Query needs doctor's attention :)

Hello,

Please help me out, and I will do my best to understand:

Here’s the tables structure:

tbl_activity


id
entity_id
name

tbl_activity_package_address


id
activity_id
name
publication_date
publication_duration_in_days

tbl_activity_package_address


id
activity_package_id
district

Here’s the query I’m running:


SELECT apa.district, t.name
FROM tbl_activity AS t
JOIN tbl_activity_package AS ap ON t.id = ap.activity_id
JOIN tbl_activity_package_address AS apa ON ap.id = apa.activity_package_id
WHERE (DATE_FORMAT(ap.publication_date, '%Y-%m-%d') <= DATE_FORMAT(CURDATE(), '%Y-%m-%d'))
AND (DATE_ADD(DATE_FORMAT(ap.publication_date, '%Y-%m-%d'), INTERVAL ap.publication_duration_in_days DAY) >= DATE_FORMAT(CURDATE(), '%Y-%m-%d'))
GROUP BY apa.district
ORDER BY apa.district

Here’s the result I’m getting:


District    Name:
Beja        Actividade-2
Braga      Actividade-5
Évora      Actividade-1
Faro        Actividade-9
Leiria      Actividade-12
Lisboa    Actividade-7
Porto      Actividade-11

Here’s the issues:

1) Activity-2 has TWO or MORE districts. How can we show them (comma separated) on the results?

2) Activity-2 has TWO or MORE districts and, one of them is, “Evora”. However, “Evora” already exists on another activity, and ONLY ONE activity with “Evora” should appear.

To say it on another way: If an activity has districts A, B. And another activity has districts B, F.

Only one of those activities should appear, because they BOTH have B.

I’m query once every two years… please have patience.

Thanks in advance.

Here’s an update. I didn’t know sqlfiddle existed.

Please take a look there. I’ve created some dummy data.

Issue:

On the results:

Activity-1 and Activity-2, they both have “Evora”. So, only one should appear. Only Activity-1 or Activity-2 not both.

Thanks in advance,
mem

Your query already does requirement #2, doesn’t it? Each district shows up in the result with only 1 activity, even if there are more activities in that district.

And it seems to me that requirement #1 and requirement #2 are contradictory? In #1 you say you want both, in #2 you say you want only 1?

By the way, why are you formatting your dates in the WHERE clause? Isn’t the publication_date column of the type date ?

Please disregard the first question, and take into consideration the updated information on the second reply. I was unable to edit the first post, hence, having two replies. Sorry for that.

You have activity A, within districts Da, Db, Dc.
You have activity B, within districts Dd, Df, Da.

Only ONE of those activities should appear, because, they both have Da.

What’s the criteria to select one, instead of another?
It could be the highest ID as precedence.

I will remove the formatting dates on the WHERE clause.

Ok, so in the fiddle, you changed the logic of the query. You now group by activity name instead of district, and you use the group_concat.

So you don’t like the output it gives you


GROUP_CONCAT(DISTINCT APA.DISTRICT) 	NAME
Braga,Sines 	                        activity-5
Evora,Beja,Faro,Setubal 	        activity-1
Evora 	                                activity-2
Faro 	                                activity-3
Lisboa,Porto,Rio de Janeiro,Montemor    activity-8
Sagres,Matosinhos,Guarda 	        activity-7
Santarem 	                        activity-6
Sintra 	                                activity-4

How should the output look like? Please rewrite this output how you want it to be.

It should be like this:


GROUP_CONCAT(DISTINCT APA.DISTRICT) 	NAME
Braga,Sines 	                        activity-5
Evora,Beja,Faro,Setubal 	        activity-1
Lisboa,Porto,Rio de Janeiro,Montemor    activity-8
Sagres,Matosinhos,Guarda 	        activity-7
Santarem 	                        activity-6
Sintra 	                                activity-4

So, the second activity and the third, should not appear.

Evora 	                                activity-2
Faro 	                                activity-3

Why?
Because they both have Evora or Faro as districts and only one should appear.
Which one?
The one with highest ID should have precedence.

There’s another catch:
What about the records that are NOT listed? That we disregard on this query?
What about all activities that ALSO appear on some districts already present? Shouldn’t they be present?
YES. So, after this tricky part, we must have some sort of list for ALL others that do belong to districts already present.

At the bottom line, I’m not sure if I’m being clear. We need to have ALL activities NOT repeated, but the ORDER they appear is very complex, showing first the ones within different districts, and then the rest.

Does this makes sense, or I’m getting crazy. o.O?

Ahahahahahaha

:goof::goof::goof::goof:

OK… aaahhhm… This [ALMOST] ]works.
But we had to repeat the date comparisons, and if we need more records we have to repeat them on each UNION select.
We still get a repetition.


( 
SELECT GROUP_CONCAT(APA_T.district), t.name 
FROM tbl_activity AS t 
INNER JOIN tbl_activity_package AS ap ON t.id = ap.activity_id 
INNER JOIN ( 
SELECT DISTINCT apa.district AS district, ( 
SELECT s1.activity_package_id 
FROM tbl_activity_package_address s1 
WHERE apa.district = s1.district 
ORDER BY s1.id DESC 
LIMIT 1 
) AS idActivityPackage 
FROM 
tbl_activity_package_address apa 
ORDER BY apa.district 
) AS APA_T 
ON ap.id = APA_T.idActivityPackage 
WHERE (ap.publication_date <= CURDATE()) 
AND (DATE_ADD(ap.publication_date, INTERVAL ap.publication_duration_in_days DAY) >= CURDATE()) 
GROUP BY t.name 
ORDER BY APA_T.district 
) 
UNION DISTINCT 
( 
SELECT GROUP_CONCAT(DISTINCT apa2.district ORDER BY apa2.district), t2.name 
FROM tbl_activity AS t2 
INNER JOIN tbl_activity_package AS ap2 ON t2.id = ap2.activity_id 
INNER JOIN tbl_activity_package_address AS apa2 ON ap2.id = apa2.activity_package_id 
WHERE (ap2.publication_date <= CURDATE()) 
AND (DATE_ADD(ap2.publication_date, INTERVAL ap2.publication_duration_in_days DAY) >= CURDATE()) 
GROUP BY t2.name 
ORDER BY apa2.district 
) 
#LIMIT 6, 6

I can’t have date fields on this fiddle, so I have removed the Dates comparison WHERE clause for convenience:

Activity-1 appears repeated.
I understand why it gets repeated, but it shouldn’t.

Perhaps some sort of INTERSECTION may help, but I’m really reaching my limit here.

SELECT * FROM
(
  SELECT GROUP_CONCAT(APA_T.district) AS DistCon, t.name
  FROM tbl_activity AS t
  JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
  JOIN
  (
    SELECT DISTINCT apa.district AS district,
    (
       SELECT s1.id_activity_package
       FROM tbl_activity_package_address s1
       WHERE apa.district = s1.district
       ORDER BY s1.id DESC
       LIMIT 1
    ) AS idActivityPackage
    FROM
    tbl_activity_package_address apa
    ORDER BY apa.district
  ) AS APA_T
  ON ap.id = APA_T.idActivityPackage
  GROUP BY t.name
  UNION
  SELECT GROUP_CONCAT(apa.district), t.name
  FROM tbl_activity AS t
  JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
  JOIN tbl_activity_package_address AS apa ON ap.id = apa.id_activity_package
  WHERE t.name NOT IN
  (
    SELECT DISTINCT t.name
    FROM tbl_activity AS t
    JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
    JOIN
    (
      SELECT DISTINCT apa.district AS district,
      (
         SELECT s1.id_activity_package
         FROM tbl_activity_package_address s1
         WHERE apa.district = s1.district
         ORDER BY s1.id DESC
         LIMIT 1
      ) AS idActivityPackage
      FROM
      tbl_activity_package_address apa
    ) AS APA_T
    ON ap.id = APA_T.idActivityPackage
  )
  GROUP BY t.name
) AS Mm
ORDER BY  Mm.name

This works. I could never ever pull this out without help.

However, as you can see, this is quite huge and perhaps is NOT the best way of doing it.

Glad you got it to work.
Maybe I’ll study it later :wink:
Me, I would have done it in PHP probably.