Trying to be extra clever with a INNER JOIN SUM() query

Hi guys,

I’m trying to be clever with a query to add up credits from one table under categories from another. It’s a pretty simple query and it almost works. Here’s the details:

I’ve got an activities table and a categories table like so:


CREATE  TABLE IF NOT EXISTS categories (
  id INT NULL AUTO_INCREMENT ,
  category VARCHAR(45) NOT NULL ,
  description VARCHAR(255) NOT NULL ,
  minimum DECIMAL(6,2) NULL ,
  maximum DECIMAL(6,2) NULL ,
  PRIMARY KEY (id) )
ENGINE = InnoDB;

CREATE  TABLE IF NOT EXISTS activities (
  id INT NULL AUTO_INCREMENT,
  contact_id INT NOT NULL ,
  category_id INT NOT NULL ,
  credit_date DATETIME NOT NULL ,
  credits DECIMAL(6,2) NOT NULL ,
  activity VARCHAR(45) NOT NULL ,
  notes VARCHAR(255) NULL ,
  PRIMARY KEY (id, contact_id, category_id) )
ENGINE = InnoDB;

My ideal Query, returns the total credits per category ordered by categories… This is what I’ve done so far. It Adds up all of the credits for all categories and provides the first Category, minimum, maximum and description it comes across:


SELECT categories.category AS category,  
       SUM(activities.credits) AS credits, 
       categories.minimum AS minimum, 
       categories.maximum AS maximum,
       categories.description AS description 
       FROM categories 
       INNER JOIN activities 
       ON categories.id = activities.category_id 
       WHERE contact_id = $contact_id
       AND EXTRACT(YEAR FROM activities.credit_date) = $report_year
       ORDER BY categories.id;

#note: $contact_id and $report_year are variables like 1 and 2012. 


It returns this:

‘Work Experience’,24,‘0.00’,‘20.00’,‘Work Experience description’

What I’m after should return something that looks roughly like this:

‘Work Experience’,‘8’,‘0.00’,‘20.00’,‘Work Experience description’
‘Education’,‘4’,‘0.00’,‘20.00’,‘Education description’
‘Mentoring’,‘12’,‘0.00’,‘40.00’,‘Mentoring description’

Failing this, I can always use a query like the following and put it in a loop that runs for each category but I want to avoid that if I can (and I know I can):



SELECT categories.category AS category,  
        SUM(activities.credits) AS credits, 
        categories.minimum AS minimum, 
        categories.maximum AS maximum,
        categories.description AS description 
       FROM categories 
       INNER JOIN activities 
       ON categories.id = activities.category_id 
       WHERE contact_id = $contact_id
       AND EXTRACT(YEAR FROM activities.credit_date) = $report_year
       AND categories.id = $category_id;

#note: $contact_id, $report_year and $category_id are variables like 1, 2012 and 3.



Anyway, I’m certain I’ve accomplished this in the past I just can’t recall what I did.

Thanks for any help.

Andrew

looks like somebody forgot the GROUP BY clause…

:slight_smile:

Oh jeeze, I missed it by a single statement. Obviously I haven’t been thinking in SQL enough lately :blush:

Thanks for that, now I can zip out results for all of the credits I have for a particular individual for a particular year:

SELECT categories.category AS category,  
       SUM(activities.credits) AS credits, 
       categories.minimum AS minimum, 
       categories.maximum AS maximum,
       categories.description AS description 
       FROM categories 
       INNER JOIN activities 
       ON categories.id = activities.category_id 
       WHERE contact_id = $contact_id
       AND EXTRACT(YEAR FROM activities.credit_date) = $report_year
       GROUP BY categories.id;

There’s one more hitch… I’d like to still pull out a response when there is no data for a particular category so If I have 5 categories and only two of them have records relating to them, I’d still like them to show up in my results , even if the credit column returns null. Something like the following:

‘Work Experience’,‘8’,‘0.00’,‘20.00’,‘Work Experience description’
‘Education’,‘4’,‘0.00’,‘20.00’,‘Education description’
‘Mentoring’,null,‘0.00’,‘40.00’,‘Mentoring description’
‘Professional Activities’,‘12’,‘0.00’,‘20.00’,‘Professional Activities description’
‘Something Else’,null,‘0.00’,‘120.00’,‘Something Else description’

I’m, pretty pleased with this so far because this is an app I built nearly 11 years ago and when I built it originally, I ran several queries, loaded the results into arrays and then sorted and manipulated them in code to get my results. If I can shape this query up, it will simplify the code immensely.

Andrew


SELECT categories.category
     , SUM(activities.credits) AS credits
     , categories.minimum
     , categories.maximum
     , categories.description
  FROM categories
[COLOR="#0000FF"]LEFT OUTER 
  JOIN[/COLOR] activities
    ON activities.category_id = categories.id
   [COLOR="#0000FF"]AND [/COLOR]activities.contact_id = $contact_id
   [COLOR="#0000FF"]AND [/COLOR]EXTRACT(YEAR FROM activities.credit_date) = $report_year
GROUP 
    BY categories.id;

Oh, that works like magic.

I’m picking up my copy of Simply SQL and putting it back on my desk so I can brush up on JOINS… Trying to figure out anything other than INNER JOINS on bla-bla-bla gives me a headache. Seeing the power of this LEFT OUTER JOIN has convinced me that it’s time I got to know them better.

Thanks so much for the help. This has just simplified my app by a great number of lines and it will be much more maintainable.

Cheers,
Andrew

Hey Rudy,

I wonder if you can shed some light on an idea I have to extend this query a little further. I have it open in MySQL Workbench and I’m playing with it a little further.

Initially, with your help my Query produced a table of Categories and the number of activity credits recorded by the contact’s id in question.



$sql = "SELECT civi_cpd_categories.id AS id
        , civi_cpd_categories.category AS category
        , SUM(civi_cpd_activities.credits) AS credits
        , civi_cpd_categories.minimum
        , civi_cpd_categories.maximum
        , civi_cpd_categories.description
        FROM civi_cpd_categories
        LEFT OUTER JOIN civi_cpd_activities
        ON civi_cpd_activities.category_id = civi_cpd_categories.id
        AND civi_cpd_activities.contact_id = " . $contact_id . "
        AND EXTRACT(YEAR FROM civi_cpd_activities.credit_date) = " . $_SESSION["report_year"] . "
        GROUP BY civi_cpd_categories.id";


From that point, I have a hyperlink that lets you go to a page that lists out the details of the activities for that person under that category for that date.


$sql = "SELECT civi_cpd_categories.category
                                   , civi_cpd_activities.id AS activity_id
                                   , civi_cpd_activities.credit_date
                                   , civi_cpd_activities.credits
                                   , civi_cpd_activities.activity
                                   , civi_cpd_activities.notes 
                                   FROM civi_cpd_categories 
                                   INNER JOIN civi_cpd_activities 
                                   ON civi_cpd_categories.id = civi_cpd_activities.category_id 
                                   WHERE civi_cpd_activities.category_id = " . $category_id . " 
                                   AND contact_id = " . $contact_id . " 
                                   AND EXTRACT(YEAR FROM credit_date) = " . $_SESSION["report_year"] . " 
                                   ORDER BY credit_date";       


Now I’m considering how to combine the two so that I produce the Category information with the total and I produce a table of the detailed activities under each category which I can hide with jQuery and expose as needed.

I know I can do this with the following query and some creative coding but I wonder if there is a better way:


$sql = "SELECT civi_cpd_categories.id AS id
        , civi_cpd_categories.category AS category
        , civi_cpd_categories.description
        , civi_cpd_categories.minimum
        , civi_cpd_categories.maximum
        , civi_cpd_activities.credit_date
        , civi_cpd_activities.activity
        , civi_cpd_activities.credits
        FROM civi_cpd_categories
        LEFT OUTER JOIN civi_cpd_activities
        ON civi_cpd_activities.category_id = civi_cpd_categories.id
        AND civi_cpd_activities.contact_id = " . $contact_id . "
        AND EXTRACT(YEAR FROM civi_cpd_activities.credit_date) = " . $_SESSION["report_year"] . "
        ORDER BY civi_cpd_categories.id";


Thanks,
Andrew

i know what you’re asking and i have mixed opinions about the best way to do it

you could return both detail rows along with aggregate rows, using a UNION

using a simple coding technique, which i can explain if you wish, you can easily “hide and expose as needed” the detail rows

on the other hand, a simple query, returning only detail rows, could be processed multiple ways using arrays or whatnot on the application side

so i guess it’s whichever way you want to go – the purist would say separate the data retrival from the data display, but the lazy coder would use the UNION query

Yeah, that’s the direction I was thinking about going with using this query:

$sql = "SELECT civi_cpd_categories.id AS id
        , civi_cpd_categories.category AS category
        , civi_cpd_categories.description
        , civi_cpd_categories.minimum
        , civi_cpd_categories.maximum
        , civi_cpd_activities.credit_date
        , civi_cpd_activities.activity
        , civi_cpd_activities.credits
        FROM civi_cpd_categories
        LEFT OUTER JOIN civi_cpd_activities
        ON civi_cpd_activities.category_id = civi_cpd_categories.id
        AND civi_cpd_activities.contact_id = " . $contact_id . "
        AND EXTRACT(YEAR FROM civi_cpd_activities.credit_date) = " . $_SESSION["report_year"] . "
        ORDER BY civi_cpd_categories.id"; 

I wouldn’t consider the UNION route a lazy way out; it requires a more elegant query and returns a tight dataset that can be plugged into the code in a nice neat fashion. Running a query, returning detail rows and then processing it by looping through arrays is in my opinion taking the easy way out. I’ll probably end up doing it that way because it’s easier but if I can get my query working with the union, that’s the route I’ll take.

Hi Rudy,

I just wanted to pop back in and say thanks for helping out with my queries. Your LEFT OUTER JOIN and GROUP BY advice really cleaned up what I was doing with my decade old approach of several queries and looping through arrays.

I haven’t mastered the UNION approach as discussed for that final query. I’ll figure it out eventually now that I have an inkling of how to approach it but for now I’ve taken the easy way out by making several queries saving the results in arrays and then looping through the arrays in code to print the report. It’s worked on my old app for more than 10 years so although not as pretty as I’d like, it is effective.

Also congrats on DB Guru of the year for 2012!

Cheers,
Andrew

“not as pretty as i’d like, but effective” – one of the better yardsticks

:slight_smile: