Using GROUP BY and MAX in SQL

hi folks - I was hoping someone (Rudy?) could help me understand what I’m missing here; as I mentioned to Rudy earlier I know enough about SQL and databases in general to be dangerous.

I have two tables: customers & visits; one customer can have multiple visits; what I’m trying to come up with is a query that will provide me with the latest visit per customer:

mysql> select id, name from customers;
 +-----+-----------------+
 | id  | name            |
 +-----+-----------------+
 | 101 | AAA Electronics |
 | 102 | BBB Electronics |
 | 103 | CCC Electronics |
 | 104 | DDD Electronics |
 +-----+-----------------+
 4 rows in set (0.00 sec)
mysql> select id, date, customer_id from visits order by customer_id;
 +----+---------------------+-------------+
 | id | date                | customer_id |
 +----+---------------------+-------------+
 |  1 | 2012-02-20 00:00:00 |         101 |  <----
 |  2 | 2012-02-01 00:00:00 |         101 |
 |  3 | 2012-02-03 00:00:00 |         101 |
 |  4 | 2012-02-05 00:00:00 |         101 |
 |  5 | 2012-01-07 00:00:00 |         102 |
 |  6 | 2012-01-08 00:00:00 |         102 |
 |  7 | 2012-01-09 00:00:00 |         102 |
 |  8 | 2012-01-25 00:00:00 |         102 |  <----
 |  9 | 2011-12-01 00:00:00 |         103 |
 | 10 | 2011-12-02 00:00:00 |         103 |
 | 11 | 2011-12-25 00:00:00 |         103 |  <----
 | 12 | 2011-12-03 00:00:00 |         103 |
 | 13 | 2011-12-04 00:00:00 |         103 |
 +----+---------------------+-------------+
 13 rows in set (0.00 sec)

mysql>  SELECT c.name, c.id customer_id, v.id visit_id, max(v.date) visit_date
     ->  FROM customers c join VISITS v ON v.customer_id = c.id
     -> GROUP BY c.name;
 +-----------------+-------------+----------+---------------------+
 | name            | customer_id | visit_id | visit_date          |
 +-----------------+-------------+----------+---------------------+
 | AAA Electronics |         101 |        1 | 2012-02-20 00:00:00 |
 | BBB Electronics |         102 |        5 | 2012-01-25 00:00:00 |
 | CCC Electronics |         103 |        9 | 2011-12-25 00:00:00 |
 +-----------------+-------------+----------+---------------------+
 3 rows in set (0.00 sec)

I seem to be almost there, except that the visit_id isn’t getting grouped correctly.

what I need for the query to produce is this:


 +-----------------+-------------+----------+---------------------+
 | name            | customer_id | visit_id | visit_date          |
 +-----------------+-------------+----------+---------------------+
 | AAA Electronics |         101 |        1 | 2012-02-20 00:00:00 |
 | BBB Electronics |         102 |        8 | 2012-01-25 00:00:00 |
 | CCC Electronics |         103 |       11 | 2011-12-25 00:00:00 |
 +-----------------+-------------+----------+---------------------+
 3 rows in set (0.00 sec)

I’m starting to think I might need two queries with a UNION in there somewhere… thanks.

-Len

i seriously question why you would need the visit_id

but let’s pretend there is another column in the visits table, something that might realistically be required, but let’s call it “foo”

so you want the customers along with their latest visit’s foo

SELECT customers.id
     , customers.name
     , visits.date
     , visits.foo
  FROM customers
LEFT OUTER
  JOIN ( SELECT customer_id
              , MAX(date) AS max_date
           FROM visits
         GROUP
             BY customer_id ) AS latest
    ON latest.customer_id = customers.id
LEFT OUTER
  JOIN visits
    ON visits.customer_id = latest.id
   AND visits.date = latest.max_date

the “latest” subquery determines the latest date for each customer

then you join only those rows from the visits table to pull the corresponding foo

notice the join conditions on the last join

thanks - I actually understood that, so I must be making progress…

btw - had to fix part of the last join from:

ON visits.customer_id = latest.id

to:

ON visits.customer_id = latest.customer_id

well spotted!

ah, the perils of copy/paste …

:slight_smile: