SELECT COUNT(VI.id)
FROM Client_Voucher_Items VI
LEFT JOIN Client_Vouchers V
ON VI.voucherId = V.id
LEFT JOIN Clients C
ON C.id = V.clientId
LEFT JOIN Items I
ON I.id = VI.itemId
WHERE I.description = 'Glasses'
AND V.dateVoided IS NULL
AND V.dateCreated >= '2010-01-01'
AND V.dateCreated <= '2010-01-31'
AND C.id IN (
SELECT DISTINCT(V.clientId)
FROM Client_Vouchers V, Clients C
WHERE C.id = V.clientId
AND DATE_FORMAT(V.dateCreated, '%Y') - DATE_FORMAT(C.dob, '%Y') - (DATE_FORMAT(V.dateCreated, '00-%m-%d') < DATE_FORMAT(C.dob, '00-%m-%d')) >= 60
AND V.dateCreated >= '2010-01-01'
AND V.dateCreated <= '2010-01-31'
AND V.dateVoided IS NULL
)
Because of the IN() clause (and mysql’s horrible optimization of it) this query is taking 10+ seconds to run. I am not proficient enough with sql to rewrite this without the IN() clause. I can post the table structure if it’s not clear.
Thanks r937 - that did shave a second or two off the query - so now I have this:
SELECT COUNT(VI.id)
FROM Client_Voucher_Items VI
INNER JOIN Client_Vouchers V
ON VI.voucherId = V.id
INNER JOIN Clients C
ON C.id = V.clientId
INNER JOIN Items I
ON I.id = VI.itemId
WHERE I.description = 'Glasses'
AND V.dateVoided IS NULL
AND V.dateCreated >= '2010-01-01'
AND V.dateCreated <= '2010-01-31'
AND C.id IN (
SELECT DISTINCT(V.clientId)
FROM Client_Vouchers V, Clients C
WHERE C.id = V.clientId
AND DATE_FORMAT(V.dateCreated, '%Y') - DATE_FORMAT(C.dob, '%Y') - (DATE_FORMAT(V.dateCreated, '00-%m-%d') < DATE_FORMAT(C.dob, '00-%m-%d')) >= 60
AND V.dateCreated >= '2010-01-01'
AND V.dateCreated <= '2010-01-31'
AND V.dateVoided IS NULL
)
before rewriting the subquery as a join, i just want to take a moment and confirm that the subquery is returning something different from the main query
in the main query, you’re looking specifically for glasses
but in the subquery, you seem to be looking for any client of age 60+ who has ~any~ voucher, regardless of whether that voucher contains glasses or not
is this correct?
because if it isn’t, it sure looks like those WHERE conditions in the subquery could simply be added to the WHERE clause of the outer query
I get what you’re saying Rudy. The query is supposed to pull the unduplicated number of people over 60 that had vouchers for glasses in a date range, and I see now that I had it written incorrectly.
I rewrote this as:
SELECT COUNT(DISTINCT(V.clientId))
FROM ClientManagement_Client_Voucher_Items VI
INNER JOIN ClientManagement_Client_Vouchers V
ON VI.voucherId = V.id
INNER JOIN ClientManagement_Clients C
ON C.id = V.clientId
INNER JOIN SystemSetup_Items I
ON I.id = VI.itemId
WHERE I.description = 'Glasses'
AND V.dateVoided IS NULL
AND V.dateCreated >= '2010-01-01'
AND V.dateCreated <= '2010-12-31'
AND DATE_FORMAT(V.dateCreated, '%Y') - DATE_FORMAT(C.dob, '%Y') - (DATE_FORMAT(V.dateCreated, '00-%m-%d') < DATE_FORMAT(C.dob, '00-%m-%d')) >= 60
Does that look right to you? It’s yielding slightly different numbers, but I think it’s right and what I had previously was wrong (because I was counting voucher items instead of distinct clients).
Side note - this is one of about 50 subqueries in a massive annual report I’m working on…I’m trying to find the bottlenecks in it and I believe I’ve been staring at it wayyyyy too long.
certainly looks a lot simpler than what you had before
on a minor note, i try to write my FROM clause in a logical sequence, one that mirrors how i hope/assume the optimizer will approach it – starting with the essential or most restrictive table, and joining from there (so as to minimize the number of rows that it retrieves only to throw away)
since your whole query is about glasses, this is how i would write it…
SELECT COUNT( DISTINCT V.clientId )
FROM SystemSetup_Items I
INNER
JOIN ClientManagement_Client_Voucher_Items VI
ON VI.itemId = I.id
INNER
JOIN ClientManagement_Client_Vouchers V
ON V.id = VI.voucherId
AND V.dateVoided IS NULL
AND V.dateCreated >= '2010-01-01'
AND V.dateCreated <= '2010-12-31'
INNER
JOIN ClientManagement_Clients C
ON C.id = V.clientId
AND DATE_FORMAT(V.dateCreated, '%Y') -
DATE_FORMAT(C.dob, '%Y') -
(DATE_FORMAT(V.dateCreated, '00-%m-%d') <
DATE_FORMAT(C.dob, '00-%m-%d')) >= 60
WHERE I.description = 'Glasses'
note also that DISTINCT is ~not~ a function, and whatever comes after it does not need to be placed in parentheses