Hi,
I have a table which collects vehicle information (fms1). Most of these things count up, thus to get the ‘amount’ added in a period I need to subtract the last and the first message in that period.
These messages belong to a unit (gps_unit_id) and units belong to a relation (relation_id)
I have this query:
SELECT latest.gps_unit_id, latest.odometer - earliest.odometer AS diff_odometer, latest.total_fuel - earliest.total_fuel AS diff_total_fuel, latest.engine_hours - earliest.engine_hours AS diff_engine_hours, TIMEDIFF( latest.created, earliest.created ) AS diff_created
FROM (
SELECT gps_unit_id, MIN( created ) AS min_created, MAX( created ) AS max_created
FROM fms1
WHERE created
BETWEEN '2012-04-01'
AND '2012-04-30'
AND gps_unit_id
IN ( 76, 81, 133 )
GROUP BY gps_unit_id
) AS minmax
INNER JOIN fms1 AS earliest ON earliest.gps_unit_id = minmax.gps_unit_id
AND earliest.created = minmax.min_created
INNER JOIN fms1 AS latest ON latest.gps_unit_id = minmax.gps_unit_id
AND latest.created = minmax.max_created
WHERE minmax.gps_unit_id
IN (
SELECT id
FROM gps_units
WHERE relation_id =16
AND id
IN ( 76, 81, 133 )
)
If I change to a bigger time interval, let’s say 2 months, or a year, the query gets very slow since every message is viewed for the min() and max(). The problem lies in the subquery:
SELECT gps_unit_id, MIN( created ) AS min_created, MAX( created ) AS max_created
FROM fms1
WHERE created
BETWEEN '2012-04-01'
AND '2012-04-30'
AND gps_unit_id
IN ( 76, 81, 133 )
GROUP BY gps_unit_id
The explain command says:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 3 Using where
1 PRIMARY earliest ref created_gps_unit_id_idx,gps_unit_id created_gps_unit_id_idx 12 minmax.min_created,minmax.gps_unit_id 1
1 PRIMARY latest ref created_gps_unit_id_idx,gps_unit_id created_gps_unit_id_idx 12 minmax.max_created,freetrack.earliest.gps_unit_id 1 Using where
3 DEPENDENT SUBQUERY gps_units unique_subquery PRIMARY,relation_id PRIMARY 4 func 1 Using where
2 DERIVED fms1 range created_gps_unit_id_idx,gps_unit_id created_gps_unit_id_idx 12 NULL 224517 Using where; Using index; Using temporary; Using filesort
the last reveals that 224517 rows are evaluated for the subquery.
besides the query not being ‘scalable’ this seems like a lott of work for something rather simple.
We can assume that messages with a higher id also are also later in time. Thus you can select the first message for a given unit next to minimum or previous to the maximum date.
I just can’t think of a way to put that in query…