cydewaze — 2014-05-22T20:33:14-04:00 — #1
What's the difference between:
SELECT ride_name, dist
ORDER BY dist DESC
SELECT ride_name, dist
WHERE dist = ( SELECT MAX(dist) FROM ridelog )
They both seem to give me the same thing, but I'm thinking the second one is faster because ORDER BY is slow.
r937 — 2014-05-22T20:49:32-04:00 — #2
wait, ORDER BY is slow?
and you think MAX() is faster?
what do the EXPLAINs say?
cydewaze — 2014-05-22T22:00:32-04:00 — #3
I could have sworn you told me that at one point. Although it may have been order by vs not using order by.
I don't think anything (a common problem for me)
They don't say anything for some reason. If I cfdump the queries, I get an execution time of "1" for each version.
The MAX version returns two records for the query (there are two SELECTs after all) and the LIMIT version returns one.
mittineague — 2014-05-22T22:19:04-04:00 — #4
By "faster" what do you mean, are we talking 0.0000937 secs or something with potentially more impact?
cydewaze — 2014-05-22T22:24:12-04:00 — #5
I'm just wondering if there's an advantage to using one method over another.
In a database with 22 records, I don't think I have to much worry about performance. But someday I might have to do something similar at work, so I wanted to know the best way to do it.
felgall — 2014-05-22T23:05:36-04:00 — #6
Not using LIMIT would make the command more portable if you are thinking of switching to a different SQL database.
When it comes to the efficiency of different calls to receive the same info, it often depends on just how many records there are and what they contain as to which will be more efficient. Until such time as the two EXPLAIN statements give significantly different results there is no particular reason to switch.
r937 — 2014-05-23T04:50:09-04:00 — #7
it would also generate way too many rows
felgall — 2014-05-23T18:03:54-04:00 — #8
Why would it? From what I can see both the alternatives being discussed produce one row of output - one using LIMIT and one using an alternative way of getting just the one row. That alternative would also work in databases that do not support the proprietary LIMIT command.
Please explain how SELECT ride_name, dist FROM ridelog WHERE dist = ( SELECT MAX(dist) FROM ridelog ) generate way too many rows.
r937 — 2014-05-23T18:48:20-04:00 — #9
stephen, i was responding to your suggestion of leaving LIMIT off
of course the MAX subquery version will return one row...
or two, or three...
but not all
felgall — 2014-05-23T21:41:54-04:00 — #10
I never suggested leaving LIMIIT off. I suggested that a version that didn't use LIMIT was more portable.
r937 — 2014-05-23T22:45:05-04:00 — #11
i bow to your hair-splitting acumen
felgall — 2014-05-24T03:15:51-04:00 — #12
I don't understand this comment. We were discussing two specific alternative ways of coding an SQL call and I pointed out that using LIMIT makes the code less portable because it ties the code specifically to mySQL. I was simply pointing out that additional difference between the two versions beyond that of the comparative efficiency.
What does hair splitting have to do with that? Unless it has something to do with your misinterpretation of statements by not reading them in the context of what comes before. Does everyone have to quote everything that came before when making a comment in order for you to interpret the context of their comment properly?
Just for you - here's the context of my earlier statement:
cydewaze — 2014-05-24T23:40:11-04:00 — #13
For what it's worth, I decided to use MAX() rather than LIMIT.
molona — 2014-05-26T06:55:25-04:00 — #14
I know I'm late and I don't pretend to be an expert but my understanding would be that the first would be faster as using subqueries tends to be slower.
Haven't said this, each database is a world on its own and the best thing would be to test each of the queries on both databases to know what works best because, yes, ordering is slow
r937 — 2014-05-26T07:43:27-04:00 — #15
so is scanning for max
cydewaze — 2014-05-26T12:54:48-04:00 — #16
My problem is that I compare database functions to how I'd do the task myself.
If I had a stack of cards numbered 1 - 50, putting them in numeric order would take a certain amount of time. Picking the highest one would also require me to put them in order, then pick the top one (which is what I'd do with LIMIT 1) so from a human standpoint, they seem the same.
I don't know how a database handles things though. It's obviously faster at math than I am!
molona — 2014-05-27T10:49:36-04:00 — #17
So is that really faster? Or being more generic... is a subquery faster than ordering? Yes, you can call me ignorant
molona — 2014-05-27T10:50:25-04:00 — #18
But only a little bit faster
r937 — 2014-05-27T11:24:51-04:00 — #19
suppose you have a table where some column has various values... like, say, a table of rides where each ride has a PK and also a column for the distance
how do you image MAX(dist) is found? wouldn't the engine have to go through all the rows somehow? i can see it examining all the rows and keeping a variable which it replaces every time it finds a larger value
now imagine you had to return only one row from a result set which is sorted... wouldn't it be easier, instead of sorting them all, simply to scan through and find the largest value, then return that row?
what i'm saying is, we can speculate all we want about what's fast, and what's faster, but in the end, the optimizer decides how to do what you ask it to do, and it might not actually sort if it doesn't have to
if you want to know what's fast and what's faster, there is only one way -- EXPLAIN