I have an authentication system for my programs which logs every authentication attempt.
Now over the months, the logs table have gotten 4,064,450 records.
The problem is when im executing that query to get the latest 13 records it takes about 10 seconds to do it.
“SELECT * FROM logs_table ORDER BY time DESC LIMIT 0, 13”
If you need to keep and use the data that does limit your options.
Do you really need every field returned - the “*” ? Even if you do, AFAIK explicitly listing the fields allows the database to use indexes and should run faster.
For the frequent query, maybe you could make a “recent” table that points to indexes in the “logs” table. Then save the heavy grunt work for the occassion.
That’s not the culprit here… he’s combining an ORDER BY and a LIMIT… that will necessarily compare the time field on every result… even when indexed, it’s doing a multi-million-record hash prior to the LIMIT.
To increase performance, he has to to a “pre-sort pseudo-limit” by date. This will allow him to cut out a significant % of his results on an indexed column prior to the ORDER BY/LIMIT.
It’ll significantly increase his performance (assuming the time column is indexed).
What server-side language are you using? Try dropping the LIMIT CLAUSE and instead in the server-side language run the loop that you use to get each row of the result set 13 times (or for however many rows are required) then break from the loop.
Yikes!!! That will retrieve 4 MILLION records!!! I don’t even KNOW how a server will handle that… even with an iterative model, there has to be some point in the DB connection that will process all 4 million records!
Ok, I decided to run a test on this, because I’m interested to see the results…
I’ve built the following table:
CREATE TABLE `logs_table` (
`id` INT NOT NULL AUTO_INCREMENT,
`user_id` INT NOT NULL,
`action_id` INT NOT NULL,
`time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
INDEX (`user_id`),
INDEX (`action_id`),
INDEX (`time`)
);
I’m currently populating it with data… 4,000,000 records spread out randomly over a 12 month period. I’ve got 1.5 million records inserted so far!!!
I’m going to test some different queries on it and see what kind of results I get!
PS - This is on a 512 MB Slicehost VPS - not the best machine in the world, so should give some good results!