Slow Query not using indexes

CKA · April 18, 2010, 2:43am

I have a query on my homepage that is getting slower and slower as my database table grows larger.

tweets_cache rows = 572,327

this is the query I’m currently using that is slow, over 5 seconds.

SELECT * FROM tweets_cache t WHERE t.province=‘’ AND t.mp=‘0’ ORDER BY t.published DESC LIMIT 50;

If I take out either the WHERE or the ORDER BY, then the query is super fast 0.016 seconds.

I have the following indexes on the tweets_cache table.
PRIMARY
published
mp
category
province
author

So i’m not sure why its not using the indexes since mp, provice and published all have indexes? Doing a profile of the query shows that its not using an index to sort the query and is using filesort which is really slow.

possible_keys = mp,province
Extra = Using where; Using filesort

Bagtjesen · April 18, 2010, 6:49pm

Hi,

Looks like MySQL decides on how to use indexes in an not so obvious way. See http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html - Have a look at chapter 7.4.4, but if you do this particular lookup a lot, have a look at 7.4.3 Multiple-Column Indexes!

Also, I don’t understand how you can compare your timings, at all - what would it find if there was no WHERE clause, and so on.

You LIMIT to 50 so there are a lot of records actually satisfying your recuirements, and they actually get handled - before you do the ORDER and LIMIT. If I understand in what order the database engine handles things. [Please anyone more experienced - CORRRECT ME IF I’M WRONG]

SpacePhoenix · April 18, 2010, 9:12pm

Do you really need all the fields of the tweets_cache returned by the query? If you don’t, then just specify the fields that you need returned in the SELECT clause.

CKA · April 18, 2010, 9:16pm

I’m pretty sure this has no effect on the speed of the query, or very very minimal in my tests. Actualy sometimes it seems faster if the query uses * rather than specifing fields.

Bagtjesen · May 9, 2010, 8:47pm

Hi again- sorry, I have been away taking a long weekend.

Yes I see some similar behaviour in a weather database I have, usually I do not have some of the columns indexed. I have 600+ MB numeric data. From what I see I have quite a bit of same data in the columns so the indexes for the columns are not very effective - If I involve the primary index which only has uniques it is very fast indeed.

I came to this conclusion after SELECT count(column) WHERE … for both of the critera. It is also much faster with just one criteria even with order by.

This makes me believe that both the merge of the columns (between the temporary memory tables) and the order by takes its toll. In my case I also found that LIMIT 50 did not improve much if the both column criteria matched lots (25,000 columns 64000 recs) each.

So, apparently it does not use the indexes under some circumstances, which is indicated if you have a look at possible_keys in the reference manual. It also refers to FORCE INDEX and other ways.

One difference is I have InnoDB (for using transactions), and you have MyISAM - For InnoDB there is a chance of using the primary index, with every index - since you use * in the select. But I haven’t had time to check out any improvements.

Since I mainly used SQL Server at work before, now at early retirement I decided for my weather project to use some things that were already made with other tools that included MySQL - I was thinking of using the implementation of NULs in MySQL, which differs quite a bit from what I was used to do. My thought being - a possible faster creation of the temp memory table. But my time is too limited right now to experiment with this.

Time to go to bed now, here in Europe - Hope someone else decides to take part in this, because it looks like you have a problem that I might also hit later on.

CKA · May 3, 2010, 10:41am

bump, still having this issue, getting slower and slower

Bagtjesen · May 3, 2010, 11:14am

Try using the EXPLAIN (ahead of the SELECT statement, i.e. EXPLAIN SELECT …) look for the differences in how the things are being executed.

Have you tested making a multiple-column index - see the refs I hinted at before.

CKA · May 3, 2010, 10:41pm

I don’t need to use EXPLAIN, I get the same data from the SQl profiler I use. Here is a screenshot of the profiler info on the query.

I tried adding a new multie-colum index with “profiles & mp”. The explain shows that this new index listed under “possible_keys” and “key”, but the query time is unchanged, still over 5 seconds.

If I remove the “ORDER BY t.published DESC” the query is fast. Or if I remove the “WHERE t.province=‘’ AND t.mp=‘0’” and keep the ORDER BY, the query is fast. But i need both.

Something weird, I made a dump of my database to test on my local desktop so i don’t screw up the live site. The same query on my local runs super fast, milliseconds. So I copied all the same mysql startup variables from the server to my local to make sure there wasn’t some setting that might be causing this. But even after that the local query runs super fast, but the one on the live server is over 5 seconds.

My database server is only using around 800MB of the 4GB it has available.
here are the related my.ini settings i’m using

[mysqld]
default-storage-engine = MYISAM
max_connections = 800
skip-locking
key_buffer = 512M
max_allowed_packet = 1M
table_cache = 512
sort_buffer_size = 4M
read_buffer_size = 4M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 128M

Try number of CPU’s*2 for thread_concurrency

thread_concurrency = 8

Disable Federated by default

skip-federated

[isamchk]
key_buffer = 512M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer = 512M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

Bagtjesen · May 9, 2010, 11:14pm

Got a phone call had to wait up for a couple of hours.

Here are a sum up of my tests with my database 600MB+, 2.6 million records, 80 columns in one single table.

the SELECT covers 25000+ for first comparison, 64.000+ for the second.

SELECT recdate, ot, dp FROM oasis FORCE INDEX (ix_OT, ix_DP) WHERE ot < - 15.00 AND dp < -20.00 ORDER BY ot

Without any indexes it took between 40 and 50 seconds.

With indexes created, but with not FORCE INDEX it took 4.5 sec.

With SELECT * and FORCE INDEX it took 1.063 sec.

With SELECT recdate, ot, dp … it took 0nly 0.516 sec.

Note that this is with numeric data, but a lot of hits, and it is LOWER THAN comparisions, not comparison for equals.

Havent tested with an index for two columns.

Try see if you get somthing like this with FORCE INDEX?! Good Luck!

Bagtjesen · May 10, 2010, 12:08am

with an index for two columns and FORCE INDEX (ix_OT_DP) I am now down to 0.047 secs…

Bagtjesen · May 10, 2010, 12:15am

Make shure the Indexes are in the order you want the to be in the sort, ORDER BY. I.e. declare the sortorder to be what you want when creating the indexes. That was a factor 30 for me.