Optimizing update of 10,000 rows

For our e-commerce store, we have a drop shipper with almost real-time data available on stock available for every item. We have created a script that downloads their 13MB XML file of data and updates the stock accordingly for all 10,000 products every 4 hours. The problem is that the script is doing 10,000 UPDATE commands as it loops through the products every 4 hours which brings the MySQL server to a hault.

Is there a better way to do this?

load the xml file into a separate table, then do a multi-table update

So would inserting 10k items, then doing the multi-table update be less taxing on mysql? I’m researching multi-table updates now…

yes, i think it would be less taxing

it would be one UPDATE statement for all 10000 rows, rather than 10000 separate UPDATE statements

I guess I need to learn more about multi-table update - from what I can tell it would still require 10,000 update statements.

nope, just one statement

UPDATE my_master_table
INNER
  JOIN my_trans_table 
    ON my_trans_table.id = my_master_table.id
   SET my_master_table.price = my_trans_table.price

rows are matched on id, so if there are 10000 different ids in the transaction table, there will be 10000 rows updated in the master table

Although there are less queries, it appears that the multi-table update still uses up a lot of mysql resources and brings the mysql server to a hault. The only real solution is to update in small increments throughout the day instead of all 10k at once.

13MB XML file should be something like 1.3MB of real data that you need updated.

Once update to write 1.3MB will not slow down your server to much, that is, unless your indexes are not used/messed up.

Just describe your update queries, and see if your using your indexes or not.

Then, if that slows down your site more than you want to, you can always do 1 update a second, so your 10k records will take 2.77h to finish updating.

But you usually do this when networking is the bottleneck, and from your description, your bottleneck is the CPU. In witch case, it will be allot faster to do one update instead of 10k, since your indexes will only be rebuilt once, (depending on what your updating).

What do you mean by this?

DESCRIBE
>> http://dev.mysql.com/doc/refman/5.0/en/describe.html

i think he means EXPLAIN
>> http://dev.mysql.com/doc/refman/5.4/en/using-explain.html

Tomato/Tomato, both work.


mysql> desc select * from log limit 10;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | log   | ALL  | NULL          | NULL | NULL    | NULL | 5190 |       | 
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

mysql> explain select * from log limit 10;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | log   | ALL  | NULL          | NULL | NULL    | NULL | 5190 |       | 
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

mysql> describe select * from log limit 10;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | log   | ALL  | NULL          | NULL | NULL    | NULL | 5190 |       | 
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

I may, however, passed in the docs for the table desc only, if that’s the case, the link provided by r937 is the good one.