How do you deal with database transaction deadlocks in PHP code?

More and more often I come across a problem of deadlocks in mysql. I like to enclose portions of SQL updates, inserts and deletes in a transaction to ensure the operation is atomic and I don’t end up with partially executed action if something goes wrong. Transactions also speed things up a bit. But sometimes a query in a transaction ends up with this error:

Deadlock found when trying to get lock; try restarting transaction #1213

I’ve searched information on deadlocks and I know there are certain steps that can be taken to minimize their occurrence like updating in a certain order, etc. However, as the application grows more and more complex it’s difficult to guarantee deadlock-free transactions since there are many different functions, methods, modules, stored procedures, etc. that are executed within various transactions and coordinating all those queries so that they do not cause deadlocks in any used scenario seems like an almost impossible task. Moreover, the MySQL manual says that deadlocks are a natural thing in complex databases and the client simply needs to restart the transaction and if this doesn’t occur often then this doesn’t pose the problem. So I am now at this stage - deadlocks happen rarely but still they do happen and I want to be prepared for that.

So here’s my question - how to restart transactions elegantly in PHP?

One thing I have done so far is I have made the query() method in my Db class detect deadlock error #1213 and retry the query up to 3 times. This was easy to implement because I have a central Db class but it does only retry a single query not the whole transaction. And this helps in many (most) cases. However, not always.

When I start transaction, insert a new row and the next query causes deadlock then the new row can be lost and I can’t simply retry the second query and continue. I have to start the whole transaction again. Here is some illustration code:


class Product {
  public function save() {
    $this->db->startTransaction();

    // INSERT new row to product table
    // ...

    // INSERT a few rows to product_category table
    // ...

    // INSERT a few rows to product_accessory table
    // ...

    // UPDATE a table which holds products indexed for search engine
    // ...

    // DELETE old entries in some other table...
    // ...

    // some other more db operations involving INSERT, UPDATE and DELETE
    // ...

    $this->db->commit();
  }
}

Each of the above operations can potentially end up in deadlock and I’d have to start from scratch. What would be the best method to do this? The only solution I can see is to move the code from each method that uses transaction to another method (e.g. saveWithTransaction() ) and in the original method (save() ) use a for loop to run the transaction method and catch any DeadlockExceptions and retry if necessary. However, this seems ugly and I’d have to litter a lot of my code with mechanisms for restarting transactions in places that should not (I think) have to deal with such issues. Any ideas?

Have you started analyzing your database logs to figure out which table is locked at the time you are attempting a write, delete, or update command?

Because of sheer volume, and the reporting we were doing during business hours, at my prior job, we made 98% of our SELECT queries to utilize WITH(NO LOCK) [SQL Server only], but MySQL does have similar commands.

Once we did that, our deadlocks disappeared quickly, but we still encountered a few of them, which I later tracked down to the code requesting data from the same set of tables repeatedly within a loop (600 times!). Needless to say, writing a better query and updating the code for that situation solved that one.

If you can identify the table being deadlocked, search your code for when and how that table is utilized and ask yourself if it could be improved.

I have logs of all queries that result in an error - there are a few different queries that cause a deadlock. I think I know which table is locked because I see the problematic queries they are trying to affect - if that’s how it works.

Are you saying that in your case SELECT queries were causing deadlocks? I’ve never seen a SELECT causing a deadlock in my case - it’s always insert, update or delete.

For example, there was an insert to the product table, then an update on all rows in this table to update a calculated column - the update takes about 1 second but if another thread tries to do the same - insert and then do the update - while the first update is still running the deadlock occurs. I managed to solve this by moving the update after the end of the transaction. Now there are very few problematic queries and I’ll be trying to improve them but what I’m asking here is more about dealing with deadlocks and restarting the transactions in the application code.

What database engine are you using for the tables concerned?

Innodb.

No, in my case the SELECT was killed because it had to wait too long for the prior transaction which had a lock on the table to finish. So by telling the SELECT you can read uncommitted data, we got around that issue.

Hmm, you have to update all rows for a table because it stores a calculated result? I’ve never been much of a fan for storing calculated results (especially ones that have to be updated routinely; unless used for reporting only purposes that can run off hours, or that could be updated off hours instead of in real-time). IMO, that is probably a design issue that should be analyzed to see how feasible it would be to remove that calculated column and do the calculation in the getter of that object’s property.

But seeing how, it is what it is, this will not be an easy to avoid situation when you have multiple users invoking similar transactions against your database. As any query that must update ALL rows, is going to lock the entire table, thus preventing a second user from running a similar transaction (regardless how fast it is – as eventually it will slow down as the number of records grow over time). I’ve prevented users from being able to invoke simultaneous requests that would be long running by creating a queue and then processing them in FIFO (First In First Out), but that is usually a pretty big over-haul too when working with an existing code base.

The gist of the queue (when I wrote it many years ago was), User sends a request for a known long running or a lock intensive query. They send it to the Queue Manager, the Queue Manager gives them an ID they can use to check if the status of their query (Pending, In Progress, Completed, Failed). The calling application is then in charge of keeping the user updated (think “Please wait while we process your request…” that refreshes X seconds so it can check the status of the queued request). The Queue Manager processes each item in its queue every minute/whatever time frame is acceptable, and updates the ID associated with the last request to Completed/Failed accordingly. The calling application sees Completed/Failed and shows the user the appropriate Success/Failure message.

Granted, you can still have dead-locks with items that are not process intensive and are not using this manager (which is ultimately, why I gave up on continuing it and instead looked into why the queries were intensive to begin with and did they really need to be?). The manager slows down any process that isn’t resource intensive, so that isn’t cool, so overall, it was never really a good solution, but rather a lesson learned that one should look into why their query/transaction is locking the tables for longer than need be. Updating ALL rows is something that should be looked into.

Hope that helps.

If you’re on PHP >= 5.3 you could use anonymous functions as callbacks. Something like


<?php
$this->db->transactional(function($db) {
    $db->insert('stuff');
    $db->update('stuff');
    $db->delete('stuff');
});

where transactional is something like:


<?php
class MyDatabase
{
   // ...
   public function transactional($callback)
   {
       for ($i = 0; $i < 3; $i++)
       {
           try {
               $this->beginTransaction();
               $callback($this);
               $this->commit();
           } catch (MyDeadlockDetectedException $e) {
               if ($i == 2)
               {
                   // throw exception if happens 3 times in a row
                   throw $e;
               }
               // Also, may want to sleep() for a bit here to give
               // db some breathing space
           }
       }
   }
}

Off Topic:

If you ever need to write something like this again you may want to look in to beanstalkd or similar to just keep on processing all the time in a queue with no overlap, guaranteed. And no need for cron-esque juggling of data :slight_smile:

It was actually a .NET system (the idea can easily cross platforms, but keep in mind, it isn’t a guaranteed solution, the issue will likely crop back up somewhere else), and so I used a Windows Service so it could be event driven (something was sent to the queue, timed interval, constant run, etc). I really like Windows Services in .NET, they are a lot of fun to code and can be utilized for all sorts of long running processes.

So I suppose my case is a bit different because I don’t see any selects being killed.

I’d be happy to get rid of updating all the rows if I could achieve what I want in a different way, unfortunately doing the calculation in the getter is of no use and it wouldn’t even be possible without reading all rows in the table. This is a calculated integer column that I use to simulate the missing feature of being able to order result in a natural sorting algorithm, so for example Product A5 Black comes before Product A10. I have a stored procedure that reads all product names and then updates the indexed name_sort column with a numeric value and then I just use an ORDER BY name_sort and natural sorting is done lightning fast on the live site. Since adding a new product or changing product name doesn’t happen very often waiting 1 second for the update is no problem at all - and having other threads to wait until the update is finished is no problem, either. Most probably my mistake was to run this procedure inside a transaction since I can’t see how two separate transactions updating all rows can successfully complete when run at the same time (my desired behaviour is for the latter one to wait until the first one finishes).

As a last resort I can implement a much simpler version of queuing - I can create a simple class that - in case of multiple users trying to add or update a product - will cause the subsequent scripts to sleep until the first action is finished, by using lock files.

Hey, that is very clever! I’m on PHP 5.3 so I might actually use that idea, thanks!