Mass Update

Hi,

I’m not sure if this belongs in this forum or the PHP forum, but I have a database (“inventory-db”) that looks like

The database in phpMyAdmin has over 16,000 lines of data. What I am trying to do is mass update the outofstock field with another spreadsheet that looks like

The spreadsheet may have anywhere from 1,000 to 10,000 lines of data. Currently, I upload the CSV spreadsheet on a password protected web page (inventory.php) via a form post.

<form action=inventory.php method=post enctype="multipart/form-data">
<input type="hidden" name="MAX_FILE_SIZE" value="30000" />
Inventory File:&nbsp;&nbsp;&nbsp;<input name="inventory" type="file">
<br><br><input type=submit name="update" value="Upload Inventory">
</form>

Once the form submits to the same page, I have it go through the CSV file to update the outofstock column in the database.

	$filename = $_FILES['inventory']['tmp_name'];	
	$handle = fopen($filename, "r");

	while (($inventory = fgetcsv($handle, 10000, ",")) !== FALSE)
	{
		$product = mysql_query("SELECT * FROM `inventory-db` WHERE `MPN` = '" . $inventory[0] . "'");

		$count = 0;

		while ($tmp = mysql_fetch_array($product))
		{
			$count = $count + 1;
		}

		if ($count == 0)
		{
			if ($inventory[2] &lt;&gt; "0")
			{
				echo $inventory[0] . " (" . $inventory[1] . ") is not listed on website.&lt;BR&gt;";
			}
		}

		/* Update outofstock */
		if ($inventory[2] &lt;&gt; "0")
		{
			mysql_query("UPDATE `inventory-db` SET `outofstock` ='0' WHERE `MPN` = '" . $inventory[0] . "'");
		}
		else
		{
			mysql_query("UPDATE `inventory-db` SET `outofstock` ='1' WHERE `MPN` = '" . $inventory[0] . "'");
		}
		
	}

So my problem is that this works about 10% of the time. It seems when I put a larger CSV file through the process, it does not respond, and I have done some research on file upload restrictions within php.ini, but I do not see what would be the restriction with the settings that I have. I would greatly appreciate any suggestions on how to fix this current process or a more efficient way to get the database updated using a secondary spreadsheed of some sort.

Your script may time out for larger files because the operation takes too long. Two basic things to watch out for:

  1. The most important question - is your table innodb or myisam? Myisam updates are pretty fast but innodb is very slow by default because the db makes sure the data get flushed to disk after every insert/update. So if your table is innodb then surround the whole process in a transaction - you will speed things up many, many times!

  2. Make sure the MPN column is indexed - you are using WHERE MPN = ‘…’ so many times in a loop and if there is no index the performance will be poor.

  3. Your counting method by selecting all rows in a loop is inefficient. If you only need to check for existence of a row it’s much better to do it in 1 query:

SELECT EXISTS(SELECT * FROM `inventory-db` WHERE `MPN` = '...')

This will always return one row with one value: 0 or 1 so it’s enough to use mysql_fetch_row() and get the value at index 0.

if your php code is updating one row at a time in a loop, this is incredibly inefficient

you should be using a joined update, to update all rows at once

alternatively, an INSERT SELECT with ON DUPLICATE KEY UPDATE option

both of these require that the csv used for the update is loaded into a table, but that’s easy with LOAD DATA INFILE

Thank you for the response.

  1. The table is myisam.
  2. MPN is indexed.
  3. I have corrected with your suggestion.
	$filename = $_FILES['inventory']['tmp_name'];	
	$handle = fopen($filename, "r");

	while (($inventory = fgetcsv($handle, 10000, ",")) !== FALSE)
	{
		$valid = mysql_fetch_row(mysql_query("SELECT EXISTS(SELECT * FROM `inventory-db` WHERE `MPN` = '" . $inventory[0] . "')"));

		if ($valid[0] == 0)
		{
			echo $inventory[0] . " (" . $inventory[1] . ") is not listed on website.&lt;BR&gt;";
		}
		else
		{
			/* Update outofstock */
			if ($inventory[2] &lt;&gt; "0")
			{
				mysql_query("UPDATE `inventory-db` SET `outofstock` ='0' WHERE `MPN` = '" . $inventory[0] . "'");
			}
			else
			{
				mysql_query("UPDATE `inventory-db` SET `outofstock` ='1' WHERE `MPN` = '" . $inventory[0] . "'");
			}
		}
	}

The CSV file is still not going through. Thank you for teaching me a better, more basic way to check for results with the SELECT query.

r937 - I’m obviously not more than a beginner with this stuff. I tried using Google to help make sense of your message, but I could not get it to perform even the first step that you are suggesting of having to load the CSV file into a table.


mysql_query("LOAD DATA INFILE '" . $filename . "' INTO TABLE `temp` FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\\"' LINES TERMINATED BY '\
'");

With $filename being $_FILES[‘inventory’][‘tmp_name’]; from the file submitted with the form. Any pointers for me?

How often is the spreadsheet updated?

Have you considered uploading the contents of the spreadsheet into a database table, then you do an joined update like Rudy suggests, then once the update is complete the table that the csv was uploaded to gets truncated?

Is the data in the spreadhsheet coming from the same app or some external source?

Please be aware that the mysql_* extension is now deprecated as of the current version of PHP and will very likely be removed from the next 5.x version and will likely not be in PHP 6.x (when it eventually is released). You should migrate over to either the mysqli_* extension or to PDO. PDO is a better choice as it doesn’t tie you down so much to a particular database server software.

Once you have migrated you should use Prepared Statements to prevent SQL Injection attacks. Have a read of this article from the PHP manual, it shows how to use prepared statements with PDO and also explains the principle.

As far as I know LOAD DATA INFILE is not always available depending on the server set-up. First, the db server must reside on the same machine as the web server and second, you need to have the FILE privilege and the target file needs to be readable by (see docs). Ask your hosting provider if you can use LOAD DATA INFILE.

If you can’t use it then you can create a temporary table, load the data using inserts and then perform the joined update.

BTW, how long does your script run? If there is an error what is it? Does the script update your data partially? 10000 rows is not a lot and in my opinion even the inefficient loop method should complete in less than 30 seconds. Does the mysql server reside on the same machine as your web server or are you connecting remotely?