Counting products in each category - php loop getting slow

Greetings,

I have a category page that lists all the products in a category. In addition, there is a side-navigation menu with a list of say 100 sub-categories along with an item count, kind of like this example:

Art:

  • Paintings (20 items)
  • Sculptures (2 items)
  • Carvings (4 items)

On my navigation script, I go through a PHP loop of each sub-category within Art and count all of the products within those sub-categories. I have it set up where it records the timestamp so it doesn’t count every time the page is loaded, only updates once per hour. Otherwise it will display the count stored in the database instead (which is quicker). The problem now is I have 50,000 items and this counting method is starting to get a bit slow.

My count script is currently like this within the sub-category loop:

Category “Art” is parent categorylevel1, while a sub-category like “Paintings” is categorylevel2. I also allow members to list a product in 2 different categories (category1 and category2):

if ($timestamp > 1 hour...) { //loops through 100 sub-categories to count products in each subcategory and store count in database.
	  $sqlcount = "SELECT COUNT(*) as num FROM products WHERE (category1level2 = '$categorylevel2id' OR category2level2 = '$categorylevel2id')";
	  $result = mysql_fetch_array(mysql_query($sqlcount));
	  $count = $result['num'];
	  mysql_query("UPDATE categories SET count = '$count', timestamp = '$onehour' WHERE id = '$categorylevel2id'") or die(mysql_error());
}

Let me know if there is a better system. More efficient MySQL query? Cron Jobs?

Thanks

i use a parent_id for each subcategory and in the product table i store the category_id for each product .
now i have only to make a join query to retrieve the related record.
its far better to perform a sql query instead of for loop .

Thanks for the tip. I also have the parent category ids for each sub category and the global category id of the entire category itself stored in my products table, so I’ll try experimenting with this a bit.

Although, I’m not certain on how to retrieve a count of all of the products listed in each subcategory without having to do the loop?

Kind regards

UPDATE:

I’ve tried doing something like this to get out of php looping mysql counting queries, but it still takes a long time:

$sql = "
SELECT categories.*,
		(
			SELECT	count(*)
			FROM    products
			WHERE	(products.category1level2 = categories.level2id OR products.category2level2 = categories.level2id)
		) AS productcount
FROM categories WHERE level1id = '$level1id'";

It works, looks simple, but it’s taking just as long as it was before. Let me know if there is a way to improve this.

Thanks

Well, I’ve come up with another method using JOIN with the categories and items table:

$sql = "
SELECT categories.*, COUNT(products.id) AS productcount 
FROM categories
LEFT OUTER JOIN products ON (categories.level2id = products.cat1level2 OR categories.level2id = products.cat2level2)
WHERE level1id = '$level1id'
GROUP BY categories.level2id";

This is still taking just as long, up to 20-30 seconds for 100+ subcategories. Is this the way I’m supposed to be doing it or is there another way?

Thanks

EDIT:

In addition, even if I do something like this to JOIN categories/products tables without counting (and without grouping), it still takes just as long. Some parent categories might hold 100 sub-categories but only have 2 products total within all sub-categories, and it STILL takes 20-30 seconds:

$sql = "
SELECT categories.*, products.id
FROM categories
LEFT OUTER JOIN products ON (categories.level2id = products.cat1level2 OR categories.level2id = products.cat2level2)
WHERE level1id = '$level1id'

please do a SHOW CREATE TABLE for each table, so that we can see the columns as well as the indexes

Here is my categories table with relevant columns:

CREATE TABLE categories (
id smallint(5) NOT NULL AUTO_INCREMENT,
level1id smallint(5) unsigned NOT NULL,
level2id smallint(5) unsigned NOT NULL,
level3id smallint(5) unsigned NOT NULL,
level4id smallint(5) unsigned NOT NULL,
level5id smallint(5) unsigned NOT NULL,
level6id smallint(5) unsigned NOT NULL,
parent tinyint(1) NOT NULL,
count int(8) NOT NULL,
timestamp datetime NOT NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM AUTO_INCREMENT=29048 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT

Here is my products table with relevant columns:

CREATE TABLE items (
id int(8) NOT NULL AUTO_INCREMENT,
global_cat_id smallint(5) NOT NULL,
cat1level1 smallint(5) NOT NULL,
cat1level2 smallint(5) NOT NULL,
cat1level3 smallint(5) NOT NULL,
cat1level4 smallint(5) NOT NULL,
cat1level5 smallint(5) NOT NULL,
cat1level6 smallint(5) NOT NULL,
global_cat_id2 smallint(5) NOT NULL,
cat2level1 smallint(5) NOT NULL,
cat2level2 smallint(5) NOT NULL,
cat2level3 smallint(5) NOT NULL,
cat2level4 smallint(5) NOT NULL,
cat2level5 smallint(5) NOT NULL,
cat2level6 smallint(5) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=20870 DEFAULT CHARSET=utf8

In categories table, the level1id is the parent category id of level2id, and level2id is the parent category id of level3id and so on up to 6 levels.

In the items table, cat1level1 is the parent category id of cat1level2 and so on. We also allow people to select a second category, which are 6 levels of “cat2level#”. global_cat_id corresponds to the category “id” column, the actual id number of the category.

I hope this helps.

the immediate problem, poor execution time, is due to the absence of indexes on the join columns –

categories.level2id = products.cat1level2 OR categories.level2id = products.cat2level2

as well as the absence of an index on the filter column –

WHERE level1id = ‘$level1id’

but a deeper, more pernicious problem, is the structure of the tables in having a built-in hierarchy

please do some research on first normal form and then ask if you need further help

Thanks for the help, I’ve added the 18 relevant columns (the 6 in categories table and 12 from items table) into indexes, and it appears to be working much faster now. Hopefully it’s the real thing and not a query from the cache.

Do you know how well the performance speed will be after there are millions of items in the item table?

Thanks
Kind regards

approximately the same :slight_smile:

One last thing… Let’s say my items table has a “live” column that is either 0 for not live (disabled) and 1 for live. Is there a way to count the items only when items.live = 1?

Here is what I got so far:

SELECT categories.*, COUNT(items.id) AS count1 FROM categories
LEFT OUTER JOIN items ON (categories.level2id = items.cat1level2 OR categories.level2id = items.cat2level2)
WHERE level1id = '$level1id'
GROUP BY categories.level2id

Like instead of “COUNT(items.id) AS count1” , do something like “COUNT(items.live = 1) AS count1” ?

Thanks

UPDATE:

I replaced “COUNT(items.id) AS count1” with: SUM(CASE WHEN items.live = ‘1’ THEN 1 ELSE 0 END) AS count1 . In addition, I added items.live column to the index.

This seems to be working well, but I’m not sure if this is the proper way or if it will degrade performance later on when millions of items are added - since this is a SUM function instead of COUNT. I was going to use “COUNT(CASE WHEN items.live = ‘1’ THEN 1 ELSE 0 END)”, but all empty sub-categories were showing “1” item by default instead of “0”.

Let me know what you think.

Thanks
Kind regards

Seems like a lot of code bloat.

Enter the number of items in the DB once and then do +1 or -1 each time you add or remove an item.