peppy — 2014-06-30T00:43:20-04:00 — #1
I have a site with 10,000+ pages and it slows down when visiting page 9000 for instance. I know the LIMIT/OFFSET is the problem and that I could use MySQL to select rows that start by using an "ProductID" indicated in the pagination links.
However, I need to be able to sort by Price and TimeStamp, so the ProductID will be completely out of order.
In addition, I found a very similar site that works fast, has many sorting options and doesn't seem to use any kind of "starting point" in their pagination links: Example Site
If anyone has an idea on what this example site is doing with their MySQL, I would greatly appreciate your help in explaining.
mittineague — 2014-06-30T00:57:20-04:00 — #2
What is(are) the index(es) you're using to query the table(s)?
Having the proper indexes will take up some space but it should help in speed.
r937 — 2014-06-30T05:42:15-04:00 — #3
this is an astonishing number
i have never heard of anyone with the patience to hit the "next" link on a search results page 9000 times
peppy — 2014-06-30T13:23:10-04:00 — #4
Even so, I would still like a pagination system like this. If ebay and that smaller example site could do it, I would also like to learn how to do the same thing.
r937 — 2014-06-30T13:26:52-04:00 — #5
could you respond to post #2 please
peppy — 2014-06-30T13:29:09-04:00 — #6
I have a indexes on ProductID, Price and TimeStamp. ProductID is the primary id index while Price and TimeStamp are the two columns for sorting.
r937 — 2014-06-30T13:34:29-04:00 — #7
okay, that helps
could you also do a SHOW CREATE TABLE for the tables, and then show your query with a typical ORDER BY clause
peppy — 2014-06-30T14:10:15-04:00 — #8
Here is the SHOW CREATE TABLE for the products table with the relevant columns:
CREATE TABLE `items` (
`id` int(8) NOT NULL AUTO_INCREMENT,
`seller` varchar(30) NOT NULL,
`title` varchar(85) NOT NULL,
`listingtype` tinyint(1) NOT NULL,
`created` datetime NOT NULL,
`endtime` datetime NOT NULL,
`live` tinyint(1) NOT NULL,
`currentprice` decimal(10,2) NOT NULL,
`binprice` decimal(10,2) NOT NULL,
PRIMARY KEY (`id`),
KEY `seller` (`seller`),
KEY `title` (`title`),
KEY `created` (`created`),
KEY `endtime` (`endtime`),
KEY `live` (`live`),
KEY `listingtype` (`listingtype`),
KEY `currentprice` (`currentprice`),
KEY `binprice` (`binprice`)
) ENGINE=InnoDB AUTO_INCREMENT=500000 DEFAULT CHARSET=utf8
There are more columns here that are relevant to our queries that you should know about but it could make it more complex for answering this question.
The "listingtype" column has a number (1 or 2) to tell us if the item is an "auction" or a "buy-it-now" item. We sometimes use the "listyingtype" column with the "binprice" column to see if an item is an auction with a "buy-it-now" option, if the binprice is > 0.00.
The "live" column tells us a number where "1" means the item is live and "2" means it is sold out and "3" means it is unsold.
The "created" column is the time an item is created, so we can sort newest listings first.
The "endtime" column is the time an item ends so we can sort which auctions/items are ending soonest.
The "currentprice" column is the current price of an item whether it is an auction or buy-it-now item, we use this to sort price (highest/lowest).
// Display ALL items, newest items
SELECT * FROM items USE INDEX (created) WHERE live = 1 ORDER BY created DESC LIMIT 0, 25
// Display ALL items, ending soonest
SELECT * FROM items USE INDEX (endtime) WHERE live = 1 ORDER BY endtime ASC LIMIT 0, 25
//Display ALL items, highest price first
SELECT * FROM items USE INDEX (currentprice) WHERE live = 1 ORDER BY currentprice DESC LIMIT 0, 25
//Display BIN (Buy It Now available) items, newest items
SELECT * FROM items USE INDEX (created) WHERE binprice != 0.00 AND live = 1 ORDER BY created DESC LIMIT 0, 25
//Display BIN items, ending soonest
SELECT * FROM items USE INDEX (endtime) WHERE binprice != 0.00 AND live = 1 ORDER BY endtime ASC LIMIT 0, 25
//Display BIN items, highest price first
SELECT * FROM items USE INDEX (currentprice) WHERE binprice != 0.00 AND live = 1 ORDER BY currentprice DESC LIMIT 0, 25
//Display AUCTION-ONLY items, newest items
SELECT * FROM items USE INDEX (created) WHERE listingtype = 1 AND live = 1 ORDER BY created DESC LIMIT 0, 25
//Display AUCTION-ONLY items, ending soonest
SELECT * FROM items USE INDEX (endtime) WHERE listingtype = 1 AND live = 1 ORDER BY endtime ASC LIMIT 0, 25
//Display AUCTION-ONLY items, highest price first
SELECT * FROM items USE INDEX (currentprice) WHERE listingtype = 1 AND live = 1 ORDER BY currentprice DESC LIMIT 0, 25
These are the more simple queries for page 1 of the homepage, each page having 25 items. On our member pages, we need to add the "WHERE seller = 'joe' " condition, or on our category pages, we add a category number and etc.
r937 — 2014-06-30T14:25:58-04:00 — #9
everythinig looks in order
so if you're experiencing slowdowns after a few thousand pages, perhaps you will have to consider modifying the queries whenever the "next" link is clicked
so if the first time you run a query you use this --
ORDER BY currentprice DESC LIMIT 0, 25
then what you could do is save the lowest price returned, then use that in the subsequent query call, and instead of using LIMIT 25,25, add a WHERE condition with the saved price --
WHERE currentprice < 937.42 ORDER BY currentprice DESC LIMIT 0, 25
peppy — 2014-06-30T14:43:19-04:00 — #10
I came across this solution before and there are a few problems. First off, we would have to use a pagination link for page 2 link this so the price is "remembered" for the next query: "?page=2¤tprice=937.42 . A problem also happens when a search engine indexes a site, they record the URLs "?page=2¤tprice=937.42". Then weeks later, they return directly to that URL and the currentprice and page numbers could be completely out of sync by then. There would be countless "currentprice" values being indexed, in addition to "created" and "endtime" parameters for sorting by newest listings or ending soonest listings...
That example site that I linked to in my first post doesn't include the price in their link, yet you can jump to the 2000th page very quickly. They are using a parameter: "?start=99850&keywords=&orderType=DESC&orderField=price . Ebay also uses a similar method where the prices parameter aren't included in the pagination links.
I'm trying to figure out how that example site is doing it so cleanly.
r937 — 2014-06-30T17:54:46-04:00 — #11
no, you wouldn't save two parameters, just one
and it wouldn't have to be in the url, it could be in a cookie
but no matter how you do it, i don't think i can help you with your page 9000 problem as far as the database is concerned
maybe it has to do with sort_buffer_size, but i am not a DBA
according to http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html you should do an EXPLAIN on the slow query and see if it uses filesort
peppy — 2014-06-30T18:27:16-04:00 — #12
I've dug deeper into what eBay and that example site's pagination. It seems like if I go to a high enough page on eBay, it throws an error saying it can only display up to 10,000 results and won't let you go further, even though there are page numbers available. The same thing happens after 100,000 items on that example site as well - again, they have additional pages available but they give up blank pages.
I'm thinking of limiting the pagination to 50,000 items instead of laying out the pagination for the full 500,000 items to help ensure load time is less than 3 seconds on the final page.
I'm not sure how the search engines are going to handle though...
wwb_99 — 2014-06-30T19:58:06-04:00 — #13
When you get past 1000 or so you've got a search problem not a pagination problem . . .
peppy — 2014-07-01T16:43:43-04:00 — #14
Do you mean an SEO search problem or a database search problem?
peppy — 2014-07-01T17:27:26-04:00 — #15
Sorry Mittineague, like you, I also had very little sleep and very little coffee, so I am confused. I've posted a question related not just to the MySQL search/query, but also the implications of SEO search as well...
I was notified by email of your post, but cannot find it here in this thread... which makes it even more confusing... Is there some hidden discussion going on that makes sense to everyone else except me?
mittineague — 2014-07-01T18:57:30-04:00 — #16
I guess I wasn't as awake as I thought I was
It wasn't until after I submitted the reply that I realized you were the OP :footmouth:
So I deleted it. Sorry for the confusion and my mistake.
I can't speak for wwb_99, but my guess is he is talking "cause vs. symptoms" not SEO
wwb_99 — 2014-07-02T15:08:26-04:00 — #17
Yup -- if you are presenting 1000 results for a human to parse through then whatever you are doing isn't filtering the haystack enough to begin with. Or, this might be a problem but you really shouldn't waste time solving it and rather focus on getting things so you don't need 10k results.
thereddevil — 2014-07-06T05:51:59-04:00 — #18
For 9000 pages the slowdown could be fixed by throwing more resources at the database or optimizing the settings if that has not been done.
If you want to fix the problem at the root, take a look on this thread for the articles I posted links to:
In regards to sorting by price/timestamp and search engines, this will not really work, since the result will change as changes are made. Due to this I would recommend adding nofollow to those links, that way search engines will only go through your products the normal way, and not by price or timestamp. I.e. you will still be indexed and those results can be viewed by users that click a link.