Group similar results, listing users

that screenshot doesn’t appear to be from the tags table

when you “run the home page” what is the exact query you’re using?

also, would you mind doing an EXPLAIN on it

The screen shot is from the tags table.

The query is taking almost eighty seconds to perform:

SELECT bookmarks.id, source.duplicates, bookmarks.url, bookmarks.title, SUBSTR(bookmarks.snippet, 1, 100) as snippet, bookmarks.datetime, t.tags, l.users FROM (SELECT url, COUNT(*) AS duplicates, MIN(datetime) AS earliest FROM bookmarks GROUP BY url) AS source INNER JOIN bookmarks ON (bookmarks.url = source.url) AND (bookmarks.datetime = source.earliest) LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(tag) AS tags FROM tags GROUP BY bookmark_id ) AS t ON t.bookmark_id = bookmarks.id LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(user_id) AS users FROM links WHERE status = ‘public’ GROUP BY bookmark_id ) AS l ON l.bookmark_id = bookmarks.id ORDER BY bookmarks.datetime DESC LIMIT 0, 9

Having run the query through phpMyAdmin, the users column is returning [BLOB - 1B], and the tags for one of the rows is returning NULL.

I’ve attached a screen shot for the EXPLAIN.

query looks fine (after i took the trouble to re-format it… why did you not preserve the indents and line breaks?)

links table looks like it might need same sort of de-construction/re-construction as we did with the tags table (compare lines 5 and 6 of the EXPLAIN)

bookmarks needs to have its indexes looked at too (see line 7)

I build all of the queries with conditional statements in PHP, based on various parameters, so there’s no need for indentation.

I just copied straight from the browser window, having asked PHP to echo the statement.

Not knowing much about what EXPLAIN does, I’m guessing it’s the “Using filesort” that’s slowing the whole thing down?

Both links and tags have identical configurations for their indexes, et cetera (see the previous screen shot for an example).

I’ve attached a screen shot for the bookmarks table.

Run this directly in MySQL and post the result as an image:


EXPLAIN SELECT bookmarks.id, source.duplicates, bookmarks.url, bookmarks.title, SUBSTR(bookmarks.snippet, 1, 100) as snippet, bookmarks.datetime, t.tags, l.users FROM (SELECT url, COUNT(*) AS duplicates, MIN(datetime) AS earliest FROM bookmarks GROUP BY url) AS source INNER JOIN bookmarks ON (bookmarks.url = source.url) AND (bookmarks.datetime = source.earliest) LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(tag) AS tags FROM tags GROUP BY bookmark_id ) AS t ON t.bookmark_id = bookmarks.id LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(user_id) AS users FROM links WHERE status = 'public' GROUP BY bookmark_id ) AS l ON l.bookmark_id = bookmarks.id ORDER BY bookmarks.datetime DESC LIMIT 0, 9 

oddz, is that the same query? he posted the EXPLAIN in post #42

i’m glad you’re here, it helps to have another set of eyes on these indexes…

can’t believe I missed that.

You will probably need to add some indexes but this should put you in the right direction. I’m running a little short on time but the main thing is that there were 2 unnecessary subqueries. Also, you will need a index on links: (bookmark_id,status) - if it doesn’t exist. You may need others but that is the main one I believe. The query is untested but you should get the idea. Suffice to say – always eliminate unnecessary subqueries to achieve a more optimized query – when possible. In this case it so happens that it is.


SELECT 
      b.id
     ,s.duplicates
     ,b.url
     ,b.title
     ,SUBSTR(b.snippet, 1, 100) snippet
     ,b.datetime
     ,GROUP_CONCAT(t.tag) tags
     ,GROUP_CONCAT(u.user_id) users 
  FROM 
     (SELECT 
            url
           ,COUNT(*) AS duplicates
           ,MIN(datetime) AS earliest 
        FROM 
           bookmarks 
       GROUP 
          BY 
           url) AS s 
  INNER 
   JOIN 
      bookmarks b
    ON 
      b.url = s.url
    AND 
      b.datetime = s.earliest 
   LEFT OUTER
   JOIN
      tags t
     ON
      b.id =  t.bookmark_id    
   LEFT OUTER 
   JOIN 
      links l
     ON
      b.id  = l.bookmark_id
    AND
      l.`status` = 'public'
 GROUP
    BY
      b.id
 ORDER 
    BY 
      b.datetime DESC     
 LIMIT 
      0, 9 

on the contrary, they are very much needed

bookmark to tags in one-to-many, and bookmark to links is one-to-many, and you will get cross join effects by joining to both at the same time

e.g. a bookmark with 3 tags and 4 links will generate 12 rows in the FROM clause, and your GROUP_CONCATs will be all shot to hell (unless you throw DISTINCT into them, but what’s the point of that, it’s way inefficient)

whereas joining to derived tables ensures that you join a bookmark to only one row for the tags, and again only one row for the links

so the subqueries are not unnecessary

yeah… you’re right.

What I would recommend is profiling each of those subqueries than to see if better optimization can be achieved on an individual by individual basis. The original explain isn’t looking to hot.

Hi and thanks for your time!

The query returned with an error:

#1054 - Unknown column ‘u.user_id’ in ‘field list’

you don’t want to use that query anyway :slight_smile:

What do I need to do next?

Breaking each of the queries down and testing them makes sense.

that’s right, take each of the subqueries, run them separately, with an EXPLAIN on them

OK, I’m onto to it!

I shall return…

OK, I’ve absolutely no idea what’s going on here with EXPLAIN.

I’ve had a look around, trying to understand, but it’s so beyond me, I don’t even know where to begin.

Right now, I don’t even know what I don’t understand, which is troubling.

There’s just far too much to learn and I simply don’t have the time to commit to this project, when client work is more pressing.

I can’t expect people to just do all of this for me, so I think the best thing to do is call the whole thing off, and then come back to this, perhaps, at some later date.

Guys, thanks for your time! Always appreciated.

call the whole thing off?? when you were so close??

:frowning:

Yeah, but you’re doing all of the work, which just isn’t right.

If I could take the advice and run with that, that would be fine. But I have no idea what I’m doing!