Why is this query taking forever to run?

Hi,

I have a table with a few 100K records. I have an index on column PiNumber - but when I run the following query in phpmyadmin it will run for
30+ minutes …


SELECT PiNumber
FROM data_table
WHERE PiNumber IN (
    SELECT PiNumber
    FROM data_table
    GROUP BY PiNumber
    HAVING COUNT(PiNumber) > 1
    )
ORDER BY PiNumber

But … when I just do this simple query it runs in under a second


SELECT PiNumber
FROM data_table
GROUP BY PiNumber
HAVING COUNT( PiNumber ) >1

When i run show processlist, the query is stuck in state ‘Sending data’
any thoughts/suggestions?

why do you need the first query at all? Why not just add the order by to the correct second query?

This is a known performance limitation of MySQL.

precisely :slight_smile:

in fact, why bother with the ORDER BY at all? grouping typically sorts results into ascending sequence anyway

but if you really need it, mysql has a non-standard extension that lets you put it on the GROUP BY clause –

SELECT PiNumber 
  FROM data_table
GROUP 
    BY PiNumber [COLOR="#0000FF"]ASC[/COLOR]
HAVING COUNT(*) > 1

Hi,

The first query that was not working was to list all occurrences of the duplicate PiNumber
The second query that was working was only producing One Occurrence of the duplicate PiNumber

I just realized in my initial post i did not even write what I was trying to do with the query, lol
But I got what I need using the following (which lists all occurrences of the duplicate PiNumber)


SELECT t1.*
FROM data_table t1
INNER JOIN (
SELECT PiNumber
FROM data_table
GROUP BY PiNumber
HAVING COUNT(PiNumber) > 1
) t2 ON
t1.PiNumber = t2.PiNumber

Thanks :slight_smile:

Do you have any other fields on that table that are unique to the row, like an id or something? If so, you can use that to find duplicates like so:


SELECT
   t1.id, t2.id
FROM
   data_table t1
   INNER JOIN
   data_table t2
   ON
      t1.piNumber=t2.piNumber
      AND
      t1.id <> t2.id

Which will return all duplicate pairs in the table and should be a lot faster than what you are doing.