Find duplicate rows

I need to search for duplicates - table as follows:


mysql> SELECT
	COUNT(*) c,
	theTel,
	theDate
FROM
	`doTable`
GROUP BY
	theTel,
	theDate
HAVING
	c > 1;

+---+-----------+------------+
| c | theTel    | theDate    |
+---+-----------+------------+
| 2 | 2107710   | 2012-08-03 |
| 2 | 290117984 | 2011-08-08 |
| 2 | 400018758 | 2009-10-02 |
+---+-----------+------------+
3 rows in set

I need to set value to 1 for duplicate records.
I tried this:


UPDATE `doTable` query1
JOIN (
       SELECT
	COUNT(*) c,
	theTel,
	theDate
FROM
	`doTable`
GROUP BY
	theTel,
	theDate
HAVING
	c > 1;
) AS query2 ON query1.theTel = query2.theTel
SET query1.dual = 1;

but it updated ALL the records that where duplicate.
E.g.:


+-----------+------------+-------+
| theTel    | theDate    | Dual  |
+-----------+------------+-------+
| 400018758 | 2009-10-02 |  1    |
| 400018758 | 2009-10-02 |  1    |
+-----------+------------+-------+

Instead I want to update only one of the duplicate records:


+-----------+------------+-------+
| theTel    | theDate    | Dual  |
+-----------+------------+-------+
| 400018758 | 2009-10-02 |  0    |
| 400018758 | 2009-10-02 |  1    |
+-----------+------------+-------+

Any help be would appreciated, thank you.

Hi Miguel,

I can see two solutions to your problem:

  1. add an identity column and add in having close id=max(id)
  2. use a stored procedure. Logic would be like this:
    a) select distinct thetel,thedate in a temp table
    b) loop through the time table using a while loop:
    at each iteration:
    get thetel, thedate in a parameter
    set rowcount 1
    execute the update statement (in the where clause, thetel,thedate equals the parameters you got)

Hope it helps

Vincent

I agree that you need a primary key/unique id. Without that I don’t think you can accomplish this without some recursive method (eg php script, stored procedure).


alter table doTable add id bigint unsigned auto_increment, add primary key(id);


SELECT
	COUNT(*) c, MIN(id) firstId,
	theTel,
	theDate
FROM
	`doTable`
GROUP BY
	theTel,
	theDate
HAVING
	c > 1;
	

     c  firstId  theTel      theDate
------  -------  ----------  ------------
     2        1  2107710     2012-08-03
     2        3  290117984   2011-08-08
     2        5  400018758   2009-10-02

UPDATE `doTable` query1
JOIN (
       SELECT
	COUNT(*) c, MIN(id) firstId,
	theTel,
	theDate
FROM
	`doTable`
GROUP BY
	theTel,
	theDate
HAVING
	c > 1
) AS query2 ON query1.theTel = query2.theTel AND query1.id!=query2.firstId
SET query1.dual = 1;	
	

select * FROM doTable;

theTel      theDate         id    dual
----------  ----------  ------  --------
2107710     2012-08-03       1    (NULL)
2107710     2012-08-03       2         1
290117984   2011-08-08       3    (NULL)
290117984   2011-08-08       4         1
400018758   2009-10-02       5    (NULL)
400018758   2009-10-02       6         1

thanks a lot!