Removing Duplicate rows in a table by matching two columns data

afridy · January 14, 2014, 5:05am

Hello,

empno | date

38084 | 2014-02-26 18:22
38084 | 2014-02-26 18:26
38084 | 2013-07-1 18:26
38011 | 2014-02-26 18:59
38011 | 2014-02-26 19:26

now i want to remove any duplicate employee numbers having the same date (yyyy-mm-dd and time can be diffrent).

i know how to remove duplicates, but not getting how to do it for more than one column.

i tried this

SELECT empno,date, COUNT(*) c FROM employees
GROUP BY empno HAVING c > 1
AND date=left(date,10)

but no luck.

SpacePhoenix · January 14, 2014, 7:26am

What is the “field type” (eg, date, varchar, text, etc) of the date field?

afridy · January 14, 2014, 7:36am

varchar

btw, i had put a mistake here in my eample table

38084 | 2013-07-1 18:26
should be
38084 | 2013-07-01 18:26

the date part is always fixed in length

afridy · January 14, 2014, 10:58am

hello

r937 · January 14, 2014, 1:09pm

hi, how ya doin?

SELECT empno
     , LEFT(date,10) 
  FROM employees 
GROUP 
    BY empno
     , LEFT(date,10)
HAVING COUNT(*) > 1

afridy · January 15, 2014, 5:33am

Hai r937,
thank you!
now i could not get the delete part to work. I want to remove the duplicates.

delete from employees
WHERE
  employees.empno 
  in 
  (SELECT empno, LEFT( date, 10 )
FROM test
GROUP
BY empno, LEFT( date, 10 )
HAVING COUNT( * ) >1)

r937 · January 15, 2014, 6:52am

gee, i’m sorry, i cannot see your error message from here

afridy · January 15, 2014, 6:56am

delete from employees
WHERE
  employees.empno 
  in 
  (SELECT empno, LEFT( date, 10 )
FROM employees
GROUP
BY empno, LEFT( date, 10 )
HAVING COUNT( * ) >1)

it was saying operand missing or something.

afridy · January 15, 2014, 6:58am

Operand should contain 1 column(s) - this is the error.

r937 · January 15, 2014, 8:07am

do you understand what it’s telling you?

your WHERE clause specifies one column, employees.empno, but your IN subquery produces two

afridy · January 15, 2014, 8:41am

undestood r937. then i also tried like this.

DELETE FROM employees
WHERE employees.empno
IN 
(SELECT empno
FROM employees
GROUP
BY empno, LEFT( date, 10 )
HAVING COUNT( * ) >1)

#1093 - You can’t specify target table ‘employees’ for update in FROM clause

afridy · January 15, 2014, 10:41am

any way i managed to solve the issue ass follows.

SELECT empno
     ,date
  FROM employees
GROUP
    BY empno
     , LEFT(date,10)
HAVING COUNT(*) > [B]0[/B]

It returned all unique rows. i experted the result as csv, then empty the table, imported back the data from csv.

Thank you.

r937 · January 15, 2014, 11:53am

you still have your duplicates

HAVING COUNT(*) > 0 will return all rows in the table

afridy · January 15, 2014, 12:24pm

hai r937

i see duplicate empnos but dates are unique to each.

see the sample data

btw, i had 550 records in the table with many duplicates, that query returned 164 of them

r937 · January 15, 2014, 12:37pm

yes, you’re right, i was confused for a moment

afridy · January 15, 2014, 12:57pm

tx for the confirmation r937