Delete Duplicate records in a table

damo2009 · May 19, 2010, 3:55pm

Hi
Im using the below Query to find duplicates in a table
which is working fine

but now Im trying to delete all duplicates leaving just one of each record, so if theres two records the same, delete one of them
any tips?


select Invoice_Number,Company_number,count(*) as n
 from `TEST`.`invoice`
group by Invoice_Number
having  n &gt; 1

damo2009 · May 21, 2010, 11:00am

Ah that was it, I had more than just them columns in the table, so when i added the others it worked fine for me

Thanks guelphdad & r937

guelphdad · May 21, 2010, 1:01am

you’d have to explain what didn’t work then because you either haven’t described the problem accurately or you have implemented the solution incorrectly, probably because there were more columns involved than described.

damo2009 · May 20, 2010, 7:35am

yep deleting random records will do the job, each set of duplicates are the exact same

I tried your above solution but it wasnt working for me

rpkamp · May 19, 2010, 10:16pm

Any preference as to which n-1 records of n records should be deleted? Or are all records the same and should just deleting n-1 random records out of n do the job?

damo2009 · May 19, 2010, 7:57pm

The duplicate record has the same company number,
so for example company 1 has an invoice number 12345
this invoice (12345) is in the table twice with the same company (1)
so i need to delete one of those whole records
company 2 is in the table three times with an invoice number of 5678
so i need to get rid of two of those records

hope that makes some sence

r937 · May 20, 2010, 2:27am

yes, it did the first time

did you try guelphdad’s solution?

of course, you backed up your table first, right?

r937 · May 19, 2010, 5:44pm

dude!!! omg!!! it’s completely the opposite!!!

what you should really be aiming for is one company for each invoice

guelphdad · May 19, 2010, 5:35pm

Does it matter which invoice_number is kept for each company?

If not you can run

ALTER IGNORE TABLE
ADD UNIQUE (company_number, invoice_number)

that will leave only one invoice_number for each company_number.