junction_id (primary key & auto increment)
person_id (foreign key from person table)
person_job_id (foreign key from job table)
The thing is, there are multiple records in this table and I want to remove them except one. For instance, person1 can have three jobs, but these should be unique job_id’s. Some of these people have been matched with the same job_id twice or multiple times.
How can I delete the duplicated ones only?
EDIT : The query below worked.
DELETE n1 FROM personxperson_category n1, personxperson_category n2 WHERE n1.joint_id < n2.joint_id AND n1.person_id = n2.person_id AND n1.person_job_id = n2.person_job_id
[quote=“RT_, post:2, topic:195589, full:true”]… you could add a UNIQUE index on person_id and person_job_id.[/quote]this actually should be the primary key, and the auto_increment column should be dropped
I was thinking along the lines of the possibility that a lot of various queries may be in the middle-ware. Depending on how the software is implemented, it may be too costly/risky (business wise) to alter the queries, and test the results. Using UNIQUE, while technically incorrect, could be a cheaper business option, because it doesn’t interfere with the legacy implementation as much.
I don’t think you would need to do that. And if the auto increment column isn’t used then getting rid of it would definitely be the best move.
But, it is very common for people to come up with unexpected ways of doing things. It could be possible that the id is referenced as a FK by a table that holds information about the relationship between the person and job for instance. Then there could be a series of queries that join that table to other results. Or there might be a query that (for some reason) orders by id. I’ve seen a lot of funny ways people have implemented things.
True enough. I guess to be safe instead of Deleting the column it could be renamed, and then unit tests run,
Problems found - rename back
No problems - delete to improve efficiency