I have searched the form for duplicates and while there were many questions asked I am not sure what is the best way to do this.
If I am searching users by a registrant_id, first_name, last_name, prov_state, and postal_code where any three of these fields that match in multiple records are considered duplicates but the registrant_id is unique for each member then do you know the best way to return all of these duplicates?
so you will need four different queries, which, luckily, you can UNION
here’s one of them –
SELECT t.*
FROM ( SELECT last_name
, prov_state
, postal_code
FROM daTable
GROUP
BY last_name
, prov_state
, postal_code
HAVING COUNT(*) > 1 ) AS dupe1
INNER
JOIN daTable AS t
ON t.last_name = dupe1.last_name
t.prov_state = dupe1.prov_state
t.postal_code = dupe1.postal_code
having said that, i wonder if you truly are interested in the results of this query – it will return all people named Smith in Oakville
As I am not deleting these, I guess I could also just do a normal SELECT too ?
SELECT
l.lead_id as 'Registrant Number'
, l.first_name as 'First Name'
, l.last_name as 'Last Name'
, pst.name as 'State/Province'
, p.postal_code as 'Zip/Postal Code'
FROM
leads as l
LEFT OUTER
JOIN leads2addresses as l2a
ON l2a.lead_id = l.lead_id
LEFT OUTER
JOIN addresses as a
ON a.address_id = l2a.address_id
LEFT OUTER
JOIN postal_codes as p
ON p.postal_code = a.post_code
LEFT OUTER
JOIN provs_states as pst
ON pst.abbreviation = p.state_prov_abbr
AND p.postal_code = a.post_code
WHERE
l.first_name Like '%Bill%'
AND
l.last_name Like '%Smi%'
AND
p.postal_code Like '%L7L%'
Just was saying that your supplied method will work as it does the correct job of selecting (and not deleting) the duplicates… sorry for the lack of clarity.
and what does this query have to do with finding duplicates based on three of four columns?
My client wants to view all records where any three columns that match out of Registrant Number, First Name, Last Name, State/Province, or Zip/Postal Code will be shown as a possible duplicate.
So if in a HTML form they fill in three of the possible fields for matching and duplicates are returned then they can manually choose to delete them.
This is setup this way as a convenience for the client, to find a persons full address they only need to supply the person’s data and their postal code; the joins provide all the other address type data. Yes it is not as efficient as using it in one table, but their application is ‘heavy’ on the requirement to search for people’s addresses so this extra complication (in code) and slower efficiency is warranted.