Resolving Soft Delete Unique Key Conflicts

oddz · November 29, 2010, 7:11am

The scenario here is that I never want users to delete data. Instead a delete will just hide items. However, the issue becomes that insertions fail for deleted items since they will still exists within the database. However, I still want all active items to respect unique key constraints. So I was thinking since the NULL value does not respect unique key constraints, that adding a field to my unique key constraint such as; activated would solve the issue.

unique key example:

sites_id
node_types_id
node_url
activated NULL

So when a something is deleted activated will be set to NULL, allowing new items to be added that may have the exact same signature. All active items will have a value of 1. Is this generally a good approach? This will be an internal safety measure making it possible to retrieve data that has been accidentally deleted through my applications user interface. I will than have a separate, development only purge method to ride the database of soft-deleted items, when necessary to completely remove items.

Also, I’m thinking it would be good to make activated an enum value, since it should only ever have a single value, 1 when it has a value. Thoughts on that? I know the general idea is to use tinyint but that could result in values other than 1. By using an enum its an absolute tha value will always be 1.

DarthGuido · November 29, 2010, 7:14am

Never done this myself, but if it works it sounds like a great solution to me.

r937 · November 29, 2010, 7:29am

what happened when you tested it?

oddz · November 29, 2010, 7:51am

Worked as expected.

Additionally perhaps there is a way to hide the rows with activated set to NULL from all queries? So that I don’t have to add the necessary logic to my application layer via activated IS NOT NULL to everything. Essentially if activated is null I would like to treat the row as if doesn’t exist as far as my application layer is concerned.

IBazz · November 29, 2010, 12:52pm

you could assign each record a date which would mean the date the record was created. you then have a choice.

select col_name where
date = ( select max(date)
from your_table
)

that will return the latest/lastly added, record

or you could have an end date col as well as a start date col. when the file is created, the end date is set to 0000-00-00. if you add a new record to replace the current one, it would be assigned an end date of 0000-00-00 but the earlier record could be given a date of curdate() - 1.

So your current/live records have an end date of 0000-00-00
the archived/soft deleted have an end date < today.

benefits of this means you can have an archive of soft deletes going further back and of course the ‘archive’ will also have accidentally deleted items, which can be re-activated.

And if you would have loads of records in the table, you could split off the archived records into a new table of the same structure as the one you are using but, with the named archived in it somewhere

eg
my_table
my_table_archived

benefits:
you can re-instate accidentally deleted files/records
you can re-activate any one of numerous previous records without the need to re-enter them

hth

bazz

oddz · November 29, 2010, 1:42pm

That won’t work, think about it.

nodes_id
node_types_id
sites_id
created_on_timestamp

At least, not unless everything is created at the same time…

Splitting up the table is not something I feel is necessary at this point.