I need a guidance / suggestion about how to prevent the DOUBLE DATA SUBMISSION to database.
This usually occurs when after the submit, user hit REFRESH button, or hit the Go Back button.
In short, the DATA INSERTION script is refreshed and submit the data again.
Here is my idea.
Have a code may be md5(time()+session_id) in a hidden field of the form. And upon insertion, first do a check if that code is already there in the database, if the answer is NO, INSERT that, else show a message !
What do you people think, is this idea OK, or there can be some better ways ?
The only time a double submit would be a problem is where you have an insert with an autoincrement. Anything else would have no effect since you'd just be replacing data with the same thing, be unable to insert a duplicate, or be unable to delete something already deleted. So in most cases using post-redirect-get is just adding additional processing for no additional benefit - particularly since it doesn't actually prevent the data being submitted again (it just reduces the chance of it happening by accident).
Which is probably 90% of mysql tables out there
You are assuming that most databases are badly designed since that ought to be a very rare occurrence with properly designed databases. You might be right though.
That of course would then mean that the best fix for the problem in those cases would be to redesign the database properly to get rid of the meaningless autoincrement field and replace it with a more appropriate key.
After you coded your first site, you shouldn't even be able to think of a way not to use:
(you could, but then you would always end the thought with: "but that's stupid" :p)
Yes - using post/redirect/get is stupid most of the time. It always adds an overhead to the processing without actually doing much in return. Even with that code in place you would still need additional code to prevent a double submit of the same data in those instances where a double submit would result in a database corruption simply because there is nothing in a post/redirect/get to prevent someone deliberately double posting to corrupt the data.
I'm assuming because it's used everywhere. http://codex.wordpress.org/Database_Description#Table:_wp_posts for example.
Is it really so bad? Is the extra header really so much overhead? I understand normalization and key constraints, but surely convenience is also a factor.
Thanks for the link.
Good visual explainations:
No it isn't really a huge overhead but it doesn't really achieve what a lot of people claim it does either since it doesn't actually prevent a double submit of the same data. All it really does is to reduce the chance of it happening by accident and get rid of an alert in Internet Exploder when you hit the back button to go back to the page that was submitted. You still need the exact same code to prevent database corruption from a double submit if you do use that processing as you need if you don't use it.
felgall, you are right.
It does not prevent from double posting. From spamming Send button, for example.
But it is not bad either.
Also, in this very case OP were asking about REFRESH button, so, his problem solved.
What database design you are talking about? Little example please. This thread has an autoincrement id = 653767
What you claim it to be in the perfect world?
If we cannot say that Post/Redirect/Get is much huge overhead then i think this reduces the risk of inserting duplicate data/row in the insertion case and it is better to go for it. I normally do that. If we don't redirect then we have to check to prevent posting same record in the same submit twice or more if the same records whether the same data are already inserted to the db or not even though we don't really want such duplicate validation.
So it means that Post/Redirect/Get is not ok to use, so what about my solution ?
"Have a code may be md5(time()+session_id) in a hidden field of the form. And upon insertion, first do a check if that code is already there in the database, if the answer is NO, INSERT that, else show a message !"
You still have to do that same duplication check because the redirect doesn't prevent someone deliberately inserting a duplicate, it only prevents some situations where people would insert the duplicate by accident. So the validation is still required to protect the integrity of the data from deliberate attack regardless of whether you use the redirect to prevent accidental duplication.
post redirect get is perfectly fine. The point felgall is trying to make is that it's not enough on it's own. You also need unique keys (eg username) to prevent duplication, and your code should handle errors like duplicate record.
I think that we are not talking on the same level of patterns:
the technical level
the business level
On the technical level, sure one assumes that if the database is well designed (with key/key foreign constraints) there should be no duplicate problems as it would violate the database integrity.
But the Application Business Level Pattern is not about technical integrity, like a User may inadvertly add a new line item in his shopping cart : technically this should not be forbidden as the user may indeed want to add the same item a second time (say for making the same gift to 2 friends) but in that case it would be a right business practice to warn the user and make him confirm or indeed forbid him and require him to modify the quantity field.
This level should not be the coder's responsability but the product manager responsability (in the case of corporate organisation otherwise nothing prevent the same guy to wear the 2 roles but they are conceptually separate ).
Thanks a lot ! you people are always great !
This is discouraging - if this thread accurately describes the available solutions.
In my case I do have an autoincrement key. I don't believe there is general agreement that this is a bad thing. In fact I have come to use it more over many years, converted from the opposite camp.
Anyway, the choices seem to be change the database design or accept double writing the record, so long as it is the same record you keep writing.
I also have automatic date and timestamps, so it would never really write the exact same record a second time.
I continue my search for the solution I am looking for.
Why it is considered to be bad design if your primary key is "auto increment" ?.
I found this thread while searching for a solution to the same problem as the OP. Nobody answered the last poster's question, namely why is it considered poor design to use autoincrement fields for primary keys? When I was learning database design (lo these many years ago) that's the standard that was recommended.
next page →