Question about UNIQUE identifier

Hello,

I’m currently building a site that requires information to be submitted by the users but must remain unique in some aspect in the mySQL database (no duplicate emails, etc.).

However, I’m kinda confused on how the UNIQUE constraint will actually play out with the database if I use it and I was hoping someone would be willing to explain the in’s and out’s a bit.

I understand that UNIQUE ensures that duplicate data is not stored but what if I had a column in the database say, “city” set as unique and two users from two different cities with the same name (eg. Hamilton, Ohio and Hamilton, NZ) submit information? Will one user not be accepted because the word “Hamilton” already existed? And if they were to be accepted, how does the database associate the two for proper data retrieval?

Because of the nature of the site, this will definitely have to be a many-to-many relationship but I just can’t seem to grasp the workings of it in my head if columns are set as UNIQUE.

Thanks for the help.

UNIQUE on a column will make sure every individual entry in that column is unique… a unique across multiple columns will make sure the combination of ALL columns is unique.

You will only want to make UNIQUE fields you genuinely need to be unique.

Hi Tim,

Though I appreciate the help, I’m not following you.

I realize what UNIQUE is supposed to do, but I’m not entirely sure of the outcome (or results if you will) if I did make these columns unique and other users submitted data that was the same.

So for example (using the city one again),

User A submits a form that has their first, last name and the city they live in.
User B submits a form that has their first, last name and the city they live in but this time, its the same exact city name even though they live in two different states / countries etc.

What happens to User B’s submission? Is it accept or rejected and if accepted, how will the database associate the fact that it’s two entirely different cities with the same name?

You really wouldn’t have a unique on city - because you’d be limiting yourself to the number of cities that exist - the first person to register in each would get the city.

If you did have it, mysql would throw an error complaining that you’ve already got a unique entry and not insert the row.

user B’s entry would be rejected with an error message

so you wouldn’t put UNIQUE on city, because that way, as tim says, you could have only one user per city

what was your thinking in wanting to make city unique? what did you really want to accomplish with that?

Hey Tim and R937,

Thank you both for your help, that definitely cleared things up.

To be honest, my thoughts of doing it was to prevent duplicate entries of the same data for data management and integrity.

The site is basically a CMS with a user submission search engine of reviews of any particular topic (let’s use skiing for an example).

The goal was to allow users to submit data but not allow that data to be duplicated in the database and search. So, if User A submits an entry for skiiing on a mountain called “Big Red” in “Anchorage, Alaska” (don’t know if it exists, just using it as an example) with a description, review and rating and someone else skis there and decides to make a submission, they could make a submission that accepts their description and any other additional info but not duplicate the name of the mountain for anchorage alaska.

That way if someone performed a search for “ski resorts in Anchorage Alaska” only one entry per name would appear for that city. (eg. one result returning for “Big Red” Anchorage Alaska rather than a multple submissions etc.) However, I also wanted to allow for a blanket search that would show the results of all listings for “Big Red” if they were in different locations etc.

I hope that makes sense.

If you don’t want duplicate data, you need to normalise your database and drag city information into a separate table.

i think you want the UNIQUE property to be applied to the combination of mountain and city

so there could be multiple entries for “Big Red” (in different cities), and multiple entries for “Anchorage, Alaska” (different resorts), but only one “Big Red” in “Anchorage, Alaska”

still, you need to think that through, too – it would mean you can have only one review for “Big Red” in “Anchorage, Alaska” and i’m pretty sure you would want multiple reviews

so do you really still want uniqueness somewhere? if so, could you have another go at explaining it?

Maybe, more what you are after, is distinct results actually… putting an index on City… and then SELECT DISTINCT city FROM table.

You both had it right. I was looking for “uniqueness” and be able to accept multiple reviews on one topic (eg Big Red) but I guess I was using the wrong method for achieving the end result I wanted. I was under the impression that UNIQUE allowed for the possibility of multiple submissions using the same information just not allowing duplicate entries. I’m sure I’m completely off base here with that thinking.

Would it make a huge difference if City and Big Red were in separate tables and user submissions were in their own table in terms of weeding out multiple submissions?

EDIT: I thought DISTINCT is what I was going for but from what I read, it only sorts the results. I don’t know. This is completely confusing me lol.

If you use DISTINCT, make sure you have proper indexes, or your queries will grind to a halt as your data increases in size.

Ok, scratch my last edit. I found a different blog explaining the DISTINCT option in better detail and they actually seemed to know what they were talking about.

I do believe DISTINCT is what I’m looking for. To be able to receive multiple submissions on one thing and only have one page result with additional user information (descripts, ratings, reviews etc.) returned, minus the duplicate submissions for “Big Red” etc., and viewable on the search page.

Again, thanks for all of your help guys!

Hopefully this will help explain exactly what I’m trying to do:


1. TABLE state
   stateID -> ID index = 1 -> PRIMARY KEY (current index number is 1)
   statename = "Alaska" -> ID index = 2
   statename = "Alaska" -> ID index = 3

2. TABLE city
  cityID -> ID index = 1 -> PRIMARY KEY
  cityname = "Anchorage" -> ID index = 2
  cityname = "Anchorage" -> ID index = 3

3. TABLE event
  eventID -> ID index = 1 -> PRIMARY KEY
  skiing -> ID index = 2
  skiing -> ID index = 3

4. TABLE resorts
   resortsID -> ID index = 1 ->  PRIMARY KEY
   resortnames = "Big Red" -> ID index = 2
   resortnames = "Big Red" -> ID index = 3

5. TABLE reviews
   reviewsID -> ID index = 1 -> PRIMARY KEY
   reviews = "I loved it! Will go again! 5 stars! -> ID index = 2
   reviews = "I HATED it! Don't waste your money! -> ID index = 3



User A submits the above information (Anchorage, Alaska and Big Red) after reviewing and rating it.

User B comes along and does the same thing for the same place.

Now User C comes along and wants to know about skiiing in Alaska so they do a search on “skiing in Alaska” and what I want to happen is only one result come up for “Big Red Ski Resort - Anchorage, Alaska” regardless of how many users submitted reviews and ratings etc for “Big Red - Anchorage Alaska”, but I do want all reviews, ratings etc to be viewable on the page of Big Red, Anchorage Alaska itself.

With that said, I’m trying to restrict searches based on area searched for example, if a user decides to search for ski resorts called “Big Red” in say, Switzerland, I don’t want Alaska popping up UNLESS they did a blanket search for “Big Red” without specifying region etc.

I think DISTINCT is what I’m looking for so I’ll play with it to see the results.

In that case, you would only use each entry once, and refer to its ID where you store the data joined together, so you’d only end up with one entry for Alaska, and one entry for Skiiing, and the review would say… state = 2, event = 2 linking you back to the correct entries.

Then when someone came and put a different state, but skiiing again, it’d be state = 4 (say) and event = 2.

Hey Tim,

Sorry, not trying to be dense, but nothing I try is working. Each search query I do prints out all submissions of the same place.

Could you provide an example of what you’re talking about? Are you referring to hard-coding those IDs into the searches themselves?

I’m currently using variables to pass information to the SQL query.

Example Code:


State <input type= "text" name= "<?php $state ?>" /><br />
City <input type = "text" name = "<?php $city ?>" /><br />
Event <input type = "checkbox" name = "Events" value = "<?php $event[] ?>" /><br />


No, I’m referring to database normalisation, the process of abstracting common data out so it is not duplicated. http://en.wikipedia.org/wiki/Database_normalization

In your case, places, events etc an all have multiple entries, so you take that out into a separate table and just refer to it from your main table then use joins in your query to piece it all back together.

Ah gotcha.

Thanks for the help!