I am building a large-ish site that has many content-managed sections. It all needs to be searchable so I am having a single
pages tables that has a FULLTEXT index on it for searches.
Each pages has a unique URL-slug that is used for URL rewriting.
I am using the section followed by the ID so on two sections you have the same page name.
…would both be valid. The above would have a unique ID of news/test-post and recipes/test-post respectively. I have three questions regarding indices:
- Is there any point having auto-increment int as the primary key when it already has a unique ID? I am guess no in the front end as lookups will always be done by the slug but perhaps in the CMS as they may use the int instead. Either way, my gut says the extra index is not worth the overhead.
- Is there a big performance difference between having a string of say 50 chars as a primary key compared to auto-increment int?
- Would there be any performance increase or decrease if I split the slug into two parts and set as a compound (?) index? So in the above examples news and test-post and recipes and test-post would be in different fields/columns. I am guess not as I will generally be selecting by primary key or FULLTEXT search.
- depends how you define "big"
Thanks, I thought as much re points 1 and 3.
Regarding point two. Let's say you have 10,000 records and SELECT * FROM
table WHERE id = '$id' takes 100 milliseconds with a 50 varchar as the PK, do you think with auto_increment it would be a couple of milliseconds quicker or say twice as fast? I guess I should set up a test table and find out for myself but if you have any experience, feel free to share it. Based on how I understand integer based indices work I would guess that ints are quite a bit faster.
depends on how you define "quite a bit"
time for some testing, yes?
Also consider whether your URL slugs might ever change, because the unique ID that you use as a foreign key in other tables must not change, otherwise those relations will break. (It's also harder to use compound values as a foreign key.) My preference, for simplicity, keep an int ID.
Of course if the slug is guaranteed to never change then adding the int id just makes for added complexity and should only be included if the added efficiency for lookups is sufficient to outweigh the added complexity that having two unique keys adds to the code.
Generally not using int ids where there is another field suited to use as the key is the simpler solution.
Even then, I'm still iffy about it. If it's a public-facing value, then no guarentee is as strong as you might think. Tomorrow, the boss/client could say, "I've changed my mind. I want people to be able to change their URLs." Then you'd be wishing you had used an int ID from the beginning.
Thanks for the replies. Jeff has an interesting point and in this particular implementation the URLs can change—though they won't very often. However, because I need a FULLTEXT index, it's a MyISAM table so any relation between other tables would have to be done manually anyway. (I'm assuming that's right, as you can tell, databases aren't my strong point)
That's prompted another question regarding IDs and slugs but I think it would be more appropriate in another thread.
This topic is now closed. New replies are no longer allowed.