URL Madness

I have so many questions, I don’t know where to begin. Let me give you a brief description and solicit any advice people can offer.

My websites use URL’s that look like this:

MySite/People/Ulysses_S_Grant

I’m in the middle of a huge upgrade and may upgrade further to a CMS (probably Drupal or WordPress) in the near future. I’m wondering if I should change my URL’s while I’m at it. The general consensus appears to be that hyphens are better than underscores. Accordingly, I may change my URL’s to this:

MySite/People/Ulysses-S-Grant

I could also change my URL’s to lower case…MySite/People/ulysses-s-grant…though I’m leaning towards not going that route. But what about characters, like periods?

MySite/People/Ulysses-S.-Grant

What about accents and other special punctuation marks?:

MySite/People/José-Martí

What about parentheses, like this?:

MySite/World/Georgia-(country)

And if I change my URL’s, I’m also going to have to forward visitors from the old URL’s to the new URL’s.

I’m thinking of covering all my bases by looking for a script that accepts all URL’s that 1) match the characters in my database, 2) regardless of whether multiple words are separated by -, _, %20 or (space). Thus, any of the following URL’s would default to the database value (Ulysses-S.-Grant):

Ulysses-S.-Grant
Ulysses_S._Grant
Ulysses S. Grant
ulysses-s-grant

Wikipedia has a similar function. If you read an article about “Some Person” and replace the URL Some_Person with Some Person or some_person, it will default to Some_Person.

One other problem I have to deal with: I don’t want my statistics to show 101 hits for This-URL, 13 hits for This_URL, 6 hits for this-url, etc. I only want to know how many visitors made it to This-URL.

Does anyone have any general advice? I’m leaning towards hyphens with first letters upper case and limited special characters (e.g. MySite/People/Ulysses-S.Grant and MySite/World/Georgia-(Country)). However, I’m not sure how to deal with accents and might simply replace People/José-Martí with People/Jose-Marti.

Thanks for any tips.

I typically only use numbers, lower-case letters, and dashes. Anything else has the potential to become too confusing to read off verbally. Or, with special characters, they might not always render properly if copying and pasting them somewhere.

Periods are a definite no-no. Those are only used to signify file extensions.

I would second the advice to only use lower case letters, numbers and dashes. Underscores are confusing, brackets may have to be % encoded, case-sensitive URLs are just plain evil, and full stops should only be used for delineating file extensions. Apostrophes can be confusing but one of their biggest problems comes when word processors and other fancy software autocorrects them to curly quotes, breaking the URL. And none of it is necessary, it just puts barriers in the way of potential users, and that reduces the number of people who will successfully get to the page they wanted.

In terms of making sure that your hits track each page correctly in aggregate, the best way to do that is to set redirect rules in .htaccess to force the URL to lower case. Then even if your filenames are not case sensitive, you will get all the hits counted against the lower case format.

Wow, great tips! It sounds like the consensus is that lower case is best, so I guess I need to bite the bullet and go make that change, too. On the positive side, my URL’s will be WordPress-friendly, if I decide to go that route. I thought periods were reserved for extensions, though Wikipedia uses them in URL’s. Sounds like Wikipedia’s doing everything wrong! :wink:

I especially appreciate the tip about modifying my .htaccess file. That’s a huge help.

Hmmmm…With my makeover, my URL’s will look like this: mysite/People/ulysses-s-grant

It’s probably a stupid question, but should I change the second segment (the section name - People, in this case) to lower case, too? I guess it would be kind of weird to have a lower case URL with a single capital letter in the middle.

Yes, you should make your whole URL case-insensitive, but then using all lower case looks neater.