Using querystring IDs to create readable URLs

I’ve currently got the following .htaccess file.

The purpose of the file is to create SEF URLs for each branch (klvc). My first rule matches the shortname of the klvc and the id of the content item. What I’d like to do is match an alias I’ve given the content item, but this might be duplicated in the database. Come to think of it, I might have the same problem with the klvc shortname.

My question is, what is the best way to create SEF URLs that use the id of the content items/klvcs, but appear as /klvc/london/news-article, rather than /klvc/20/5

Can anyone also check my .htaccess as I’d like to know if what I’m trying to do is correct.


Options +FollowSymlinks
RewriteEngine on
RewriteBase /

#Catch specific reference to a klvc - how can I 
RewriteRule ^klvc/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ klvc_store.php?klvc=$1&mantext_id=$2 [NC]

#Catch specific reference to a news article
RewriteRule ^klvc/([a-zA-Z0-9_-]+)/news/([0-9_-]+)$     klvc_news.php?news_id=$1 [L,R=301]
 
#Catch specific reference to a booking form
RewriteRule ^klvc/([a-zA-Z0-9_-]+)booking/(.*)$     klvc_booking.php/$1 [L,R=301]

#Catch specific reference to a contact form
RewriteRule ^klvc/([a-zA-Z0-9_-]+)contact/(.*)$     klvc_contact.php/$1 [L,R=301]

RewriteRule ^klvc/([a-zA-Z0-9_-]+)$ klvc_store.php?klvc=$1 [NC]

#Catch anything left
#How do catch anything that hasn't matched and redirect to parent site?
# redirect all remaining traffic to /klvc



pete,

The best way is NOT to use the id of the content items but the title or name field in your db (you DID read my tutorial Article, didn’t you?). All that requires is that your title/name field be unique (MySQL can take care of that by preventing duplicates). If you don’t believe me, the record article titles at wilderness-wally.com ARE the titles, not the record ids and nearly all links ARE the titles. Of course, I HAD to convert the title to something which did not show %20 for spaces, etc, and convert back before accessing the db but the effect is EXACTLY that my client wanted - and what you say that you want, too.

As requested:

Options [COLOR="Red"]+[/COLOR]FollowSymlinks
# + is not used
RewriteEngine on
[COLOR="Red"]RewriteBase /[/COLOR]
# RewriteBase is only used to UNDO a mod_alias Redirect - none here!

#Catch specific reference to a klvc - how can I 
RewriteRule ^klvc/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ klvc_store.php?klvc=$1&mantext_id=$2 [NC]
# Example URIs, if you please!
# Your atoms appear to contain all the usable characters
# rather than the subset which you are after
# Same below

#Catch specific reference to a news article
RewriteRule ^klvc/([a-zA-Z0-9_-]+)/news/([0-9_-]+)$     klvc_news.php?news_id=$1 [L,R=301]
 
#Catch specific reference to a booking form
RewriteRule ^klvc/([a-zA-Z0-9_-]+)booking/(.*)$     klvc_booking.php/$1 [L,R=301]

#Catch specific reference to a contact form
RewriteRule ^klvc/([a-zA-Z0-9_-]+)contact/(.*)$     klvc_contact.php/$1 [L,R=301]

RewriteRule ^klvc/([a-zA-Z0-9_-]+)$ klvc_store.php?klvc=$1 [NC]

#Catch anything left
#How do catch anything that hasn't matched and redirect to parent site?
# redirect all remaining traffic to /klvc

# How about
RewriteRule ^klvc/ klvc_lost.php [L]

As I started, though, this is NOT the way to do what you’re after. I’d probably start with the links, i.e., klvc/ is not recommended as it’s “useless” in the context of your regex. I’d start those (above) links with news/, booking/, contact/ or store/ (the redirection can remain to the klvc_yadda-yadda) which MAY allow you to combine your mod_rewrite statements.

From there, I’d have to see some examples of your URIs to give the EXACT regex (for the characters you DO want to accept) AND the … duh, WHY are you using MultiViews (klvc_contact.php/$1 rather than klvc_news.php?news_id=$1)? Using MultiViews can cause a LOT of problems (relative links) while you should be able to use something like klvc_$1.php?$1_title=$2 if you constructed your db using common field names differentiated by the news_/booking_/contact_/store_.

I guess the lesson here is PPPPPPP (proper prior planning prevents piss poor performance) … and other problems!

Regards,

DK

Thanks for the excellent reply DK.

This is my first step into htaccess so it took me a while to work out what was happening with the matching. Looking at it now, matching to everything is obviously wrong. The site is a mixture of static pages and db driven ones. Perhaps a desired structure might help with a sample URL next to it


Home /klvc/clientname
Item 1 /klvc/clientname/item-1
Item 2 /klvc/clientname/item-2
Item N /klvc/clientname/item-N
News /klvc/clientname/news
-Article 1 /klvc/clientname/news/article-1
-Article 2 /klvc/clientname/news/article-1
Booking /klvc/booking
Contact /klvc/booking

Home, Items URLs are in the format:

/klvc/klvc_store.php?klvc=clientname&man_text=N

In this instance, we’re going to have a lot of similar page titles ie About Us, Our Store etc so I’m not sure I can use the title as it won’t be unique.

News URLs are in the format:

/klvc/klvc_news.php?klvc=clientname&newsid=N

If no news id is passed, you get the list of articles

Booking and Contact are standalone pages

/klvc/booking.php?klvc=clientname
/klvc/contact.php?klvc=clientname


I didn’t realise that I was using multiviews but I now understand the problems. Unfortunately, parts of the database I’m using is legacy stuff which I can’t alter - shame.

Is there a better way I could construct my rules based on what I’ve supplied above? Thanks very much for your time. I’ve only been using the sitepoint forums for a little while - the podcast made me think about giving it another go!

Pete

Pete,

You’re welcome. I’m here to “share the knowledge” (and, on occasion, guesses, too!).

With your examples above, why aren’t you making sure that you’re capturing item numbers with “…/item-([0-9]+)”? Ditto “…/news/article-([0-9]+)” for news? The whole point of using regex is to be as exact as you can be. Without knowledge of what your links look like, I can only guess by looking at your example and those JUMP OUT at me right away!

Regards,

DK

David,

Thanks again for your reply. Having not use regexps much in the past either, I’m playing catchup but what you’re suggesting makes a lot of sense.

My current URLs are towards the end of my previous post

/klvc/klvc_news.php?klvc=clientname&newsid=N

Pete,

If you want to CREATE /klvc/clientname/news/article-# and use it to access /klvc/klvc_news.php?klvc=clientname&newsid=N, then all you’d need to do is use code like this (in klvc’s parent directory):

...
RewriteRule ^klvc/([a-zA-Z]+)/news/article-([0-9]+)$ klvc/klvc_news.php?klvc=$1&newsid=$2 [L]

Yes, it’s that simple when you have a “specification” to work from.

Let’s see what you can put together for the rest.

Regards,

DK

Thanks DK - that’s excellent.

I’ve been on another project since I started this thread but will be moving back to it shortly so it’s been good to get your help - the time delay would have been a knightmare otherwise!

Working towards the long-term goal I am going to be pointing several domains at this part of the site. Is there a way to re-write these as follows:

www.clientname.com/news to /klvc/klvc_news.php?klvc=clientname
www.clientnamesecond.com/news to /klvc/klvc_news.php?klvc=clientnamesecond

Would I have to have an entry for each domain and page I wish to rewrite? This wouldn’t be ideal as we could have in excess of 50 domains pointing to this part of the site.

Pete,

The delay is for me to “recharge my batteries” as I do need the sleep!

Multiple domains? The POWER of mod_rewrite is in its ability to examine (and capture) Apache variables. So, all you’d need to do is to use a RewriteCond to examine and capture the {HTTP_HOST} (domain) and use that in a redirection, i.e.

...
RewriteCond %{HTTP_HOST} ^(www\\.)?([a-z]+)\\.com$
... 

where you use %2, the captured ‘clientname’, to feed to klvc_news (etc) in the query string.

Regards,

DK

Thanks David,

I’ve rewritten my ruleset as required. I also needed to match to hyphenated characters, and numbers (ie the clientname might be client-4-something).


Options +FollowSymlinks
RewriteEngine on

#Catch specific reference to a klvc  
RewriteRule ^klvc/([a-zA-Z0-9\\-?]+)/([0-9]+)$ /klvc_store.php?klvc=$1&mantext_id=$2 [L]

#Catch specific reference to a news article
RewriteRule ^klvc/([a-zA-Z0-9\\-?]+)/news/article-([0-9]+)$ /klvc_news.php?klvc=$1&newsid=$2 [L]
RewriteRule ^klvc/([a-zA-Z0-9\\-?]+)/news$ /klvc_news.php?klvc=$1 [L]
 
#Catch specific reference to a booking form
RewriteRule ^klvc/([a-zA-Z0-9\\-?]+)/booking$ /klvc_booking.php?klvc=$1 [L]

#Catch specific reference to a contact form
RewriteRule ^klvc/([a-zA-Z0-9\\-?]+)/contact$ /klvc_contact.php?klvc=$1 [L]

#Catch anything left
RewriteRule ^klvc/ /klvc_store.php [NC]

I now need to match to the domain names. I’m assuming that I would need something in the URL that I’m rewriting to match against the header? Would that mean that my

^klvc/([a-zA-Z0-9\\-?]+)$

Would need to match the domain name rather than the shortname?

pete,

I also needed to match to hyphenated characters, and numbers (ie the clientname might be client-4-something).

^klvc/([a-zA-Z0-9\\-?]+)$

NEVER put an intended hyphen in the middle of a character range definition!

NEVER escape characters (other than a space) in a character range definition!

NEVER think that ? will be in a URI - that’s a reserved character (denotes the end of the URI and start of a query string) so it’ll NEVER be used in a character range definition (it loses its metacharacter type inside the definition so it’s NOT an optional whatever-came-before).

Domain name regex is the same as all the other regex EXCEPT that you can expect users to try CaMeLcAsE every once in a while. Merely use lowercase characters and the No Case flag:

RewriteCond %{HTTP_HOST} ^(www\\.)?example\\.com$ [NC]

which will match the www’d and non-www’d versions of example.com. Leave off the start or end anchors if you want and you can search for the regex in the middle of the Apache variable:

RewriteCond %{HTTP_HOST} example [NC]

will match test.myexamplesite.tld in any character case.

Regards,

DK

Thanks Dave,

I understood that putting \-? would optionally match the hyphen in my addresses. I’ve now changed to:


RewriteRule ^klvc/([-a-zA-Z0-9]+)/([0-9]+)$ /klvc_store.php?klvc=$1&mantext_id=$2

Is that correct to match the hyphens?

I’m still not quite clear on the matching of the domains to rewrite rules I’ve got. It appears that I’d need to have a RewriteCond for each domain I wish to host and hardcode the URLs or have a way of matching requests in the database using the querystring.


RewriteCond %{HTTP_HOST} ^(www\\.)?clientname\\.com$ [NC]
RewriteRule ^http://www.clientname.com/klvc/([-a-zA-Z0-9]+)/([0-9]+)$ /klvc_store.php?klvc=$1&mantext_id=$2 [L]

RewriteCond %{HTTP_HOST} ^(www\\.)?clientnamedifferent\\.com$ [NC]
RewriteRule ^http://www.clientnamedifferent.com/klvc/([-a-zA-Z0-9]+)/([0-9]+)$ /klvc_store.php?klvc=$1&mantext_id=$2 [L]

This would mean I could have a lot of entries in the htaccess for all of the allowed domains.

Pete,

Yes, your treatment of hyphens is now correct.

Domains? How many? You can use a list ( match (domain1|domain2|domain3|…) ) in your RewriteCond. Moreover, you can add a Skip flag as a GoTo (positive match will SKIP the next n RewriteRules so use a negative to match the list). On the other hand, what other domains will be in your DocumentRoot? Specification?

Regards,

DK

The purpose of what we’re doing is to provide ‘microsites’ for branches of the parent company. Initially this will probably be less than 10, but once the microsites are sold to the branches, this could stretch into the 100’s.

As I’ve stated earlier, all of the sites will live at:

which points to
www.parentcompany.com/branch
which points to
www.parentcompany.com/klvc_store.php?klvc=branch

As I was unsure as to what I could achieve I initially just set out to create some nice URLs on the parent.com domain. I would then redirect any traffic in the form of www.branch.com to the root of the www.parentcompany.com/branch.

This would actually be fine, but if we could get it that the branch.com domain resolves correctly for all of the pages, rather than redirecting to the root then that’s the best case.

I just need to offset the benefit against the maintenance.

What’s your opinion?

Pete,

Are you on a hosted server or the company’s server? The reason I ask is that cPanel allows (okay, it FORCES) Addon Domains to have their DocumentRoot at the maindomain/addondomain directory. This would bypass the maindomain’s DocumentRoot entirely (and maintain the addon domain as the {HTTP_HOST}) OR it could be configured as a Parked Domain which is directed to the maindomain’s subdirectory (with the maindomain being displayed as the {HTTP_HOST}).

Personally, I’d use the Addon Domain style of handling the branches.

Regards,

DK

We’re on a dedicated box so I’m assuming we can configure in whatever way we require. Having said that, I’m not sure they’ve got any form of control panel though so I’m not that sure how to configure it.

Thanks for all of your help on this David. It’s been a great learning experience for me.

:tup: That’s why I’m here (to help others learn). Pass it along!

Regards,

DK

David,

I hope you had a good Christmas and New Year. I’m currently arranging with the client to get some form of control panel installed on their server as I’m no apache expert. Once we’ve got the control panel installed, I can get the domains setup on the new server, would I then just be able to take the {HTTP_HOST} and use it in my existing rewrite rule? Can you clarify?

Many thanks

petersen,

You don’t NEED a CP to add (sub)domains. All you need is access to the httpd-vhosts.conf file (and httpd.conf to be able to include the vhosts file) and merely follow the format there. You may also need to update your hosts file, too, as that’s necessary for me to see my many test domains (I either strip the .com off the public domain name or assign my own acronym) on my test server. For a public server, the changes would be the same but two entries would be required - one for localhost use and one for public use (with the IP address replacing 127.0.0.1). For me (in my testing environment), that satisfies the need for a DNS daemon.

Regards,

DK

Thanks David,

I’ve taken a look at my WAMP setup and reckon I understand what’s going on. My question was actually more in relation to how to configure my rewrite conditions.

petersen,

That’s all in the tutorial linked in my signature.

Regards,

DK