Htaccess isn't accounting for spaces?

Hi Everyone,

My htaccess isn’t account for spaces with certain keywords. I’m not sure why because I thought my syntax was correct but it turns out it’s wrong.

A URL like this one on my website is working. It’s working because there are no spaces in between keywords.
http://mysite.com/florida/orlando/2006/ford/explorer

A URL like this on my website is working but I don’t want the %20 in between any keywords.
http://mysite.com/florida/key%20west/2005/ford/explorer

My goal is to take the second above URL (the one with a space) and make it look like this below URL
http://mysite.com/florida/key_west/2005/ford/explorer

or take a URL like this
http://mysite.com/rhode%20island/cranston/2005/ford/explorer

and turn it into this
http://mysite.com/rhode_island/cranston/2005/ford/explorer

How would I do that?

My .htaccess code is below.

RewriteEngine On
RewriteRule ^([_a-zA-Z_]+)$ state.php?state=$1 [L]
RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)$ city.php?state=$1&city=$2 [L]
# RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

Thanks everyone!

22,

I cover that in my signature’s tutorial. Also, Uniform Resource Identifiers (URI): Generic Syntax addresses allowed/reserved/illegal characters in a URI and space is certainly only allowed when it’s encoded (as %20).

Regards,

DK

Hi Dklynn,

Thanks for your help.

Note #5: Since URLs can’t have spaces (except as %20), use underlines or hyphens to replace them. If you ABSOLUTELY have to use spaces (%20) in your URIs, you can include them in your regex within a range definition as \{space}, i.e., ([a-zA-Z\ ]+). However, this is NOT advised.
Note #6: If you are converting to/from a database field which does contain spaces, you should convert the spaces to some other character. Using PHP, you can use
$state = str_replace ( ’ ', ‘_’, $state );

I’ve tried a few different variations but my syntax still does not work. On my make-model.php page I’m replacing the space with a _ and it’s not working. What do you think I’m doing wrong?

Thanks for your help.

make-model.php syntax

$state = str_replace (' ','_', $state);
$city = str_replace (' ','_', $city);
$model = str_replace (' ','_', $model); 
RewriteEngine On
RewriteRule ^([_a-zA-Z_]+)$ state.php?state=$1 [L]
RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)$ city.php?state=$1&city=$2 [L]
# RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
# RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
RewriteEngine On
RewriteRule ^([_a-zA-Z_]+)$ state.php?state=$1 [L]
RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)$ city.php?state=$1&city=$2 [L]
RewriteRule ^([_a-zA-Z]+)/([_a-zA-Z]+)/([a-zA-Z]+)/([a-zA-Z]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
# RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
# RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
RewriteEngine On
RewriteRule ^([_a-zA-Z_]+)$ state.php?state=$1 [L]
RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)$ city.php?state=$1&city=$2 [L]
# RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
# RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

22,

Your PHP code is fine (to convert spaces from your database’s data to use as links) but, once you use the link, you’ll need to convert back before you access the database.

On to your mod_rewrite code:

RewriteEngine On
RewriteRule ^([[COLOR="#FF0000"]_[/COLOR]a-zA-Z[COLOR="#FF0000"]_[/COLOR]]+)$ state.php?state=$1 [L]
# WHY _'s at the beginning and end? That (IMHO) shouldn't cause a problem but ...

RewriteRule ^([[COLOR="#FF0000"]_[/COLOR]a-zA-Z[COLOR="#FF0000"]_[/COLOR]]+)/([_a-zA-Z_]+)$ city.php?state=$1&city=$2 [L]
# RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]
RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

# There are MAJOR problems with this code:
# 1. You're allowing state/city/// - can make-model handle null input for year, make and model?
# 2. Shouldn't the value of year be [0-9]{2,4}?
# 3. Do you remove spaces from make and model?

# RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

# Most importantly, USE the R=301 flag to troubleshoot your mod_rewrite code!
# At least you'll be able to SEE the redirections it makes without resorting to
# a log file.

Okay, the others get the same comments.

The reason I recommend that you verbalize your “specification” is so you can see the problem which you’ll have in specifying the matches to be made as well as finding potential conflicts between RewriteRule sets. Certainly, your ([^/]*) sets are highly problematic (nearly the dreaded :kaioken: EVERYTHING :kaioken: atom) so I’m glad that you commented them out (for the most part). Specificity is important so try to nail down your intended matches as well as you can.

Regards,

DK

Hey Dklynn,

Thanks for the help. I’m trying a few different .htaccess syntax examples but nothing seems to be working.

This code gave me an internal server error.

RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)/([0-9]{2,4)/([_a-zA-Z_]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

Not working

RewriteRule ^([_a-zA-Z_]+)/([_a-zA-Z_]+)/([0-9]{2} | [0-9] {2, 4})/([_a-zA-Z_]+)/([_a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

Not working

RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/([0-9]{2,4)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

I know you don’t recommend ([^/]) but the below example is the syntax that’s come closest to actually working. My question is, if ([^/]) accepts everything then how come it’s not accepting the conversion of the space%20 into a underscore_?

RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

Thanks for your help.

22,

The first thing you must learn is that a 500 error is a direct indicator that your syntax is in error.

The second thing you must learn is that when you’re advised that you don’t need TWO _'s in a character range definition, REMOVE ONE!

The third thing you must learn is to use the R=301 flag so you can see whether a redirection is working (your receiving script may be the source of an error).

Whew, with those off my chest, on to your code:

See learning objectives 1, 2 and 3 above then compare your code with the following and trial my code against the URI (which you’ve not shown) it’s supposed to match:

RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([0-9]{4})/([a-zA-Z_]+)/([a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [R=301,L]

Note that I’ve removed the leading _'s when duplicated, that I closed the character delimiter on the year (the missing } ), decided on your behalf to use four digits for your year’s value and used the R=301 flag so you can see the URI change (assuming that your URI is properly encoded with letters and _'s except for the year which now must be four digits).

This is an improvement as the syntax error (missing } ) is gone but you’ve used additional spaces which is another syntax error (should have been 500 also). I’m disappointed that you didn’t decide whether your years are two or four digits but it does show that you’re thinking about it. Using your alternative, I believe that the second year option should be {4}, though.

RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([0-9]{2}|[0-9]{4})/([a-zA-Z_]+)/([a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [R=301,L]

Back to #1, #2 and #3 and introduced possible null values into your URIs! Why are you trying to match spaces with [^/]*? Spaces (%20’s) can be matched within a character range definition using [\ ] or, in your situation, [a-zA-Z_\ ]+. NOTE THE PLUS SIGN which requires one or more of the preceding character!

I’ve given you code above for this situation (unless you really want to embed spaces in your make and model) so I won’t repeat it here.

Sorry, 22, but that’s almost as insane as ^(.)/(.)/(.)/(.)/(.*)$! If you use ridiculous code like that, you’ll need to build a LOT of intelligence into your make-model script and I wouldn’t want to tackle that when it’s so easy to do with proper regex and mod_rewrite.

Once again, you might benefit from reading the mod_rewrite tutorial linked in my signature as it contains explanations and sample code. It’s helped may members and should help you, too. Pay particular attention to how to develop proper regular expressions rather than using the :kaioken: EVERYTHING :kaioken: atom. After all, my first Standard Rant is:

[rant #1][indent]The use of “lazy regex,” specifically the :kaioken: EVERYTHING :kaioken: atom, (.*), and its close relatives, is the NUMBER ONE coding error of newbies BECAUSE it is “greedy.” Unless you provide an “exit” from your redirection, you will ALWAYS end up in a loop![/indent][/rant #1]

Regards,

DK

Hey Dklynn,

Thanks for the help. I appreciate it.

The below syntax is working for the .htaccess.

On my make-model.php page I accounted for the spaces with this syntax

$state = str_replace ('_',' ', $state);
$city = str_replace ('_',' ', $city);
$make = str_replace ('_',' ', $make);
$model = str_replace ('_',' ', $model);

I also used this question many years ago to help me.

A few variations of the .htaccess is working.

The below one is working and this is the one I’m using for my website.

RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([0-9]{2}|[0-9]{4})/([a-zA-Z_]+)/([a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

This syntax also works

RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([0-9]{4})/([a-zA-Z_]+)/([a-zA-Z_]+)$ make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

The below syntax also works but I’m not using it because it’s not the best way to do so.

RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /make-model.php?state=$1&city=$2&year=$3&make=$4&model=$5 [L]

Thanks again Dklynn, appreciate the help.

22,

No problem. I’ve been here a long time to help members both directly and via others’ questions.

As I showed in my tutorial, you’re using the same code to generate the links; I’m sure you’re using the same code (reversing the order of the first two parameters) on the key values before querying the database.

The difference between your two working code snippets is that the former requires EITHER 2 or 4 digits for the year while the second requires 4 digits. Because you are the webmaster, you should know whether your database is expecting queries with the year as either 2 or 4 digits (i.e., NOT an either/or for your links) so the second code is better (IMHO). It’s all in how tightly you can define your expected key values.

Of note is that my last sentence should be considered a “ban” against your “almost” :kaioken: EVERYTHING :kaioken: atoms - as I’m sure you’ve realized. I trust that you understand the reason for this as it IS important (and alleviates the necessity of massive pre-query checks in the receiving script).

Regards,

DK

Thanks again Dklynn. Appreciate the help.

I have another question but I’m not entirely sure if this related to htaccess or php but I’ve decided to ask it here.

My website for some reason is echoing out keywords/key-phrases that aren’t related to my website. I’ve listed some of the below statements. How come do you think this is happening? For example, if a user were to type in this_is_me_typing_in_something after the / google is indexing that string. I’m not sure why this is taking place. Perhaps my .htaccess code is not accounting for something?

http://www.mywebsite.com/advices <- ‘google will index this as a page when it’s not.’
http://www.mywebsite.com/ratings <- ‘’
http://www.mywebsite.com/widgets <-‘’
http://www.mywebsite.com/terms_and_conditions <-‘’
http://www.mywebsite.com/latest_news <-‘’
http://www.mywebsite.com/this_is_me_typing_in_something <-‘’

Also my landing page(s), whatever is appended to the URL (latest news, widgets, ratings etc…) those statements are being echoed out onto the page and I’m not sure why?

My question is, how would I go out stopping/fixing this?

Also, I’ve noticed on some websites if a user where to append a key-phase after the / they often get automatically redirected back to the home page.

randomwebsite.com/whatever <- user types in “whatever”. The “randomwebsite” does not have a “whatever” page and they get diverted back to the homepage. Perhaps that will correct my problem. Can I do that with .htaccess or php?

My apol. for rambling. Tell me what you guys think.

Thanks again.

s22,

Oh, my!

First, don’t get overly concerned about what SE’s are indexing EXCEPT for when they index a page they should not be viewing - these are generated by hackers looking for specific attack vectors at your website (like admin/index.php pages without a login). Most of the nonsense is just that - nonsense. At http://wilderness-wally.com (where I use page titles for the links), I have the “handler” file look at the database and, if the title is not there, it’s not a valid URI so I feed the default Welcome page.

Now, with my client entering new articles quite often, I can’t provide a list of titles in the .htaccess to check before sending requests to my “handler” but you can, if you only have a few, by including them in regex like ^(page1|page2|page3)$. You see, while I had my “handler” file provide the Welcome page, I prefer to use ErrorDocument 404 /sitemap.php because it adds value to the error message.

Echoed out? Like a query string? If you kill the query string, your handler file may not be able to provide the desired content. I need an example of these URIs as well as your .htaccess, though, to give a definitive recommendation.

I believe that my “handler” file’s returning the default page is what you’re describing at the end so, “answered then asked” or did I answer before getting to the question by reading ahead?

See, I can ramble, too! And you didn’t even ask the time before I launched into how to build the watch! :lol:

Regards,

DK

Hey Dklynn,

Thanks for the reply.

“I have the “handler” file look at the database and, if the title is not there, it’s not a valid URI so I feed the default Welcome page.”

I’m a little lost. What do you mean by a handler file? Does this file specifically look at the URIs to see if they are “legit”? to make sure they are coming from the webmaster and know one else?

“Echoed out? Like a query string? If you kill the query string, your handler file may not be able to provide the desired content. I need an example of these URIs as well as your .htaccess, though, to give a definitive recommendation.”

Exactly, it’s like a query string. It’s a little bizarre. I’ve PM’ed you.

Thanks again DK. Appreciate the help.

22,

The “handler” file is the redirection script which is hidden from visitors. WP uses index.php to serve EVERYTHING and everyone knows that but can you identify my “handler” file at Wally’s website? Oh, well, for your redirections, it’s make-model.php.

I responded to your PM hours ago; situation resolved?

Regards,

DK

Hi DK,

Thanks again for taking the time to answer my question(s). I appreciate it. My apol. for my delayed response.

I answered your PM just now. Please give me your thoughts.

Again, thank you.

22,

Done.

Lessons: Use R=301 to view the redirections and use actual URIs to test the mod_rewrite.

When you’re satisfied with your “lessons learned,” please post a summary of PMs for others to use to learn, too.

Regards,

DK

Will do DK.

Appreciate the help. I PM’ed you one last question. I’ll post a summary once you’ve gotten back to me so I’ll (also you) will be able to help everyone.

Thanks again!

22,

Done.

Regards,

DK