Regexp question

Hello,

I have enabled #hashtags on my forum… here’s the replacement code:

$string = preg_replace('/(^|\\s)#(\\w*[a-zA-Z0-9_]+\\w*)/', '\\1<a href="http://mysite.com/link_to_hash/\\2">#\\2</a>', $string)

How can I make it so hashtags CANNOT begin with a NUMBER (0-9), but can begin with a-zA-Z_?

Thanks!

Noone’s jumped on this for a couple hours, so i’ll put me foot on the grenade and try and teach at the same time.

Define your hashtag.

I’ll give you an example, that i’m going to lift from my old college’s homework.
A valid string is one such that there is at least 1 a, followed by a b, followed by any number of b’s and c’s.
/a+b[bc]*/

I’d use this

~(^|\\s)#([a-zA-Z][a-zA-Z0-9_]*)~

~ - start regular expression
( - start capture group 1
^|\s - match at either the start of the string, or after a space
) - end capture group 1
#- match a # literally
( - start capture group 2
[a-zA-Z] - match a character in the range a-z, A-Z (i.e., any character lower case or upper)
[a-zA-Z0-9_] - match a character in the range a-z, A-Z, 0-9, _ (i.e., any character lower case or upper, digits, and/or underscores)

    • match the last expression (i.e., the character class [a-zA-Z0-9_]) zero or more times
      ) - end capture group 2
      ~ - end of regular expression

I used ~ instead of / because I find they are more practical in regular expressions, as you need to match / pretty often so then you’d need to escape it; with ~ you don’t have that problem. Feel free to substitute them for / if you wish.

HTH :slight_smile:

Thank you very much for both explanations. :slight_smile: It’s very helpful.

Works like a charm!