preg_match() not replacing all instances

Hi folks, l’m trying to work with a little preg replace script to chagne the SRC of images posted on one of my wordpress blogs. Basically what l’m trying to do is have the preg_match() function strip out the width and height attributes from the image(s) and alter the SRC attribute to point to a timthumb.php resizing script.


$content = '<img src="http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-13.jpg" alt="" title="pics-(13)" width="820" height="668" class="alignnone size-full wp-image-64" />';

$content .= '<img src="http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-12.jpg" alt="" title="pics-(12)" width="820" height="757" class="alignnone size-full wp-image-63" />';


$content_fixed = preg_replace('/<img(.*)src=(\\'|")(.*)(\\'|")(.*)width=(\\'|")(.*)(\\'|")(.*)height=(\\'|")(.*)(\\'|")(.*)>/ims', "<img$1src=\\"/photoblog/wp-content/themes/photoblog/scripts/timthumb.php?zc=0&w=580&src=$3$11\\" /><br />", $content);


echo $content . '<br />----------------<br />' . $content_fixed;

Outputs:


<img src="http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-13.jpg" alt="" title="pics-(13)" width="820" height="668" class="alignnone size-full wp-image-64" />

<img src="http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-12.jpg" alt="" title="pics-(12)" width="820" height="757" class="alignnone size-full wp-image-63" />

----------------

<img src="http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-13.jpg" alt="" title="pics-(13)" width="820" height="668" class="alignnone size-full wp-image-64" />

<img src="/photoblog/wp-content/themes/photoblog/scripts/timthumb.php?zc=0&w=580&src=http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-12.jpg" alt="" title="pics-(12)757" class="alignnone size-full wp-image-63" /><br />

For some reason it’s only applying to the last IMG tag, and l’m not sure why. Can anyone that’s a little more seasoned with preg_replace help me out a bit and tell me what l’m doing wrong, and how l can fix it?

Your problem is coming from the fact that it’s actually matching from the start of image one, all the way through to the end of the source of image two.

The liberal use of (.*) is meaning that it’s finding more than it needs to, e.g. it will match this:

src=“http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-13.jpg” alt=“” title=“pics-(13)” width=“820” height=“668” class=“alignnone size-full wp-image-64” />

<img src="http://blogs.theworldlink.com/photoblog/wp-content/uploads/2010/03/pics-12.jpg

Hence the replacement looking a bit funny. I’d suggest going back to what those matches are doing, and perhaps doing a search for items EXCEPT for a " or ', rather than searching for anything which is what (.*) is.

Actually replacing your (.) with (.?) will do it. By default (.) is greedy and matches as much as possible, but using (.?) will match as little as possible.

Also, because you can’t guarantee the order of the src, height and width tag you should probably just be using an html parser instead, as it’s more reliable.

http://www.google.com/search?hl=en&safe=off&q=html+parser+php&btnG=Search

Mal Curtis, Thank you so much for your help. the (.*?)'s did the trick…
Also the HTML parser is also a great idea, but these are img tags being generated by wordpress so l’m sure that the order of the attributes will stay the same.