preg_match_all() puts all into $result[0] each time

Hi,

I got preg_match_all() working with the below pattern. It searches through an entire file read into a buffer.

$match_pattern = '/[\r
]<my_custom_tag>[\r
]
(.)[\r
]
<\\/my_custom_tag>/s
’;

If there is only once instance of <my_custom_tag> </my_custom_tag> in the buffer then it works well.

If however I include two or more instances of <my_custom_tag> </my_custom_tag>, rather than give me each one in its own array ID (i.e. $result[0], $result[1], $result[2] etc.), it packs everything into $result[0].

Say there is only one instance of <my_custom_tag> /my_custom_tag> in the buffer (read from a file), $result[0] will be 27 chars long. If there’s two instances $result[0] will be 1027 chars long and if there’s three $result[0] will be 2220 chars long (and so forth). Not only the text between <my_custom_tag> </my_custom_tag> gets kicked into $result[0] but also all the html/php code in between these tags (when there’s more than 1 instance of <my_custom_tag> </my_custom_tag>).

As you can see, it all gets crammed into $result[0] for some reason. It’s as if preg_match_all() can’t fully (though it does it once fine) identify between <my_custom_tag> and </my_custom_tag>.

What gives?

[I]$rss_source_file = fopen("$rss_from_file", "r") or die("can't open file [SOURCE]");
$rss_write_file = fopen("$rss_to_file", "a") or die("can't open file [DESTINATION]");


while (!feof ($rss_source_file))
{
	$buffer = fgets($rss_source_file);
	$lines[] = $buffer;	
} 
$array_count = count($lines);

$match_pattern = '/[\\r\

]<my_custom_tag>[\r
]
(.)[\r
]
<\\/my_custom_tag>/s’;

for ($i = 0; $i &lt; $array_count; $i++) 
{	
	
	$current_line = $lines[$i];
	$content = get_all_content_between2($current_line, $match_pattern); 
			
	print var_dump($content[$i]);
	
	fwrite($rss_write_file, trim($content[$i]) ."\\r\

“);
fwrite($rss_write_file, " " .”\r
");

}

fclose($rss_source_file) or die("can't close file [SOURCE]");
fclose($rss_write_file) or die("can't close file [DESTINATION]");[/I]

Something wrong in $match_pattern yes?

Thanks,

  • means zero or more but you can have a variation on it:

*? means zero or more, but match as little as possible

otherwise, * matches as much as possible, aka “greedy”

Since you do .*, you tell it to match as many of anything as possible. It obliges. It doesn’t stop until it finds the last </my_custom_tag> in the string.

It’s a greedy search that’s being performed.

The following information from the PHP pattern modifiers page should help.

U (PCRE_UNGREEDY)
This modifier inverts the “greediness” of the quantifiers so that they are not greedy by default, but become greedy if followed by ?. It is not compatible with Perl. It can also be set by a (?U) modifier setting within the pattern or by a question mark behind a quantifier (e.g. .*?).

Thanks both you guys. I put a “/U” at the end of $match_pattern and its working just fine now filling the array 1, 2, 3 etc. instead of just 0 all the time.