RegEx I can't solve

Hello,

Here’s one I can’t solve.

I need to remove a string if it’s doesn’t appear in comments (only comments targeted are “/**/”). Everywhere else, the string should be removed.


$foo = var;
/* And this is some comment $foo = var; which isn't touched*/

Output should become:


/* And this is some comment $foo = var; which isn't touched*/

$foo = var; has been removed. "$foo = " will never change. However, “var” could be anything.

Thanks in advance for the help.

:slight_smile:

-jj.

If you have only one-line comments then using lookaheads in regex can be used:


// Assume $s contains the original string

// this will work for single-line comments only
$out = preg_replace('#(\\$foo *= *\\w+;)(?!.*?\\*/)#', '', $s);

This will work as long as $foo = var is either in one-line comments or is in the last line of a multi-line comment because it only checks for the closing */.

Unfortunately, PHP doesn’t support variable-length lookbehinds so this can’t be used for multi-line comments because we have no way of detecting preceding comment content back to the opening comment tag, which can be any length. So a workaround can be used by first splitting the string into non-comment sections and comment sections and then doing the replacements only in the non-comment ones:


// split string into non-comment elements and comment elements
$split = preg_split('#(/\\*.*?\\*/)#s', $s, null, PREG_SPLIT_DELIM_CAPTURE);

$out = '';

foreach ($split as $key=>$elem) {
	if ($key % 2 == 0) {
		// element with no comment - do replacements
		$out .= preg_replace('#(\\$foo *= *\\w+;)#', '', $elem);
		
	} else {
		// element with commented code - copy as is
		$out .= $elem;
	}
}

In these examples var can be any word character, that is letters, numbers and underscore.

Edit: if you want any content as var, then simply replace

(\\$foo *= *\\w+;)

with

(\\$foo *=.*?;)
  • but I guess that is the easy part :slight_smile: