preg_replace confusion

dujmovicv · June 2, 2010, 11:14am

Hi All!

I created this short listing for replacing ALL the ‘non-standard’ (Slavic) letters with the extended ASCII codes :


$patterns = array();
$patterns[0] = '/&#268;/';
$replacements[0] = '&#268;';
$patterns[1] = '/&#262;/';
$replacements[1] = '&#262;';
$patterns[2] = '/Š/';
$replacements[2] = 'Š';
$patterns[3] = '/&#272;/';
$replacements[3] = 'Ð';
$patterns[4] = '/Ž/';
$replacements[4] = 'Ž';
$patterns[5] = '/ž/';
$replacements[5] = 'ž';
$patterns[6] = '/&#273;/';
$replacements[6] = '&#273;';
$patterns[7] = '/š/';
$replacements[7] = 'š';
$patterns[8] = '/&#263;/';
$replacements[8] = '&#263;';
$patterns[9] = '/&#269;/';
$replacements[9] = '&#269;';

$h_transport = "TRANSPORTERI";
$h_couplings = "OSOVINSKE SPOJNICE";
$h_spantech  = "SPAN-TECH TRANSPORTERI - NOVO";

$myFile = "content/products_spantech_".$language.".txt";
$fh = fopen($myFile, 'r');
$theData = fread($fh, filesize($myFile));
fclose($fh);
$theData = preg_replace($patterns, $replacements, $theData);

$h_transport = preg_replace($patterns, $replacements, $h_transport);
$h_couplings = preg_replace($patterns, $replacements, $h_couplings);
$h_spantech  = preg_replace($patterns, $replacements, $h_spantech);

echo $theData;
echo $h_transport;
echo $h_couplings;
echo $h_spantech;

The script replaces even if meets a ‘simple’ letter c or C to č or Č, when I comment those lines, it goes to the next (ć or Ć) and so on…
Why doesn’t replace just the given characters like it suppose to do? Am I doing something wrong?

Thank you for your help!

dujmovicv · June 8, 2010, 9:07am

I think I’m getting closer…
I save the lang_1.php, with some of the patterns like :


$patterns[0] = '/&#268;/';
$replacements[0] = '&#268;';
$patterns[1] = '/&#262;/';
$replacements[1] = '&#262;';
$patterns[8] = '/&#269;/';
$replacements[8] = '&#263;';
$patterns[9] = '/&#263;/';
$replacements[9] = '&#269;';

but then when I close my php editor (Macromedia Dreamweaver) and re-open the file, those lines are shown as :


$patterns[0] = '/C/';
$replacements[0] = '&#268;';
$patterns[1] = '/C/';
$replacements[1] = '&#262;';
$patterns[8] = '/c/';
$replacements[8] = '&#263;';
$patterns[9] = '/c/';
$replacements[9] = '&#269;';

…
Why is that the editor allow to type and save š or ž, but doesn’t allow ć or č???

dujmovicv · June 3, 2010, 8:28am

I have checked, here’s the line from my header :


<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

and the issue is still unsolved…
Maybe I should find/replace ALL those characters in my plain txt files???

dujmovicv · June 3, 2010, 10:34am

I tried with the php line you provided, the result was the same…

logic_earth · June 2, 2010, 9:05pm

Use UTF-8 and send the page as UTF-8…
header( 'Content-Type: text/html;charset=utf-8' );

dujmovicv · June 2, 2010, 8:44pm

Well, the problem is if I just type the letters (those with the Slavic ‘marks’) in my php editor, the browser won’t display them correctly. Now I have a plenty of text to insert (copy/paste) and I don’t want to replace ALL the characters with the corresponding ASCII codes…
Hope I was clear enough!

dujmovicv · June 2, 2010, 11:52am

Thank you rajug, unfortunately str_replace() doesn’t work at all in my case…

logic_earth · June 3, 2010, 9:28am

…That is an HTML meta element. It is not the same thing as what I posted.

Betcour · June 2, 2010, 1:23pm

Not sure what you are trying to do here, it seems the replacement letters are exactly the same as those they are supposed to replace ? Or are you trying to convert unicode symbols to some variant of 8 bit extended ASCII ?

Raju_Gautam · June 2, 2010, 11:30am

I am not quite sure about such special characters but just to replace such individual characters will be replaced with str_replace() only. Consider the following example:


$string = "cWMKonDl";
echo str_replace(array('W', 'D'), array('www', 'ddd'), $string);