Reading the Euro symbol

Hi Everybody,
I am reading the Euro symbol (€) from a data stream I am parsing and inserting into a MySQL db. The problem is, whenever the symbol occurs, garbage is written into the field in the database. I presume I need to do a str_replace, but when i search for the € character, its not found.

I guess its encoded differently? How do I search for it and replace it with ‘€’ ?

thanks in advance.

The PHP manual lists the use of:

mysql_set_charset('utf8',$link_name); 

as the preferred way to set the character encoding for a MySQL database connection. It should be used right after the establishment of the connection to the server but before the selecting of a database to work with.

You must set up proper encoding probably.

What do your mean ?

I mean that MySQL db must know, which encoding has data it holds. and which encoding has data you send.
here I posted a small checklist to ensure you set all encodings properly

Which encoding does this data stream have?

I’m not sure, reading from an Excel file, how do I find out?

it doesn’t really matter.
you are already suppose that data is in utf8, don’t you?
so, tell mysql that your data has this encoding

How do you read the excel file? Which library are you using?

yes, I do this with mysql

mysql_query("SET NAMES 'utf8' COLLATE 'utf8_unicode_ci'");

I am using this project to read the file.

http://sourceforge.net/projects/phpexcelreader/files/Spreadsheet_Excel_Reader/

How do you check that symbol in the database?
Is it done with proper encoding set?

To check the DB, I am using the mysql query browser. The table is set-up as utf8 with utf8_general_ci collation.

Does this mysql query browser support utf-8?
Your “garbage” seems strange to me, because if utf-8 used on the whole data path, no recoding involved and any symbol must remain the same, no matter source encoding.

How does your garbage look?

I’m making a little bit of progress. Basically, it seems Excel uses a charset cp1250. So, I told my browser to output this, and it does.

So, I guess I now need to convert the cp1250 code for the € symbol to utf-8 ?

I sorted this using the following line to convert the charset

$product = iconv('Windows-1252', 'UTF-8//TRANSLIT', $product);

Fixed now :slight_smile:

You already figured it out, but yes you need to convert into utf-8 manually. You don’t need the //TRANSLIT part since UTF-8 is capable of representing all the characters that exists in cp-1252.

Fixed now

It can be done by proper client charset setting too
SET NAMES cp1250
should do the trick
SET NAMES itself were invented to do such things

Technically yes, but I would recommend doing the conversion in php, as livewire figured out hi self. The connection charset is a global setting, so by setting it to cp1252, you would have to make everything in the application use this charset.