Accented characters sometimes work, sometimes don't

I have a webpage with lots of accented characters from different languages. In IE8, sometimes the page loads fine but sometimes selected accented characters (i.e. á, í, é, ð) come up as boxes. It seems completely random whether it works or not - just reloading the page without making any changes can show different results.

In my HTML header, I use

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

At the start of the PHP code, I use

header('Content-type: text/html; charset=utf-8');

Any tips on why it’s intermittently working like this?

Some way for us to replicate the problem you are facing will help us to help you.

I tried posting a link in my original post but it wouldn’t let me. The URL is skepticalscience dot com/translation.php

I have reloaded the page 20 times a row in IE8 seeing several boxes without seeing any change as described.

What am I supposed to be looking for? How are we to experience your issue?

Thanks for the response and sorry it’s a tricky issue. It’s like taking a faulty appliance to the repairman, sometimes it works, sometimes it doesn’t, it seems completely random :frowning: I’ve been hitting reload here. Sometimes it shows weird characters like this:

�El sol ha estado enfri�ndose o calent�ndose en las �ltimas d�cadas?

Sometimes the same line displays fine:

¿El sol ha estado enfriándose o calentándose en las últimas décadas?

Sometimes clicking on one of the links then hitting back toggles from one to the other. Could it be a caching issue?

BTW, now I’m up to 5 posts, the link is http://www.skepticalscience.com/translation.php and an example of a subpage is http://www.skepticalscience.com/translation.php?a=8&l=2

Do you experience the same trouble with Mozilla Firefox or Google Chrome, or other browsers than Internet Explorer?

Works fine in Mozilla - doesn’t seem to do any weird characters upon reloading. Only seems to happen in IE8. Do you see the bad characters or do all the accented characters display ok for you?

Page: http://www.skepticalscience.com/translation.php?a=8&l=2

blue title - box instead of accented text
pink background - has accented text
green title - boxes instead of accented text
blue background - boxes instead of accented text

No amount of refreshing/reloading shows any changes to what is accented and what are boxes.

When I got your message, I went to the page and saw the same thing.

I hit reload and all the boxes disappeared, the accented text displayed okay

This is in IE8.

Just noticed similar problem in Mozilla. First loaded the page, the accented characters showed up not as boxes but some other weird character. Hit reload, they worked okay. Hit reload again, back to broken and multiple reloads stays broken. Left it for a minute, went to IE, came back to Mozilla, hit reload, fixed again. So both browsers reporting similar issue.

Do you see same problem in both browsers?

I don’t have Firefox on this box, but have Google Chrome 4

With Chrome, I see all the proper accents, no boxes at all, even after many refreshes.

Edit: and going back, I do see the boxes with Chrome.

We need to find out what is happening with the title first.

How is it stored in the database, how is it accessed, what happens to it on the way to the page.

It’s stored in a MySQL database. The MySQL charset is UTF-8 Unicode (utf8)

I access it via basic query with no encoding or decoding:

SELECT * FROM translation WHERE ArgumentId = “.$_GET[‘a’].” AND LanguageId = ".$_GET[‘l’]

I then output with simple echo (I use a $db class), again no encoding or decoding:

echo “<title>”.$db->Record[‘Argument_Title’].“</title>”;

There is a way to convert a string into its component hex values, but it slips my mind at the moment.

That information within an HTML comment for the header would allow us to tell us where the difference lies.

If the display of the text changes while the hidden comment shows the same hex codes, it’s something about the browsers at hand.

if the display of the text changes an the hidden comment shows different hex codes, we need to investigate the php code more closely.

The hidden comment is exactly the same as the displayed text. This seems to point towards a server side issue. As a bit of a speculator, I tried using
the php function utf8_encode() and fingers crossed, it seems to be displaying all the text at http://www.skepticalscience.com/translation.php okay. Does it look good to you?

No, that’s not it, it still toggles on and off. However, I have isolated the problem by looking at the raw database.

Accents that are prone to going wrong use the accented characters in the raw database: characters like ä. However, whenever it uses HTML code for the character (eg - ä instead of ä), it works fine. So a hack solution would be to go thru the entire database and replace any accents with the HTML code version of each character. Or I could figure out why sometimes the browser shows the character ok and sometimes it doesn’t. If no other option presents itself, will have to go the hack option.

If it was changing back and forth like you say, without you doing anything, you have some other issue.

Are you using persistent database connections?
Messing with iconv or mbstring extension, or their settings?

I don’t know what persistent db connections are, or iconv or mbstring extensions - so I’m guessing no. I think it makes a fresh db connection each time the page loads.