Proper markup of bilingual title tag

In HTML 4.01, given a page including, among other content, the following tags:

<html lang="zh-TW">
<title>&#20027;&#38913;/Chinese home page</title>

what is the proper way to mark it up for language? I’d like to code:

<title>&#20027;&#38913;<span lang="en">/Chinese home page</span></title>

but I suspect that the span tag would be invalid inside a title tag.

The span would be incorrect, it needs to be just plain text.

You should mark the whole pages language as Chinese if that’s the Chinese page. Any English on that page would be inconsequential.

Like was said the HTML is the root element and thus TITLE theoretically also gets the same treatment as the lang value on HTML since basically it is a child of the HTML. Though of course TITLE is not considered part of the flow of text and cannot contain other markup.

what is the proper way to mark it up for language?

You cannot mark it up properly. This is a drawback of how the <title> tag was designed.

You’ll have to decide which language gets preference, and use the lang attribute for that one (unless the one you choose matches the one you have already on the html element). I have the same problem with Dutch/English titles.

The reason you are using the attribute at all is mainly for screen readers. With the proper lang attribute, a screen reader can (if it has it installed) choose the correct voice with the correct pronunciation. This means if I have an English lang page with a title that says
<title lang=“nl”>Stomme poes</title>
then a screen reader with a Dutch voice will correctly pronounce that title, as well as optionally announcing the language of that tag (as it can the page as well).

In your case, you can’t have two languages in one tag both marked up correctly, so you’ll have to pick one. It’s not your fault.

For the rest of your page, you can of course use the span as a hook to add the lang=en as needed, and I encourage you to do this if you aren’t already. This will work best for screen readers who have both English and Chinese voices, and will prevent the mangled noises you get otherwise.

Hi everyone,

I just joined today and I have a similar issue. I would like to ask for advice, since I always have to work on my own. I am working on a bilingual website. Basque language (Euskara) and English. There won’t be 2 different sites, it will all go in one site, mainly in English, with some comments, titles, links and the occasional article in some pages in Basque. In these cases, there’ll be a paragraph in English and right below it will be translated into Basque language in the next paragraph. I guess I can code those in Basque as <p lang=“eu”></p>…? I say I guess as I’ve never done it. Likewise, should I use the <span lang=“eu”></span> when inserting Basque words to wrap them?

How about the head section? I am coding HTML5, should I mention the language at all in the head section? The way I have learned about HTML5, in the head the language has not been mentioned, only the <meta charset=“utf-8” />. I am guessing it is optional…?

Any other advice will be welcome and appreciated. This website is for an association working for the community very much like a charity, so I do this for free. I want to do a decent job. If anyone can tell me if this is the right section to talk about html and css, I would appreciate it as well. Thanks in advance and all the best.

You can switch languages in a single document the way you’ve described.

<header lang=“en”>english text here, no matter how many inner elements there are, so long as they are all in English… though if there is one word or phrase in Basque, you can always override with an inner element: <span lang=“eu”>basque phrase here</span> and this text is automatically back to English.</header>

Only if there isn’t already another tag around the other language (just use that then), otherwise a span is fine.

According to the specs, the lang attribute should include everything inside the element it is placed on, and is overridden by any elements inside (doesn’t really matter what the element inside is, except that the new overriding language is only within the scope of that tag).

<p lang=“zh”>This paragraph and all inside are Chinese <a href=“foobar” lang=“jp”>but this link is in Japanese <span lang=“es”>with a Spanish phrase inside</span> more japanese text</a> this is back to chinese</p>

Now I remember when I had an older version of JAWS where we used links for people to switch languages. Firefox at the time was trying to speed up the user experience by prefetching all links on a page. Until we added in the no-prefetch meta tag, using JAWS with Firefox meant that after hitting the last link (which was in Portuguese), the rest of the page was being read as if it were in Portuguese, even though the text was Dutch and the HTML tag had lang=“nl” on it (this was a bug in JAWS; the HTML was fine).

So maybe text in a few popular screen readers who have Basque and English voices (I’m surprised this page doesn’t have any Spanish dialects on it) to make sure some bug in them causes problems… or just be aware of the possibility if you get a complaint in. But it’s pretty easy to nest the languages correctly in the HTML.

You said “mainly in English” so I take it you want
<html lang=“en”>
and just add in the lang=“eu” where necessary.