If I code in strict XHTML will it validate in transitional doctype?

4SeeN · November 12, 2010, 5:59pm

Just wanted to thank everyone who posted in this thread, I learned alot!

Cheers!

clairs · November 14, 2010, 2:35pm

It’s perfectly fine to use XHTML syntax in the new HTML(5) doctype - I do it all the time because I prefer the stricter syntax. If you want to “convert” your website to “HTML5” all you need do is change the doctype at the top. All browsers (including IE6) will still display your page in the same way, so it is completely safe to do this. It’s also completely safe to use the new features of the spec, with sufficient fallback (which will not take you long to do)

When the W3C spec becomes recommendation, it will still be ok to use the HTML4 and XHTML doctypes - support for these will not go away. You’ll just probably have to change your doctype when you want to use new features. And then work on HTML will continue: it will still use the same new HTML doctype

sitepoint.com (not the forums) uses the new HTML doctype with XHTML syntax

system · November 15, 2010, 3:07pm

Code in Strict, deploy in Tranny is common practice for when you can’t trust the next joker to come along to have any business writing markup in the first place (Which it sounds like your marketing department falls into the category of). There are ZERO problems since technically STRICT is a SUBSET of Transitional – Tranny just has all the extra crap you have no business using in coding a website after 2002 still in it. (basically when we were able to kick Nyetscape 4 to the curb)

Just kiss off having validation that means a damn since the HTML5 spec is so loose it makes Belladonna look like the Mother Mary. [i](Just another of the reasons I have ZERO use for that steaming pile of manure known as HTML5 – the day it becomes ‘expected’ for new sites is probably the day I stop developing altogether)

Though as already implied by many posters, not that ANYONE has any business using HTML5 for production sites for the forseeable future.

Which is something a lot of people seem to lose sight of – lands sake HTML 3.2 still works just fine if you feel like a trip in the wayback machine to 1997.

Christmas on a cracker not this bull again. It says in the XHTML specification it’s ok to change the mime-type to text/html – so how is it NOT XHTML? WAY too much value is put on mime-type by some folks, when frankly I don’t even think it should be relevant and was a bad idea from the start!

IF you want to use it as a XML application, then sure, serving it with the allegedly “wrong” doctype presents that – but if you want a XML application you should be using 1.1, not 1.0

1.0 exists to bridge the gap between XML and HTML being subsets of BOTH – this is WHY either mime-type is valid. It gives you the clean consistent rules of XML, the ability to parse it with a standard XML parser and backwards compatibility to HTML. If you are trying to use it for anything more than that, you missed the point of XHTML 1.0

“Oh but it’s not true XHTML” – BULLCOOKIES.

felgall · November 15, 2010, 6:24pm

The MIME type is what tells the browser how it is supposed to handle the content.

If you tell the browser that it is HTML then it processes it as HTML regardless of the doctype (which the browser ignores except for checking that there is one).

So if you have an XHTML doctype and an HTML MIME type then the source is written in XHTML but the browser interprets it as HTML.

If you want the browser to interpret it as XHTML instead of HTML then you have to give it an XHTML MIME type.

That XHTML 1.0 can be served as HTML, XHTML, or XML is what allows its use to transition from one to the other as all you have to do to convert your site from HTML to XHTML (at least as far as the (X)HTML is concerned) is to change which MIME type you use. You can’t do that with any other version of (X)HTML.

That it can be served as HTML does not make that HTML into XHTML though. When you serve XHTML as HTML you can only use the sub-set of XHTML that is able to be interpreted as HTML. For example:

perfectly valid XHTML 1.0 if served as XHTML but cannot be served as HTML and so cannot use the HTML MIME type.

Stevie_D · November 15, 2010, 11:04pm

deathshadow60:

Christmas on a cracker not this bull again. It says in the XHTML specification it’s ok to change the mime-type to text/html – so how is it NOT XHTML? WAY too much value is put on mime-type by some folks, when frankly I don’t even think it should be relevant and was a bad idea from the start!

IF you want to use it as a XML application, then sure, serving it with the allegedly “wrong” doctype presents that – but if you want a XML application you should be using 1.1, not 1.0

1.0 exists to bridge the gap between XML and HTML being subsets of BOTH – this is WHY either mime-type is valid. It gives you the clean consistent rules of XML, the ability to parse it with a standard XML parser and backwards compatibility to HTML. If you are trying to use it for anything more than that, you missed the point of XHTML 1.0

“Oh but it’s not true XHTML” – BULLCOOKIES.

One of the requirements of XHTML is zero error compensation. Any syntactical error must result in a YSOD, user agents must not try to recover from it.

As far as I am aware, there are no browsers that do this when it is served as text/html, therefore there are no browsers that treat pages served as text/html as XHTML. You are getting none of the advantages of XHTML, but most of the headaches, and you’re setting yourself up for some really big headaches in the future. The only reason that you are allowed the kludge of serving XHTML as text/html with the *******ised self-closing tags (that are technically wrong in both languages) is because W3 realised that MS were so stuck in the stone age that with IE not supporting proper XHTML, if they didn’t introduce any such kludge, XHTML would have been stillborn - if there was no way to make it work with IE, nobody would have given it the second glance that it didn’t deserve.

system · November 16, 2010, 5:04am

Which again I’ve felt was a bad idea from the start – it’s more of your typical unix server geek ‘file extensions are evil’ nonsense; funny since 99% of files have their mime-type set by IIS or Apache by their filename…

… and since the specification allows for either and if you follow the recommendations and guidelines it should be processed identically by either, what difference does it make? Answer? NONE.

WHICH IS THE POINT OF XHTML 1.0!

Due to SCRIPT NOT being a empty content model container (you are familiar with the HTML content models, right?), you shouldn’t be using the shorthand self-closing on it. Only “empty model” elements can use the shorthand and guess what? SCRIPT isn’t on the list of empty elements. It’s like making an empty <p></p> – in XHTML <p /> is invalid! (well, not invalid but it’s not recommended either) You’re thinking the XML rules, NOT the XHTML rules. There’s a difference! Yes, “/>” is valid on any element in XML – XHTML 1.0 is NO MORE XML than it is HTML, it’s a subset of BOTH.

Now if you wanted to say that about 1.1 or later, FINE… but for 1.0 that’s NOT TRUE. I really don’t get why people seem to act like the appendix C guidelines is the work of the Great Satan or something.

… and I quote:

XHTML 1.0 (this specification) is the first document type in the XHTML family. It is a reformulation of the three HTML 4 document types as applications of XML 1.0 [XML]. It is intended to be used as a language for content that is both XML-conforming and, if some simple guidelines are followed, operates in HTML 4 conforming user agents.

That’s the POINT of it, XML conforming HTML 4 that can be served as HTML for legacy agents. That’s what XHTML 1.0 is – and saying that it ceases to be XHTML when you follow those “simple guidelines” INCLUDED IN THE SPECIFICATION DEFINING IT is 100% horse-hockey.

But then I’ve never understood the appeal of the whole XML application nonsense either so… Always seemed like a fat bloated train wreck of using XML for something it was never meant to be; though that’s my opinion of XML databases as well and the idiocy of people calling XML ‘machine readable’… said people apparently not understanding what machine readable is. (That or my machine language background giving me a different definition!)

felgall · November 16, 2010, 5:47am

Now you are really getting HTML and XHTML really confused. The XHTML content model is not the same as the HTML one.

In XHTML it is perfectly valid to use a self closing script tag and the same is true for any other tag where there is no content. That’s one of the differences between HTML and XHTML.

If HTML and XHTML were identical then there’d be no reason for Appendix C which specifies the subset of XHTML 1.0 that is allowed to be served as HTML. If you ignore Appendix C of the XHTML 1.0 specification (which deals specifically with serving it as HTML) then there are lots of valid XHTML constructs that the XHTML 1.0 standard allows you to use that cannot be handled as HTML. You are only required to follow Appendix C if you are going to serve it as HTML. Appendix C is the transitioning rules - you must follow it while still serving your XHTML as HTML but can disregard it once IE8 dies and you can complete the transition to full XHTML 1.0.

system · November 16, 2010, 6:09am

NOWHERE in the actual XHTML 1.0 specification does it actually say that. It does say:
Empty elements must either have an end tag or the start tag must end with />

But nowhere does it say you can do that on just any tag. Do they mean empty as in devoid of content, or do they mean empty MODEL elements as defined by HTML 4. If it’s a reformulation of HTML4, it should mean empty model (IMG, BR, HR, INPUT, BASEFONT, etc…) vs. content model (P, DIV, OBJECT, H1, etc…)… guess which category SCRIPT falls into? The defining syntax in the HTML 4 specification says it has a opening and closing tag, which makes it a content element. Only elements that are NOT defined in the HTML 4 specification with closing tags are the ones that should get the self closer.

Since the XHTML specification says right at the top:
The semantics of the elements and their attributes are defined in the W3C Recommendation for HTML 4

Let’s look at an empty element in the specification:

<!ELEMENT HR - O EMPTY – horizontal rule –>

Then let’s look at SCRIPT:

<!ELEMENT SCRIPT - - %Script; – script statements –>

It is not defined as empty, as such it does NOT qualify under Rule 4.6 for using the self closer!
Empty elements must either have an end tag or the start tag must end with />. For instance, <br/> or <hr></hr>.

Since it’s a content element, the specification format for SCRIPT is Rule 4.3
In SGML-based HTML 4 certain elements were permitted to omit the end tag; with the elements that followed implying closure. XML does not allow end tags to be omitted. All elements other than those declared in the DTD as EMPTY must have an end tag. Elements that are declared in the DTD as EMPTY can have an end tag or can use empty element shorthand

Lemme just restate that again breaking it out and fixing the wording to something clearer:

All elements NOT in the DTD as EMPTY must have an end tag.
Elements that are declared in the DTD as EMPTY can have an end tag or can use empty element shorthand

Since SCRIPT is NOT defined as EMPTY, it must use a end tag and NOT the shorthand!

So <script /> and <p /> are in fact INVALID XHTML 1.0 – even if the validator is too stupid to check said rules.

The devil is in the details – in this case the wording. When they say an “empty element” they do NOT mean an element that has no content between the opening and closing tags! They mean, as it says in the specification “declared in the DTD as EMPTY”

Which means the shorthand closer is only for use on BASE, BASEFONT, META, LINK, HR, BR, PARAM, IMG, AREA, INPUT, COL, ISINDEX, and if using a FRAMESET doctype, FRAME. That’s WHY it’s called the “Empty Element Shorthand”… you can only use it on elements declared as EMPTY in the DTD!

BY the specification! It might be valid XML to slap it on anything any old way, but XHTML has specific rules BECAUSE it’s a reformulation of HTML. That would be like expecting full SGML rules to work in HTML (which also doesn’t work). Bottom line, DTD rules trump the XML rules in XHTML…

felgall · November 16, 2010, 6:17am

That’s exactly what they mean.

XHTML 1.0 is a derivation of XML. It is not derived at all from HTML except in that it has been deliberately set up to look like HTML so that a subset of it can be served as HTML.

You quoted a part of the standard where it said <hr></hr> is valid XHTML. Well it isn’t valid HTML because in HTML <hr> is defined as EMPTY and HTML relies on that to determine which tags need closing and which don’t while XHTML requires all tags to be closed but doesn’t care whether you use <tag/> or <tag></tag> to do it as both are always valid XHTML. XHTML doesn’t need to define a set of EMPTY tags because all tags must be closed.

Also why are you leaving spaces before your /> as that is only necessary when serving XHTML as HTML to run in Netscape 4.

system · November 16, 2010, 6:36am

Read the rest of the post - NO it isn’t. They mean, as they explain in section 4.3 and 4.6, elements declared as EMPTY in the DTD!

No clue where you even get that notion – have you even READ the specification? I mean, it says it over and over again in every declarative:

This specification defines the Second Edition of XHTML 1.0, a reformulation of HTML 4 as an XML 1.0 application, and three DTDs corresponding to the ones defined by HTML 4. The semantics of the elements and their attributes are defined in the W3C Recommendation for HTML 4.

XHTML is a family of current and future document types and modules that reproduce, subset, and extend HTML 4

It is a reformulation of the three HTML 4 document types as applications of XML 1.0

… and I could quote another dozen passages re-iterating that point!

No they AREN’T – READ SECTION 4 and try comprehending it! Yes, it’s “informative” – but that just means it’s EXPLAINING the rules.

All elements other than those declared in the DTD as EMPTY must have an end tag. Elements that are declared in the DTD as EMPTY can have an end tag or can use empty element shorthand.

The empty element shorthand is for elements declared as empty in the DTD, NOT FOR ELEMENTS THAT JUST HAPPEN TO BE EMPTY! – which is why it should NOT be used for <p /> or <div /> or <script /> – by the specification that is invalid XHTML! (even if it is valid XML)

Actually I’ve also seen it trip something similar to the disappearing content bug in IE7 and 8 when scrolling. If it’s in the guidelines I follow it – you never know when something stupid will leave you tearing your hair out… though in my production code my /> often tend to end up on their own line – though that depends on how many parameters the tag has and if it breaks the 76 column rule. (again showing my old-school formatting tendencies on that one)

system · November 16, 2010, 7:01am

Oh, and “empty elements” is another of those bits of the specification that people do often misunderstand as they take the words at face meaning without the appropriate context.

“empty elements” does not mean elements that have no content in the markup, it means elements declared as ‘empty’ in the DTD.

“importance” is another one you get with heading tags a lot since they mean ‘importance’ structurally, not as in the content. The content of the H1 is not supposed to mean the most important text on the page, it’s just the one with the most import structurally – which is to say all other headings are by definition subsections of it. (which is why the people who use h1’s for their content area are actually misusing them on most layouts!)

“paragraph” does not mean the typographical mark, it means a grammatical structure.

etc, etc, etc…

felgall · November 16, 2010, 8:08pm

Okay, I have read through the specification carefully and we are both wrong.

I am wrong because I assumed that they had got rid of the definition of EMPTY since it is in fact completely unnecessary in any XML based language (as XHTML is).

So <script> which should have been defined as EMPTY in HTML can’t use the short form of the close in XHTML.

You are wrong in that the /> is only an alternative way of closing EMPTY tags. All tags in XHTML may be validly closed using <tag></tag>.

This means that the following are valid XHTML but invalid HTML (and so my original argument still stands apart from my having used the wrong example).

Getting rid of the definition of EMPTY would not have allowed <p/> even with no mention of it in the specification since to have a paragraph you must have some content and so <p></p> is semantically meaningless even in HTML.

system · November 16, 2010, 8:16pm

This is a good point… but with a not so fortunate wording choice. Because elements declared empty in DTD actually have no content, the same with those elements that have no content but are allowed by their DTD definition to have one. “Empty elements” applies to both.

My simple understanding of an element: start tag, content, end tag.

On DTD declared empty elements, we have only the start and the end tag, or a self closing tag. Content is missing. By prohibition.

Elements that are not restricted in DTD to be empty, like <span>, which don’t have anything between their tags, those elements are also called… empty. Content is missing. By choice. They are empty in that particular context. How else would you call them?

It’s all about the context. That way you know which “empty element” category the element fits in.

These are a bit convoluted. And false.

Convoluted because it replaces the natural order: user, text, developer. It’s about the user experience, not about the developer’s understanding and logic. Specs and UAs are created to serve the user with text, not to give a bigger importance to developers when they talk “semantics” :). At first it is the user, to whom you direct your developer efforts, then it’s the text, having classic old rules DTD borrow from, and the developer is last, USING a system, not reinventing or assigning to his own will.

It’s false because when you have a definition saying an element has a bigger importance than another, it truly does say that the rendering should make the user aware of that thing. Literally. In DTD, specs etcetera, you, the developer, came in the last place. First comes the user, than comes the text. So “importance” has to do with the user and how he sees the text, not with a coded text, from a developer to another to understand.

And it’s a moot point to say that some other text on page may be made to look bigger when in fact it’s a… shout out. That’s called exception. We all understand and live with it.

This is actually where many developers are making a BIG mistake: they think the markup is for their semantic play and has nothing to do with the user or the text. No, the markup is there in the first place for developers to achieve a semantic for the user (using classic text rules), not for them, the developers.

I look in owe when a developer or another comes and say: “It’s a paragraph, but a developer’s paragraph, not a classic one. It’s a new breed, it doesn’t say “text” it says “developer semantic element”!” Let’s get real: we talk about text. Some developers only invented contradictions in order to have a Genesis experience of their own

system · November 17, 2010, 12:58am

No, you just misread what I was saying… and quoting. Again:

Chinese==Oriental
Oriental!===Chinese

specifically:

NOWHERE did I say that. I said only EMPTY (by the DTD definition) elements can use the empty element shorthand, I did NOT say that was the only closing they could use, and in fact repeatedly quoted the section that says they can use EITHER method! ONLY “empty” by the DTD elements can use the empty element shorthand, ANY element can use the fully qualified closing tag.

Not it shouldn’t have been declared as a empty element because SCRIPT can in fact contain content…

EMPTY elements can NEVER contain content… that’s why they are marked as special in the DTD in the first place!

Though IMHO SCRIPT should the wrong tag for including a javascript file – that should be LINK’s job! Shame it doesn’t work that way.

Except what’s the core rule of HTML in regards to unknown/unrecognized tags? They’re to be ignored (though their content should render)… so under the rules of HTML they have ZERO impact. One of those early forward thinking rules of HTML that has served us quite well.

IF I’m understanding that properly (Not sure I am) then I’d modify the wording on that a bit further.

By definition EMPTY elements in the sense of the specification says that the element CANNOT have content. That’s why they’re called “empty elements”. That’s the difference in the wording that trips up most people. They’re NOT talking about IF an element has content, but if it is ABLE to have content.

Which none of those elements on that list I spewed out a few posts back are ABLE to have content between an opening and closing tag. That’s EXACTLY what they are referring to when talking about an empty element in the specification.

Though the rest of noonnope’s post I’ll need a translator for as I can’t make any sense out of that gibberish which seems to contradict itself in every paragraph.

felgall · November 17, 2010, 1:28am

Except that you implied it by your response because those constructs are valid XHTML but invalid in HTML which is the point I was making which you were diagreeing with.

Which is exactly why I am saying that such inline garbage should have been prohibited in the first place by defining <script> as EMPTY.

And since they didn’t define script as empty from the start defining link that way would have been the next best thing.

Which is one of the reasons WHY JavaScript should have never been allowed inline.

But an <img> tag should be allowed to have content - then we wouldn’t need the alt attribute and it world work exactly the same way as other tags with fallback content.

system · November 17, 2010, 3:20am

Not allowing javascript inline would be absurdly prohibative – how do you pass unique values to the page to the parent script? It has it’s place – it’s only abusive/a problem when it’s used to pass STATIC script like functions that are not unique to that page.

Saying SCRIPT can never contain content is impractical at best, neuters the functionality almost entirely at worst. How would you then pass the tracking information for things like adsense or analytics? Would you force an extra user file? Force them to serve their own copy they have to edit instead of just linking to a common file off the package’s server and enjoying the fact that every site that uses it can share that same cached copy?

It has it’s place – getting rid of the capability to do it entirely is impractical and nonsensical.

system · November 17, 2010, 3:35am

Oops, didn’t see this response thanks to the page break.

REALLY?!? … and where exactly in the XHTML 1.0 specification does it say that? Here’s a tip – It doesn’t!. It is in fact one of the parts of the XML specification REMOVED in the XHTML 1.0 subset so that it remains backwards compatible to HTML… again, that’s the POINT of 1.0!

That is one of those cases of the “XML for everything” die-hards REALLY wanting XHTML 1.0 to be something it isn’t…

system · November 17, 2010, 5:11am

How do you call then an element having no content, either because it CANNOT or because it DOESN’T have one An “empty element”. So you’re contradicting your self in this paragraph. Until you provide an alternative name for those elements that CAN have content but DON’T have one.

Maybe be that some will not understand the difference very well, but still, both categories have the same description: “empty elements”, right?

system · November 17, 2010, 5:29am

Where in the specs have you find it says so, the most important text on the page? You have a probable misuse as an argument for a wrong point. When I read the specs I understand that h1 has the biggest importance when it comes to h1-h6.

That alone means h1 has to stand out. When it comes to h1-h6. It implies a default style. Not to be mistaken with the default style sheet rules most UAs agreed upon: 16px, blue color links etcetera.

h1 is the most important structurally… when it comes to h1-h6.

Tags have been invented to provide mark up for TEXT. Not to structure a TEXT, to DESCRIBE a TEXT. Starting from the way written text looked before any digital content ever existed. That means that tags main use is to offer the basic functionality for PRESENTING TEXT (yeah, that’s right). CSS later came on with ADVANCED presentation tasks. But the text has to exist w/o CSS. CSS is only for the priggish

But some developers think it never started from there, because… I don’t know. Because there are some tags that go beyond that initial purpose? Because CSS came along and made mistakes in presentation possible by those having insufficient knowledge? So they start to mix things up: it’s ONLY semantics now. Really?

When you read the html specs, you find everywhere descriptions for a default style for those tags. Starting from [URL=“http://www.w3.org/TR/html401/sgml/dtd.html”]

<!--================== HTML content models ===============================-->

<!--
    [COLOR="red"]HTML has two basic content models[/COLOR]:

        [COLOR="Red"]%inline;[/COLOR]     character level elements and text strings
        [COLOR="Red"]%block;[/COLOR]      block-like elements e.g. paragraphs and lists
-->

I’m sure block content model has to be different in style than inline content model. Starting from pure HTML. Never involving a bit of CSS. And so it’s true for the rest of the specs. The html specs convey a default style. Again, not to be mistaken for the trivial default style sheet rules UAs seem to have agreed upon: 16px, blue colour for links etcetera.

Specs talk about a basic way of PRESENTING TEXT to a USER, not about how to CODE TEXT for another DEVELOPER to read its semantics.

xhtmlcoder · November 17, 2010, 10:27am

I think Stevie D got mixed up with well-formed markup; as obviously you can have invalid XHTML document but it still render correctly so long as it follows the well-formedness constraints.

If it is violates well-formedness and the UA uses an XML Parser then yes, it should halt on an error or use draconian error handling and report back as such a volition classifies as a ‘fatal error’.

I think people often get confused between the grammar and the user-agent. Being a reformulation it is written as an application of XML but obviously it got wrote to take into account HTML user-agents and HTML 4.01 DTD. Hence XHTML 1.0 Transitional.