I am reading this book for the second time and it is equally as interesting as the first time: http://www.abookapart.com/products/html5-for-web-designers
What I am wondering about is this:
Browser makers are the ones who decide which markup to support, so for HTML5 it's about supporting anything and everything to make it the easiest on everyone. Rather than discontinuing old coding practices and try to force the change to something "new and better" (which is what XHTML 2 tried to do but failed) the browser makers will do their best to support everything.
I guess this is where my knowledge of who's role does what get's a bit foggy when it comes to HTML advancements, W3C, WHATWG, and the browser makers themselves. I understand that XHTML 1.0 is a "coding practice", or a style of writing code that is a good one. Close all tags, write in lower case, include quotes - it seems like a good way to go.
How will IE6 render this code? Will it have any problems with it? If it doesn't have problems with it, and this may sound like a silly question, but how come we haven't been using this syntax from the very beginning when IE6 first came out?
Also, when it comes to doctypes... the simplistic HTML5 doctype "<!DOCTYPE html>" works in all browsers too? Again, if it works in all browsers (old and new) then how come we have been using the messy long "XHTML / W3C" doctypes all this time before HTML5?
I am also getting the hunch that HTML5 code syntax is nothing ground breakingly "new", in fact we have already been using it all this time? It's now up to us which coding practices we wish to use and follow through with. So why do we put a new version number on "HTML5", is it to give it the appearance that it's something new?
Also, if we adopt these leaner, more simplified HTML5 coding practices that I mentioned above it break anything when it comes to the older browsers, such as IE6? And if the old browsers already fully support this code practice then how come we haven't been using the leaner, more simplified code to begin with when IE6 first came out?
Hope my questions make sense - I'm just trying to wrap my head around how and why the code syntax for HTML version advancements that W3C pushed forth were so confusing in the past, how W3C / WHATWG is rectifying this problem in the future, and how these changes effect the rendering engines when it comes the older browsers.
The W3C is basically made up of the browser makers so those two groups are much the same.
Because versions 2 through 4 of HTML were defined according to a standard called SGML and the doctype is an SGML tag identifying which standard the document that follows it uses. The browsers implemented HTML without making se of the standard and so didn't need to reference the doctype at all until Internet Explorer implemented an early version of CSS 2 prior to the standard being finished and then IE6 needed a way to distinguish between pages that used the version of CSS2 implemented in IE5 from those that implemented the standard version. The chose to use the presence of a doctype for this. The HTML 5 doctype is the short version of the doctype that is valid fore HTML 2 but without the part that says whay particular SGML standard the HTML is using as HTML 5 has abandoned using the standards for how to define markup languages.
HTML 5 is a far more bloated version of HTML compared to the much leaner HTML 4. While it has added a few additional useful tags there are also many tags being added specifically for older browsers such as Netscape 4 even though those browsers are already or will be long dead by the time HTML 5 is finished. Presumably those tags will be removed before HTML 5 becomes a standard.
Felgall answered most everything and quite well. Some notes:
Because everyone was hoping everyone would get in on the standards compliance bandwagon. When the WHATWG were forming they spent some time looking at what browsers in existence were actually bothering to react to (as well as what their error-rendering model was), and it turned out there were lots of things browsers didn't give two whits about. Like MIME types on certain tags.
Browsers dealing with errors the same way, to the point that it could be made a new standard... for example, the <meta charset='utf-8'> bit. So many authors got the quotes wrong, and nobody (except maybe Opera sometimes) wanted to be the browser who puked when users went to some page written by some dork who never bothered to validate... which would have been most pages, and still is.
Browser vendors want to "win" browser wars by following Postel's law: liberal in what they accept (including egregious errors, so long as the browser can muddle through it and figure something out that looks ok), and conservative in what the send out. This means lots of crap and errors don't raise any flags and the world continues turning and multicultural children holds hands and sing Kum Bah Yah and people writing HTML (or munging HTML in something gross like a WYSIWG) never realise there's anything wrong.
Apparently, lots of people have been, and browsers had to learn to deal with it. Throw poop at HTML and guess what? It sticks.
<Curly Howard whoop here/>