Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> What? Who knows. Nobody knows because you cannot, by definition create _wrong_ HTML 5.

HTML5 defines pretty strict conformance requirements for authors. That's separate thing from defining error recovery mechanisms for UAs.

You can easily learn what is wrong with your code using the W3C Validator.

http://validator.w3.org/check?uri=http%3A%2F%2Fpornel.net%2F...

which is a big improvement over the old DTD-based one which couldn't verify contents of attributes or structures more than one level deep:

http://validator.w3.org/check?uri=http%3A%2F%2Fpornel.net%2F...



> HTML5 defines pretty strict conformance requirements for authors.

What you are referring to wrong as in is _non valid_, what I was referring to was _non working_.

Invalid HTML 5 _works_, so does invalid HTML. At no point your browser will stop and say "Come on, that is not HTML, that is garbage". If there is no such point, then there is no _wrong_ HTML 5.

Take this code, it is valid HTML 5 (may the XML gods forbid me)

    <!DOCTYPE html>
    <title>My feelings</title>
    I love HTML
    </html>
It will be shown without any problem by a browser. The title will be "My feelings" and the body will be "I love HTML".

The following is invalid HTML5

    <title>My feelings</title>
    I love HTML
Yet, it will be shown "correctly" by browsers without any problem, just like the previous one.

Once such a lax error recovery mechanism is in place _without additional warning in the UI_, how is one able to define what is wrong and what is correct?


> how is one able to define what is wrong and what is correct?

There are many arbitrary lines there. There were huge bikeshedding debates on HTML WG just how much must be quoted, escaped, declared and closed.

Generally the correct/valid subset is chosen to be free from gotchas as much as possible (only things that behave as expected are allowed).

It's a compromise between best practices and not so pretty, but very common code out there.

It's counter-productive to declare 99% of working pages "invalid". With less nitpicking errors validators can have better signal to noise ratio and flag errors that are more likely to cause trouble, and authors are more likely to take those seriously rather than assume validator is impossible to please.

e.g. misnested tags are disallowed, because it's hard to understand how they are interpreted.

DOCTYPE is required, because it disables emulation of IE5 bugs (Quirks Mode).

OTOH unquoted attributes and some unescaped ampersands are allowed, because most often they're parsed unambiguously in a way that authors expect.


    <!DOCTYPE html>
    <title>My feelings</title>
    I love HTML
it is a valid html5


You are missing the point: I know that you could add `<!DOCTYPE html>` to make that document valid and you know as well. But whoever writes the second snippet does not know because we are not there to point it out. And if you point it out they will look at you puzzled: "You are saying that it is not valid, but it renders, and in exactly the same way! Why are you making all this fuss about this "validity" thing?"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: