Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> how is one able to define what is wrong and what is correct?

There are many arbitrary lines there. There were huge bikeshedding debates on HTML WG just how much must be quoted, escaped, declared and closed.

Generally the correct/valid subset is chosen to be free from gotchas as much as possible (only things that behave as expected are allowed).

It's a compromise between best practices and not so pretty, but very common code out there.

It's counter-productive to declare 99% of working pages "invalid". With less nitpicking errors validators can have better signal to noise ratio and flag errors that are more likely to cause trouble, and authors are more likely to take those seriously rather than assume validator is impossible to please.

e.g. misnested tags are disallowed, because it's hard to understand how they are interpreted.

DOCTYPE is required, because it disables emulation of IE5 bugs (Quirks Mode).

OTOH unquoted attributes and some unescaped ampersands are allowed, because most often they're parsed unambiguously in a way that authors expect.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: