In Which The Ferrett Argues For Higher Standards
Way back in 2003, I bitched about XHTML and its stricter standards. The thing I liked about HTML back then was that it was very forgiving of errors, which made it easy for new HTML people. If you screwed up – as you often do when you’re learning – the browser would try to work around your error. If you left off a closing </B> tag, the browser would try to guess what you wanted made bold. If you screwed up a table, the browser would try to figure out what sort of table you intended to make, and so on.
XHTML doesn’t do that. If you leave off a tag, it goes, “Whoah! That’s malformed!” and refuses to proceed until every last “i” is dotted.* XHTML is a very harsh mistress.
At the time, I found that to be pretty darned prissy of XHTML, and I likened it to grammar. People understand the phrase “Bread, milk, and butter” or “Bread, milk and butter” just fine, said I. It’s better when the computer makes a best guess for you, I argued.
I was wrong. And that’s because my perspective has shifted.
In 2003, I was someone who just wanted to get a page up quickly, with a minimum of fuss. Having a brittle parser that flung up its hands at the slightest digression from its strict routine was like working with Raymond Babbit – driving with a very smart guy who flipped out at the slightest new thing. And it frustrated me that unless I spent ten minutes tracking down one stupid rogue tag in a sea of over a thousand HTML elements, everything ground to a halt.
But five years later, I’ve got a different perspective. Getting a page up is old hat. Now, I’m trying to parse pages with the JavaScript DOM model, which allows me to find specific tags in the page (like, say, a table) and do cool things with it (like re-sorting a table full of items when someone clicks on the top row). I’m trying to get my pages to look the same in Internet Explorer and Firefox and Safari and Opera and Lord knows what else. I’m trying to get live shipping quotes from other sites that give me their price quotes via XML feeds.
When I’m writing, I want quick and easy. When I’m processing what others have written, I want it to conform to a strict set of data so it doesn’t break my parser.
You got it; I want no surprises in the response when I’m trying to dope out whether UPS can ship this Next-Day Air. And when I’m doing cool JavaScript tricks, I want the DOM as neat and organized as Dick Clark’s hair.
Now, some would argue that I’ve just flipped my laziness. After all, when I was writing, I wanted the easy solution of having the browser fix things for me when I screwed up. And now that I’m processing, I want the easy solution of not having to program in exceptions for weird things like unclosed tags when other people screw up. You might even call me a hypocrite.
Except that I’ve seen where sloppy standards lead.
The problem with the browser trying to fix the bad HTML for you is that because that left-out </B> tag is non-standard, each browser has to guess what happens next. That’s all fine and well if they all guess the same way… But if Internet Explorer handles a broken </B> one way and Firefox another, then suddenly you can’t tell how a page is going to look when someone views it.
One little quirk isn’t a dealbreaker, but when you add up a hundred of them, suddenly you have a page that looks just fine in Firefox but becomes some Matrix screen-melty blur of characters in Internet Explorer. That’s not good, and ultimately it makes the job of the writer a lot harder.
Plus, there are the security issues. If you have something that tries its best to guess when it encounters something strange, it’s possible for a clever person to insert very strange data in the hopes of exploiting some loophole. Having a whitelist of accepted data and melting down whenever it encounters some weirdo thing reduces the risk of strange data doing something harmful.
Hence, in the past five years, I’ve flipped and agreed that standards are a good thing. Parsers should break down like a car with a blown tire when they encounter bad syntax. It heads off more problems in the long run.
But I wasn’t entirely wrong.
See, one of the reasons HTML took off so quickly in the first place is because it was so forgiving. That forgiveness allowed everyone to try out the Web, take it for a spin, create their own home page even if it was absolutely terrible. That ease of use facilitated use.
If the first Web pages had crashed whenever someone left off a </B> tag, we wouldn’t have had the excitement of the Internet as we know it today. It would have put up a barrier between the casual user and the computer nerd… And rather than the Internet being an “us” thing where people made their own awful, awful blogs back in 1995, it still probably would have been restricted to the hands of advanced users only.
As a novice, I want a forgiving parser. As an intermediate-to-advanced user, I want Ilsa, She-Wolf of the SS at the helm. So how do I reconcile these?
Simple: Whenever you create a standard, you need to create tools to match. If HTML had been harsh but there had been easily-available, open-source tools that everyone knew about – tools that would hunt down silly problems like unclosed </B> tags and do their best to fix them for you – then the Web probably would have been a little slower to take off, but easier to handle for webmasters right now.**
Parsers should be strict. But the creation tools that make the content should be like a loving uncle – looking over your shoulder, anticipating your needs, and fixing as much as it can for you. In other words, the tools for creation should be a priority along with the standards, so we don’t leave the new kids out in the cold, wondering what to do next.
That’s what I think now, anyway. I wonder where I’ll be in 2013.
* - In real life, it generally throws the browser back into quirks mode to have ol’ sloppy HTML handle it instead, at least as I understand it…. But you get my drift.
** - And yes, I know, the browser wars between Netscape and IE would still have caused major issues, but let me dream there would have been fewer of them.
|