UsableTypeTypography for the World Wide Web

weblog

Presentational markup in HTML 5

I’ve been contributing to the W3C HTML Working Group working on the next version of HTML. I say contributing, I’ve found it tough enough keeping up with the pure volume of mail the group generates to make much of a contribution so far. Also, a lot of the initial discussions have been around some pretty high-level design principles which I’m leaving to the proper experts for the moment.

One thought-provoking topic that I have been following closely however is the <indent> vs. <blockquote> discussion. It’s not worth going in to the specifics of that conversation here, but you can find that in the archives if you want to follow it. What it has raised in my mind though is the question of using presentational mark-up in HTML.

Most of what I’ve learnt about Web Standards like XHTML and HTML teaches that mark-up is for giving structure and meaning to content, and that CSS is for presenting that content in a web browser or other environment of human consumption. However, history tells us that HTML is not used like that. The <blockquote> example is a good one — it’s regularly used to indent content in the middle of prose whether the content inside the <blockquote> element is really a blockquote at all.

The question is, would it be better for the Semantic Web if, instead of less knowledgeable page authors using <blockquote> incorrectly, an element like <indent> was created which allowed authors to simply indent content with no implied semantics?

I think most people’s reaction to this is: No, use CSS to indent the content. And that was my initial reaction as well. However, following the argument through I think I can see both sides of the argument. Why pollute the web with incorrectly used HTML when we can give authors that want it a semantic-free mechanism for indenting their content?

Comments:

On 6 May 07

patrick h. lauke (contact, colleague) said:

i have to admit that i'm getting very concerned about the direction HTML5 is taking. i'm just waiting for blink and marquee to make an appearance any day now. the primary concern of many people on the list seems to be "authors who hand-code and don't care about semantics". to me, that's certainly the wrong audience to aim this spec at. the "everyman" will be using authoring tools, so it's up to the tools to hide the complexity (?) of CSS styling for visual intent, and to offer adequate semantic choices to authors in a logical and transparent way. will some authors still abuse presentational stuff rather than choosing the semantic equivalents? of course they will. will there be people hand-coding and abusing elements? of course. but should the spec just officially roll over and say "hey, here you go...you're going to get it wrong anyway, so there's some more elements for you to get it wrong with" seems...disingenuous to me.

On 6 May 07

Brent Miller said:

I think that it's very important that the W3C try to enforce best practices as much as possible. That means, don't give people &lt;indent&gt;, because they'll use it.

No matter what you decide the spec will be, there will always be people who misuse it. There will always be bad tools that generate awful markup (MS Word, anyone?) as well. There will always be people who write bad code.

For the rest of us, though, a little bit of enforcing best practices, like semantic markup, could go quite a long way. And if the W3C puts out a spec that encourages semantic markup, it will start a larger discussion among web developers about the utility of good markup, and hopefully more people will start writing better code.

On 6 May 07

Andy Hume (me) said:

I'm inclined to agree with you Patrick, but in the interests of discussion I'll sit on the fence a bit more.

I think it is a balance between creating a spec that moves HTML in the right direction for the Web, but keeps it relevant and practical to as many people that are using it as possible.

I think some of these presentational elements <em>will</em> make it in, and I don't really see the harm if they are there for the 'right' reasons, and aren't polluting or diluting the spec for authors that want to do things 'properly'.

On 6 May 07

Andy Hume (me) said:

Hi Brent,

I think the W3C would always encourage semantic markup.

<q>There will always be people who write bad code.</q>

I think that's exactly the point. There <em>will</em> always be bad code, so is it not more practical for the web if people used <code>&lt;indent&gt;</code> which means nothing, than to pollute the web with incorrectly used <code>&lt;blockquote&gt;</code> elements?

On 6 May 07

patrick h. lauke (contact, colleague) said:

but the only 'right' reasons that i can see are authors writing bad code, legacy content, and authoring tools that are presentational. in all those cases, browsers are free to support the display of presentational stuff (either via a doctype switch, or by simply employing a very lax, backwards-compatible fallback rendering mechanism). however, that is completely separate, in my mind, to actually including those things in the spec. a document can display "properly" in a future browser, yet doesn't have to validate. letting presentational stuff pass validation simply carries forward the whole "a document may be valid, but it doesn't mean it's right" debacle...

On 7 May 07

Richard Rutter (friend, met, colleague) said:

I can see where HTML 5 is coming from with this, but it seems so wrong. Just because you one can use meaningless or misleading markup doesn't mean it should be encouraged. And anyone who's been preaching the good word of the web standards philosophy over the past few years is gonna feel rightly peed off.

It would seem better to be to be concentrating on creating some levels of generic markup that can cope with more situations beyond the current scientific document approach of HTML 4.

On 8 May 07

Darius Jahandarie said:

To be honest, I think the new HTML branches back to the argument of 'Should we care about semantics, if it works fast and properly?'. For example, the &lt;blockquote&gt; element works absolutely fabulous for indenting text, however, it is obviously incorrectly used, because of the name 'block quote'. The argument is not like it was before, because improper markup obviously introduced slow-downs and things not working cross-browser.
HTML 5 COULD go in the right direction, but I fear that it would either be discarded, or completely replace XHTML. The current reason for XHTML not being fully used right now is because it is just parsed as 'tag soup' not as XML, because of Internet Explorer still not accepting the application/xml type. However, IE is already looking to support HTML 5.
XHTML is basically the step in the future. Flexibility and fast parsing. However, some idiots ( sorry Microsoft fans(?) ) are not fully supporting it. So, instead, this new HTML 5 fully backward-compatible hyped up language is coming out to save HTML 4.01 from is sure demise.
Back to reality now. HTML 5 is going to support new tags and old ones. XHTML 1.1 can already support any new tags or attributes with modules, and furthermore, most of the visual changes that are going to be supported by HTML 5 can be handeled by CSS. So really, the question is: "Should we help out the idiot standing in headlights in the middle of the road who can't already make use of all the help that is available, or should we continue developing the already easy-to-use, but unsupported methods already created?"

On 11 May 07

Matt Wilcox said:

"The question is, would it be better for the Semantic Web if, instead of less knowledgeable page authors using &lt;blockquote&gt; incorrectly, an element like &lt;indent&gt; was created which allowed authors to simply indent content with no implied semantics?"

No, because &lt;indent&gt; is not semantic, and HTML is not about presentation. The solution to the mis-use of elements is better education, not changing the definitions.

On 11 May 07

mattur said:

Another way to look at it: if lean, mean, semantic markup offers significant benefits (and it generally does) then anyone seeking to publish professional web pages will eventually adopt it to gain those benefits - no social engineering is required in the browser or spec.

I do think accessibility applies to both readers and publishers. Everyone, regardless of technical aptitude, should be able to publish web pages, and IMHO presentational markup is easier for non-experts to understand and use.

On 11 May 07

Andy Hume (me) said:

Matt - no one is talking about changing the definitions of existing elements. We're talking about introducing presentational elements for authors that want to use them.

The argument is that it is better to allow these in the spec than for authors to continue to misuse things like &lt;blockquote&gt;. As I say, I don't think it's as clear cut as some people would have you believe, and I think HTML 5 will get <em>some</em> of these presentational elements.

On 17 May 07

Robert Love said:

"Why pollute the web with incorrectly used HTML when we can give authors that want it a semantic-free mechanism for indenting their content?"

Because authors who use semantic markup incorrectly are highly unlikely to start using non-semantic markup correctly. They'll simply use both incorrectly. For them, semantics is neither here nor there - blockquote or indent - whichever they find first on the WYSIWYG toolbar.

On 17 May 07

Lachlan Hunt said:

The indent element, or any other purely presentational elements, will not be introduced into HTML5.

Patrick, I think you are seriously overreacting. HTML5 is not being designed as a presentational language. We've tried to explain it to you, but you seem to be ignoring what we say and jumping to your own conclusions based on the presence of B and I, which are actually included for semantic reasons.

On 17 May 07

Mike said:

In some of your other comments, you defend critiques of HTML 5 by saying that it is only a draft. It's been a draft for 3 years now and the FONT element and inline CSS are still part of the spec. Who are you kidding?

On 17 May 07

Mike said:

Part of my previous message was cut off - it start with: Lachlan, please stop telling people that they are overreacting. When the FONT element and inline CSS are removed from the HTML 5 spec, then you can tell people they are overreacting.

On 17 May 07

David Hucklesby said:

Perhaps the authors of the OED should include "their" as an alternative spelling for "they're" as I see it abused this way so often?

Seriously though, I think HTML 4 is a great advance over 3.2, and would like to see the trend to semantically cleaner HTML continue. Standard error handling seems to me a good idea, too, but it may take a lot of compromise to get agreement.

On 18 May 07

brent said:

The only people that would study up enough to use the indent tag would not use it because it has no semantic value. And where would this rationale lead? I remember seeing people use the li tag without a ul just because it indents and puts in a bullet in some browsers, are they going to also make a bullet tag? There are probably dozens of other useless tags they could make.

On 18 May 07

JJ07 said:

an &lt;indent&gt; tag would be pointless, indenting can be done just as easy in CSS and &lt;indent&gt; has no actual semantic meaning. You might aswell go back to using &lt;font&gt;, &lt;b&gt; and &lt;strike&gt; if you thinking like that, HTML is for content and CSS is for presentation, if this wasn't the case the web would be dirty and standards would not exsist.

On 19 May 07

Lachlan Hunt said:

FWIW, the font element will be removed. AFAIK, it received no support from anyone at all; not even the WYSIWYG editor vendors it was introduced for. It's just not a priority to remove it yet.

With regards to the style attribute, that's an open issue. There's a lot of pressure from at least one WYSIWYG editor vendor that I know of pushing for it to be included. I know it's bad, but we can't ignore WYSIWYG editors; and until someone demonstrates that a semantics-based editor is possible and can be successful, we're stuck with them.

On 24 May 07

Matt Wilcox said:

"Matt - ... We're talking about introducing presentational elements for authors that want to use them. The argument is that it is better to allow these in the spec than for authors to continue to misuse things like <blockquote>..."

HTML is not about presentation. If it were up to me every presentational element would be stripped out, and browsers would render HTML without a style-sheet as plain-text, and thus abuse of elements based on their default rendering behaviour would stop dead. However, that's entirely impractical, and it won't ever happen. In the mean time I'm happy to let the status-quo be, as long as there's no regression.

What concerns me (and made me join the HTML Working Group) is the idea of introducing new tags which are there purely for presentational purposes, which IS a regression. No, no, no and no again. HTML is not the place for presentational elements. If something is to be displayed differently it's extremely likely to be due to a semantic difference from surrounding data, so mark it up properly and use CSS. And if it's a rare event where it really is pure presentation, wrap it in a span. That's what spans are there for.

On 8 Jun 07

Nathan Davies said:

Hi Everyone,
I'm fairly new to this whole issue but have come to love CSS. With a history in OO programming for the desktop environment the move to web development was a bit of a leap. However, with the use of CSS the development of a site can be properly modularised - layout in the CSS content in the HTML. This really makes sense. Any moves forward should really enhance this difference.
The main problem I can see is that there are a lot of websites out there by amateurs and professionals alike that are built in website authoring tools. I have recently started working for a new company and have to redesign one of their many websites. The existing site needs to be maintained at the same time. the current site was developed in Dreamweaver which has a mixture of style and content on every page and is truly awful. This is the mess we have, standards are a great idea but they are not totally enforceable and as any one can publish a web site there will always be poor practice out there.
The challenge for us, and one I am currently relishing, is to ensure that sites we design and build are standards compliant and properly modularised.
Sorry for wading as a newbie but I really enjoy this sort of discussion.

Add your comment:

Got an OpenID?

Latest Links

Joe finds me patronising! Arsenalski Niet

To get in touch, please use the UT form below.

Name:
Email:
Message:

close