You are reading an archived post from the first version of my blog. I've started fresh, and the new design and content is now at boxofchocolates.ca

Headings, Hierarchy and Document Structure

July 22, 2004

Eric Meyer has followed up on Tomas Jogin’s post Hierarchy in his post Pick a Heading. Something Eric said struck me, so I need to post as well. He provides a code sample from Netscape DevEdge that really got me thinking about where this might go in the future:

<h1>
 <a href="/" target="_top"><span>Netscape</span> DevEdge</a>
</h1>
<form action="/search/app/" id="srch" method="get">
<h4><label for="search-input">Search</label></h4>
("¦inputs"¦)
</form>
<h2>Netscape 7.1 is now available</h2>
("¦paragraph text"¦)
<h2>The DevEdge RSS-News Ticker Toolbar</h2>  
("¦paragraph text"¦)
<h3>Recent News</h3>
("¦list of links"¦)

Just some quick analysis here, if you haven’t read Eric’s post. Eric states himself that the heading order is “off” (h1-h4-h2-h2-h3) – but I think most would agree that the structure used makes sense. There are very legitimate reasons for this markup. First, the label for the form is in an h4 element so it will show up in the document outline (as it should) and is easily navigable for screen reader users or others that navigate via headings. It is marked up as an h4 – implying that it is less important that the two main sections of the page in question, and it is source ordered to be near the top – useful for all users, regardless of the device they are using, be it cell phone, PDA, desktop user agent with or without assistive technology.

What about the future?

Eric also points to the elements available to us in XHTML 2<section> element and the <h> element. In essence the W3C is telling us that heading hierarchy is to be determined by context and where the <h> element is in the document hierarchy. In theory this seems like quite a rational approach, and I’m sure many developers are looking for that level of semantic connection between a heading and all of the contents found within that “section”. Many of us are already doing that anyway — we’re just using divs with appropriate names.

Update: Anne van Kesteren points out that the following examples in this post are mistaken because forms aren’t part of XHTML 2. Fair enough – I slipped. But read on anyway. The point I am trying to make really has nothing to do with the fact that it is a form in that particular section – it could be any content. Content that you want source ordered closer to the top of the document, but that has less “importance” than the other main content.

So what about the code from DevEdge in XHTML 2? What would it look like? What happens when the source order conflicts with the hierarchical structure of the document? Let’s try it:

<section>
	<h><a href="/" target="_top"><span>Netscape</span> DevEdge</a></h>
  <section>
    <section>
      <section>
        <form action="/search/app/" id="srch" method="get">
        <h><label for="search-input">Search</label></h>
        ("¦inputs"¦)
        </form>
      </section>
    </section>
    <h>Netscape 7.1 is now available</h>
    ("¦paragraph text"¦)
    <h>The DevEdge RSS-News Ticker Toolbar</h>  
    ("¦paragraph text"¦)
    <section>
      <h>Recent News</h>
      ("¦list of links"¦)
    </section>
   </section>
</section>

Code bloat. Meaningless <section> elements added just to impose the correct hierarchical structure? Adding meaningless wrapper divs upon wrapper divs is frowned upon in our current practice. Why would adding nested sections within nested sections to provide the appropriate level of semantic structure be any different?

It may make sense for the content of an article or a resource or blog post or whatever to have a linear structure, but we can’t necessarily force linear structure on the rest of the site wrapped around that content.

Give us some options

Why not create a new attribute for the <section> element then? Call it level, call it whatever you want… This would allow for documents to have both implicitly and explicitly defined structures. If the “level” is not defined, the structure is determined completely by nesting. If levels are defined, then we mix the two — we remove the explicitly defined levels from the linear flow and place them accordingly. You know, kind of like absolutely positioned elements are removed from the regular document flow, we could absolutely position sections within the document hierarchy for outlining purposes.

<section> 
	<h><a href="/" target="_top"><span>Netscape</span> DevEdge</a></h>
  <section level="4">
        <form action="/search/app/" id="srch" method="get">
        <h><label for="search-input">Search</label></h>
        ("¦inputs"¦)
        </form>
  </section>
   <section>
    <h>Netscape 7.1 is now available</h>
    ("¦paragraph text"¦)
    <h>The DevEdge RSS-News Ticker Toolbar</h>  
    ("¦paragraph text"¦)
    <section>
      <h>Recent News</h>
      ("¦list of links"¦)
    </section>
   </section>
</section>

Document outline creators, other user agents, whatever, can then respect source order and we won’t all get stuck worrying about how exactly our documents look in a Document Outline Extension (albeit a very cool tool!), because we’ve defined precisely the hierarchical structure that we intend.

Admittedly, I’m examining this through the lens of current techniques and practices and what “works” in today’s user agents, but clearly we need to deal with this sooner rather than later… Either that, or we don’t ever move to XHTML 2, because at the end of the day, I’ll have to side with source order as being more important.

Update 2: Reading through the XHTML 2.0 specs tells us that there are two types of headings: numbered headings (<h1> through <h6>), and structural headings (<h> inside <section>). It seems to me that the mechanism I proposed above, with a “level” attribute wouldn’t be needed. Instead, document structure could be determined by nested sections and headings, and over-ridden when an actual numbered heading was found within a section. Clearly more thought is required… watch for another post!

Other Related Posts:

Filed under:

3 Responses

Comment by Karl Dubost — Jul 23 2004 @ 10:14 am

I don’t know if you have sent your comment to the appropriate mailing-list but if not asking here will not achieve anything.

Formal issues and error reports on this specification shall be submitted to www-html-editor@w3.org (archive). It is inappropriate to send discussion email to this address. Public discussion may take place on www-html@w3.org (archive). To subscribe send an email to www-html-request@w3.org with the word subscribe in the subject line.

You can send a copy to www-html@w3.org as well, but the latter will not be filled as an issue.

Hope it helps.

Comment by Derek Featherstone — Jul 23 2004 @ 12:01 pm

Hi Karl — thanks for your comment… I haven’t sent anything on to the appropriate W3C list – I may send something, but as the XHTML 2.0 Working Draft has just been updated, I’m going to wade through it first.

Then I’ll update this post and look at sending something along. Admittedly, it is tough keeping up with all the lists, and blogs, and forums. Distributed knowledge and discussions are getting out of control…

Cheers,
Derek.

Comment by Jules — Aug 20 2004 @ 10:36 am

Hey Derek:

I must admit that I believe that all content belongs in containers – I say this knowing full well that header, footer, and navbar content can sometimes be difficult to assign to a container other than a purposeless

. However, all other content, in my opinion, should be contained within (below) headings. I view a document from the perspective of an outline – everything must fall within a container and containers must be heirarchical. Part of this comes from my writing experience (actually, I should tell the truth, it is because of my editor that this opinion has been formed). I also work with editors and they pound the same message into the authors.

I have wrestled with h1 in the past and now believe that it is the title of the current page (not necessarily the same as the title tag) and if the h1 appears part way down the content, so be it (although, with some layouts, source ordering can place the header near the bottom but layout can display it at the top which means that h1 can appear at the top of the source but placed further down in the layout). I don’t create a h3 if it is not part of a h2. I teach my book readers that if h2 is the appropriate heading for a block but you don’t like the size, then style it smaller, don’t play with the structure.

Although I know very little about XHTML 2, I am not comfortable with the h element. If you have an h element, and then a p element and then another h element and another p element. Is the second h at the same level as the first or nested below the first? If you are using nested section elements, is that not the same as a hierarchy? Therefore, if you create a hierarchy using the section elements, why would you be opposed to the hierarcy of h1, h2, h3, … ? (By “you”, I am refering in a general case, not specifically you Derek.)

My 2 cents.

Jules