Understanding HTML5 Content Models

Earlier this week we looked at the new text-level and structural semantic elements html5 provides. Today I want to continue and talk about content models in html5, specifically the new outline algorithm for creating hierarchy.

Once again much of the content below comes to me via Jeremy Keith‘s book HTML5 for Web Designers, which I highly recommend.

Unfortunately some of what we’ll look at below isn’t yet supported by browsers. Some of it will be, but not all. Still I think what’s here is important to understand with an eye toward the future.

Venn diagram of html5 content models

Content Models

Before html5 we had two categories of elements, inline and block. With html5 we now have a more fine-grained set of categories with their own content models.

  • Text-level semantics — what were previously inline tags
  • Grouping content — block level elements like paragraphs, lists, and divs
  • Forms — everything inside form tags
  • Embedded content — images, video, audio, and canvas
  • Sectioning content — the new structural tags described in my previous post

Currently to create a hierarchical outline of our content we use a set of h1–h6 tags. They work for the most part, but can break down at times. Consider the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<h1>Web Design</h1>
    <p>Some general info about web design</p>
    <h2>Layout</h2>
    <p>Info about layouts</p>
        <h3>Grids</h3>
        <p>Info about grids</p> 
    <h2>Typography</h2>
    <p>Info about typography</p>
    <h2>Color</h2>
    <p>Info about color</p>
    <h2>Design Principles</h2>
    <ul>
        <li>List of</lI>
        <li>several different</lI>
        <li>design principles</lI>
    </ul>
<p>Where in the outline does this paragraph belong?</p>

The above would produce the following outline based on the headings.

  • web design
    • layout
      • grids
    • typography
    • color
    • design principles

In general each paragraph below a heading belongs under that heading in the outline in the hierarchy, but do they have to?

Where in the outline does the very last paragraph belong? Is it under the Design Principles or does it belong under Web Design?

You can tell my intention based on the indentation, but a machine isn’t going to see that with the whitespace stripped and there’s no reason the code needed to be indented the way it is above.

Visually that last paragraph will look just like the one above it as well. Reading you wouldn’t really know which section it belongs to.

HTML5 helps solve the problem above.

I've seen the future. It's in my browser. HTML5

Sectioning Content Model

The first tool html5 provides is the section tag we discussed last time. Using the section element we can rewrite the above as

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<h1>Web Design</h1>
    <p>Some general info about web design</p>
    <section>
        <h2>Layout</h2>
        <p>Info about layouts</p>
            <h3>Grids</h3>
            <p>Info about grids</p> 
        <h2>Typography</h2>
        <p>Info about typography</p>
        <h2>Color</h2>
        <p>Info about color</p>
        <h2>Design Principles</h2>
        <ul>
            <li>List of</lI>
            <li>several different</lI>
            <li>design principles</lI>
        </ul>
    </section>
<p>Where in the outline does this paragraph belong?</p>

Once again the outline produced is the same as we saw above, but now it’s much clearer where the last paragraph belongs. We can do better though. Let’s mix in the header element and better define the different sections of the document.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<h1>Web Design</h1>
    <p>Some general info about web design</p>
    <section>
        <header>
            <h2>Layout</h2>
        </header>
        <p>Info about layouts</p>
        <section>
            <header>
                <h3>Grids</h3>
            </header>
            <p>Info about grids</p>
        </section>
    </section>
    <section>
        <header>
            <h2>Typography</h2>
        </header>
        <p>Info about typography</p>
    </section>
    <section>
        <header>
            <h2>Color</h2>
        </header>
        <p>Info about color</p>
    </section>
    <section>
        <header>
            <h2>Design Principles</h2>
        </header>
        <ul>
            <li>List of</lI>
            <li>several different</lI>
            <li>design principles</lI>
        </ul>
    </section>
<p>Where in the outline does this paragraph belong?</p>

Once again the above html produces the same outline. So far not much is really new other than the addition of some new tags. We could have done the same thing by using divs instead of section and header.

So where’s the new stuff?

CSS outline in Tinderbox application

HTML5 Outline Algorithm

In html5 each sectioning element has its own self-contained outline. What that means is we can start each section with an h1 tag and the algorithm will figure out the overall outline.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<h1>Web Design</h1>
    <p>Some general info about web design</p>
    <section>
        <header>
            <h1>Layout</h1>
        </header>
        <p>Info about layouts</p>
        <section>
            <header>
                <h1>Grids</h1>
            </header>
            <p>Info about grids</p>
        </section>
    </section>
    <section>
        <header>
            <h1>Typography</h1>
        </header>
        <p>Info about typography</p>
    </section>
    <section>
        <header>
            <h1>Color</h1>
        </header>
        <p>Info about color</p>
    </section>
    <section>
        <header>
            <h1>Design Principles</h1>
        </header>
        <ul>
            <li>List of</lI>
            <li>several different</lI>
            <li>design principles</lI>
        </ul>
    </section>
<p>Where in the outline does this paragraph belong?</p>

Believe it or not the above html where every heading is an h1 still produces the same outline in html5.

  • web design
    • layout
      • grids
    • typography
    • color
    • design principles

Under html 4 the outline would be

  • web design
  • layout
  • grids
  • typography
  • color
  • design principles

Quite a difference. It might seem somewhat strange to have every heading be an h1 tag, but it does have advantages. You won’t have to keep track of your overall hierarchy, only the hierarchy within a section.

Maybe not such a big deal with a single document, but it does allow our content to be more modular and portable, which will get to momentarily.

Other Sectioning Elements

Above I mentioned that the sectioning content model includes all the structural tags we talked about last time. It’s not only the section tag that creates its own self-contained outline.

Tags like aside, article, and nav also do the same.

While it wouldn’t be appropriate had I used article tags instead of section tags in the above code the same outline would have been produced.

group of lego people

The hgroup Element

Note: The W3C removed the hgroup element from the html5 spec in the spring of 2013 citing little real world use. It’s no longer recommended for use.

Sometimes you may want to use headings so you can better show and style visual hierarchy, but you don’t want the heading to be part of the document outline.

hgroup allows us to do just that. For example say you have the following markup:

1
2
3
4
<hgroup>
    <h1>Main heading</h1>
    <h2>Tagline</h2>
</hgroup>

Only the h1 above would be included in document outline. The h2 wouldn’t be included. Only the first heading, regardless of how many are there would be included in the outline.

The hgroup element can only contain h1–h6 tags and it’s meant to be used for subtitles, alternative titles, and tag lines.

Do we need hgroup? The above could have been coded as:

1
2
<h1>Main heading</h1>
<p class="tagline">Tagline<p>

This would produce the same outline and allow for the same visual styles, however the hgroup probably adds more semantic meaning and certainly uses a bit less code.

In addition to using hgroup to hide some headings from the document outline there are a few elements that by default are invisible to the document element and are called sectioning roots.

  • blockquote
  • fieldset
  • td

Even if you use headings inside the above elements those headings won’t be part of the document outline under html5.

Modular lego building

Modular Content

The new outline algorithm helps us create content that is more modular. The idea of not needing to keep track of your hierarchy might not seem like such a big deal until you consider what happens when you move a piece of content around.

For example typical of many blogs is to display the title and a short paragraph of several posts on the main blog page. In the individual posts the headings would be marked up with an h1. On the main blog page you might have an h1 for the page and then have each of the blog post titles as an h2.

With the new outline algorithm you can move the post titles back and forth with the same h1 heading and let the outline algorithm figure out the hierarchy.

This makes any section of content more portable as we can mix it in with other content without worrying that it might break the hierarchy of the page.

While you’ll probably never have need you can now also structure a document with more than 6 levels. Ultimately we can now create an infinite amount of levels using the same h1–h6 elements in nested sections.

Scoped Styles

A new problem is created in being able to move content around from document to document and that’s in the styles that get applied to that content.

Our modular content will inherit the styles of the parent document, which may not be what we want. html5 offers a solution with the boolean scoped attribute that can be applied to the style element as seen below.

1
2
3
4
5
6
7
<article>
    <style scoped>
        h1 {styles here}
    </style>
    <h1></h1>
    <p></p>
</article>

In the above code the h1 of our article will be the scoped styles regardless of where the article is displayed. This allows us to move not only content, but the styles associated with that content easily.

 Collage of browser logos

Browser Support

In order to use the new semantic elements we defined those elements in our stylesheet as display: block to ensure they won’t break our layouts. We should now add the hgroup element.

1
2
3
section, article, header, footer, nav, aside, hgroup {
    display: block;
}

We’d of course need to create the element for IE as we did with the other elements or include the html5shiv script.

1
document.createElement('hgroup');

Now for the bad news. Browser support for the html5 outline algorithm is currently not good.

However the good news is you don’t have to use an h1 to start each new section. You can continue to use h2 and h3 tags inside sections to produce the outline you want.

We’ll lose the portability benefits until browsers are supporting the new algorithm, but we can start preparing for when they do offer support.

For now it’s probably better to stick with using headings as you always have, though it is safe to enclose your headings in the new semantic elements.

html5 logo

Summary

HTML5’s sectioning content model gives us greater control over the hierarchy of our documents. The new outline algorithm provides for an unlimited number of heading levels and helps make our content more modular and portable.

At the moment there’s limited browser support the the html5 outline algorithm, but we can still prepare for it while using h1–h6 tags as we do now.

We won’t be able to take advantage of some of the benefits the new outline algorithm will gives us, but we can prepare our documents for when browser support is more robust.

It will probably feel a little strange to markup a document with multiple h1 tags and leave it to the browser to sort out the hierarchy, but hopefully you can see the advantages in such an approach.

Download a free sample from my book, Design Fundamentals.

4 comments

  1. Scoped styles get me pumped! But, you’ve only shown an inline style. Is there any way to embed an external stylesheet that is scoped instead?

    • Good question. As far as I know you can’t link to an external stylesheet using the scoped attribute. At least I haven’t been able to find anything suggesting it’s possible.

      Keep in mind though that browsers aren’t supporting this yet and it’s always possible that by the time they do linking to an external stylesheet may be possible.

      I’m thinking given what scoped styles are meant for that the styles will always be there and not linked to, but you never know.

  2. Scoped style isn’t something I can see myself getting into as I prefer to do all my styling in the css file for a clean style/content separation. You can use some simple selectors like ‘article h1′ instead of scoped style to maintain the look you want.

    A good tool to see how a page’s HTML5 Outline looks:
    http://gsnedders.html5.org/outliner/
    One note is if you use that nav element the outline will return an Untitled Section where the nav is.

    • I think it has some limited use cases. It seems mainly for when you’re content is going to be embedded on another site and you want the styles to go along with it. For example what you code a badge for your site and want to keep the branding. You might not be able to count on the other site owner styling things to your liking.

      It could also make it easier to move some content within your site without having to deal with styling them for different places. Probably not the thing most of us will be excited about, but useful in some cases.

      Thanks for the link. I thought I linked to the outliner in this post or maybe I did in one of the other html5 posts. I know I meant to link to it in one of them.

Leave a Reply

Your email address will not be published.

css.php