November 2008 in the Life of Ben (Blog)

  1. January
  2. February
  3. March
  4. April
  5. May
  6. June
  7. July
  8. August
  9. September
  10. October
  11. November
  12. December

Smart Headers & HTML5 (30th November 2008)

This comparison is for HTMLWG, to inform further changes to the header association features for data tables in HTML5. The idea arose during TPAC 2008. It is tracked as Action 85 in HTMLWG.

Quick Reference

  1. Similarities
  2. Differences
  3. Ben’s Advice

That’s All, Folks!

This comparison has influenced HTML5:

Ben's research was instrumental to the changes made here; his research probably had more of an effect on the spec than all the other discussions put together. I cannot emphasise enough how much more important objective analysis and logical argumentation is compared to opinions and assertions.

Ian Hickson

Final update was on 20th December 2008.

Feedback Finished

Feedback was welcomed until 15th December 2008. I extended this to 20th December 2008.

Spread the word to any other lists, websites or individuals you think are relevant.


Smart Headers is compared with HTML5. James Graham has built a prototype for both.

HTML4 is not included due to ambiguities which become significant when assessing exactly what happens in genuine data tables.


3 Mechanisms Applied

Data Cells without Header Cells

Table Header Cells are <th>

Regular Associations are Header Cell → Data Cells

Optimised for Regular Associations

A regular data table has each header cell directly above or to the left of all the data cells it must be associated with. Smart Headers and HTML5 are optimised for this case by making associations automatically, without requiring scope or headers+id. Specifically:

Irregular Associations are Header Cells ← Data Cell

Upon finding a data cell with the headers attribute, both algorithms will search for the corresponding header cells:

  1. Split the headers attribute value into its constituent tokens.
  2. For each token, both algorithms scan the document (via getElementById) for the first element with matching id.
  3. For each token, search for a header cell with matching id in that table.
  4. If the element is a <th> in the current <table>, associate it with the data cell.

Irregular Associations via headers+id

An irregular data table has one or more header cells in a position which is not directly above or to the left of all the data cells it must be associated with. Smart Headers and HTML5 support these cases but require headers+id. Specifically:

Incremental Association

James Graham: “In principle I believe either algorithm could be written to run incrementally.”

Arbitrary Levels of Header Cells

Specialist Markup Takes Precendence

Differently Spanned Header Cells (Adjacent or Distant)

Data Cells Spanning Different Spans of Header Cells

Both HTML5 and Smart Headers use Forming a table to determine which slots in a table each cell covers, including spanned cells. As such, each header cell is associated with all cells that cover one or more slots in the area that header cell applies to.


Header Cells for Header Cells

Header Cells Blocking Header Cells

Equally Spanned Header Cells, Adjacent

Equally Spanned Header Cells, Data Cells Between

Empty Cells

Broken Tables are Unsupported

Tables with incorrect semantics inevitably lead to incorrect results.

Ben’s Advice for HTML5

Both algorithms reflect extensive feedback, research and testing. Each makes design choices which are not universally agreed on. Below are my suggested changes for HTML5. They are informed by:

Header Cells for Header Cells

Tables can have multiple levels of header cells for columns or rows. Sometimes, the higher level of header cell is the only way to disambiguate the lower level of header cell when moving along the lower level of header cells.

<td> with Header Cell Semantics

Let <td> act the same as <th> when given header cell semantics via the scope or headers+id features.

Equally Spanned Header Cells, Adjacent

Ignore the sizes of adjacent header cells until you break out of header cells and into data cells.

Equally Spanned Header Cells, Distant

Block the current header cell from associating any further along that axis if you find a header cell with the same span after one or more data cells.

Equally Spanned Header Cells Using scope, Distant

Should have the same blocking logic as the auto state.

Emptiness of Cells

Define emptiness the same way for header cells and data cells.

Empty Data Cells

These must get header cells associated with them.

Empty Headers Cells

These should not create associations since it’s effectively a no-op.


Early on, I considered <td><b> and <td><strong> as aliases of <th>. I now think the algorithm should not use heuristics due to their unreliability in real data tables.

Wide Header Cell Heuristic

Do not make a header cell at the start of a row with empty data cells act as if it spanned all those empty data cells.

scope, colspan & rowspan (15th November 2008)

In HTML4, scope is unaffected by colspan and rowspan. Let me walk you through the spec lawyering which leads me to this conclusion.

Table of Contents


HTML4 defines 4 modes for scope:

The current cell provides header information for the rest of the row that contains it (see also the section on table directionality).
The current cell provides header information for the rest of the column that contains it.
The header cell provides header information for the rest of the row group that contains it.
The header cell provides header information for the rest of the column group that contains it.

Already we see that scope only applies to:

Special treatment of header cells using colspan or rowspan is conspicous by its absence.


Limits the association to “the rest of the row that contains it”. While talking about row groups:

Each row group must contain at least one row, defined by the TR element.

So a row corresponds to a <tr> element. The header cell is only contained by the first <tr>.

There is no prose to say a cell with rowspan creates a special row. So “the row” doesn’t gain special meaning for spanned header cells.

Therefore, scope="row" only applies the header cell to the first row it spans.


Limits the association to “the rest of the column that contains it”. However, what exactly is a column in HTML4 data tables? This is a surprisingly slippery question to answer.

Column ≠ <col>!

Bit of a red herring but worth mentioning:

The COL does not group columns together structurally [...].

Number of Columns

Calculating the number of columns in a table specifies what to do in the absence of <col> elements:

The number of columns is equal to the number of columns required by the row with the most columns, including cells that span multiple columns.

A cell with colspan simply spans many columns. So “the column” doesn’t gain special meaning for header cells using colspan.

It goes on to say:

For any row that has fewer than this number of columns, the end of that row should be padded with empty cells.

It seems one cell adds one column, although this is described rather fuzzily.

Contained by a Column

Cells cannot be a descendant of columnar markup in HTML4. So “the column” must take its meaning from “Calculating the number of columns in a table”. Namely, that one cell adds one column.

When a cell uses colspan, the additional cells it spans across are counted as additional columns in “Calculating the number of columns in a table”. Given that only “the column that contains it” is affected by scope="col", that must only be the first column.

Therefore, scope="col" only applies the header cell down the first column it spans.


Extends the association across “the rest of the row group that contains it.” From row groups:

When present, each THEAD, TFOOT, and TBODY contains a row group.

It also tell us a table without explicit row groups has an implied <tbody>:

<!ELEMENT TBODY    O O (TR)+           -- table body -->


The TBODY start tag is always required except when the table contains only one table body and no table head or foot sections.

Therefore, scope="rowgroup" applies the header cell to an area:

Using scope="rowgroup" in a table with an implied <tbody> (or in an explicit <thead>, <tfoot> or <tbody> which includes the rest of the rows in that table) applies that cell to the rest of the table, even when the rowspan finishes before that.


Extends the association across “the rest of the column group that contains it.” From column groups:

Column groups allow authors to create structural divisions within a table. [...] The COL element allows authors to share attributes among several columns without implying any structural grouping.

So the <colgroup> element is all that scope="colgroup" is affected by. But how many columns does it cover? The specification for <colgroup span> explains:

This attribute, which must be an integer > 0, specifies the number of columns in a column group. Values mean the following:

User agents must ignore this attribute if the COLGROUP element contains one or more COL elements.

Calculating the number of columns in a table tells us that <col> items inside a <colgroup> do affect the number of columns that <colgroup> spans:

(“Step 1” is actually the parent item of that list, so this doesn’t make sense. Treating the first item of this bulleted list as “step 1” makes sense, though.)

You need <colgroup> elements “to create structural divisions within a table” and <colgroup span> can do this by itself. But if <col> or <col span> are also present, they set the number of columns spanned by their <colgroup> elements. These elements work in tandem.

Column groups tells us that a <colgroup> is always present:

A table may either contain a single implicit column group (no COLGROUP element delimits the columns) or any number of explicit column groups (each delimited by an instance of the COLGROUP element).

Therefore scope="colgroup" applies the header cell to an area:

Using scope="colgroup" in a table with an implied <colgroup> (or an explicit <colgroup> which includes the rest of the columns in that table) applies that header cell to the rest of the table, even when the colspan finishes before that.


Yet tables I find on the web expect spanned cells to work with these values. Plain <th> spanning cells is also expected to Just Work. Tutorials I’ve seen on the subject advise this, too.

HTML5’s header association algorithm follows a research-oriented design process. So did HTML4’s, of course. But HTML5 has the luxury of studying how tables ended up being authored on the web at large. This lets it optimise the common cases while making it more robust.

HTML5 makes the common cases fully accessible with trivial markup: use <th> for header cells and <td> for data cells.


Eventually, tutorials should advise plain <th> for header association in the common cases.

7th Blood Donation (14th November 2008)

Same location and use of my jumper too keep warm as before. Donation went really fast, although taking the needle out was pretty painful this time. It remained sore for an hour or so afterwards.

All part of being a hero, I guess! c{8¬)

TPAC Cost (11th November 2008)

Looks like the trip cost me getting on for £600 overall. Well worth spending an hour on the Excel expense sheet Mozilla sent me, with half a dozen receipts scanned and attached.

On the TGV I had a surprisingly tasty and filling lunch. A long baguette with cheese, ham and lettuce followed by a small but very chocolatey dessert and washed down with hot chocolate. This came to under €10 but I had it both ways, so that’s €20 all told.

Not really worth claiming, I thought. But that’s around £15 which is twice what I make per hour as Calthorpe Park School’s part-time webmaster. Scanning in the receipt and making a sensibly sized PNG out of it was a cinch. Well, it certainly took less than 2 hours!

Leaving Accessify Forums (7th November 2008)

After spliting 78 threads individually to get the spam out, I’m leaving Accessify Forum.

Deleting a user and all their messages from the Admin Control Panel probably takes 10 clicks. There have been many requests behind the scenes to give moderators access to it. Somehow, the owner of the parent site has consistently refused to do this.

It boils down to an absentee owner who won’t let other people help.

— Ben Millard

It was just like the slow, grinding halt which made me leave MISA. There’s an entry from Fantasai which describes the feeling well.

Calthorpe Wins “Most Accessible Website” (4th November 2008)

We were announced as being in the final 10 of the Hantsweb Awards way back on 15th September 2008. Some time during October we were listed as finalists.

On 4th November 2008, at the awards ceremony, we were finally announced as the winners of the Most Accessible Website category.

The Awards Ceremony

Just like a miniature TV awards ceremony, there’s a stage in front of an audience with professional lighting, sound a local radio DJ as the host.

The venue was inside a planetarium. A massive digital screen forms a hemisphere above the audience.

Before proceedings really began, they ran an amazing CGI trailer type thing celebrating space exploration. Quite unexpected and very well produced! The moving camera angles coupled with the screen filling my peripheral vision made for a real sense of motion, even though we were all seated in very comfortable theatre chairs.

Some local politicians opened proceedings and the awards got underway promptly.

Our Chance in the Spotlight

“Most Accessible” was the first category. The big screen showed the award names and finalist websites as the host announced the category. All the finalists were invited to a set of chairs beside the stage so the audience could see us when the winner was announced.

It was Calthorpe Park School. We’d done it!

We went onto the stage and stood on our marks to get a picture taken. The audience were all very willing to applaud. It’s quite a nerve-racking experience, being the centre of attention like that. I was on a high but didn’t want to look insane for the photo.

While we were on stage, the host was saying how the judges summarised our website. I was so busy trying not to trip over, making sure I stood in the right place and was looking squarely into the camera lense that I didn’t hear a word of it!

We applauded the runners up, who were announced after we sat back down.

Other Finalists

The categories continued for an hour or so.

A lot of visibly well-designed websites became finalists. Mainly the winners had kept things simple. A couple stood out for marrying wonderfully clear layouts with great finesse and subtley in their graphics and palette. That’s what I’ve wanted for Calthorpe. Alas, I haven’t the skill and they haven’t the budget.

Warren and I occassionally noted the hyperlink styles websites use. I think he still dislikes the “blue and underlined” mantra I enforce on Calthorpe…

Success at Mingling

With the last prizes awarded, everyone headed for the buffet area. I had a handful of crisps and 2 sausage rolls, washed down with glasses of pleasant orange juice.

People seemed to be standing in the same groups rather than diffusing throughout the room. I had a go at finding some other school type places and chatted to Wyvern Technology College for a while. They use Joomla under the hood, as I’ve now confirmed:

<meta name="Generator" content="Joomla! - Copyright (C) 2005 - 2007 Open Source Matters. All rights reserved." />

Joomla is used by sDesign1 as well, such as Tower College. I’ve just notice the devastating news they’ll be moving to a Sharepoint-powered portal. There goes the neighbourhood. :*(

Dude who does the Wyvern Technology College site said it took a bit of customising to get it all to W3C standards. He knew about CSS layout and did cross-browser compatibility firefighting. Studied web design at university, although it was mostly graphic design.

We’re both hosted by Hampshire County Council and are both dead impressed by their service. They keep hitting the 150MB disk space limit. I suggested paying to get more. Not sure how much Calthorpe uses but we’ve not had issues with disk space, bandwidth or CPU usage.

Future Plans

At some point, I want Calthorpe to look as good as it works. I mentioned the requirements the current “design” has to accomodate to a Hampshire County Council communications dude, who interviewed us.

Our audience ranges from Year 5 kids all the way up to grandparents of Year 11 students and all points in between. It has to be many things at once:

Being the archetype for “accessibile but boring” is kinda lame, especially considering what I do with sDesign1. It’s a challenge worthy of a great designer. But great designers cost great piles of money which Calthorpe doesn’t have. Maybe we’ll find one next year.

Still, it’s a labour of love and I’m proud to win against 100 other entrants for 2 years in a row!