scope, colspan & rowspan (15th November 2008)

In HTML4, scope is unaffected by colspan and rowspan. Let me walk you through the spec lawyering which leads me to this conclusion.

Table of Contents

RTFM

HTML4 defines 4 modes for scope:

row:
The current cell provides header information for the rest of the row that contains it (see also the section on table directionality).
col:
The current cell provides header information for the rest of the column that contains it.
rowgroup:
The header cell provides header information for the rest of the row group that contains it.
colgroup:
The header cell provides header information for the rest of the column group that contains it.

Already we see that scope only applies to:

Special treatment of header cells using colspan or rowspan is conspicous by its absence.

scope="row"

Limits the association to “the rest of the row that contains it”. While talking about row groups:

Each row group must contain at least one row, defined by the TR element.

So a row corresponds to a <tr> element. The header cell is only contained by the first <tr>.

There is no prose to say a cell with rowspan creates a special row. So “the row” doesn’t gain special meaning for spanned header cells.

Therefore, scope="row" only applies the header cell to the first row it spans.

scope="col"

Limits the association to “the rest of the column that contains it”. However, what exactly is a column in HTML4 data tables? This is a surprisingly slippery question to answer.

Column ≠ <col>!

Bit of a red herring but worth mentioning:

The COL does not group columns together structurally [...].

Number of Columns

Calculating the number of columns in a table specifies what to do in the absence of <col> elements:

The number of columns is equal to the number of columns required by the row with the most columns, including cells that span multiple columns.

A cell with colspan simply spans many columns. So “the column” doesn’t gain special meaning for header cells using colspan.

It goes on to say:

For any row that has fewer than this number of columns, the end of that row should be padded with empty cells.

It seems one cell adds one column, although this is described rather fuzzily.

Contained by a Column

Cells cannot be a descendant of columnar markup in HTML4. So “the column” must take its meaning from “Calculating the number of columns in a table”. Namely, that one cell adds one column.

When a cell uses colspan, the additional cells it spans across are counted as additional columns in “Calculating the number of columns in a table”. Given that only “the column that contains it” is affected by scope="col", that must only be the first column.

Therefore, scope="col" only applies the header cell down the first column it spans.

scope="rowgroup"

Extends the association across “the rest of the row group that contains it.” From row groups:

When present, each THEAD, TFOOT, and TBODY contains a row group.

It also tell us a table without explicit row groups has an implied <tbody>:

<!ELEMENT TBODY    O O (TR)+           -- table body -->

[...]

The TBODY start tag is always required except when the table contains only one table body and no table head or foot sections.

Therefore, scope="rowgroup" applies the header cell to an area:

Using scope="rowgroup" in a table with an implied <tbody> (or in an explicit <thead>, <tfoot> or <tbody> which includes the rest of the rows in that table) applies that cell to the rest of the table, even when the rowspan finishes before that.

scope="colgroup"

Extends the association across “the rest of the column group that contains it.” From column groups:

Column groups allow authors to create structural divisions within a table. [...] The COL element allows authors to share attributes among several columns without implying any structural grouping.

So the <colgroup> element is all that scope="colgroup" is affected by. But how many columns does it cover? The specification for <colgroup span> explains:

This attribute, which must be an integer > 0, specifies the number of columns in a column group. Values mean the following:

User agents must ignore this attribute if the COLGROUP element contains one or more COL elements.

Calculating the number of columns in a table tells us that <col> items inside a <colgroup> do affect the number of columns that <colgroup> spans:

(“Step 1” is actually the parent item of that list, so this doesn’t make sense. Treating the first item of this bulleted list as “step 1” makes sense, though.)

You need <colgroup> elements “to create structural divisions within a table” and <colgroup span> can do this by itself. But if <col> or <col span> are also present, they set the number of columns spanned by their <colgroup> elements. These elements work in tandem.

Column groups tells us that a <colgroup> is always present:

A table may either contain a single implicit column group (no COLGROUP element delimits the columns) or any number of explicit column groups (each delimited by an instance of the COLGROUP element).

Therefore scope="colgroup" applies the header cell to an area:

Using scope="colgroup" in a table with an implied <colgroup> (or an explicit <colgroup> which includes the rest of the columns in that table) applies that header cell to the rest of the table, even when the colspan finishes before that.

Innovation

Yet tables I find on the web expect spanned cells to work with these values. Plain <th> spanning cells is also expected to Just Work. Tutorials I’ve seen on the subject advise this, too.

HTML5’s header association algorithm follows a research-oriented design process. So did HTML4’s, of course. But HTML5 has the luxury of studying how tables ended up being authored on the web at large. This lets it optimise the common cases while making it more robust.

HTML5 makes the common cases fully accessible with trivial markup: use <th> for header cells and <td> for data cells.

Conclusion

Eventually, tutorials should advise plain <th> for header association in the common cases.