HTML-compliant ToC and ToC data structure available to plugins

Message ID
DKIM signature
Download raw message
Hi everyone,

Recently I noticed that HTML standard still doesn't allow an <ul> or
<ol> element to have another <ul> or <ol> inside.
They can only have <li> elements for children, and nested lists should
be inside those <li>'s.

Then I noticed that a lot of table of contents (and nested lists in
general) around the web are non-compliant as well.
However, HTML-compliant nesting in fact has advantages for automated
It's also quite a bit harder to produce if you work with a flat list of
headings, which may be the reason why so many ToC's use invalid HTML.

If you have a tree of document sections, generating valid nested lists
from it is trivial. Many other things are also much simpler.
However, making a tree from a flat list isn't easy. It's pretty much an
LR(1) parser that parser generators can't help you with.

I've made up a general algorithm for building a headings tree.

First, there's now "valid_html" boolean option in the toc widget config:

  widget = "toc"
  valid_html = true

It's false by default for now, but I'm tempted to make it true by
default. Hristos and I tested it on our websites and couldn't see any
visual problems.
It may break some styling, but I don't think I've seen any CSS that
relies on the old nesting to work, so it may be better to change it
before anyone does that.

For a live test, I've made a collapsible tree ToC on
The new option forces correct nesting. Then a plugin converts <li>
elements that have a <ul> in them to HTML5 <details>/<summary> to make
them collapsible.

But wait, there's more! To make it easier to write custom ToC plugins,
the headings tree is also available from Lua.

A new HTML.get_headings_tree function returns a nested table like
{{"<h1>Chapter one</h1>", {"<h2>Section one</h2>", {}}}, {"<h1>Chapter
two</h1>", {}}}
The first element is actually a reference to an element tree item, not a
string, but you get the idea.

Let me know if you think valid_html should be the new default, and
whether you have any issues with that code.
Reply to thread Export thread (mbox)