As we all know, documentation is one of the most opinionated aspects of every workspace. It often caters to a common denominator of the preferences of teams, available tools, and downstream requirements.
I would like to raise an important principle, which I personally think is applicable to documentation, and would like to open it for discussion to benefit from our shared experience on the topic.
I believe that the “single source of truth” principle should apply to documentation, in some sense, but I am sure we all consider this from different views.
Those views often relate to the processing pipeline which broadly approximates to:
-
(sometimes) Populated or partially populated from the source code.
-
(sometimes) Transformed from a more expressive form to a more portable one — like
rst/md -> md/html
-
(often) Serialized into a more transportable form — like
md/html -> json
-
(always) Rendered from a trans/portable form into a stateful presentational structure — like
md/json/html -> #fragment
-
(always) Captured for syndication in resynthesized form(s) — like readers, sharing, enduser data transfer operations (clipboard, drag and drop).
Opinion — the single source of truth principle can apply to how we represent the parsed structure irrespective of the serialized form, and imho, where there is technically already an ideal model (the DOM) that may sometimes be avoided for historical reasons, but may no longer be avoided due to changing paradigms (like accessibility).
Over the years, documentation pipelines were designed due to certain necessities, but those necessities are no longer justifiable today (ie clients are capable enough if not preferred).
Factors for considerations:
-
Generation (ie backend Rendering) — like GitHub’s which many can relate to how it sometimes complicated their workflows, differently, but even GitHub pages still rendered the same markdown file differently (as of late 2018) and in fact failed to render deeper nested aspects (I believe it was layered
<li>
and<blockqoute>
that broke) -
Styling — content-specific typographic hints (beyond
<b>
or<i>
) and layout, which is separate from but must align with page-specific styling, and the different schools of thought on separation of concerns. It is messy, to say the least. -
Notation — I personally think of this as the author-referred abstractions of syntax in the case of documentation, and it is only approximated by the tokenizer and then realized by sometimes hacky simulation logic — that’s imho how most solutions try to give the façade of syntax tokens.
-
Substance — I personally think of this as the reader-referred abstractions of semantics in the case of documentation, and this is a far more complicated topic to bullet, but think of the
alt
of an<img>
is the simplest example, all meta-content aspects, and all meaningful content structures (ie<ol><li>
) — that imho is where creative pipelines often get tangled.
Sorry, my own thoughts are a little scattered, and I am trying to use this as an opportunity to try to better frame them — they are however the result of a year-long dive of trying to gain perspective in the topic, to try to solve problems that relate to more diverse notions of accessibility.
Thoughts?