jgart: 1 Various fixes. 1 files changed, 112 insertions(+), 115 deletions(-)
--- Hi Andrew, Here is a patch with various fixes and some suggestions. Feel free to take all of it or just what you like. all best, jgart
Hello, Thank you very much for such a deep copyediting, jgart! Update it in a few places, reworded commit message and merged.
pages/posts/2023-05-10-scheme-ssgs-review.md | 227 +++++++++---------- 1 file changed, 112 insertions(+), 115 deletions(-) diff --git a/pages/posts/2023-05-10-scheme-ssgs-review.md b/pages/posts/2023-05-10-scheme-ssgs-review.md index 967698e..93f406f 100644 --- a/pages/posts/2023-05-10-scheme-ssgs-review.md +++ b/pages/posts/2023-05-10-scheme-ssgs-review.md @@ -32,7 +32,7 @@ and provides a lot of functionality outside of SSG scope, so we don't cover it in this writing. Basically we have only one option left at the moment: Haunt and the -further discussion will be ralated to it, but before exploring it, we +further discussion will be related to it, but before exploring it, we need to get to common ground and cover the topic of different markup languages. @@ -40,7 +40,7 @@ languages. Markup languages are used for defining documentation structure, formatting, and relationship between its parts. They play an important role in SSGs, different languages can suite better for -different tasks: simple and expressive for human convinience, powerful +different tasks: simple and expressive for human convenience, powerful and capable for intermediate representation and manipulation, compatible and wide-spread for distribution. @@ -61,9 +61,9 @@ it's defined in the SGML Doctype language. Often used for representing and exchange data. HTML is a more user friendly markup language, it's defined in plain -english, has more forgiving parses and interpreters, allows things +english, has more forgiving parsers and interpreters, allows things like uppercased tags, tags without matching closing tag. Such -flexibilities can be convinient for users, but it makes it harder to +flexibilities can be convenient for users, but it makes it harder to programmaticaly operate on it (parse, process and serialize). XHTML (XML serialization of HTML) is a version of HTML, which is @@ -75,7 +75,7 @@ relationship between them and tools for XML can't be used for HTML in general case. ### Lightweight Markup Languages -This is another family of markup languages, which are simplier, less +This is another family of markup languages, which are simpler, less verbose, and more human-oriented in general. The notable members are Wiki, Markdown, Org-mode, reStructuredText, BBCode, AsciiDoc. @@ -85,9 +85,9 @@ final output is produced, usually in the form of (X)HTML documents. ### Other Markup Languages There are a number of languages and typesetting systems, which are not -covered by previous two sections: Texinfo, LaTeX, Skribe, Hiccup, +covered by the previous two sections: Texinfo, LaTeX, Skribe, Hiccup, SXML. The goals for them can be different: preparing hardcopies, -using as intermediate format, or just more suitable for specific needs +use as an intermediate format, or better suitability for specific needs like writing documentation. ## Haunt Overview @@ -100,13 +100,13 @@ HTML. Let's discuss various parts of this process in more details. ### SXML SXML is a representation of XML using S-expressions: lists, symbols -and strings, which can be less verbose than original representation +and strings, which can be less verbose than the original representation and much easier to work with in Scheme. -[SXML](https://okmij.org/ftp/Scheme/xml.html#SXML-spec) is used as -intermediate format for pages and their parts in Haunt, which +[SXML](https://okmij.org/ftp/Scheme/xml.html#SXML-spec) is used as an +intermediate format for pages and their parts in Haunt, which is relatively easy to process, manipulate and later serialize to target -formats like XHTML. It can be crafted by creating s-expression from +formats like XHTML. It can be crafted by creating s-expressions from Scheme code manually, or programmatically, or with a mix of both. It looks like this: @@ -126,7 +126,7 @@ looks like this: As it was mentioned in the introduction there is no direct relationship between XML and HTML, and while we usually can parse arbitrary HTML and convert it to SXML without losing significant -information, we can't directly use XML parses for that. For example +information, we can't directly use XML parsers for that. For example this HTML is not valid XML: ```html @@ -136,28 +136,28 @@ this HTML is not valid XML: Luckily, we can present boolean attributes in full form as `hidden="hidden"`, which is valid both in HTML[^4] and XML. -Most lightweight markup languages as well as SSGs usually targeting -HTML, but SSG needs to combine the content, templates and data from -various sources and merge them together, so SXML looks as a solid +Most lightweight markup languages (including SSGs) usually target +HTML. But SSGs needs to combine the content, templates and data from +various sources and merge them together, so SXML looks like a solid choice for intermediate representation. ### The Transformation Workflow -Each site page is built out of a series of consequently applied -trasformations, the transformation is basically a function, which +Each site page is built out of a series of subsequently applied +trasformations. The transformation is basically a function, which accepts some metadata and data and returns another data (usually SXML) -and sometimes additional metadata. Because transformation is a basic -pure function, a few transformations can be composed in one bigger +and sometimes additional metadata. Because this transformation is +a pure function, a few transformations can be composed in one bigger transformation. We will cover it in more details in the next section, but readers, -templates, layouts, serializers, builders are all just -transformations. For example the top level template, called layout -just produces SXML for the final page, which can be serialized to the -target format. To demonstrate the workflow we will go bottom up. +templates, layouts, serializers, builders are all just transformations. +For example the top level template, called layout just produces SXML +for the final page, which can be serialized to the target format. +To demonstrate the workflow we will take a bottom-up approach. Let's take a simple Markdown file, where one wants to write the content of a blog post in human-friendly markup langugage and let's -add a metadata to the top of this file: title, publish date, tags. +add metadata to the top of this file: title, publish date, tags. ```Markdown title: Hello, CommonMark! @@ -168,7 +168,7 @@ tags: markdown, commonmark ## This is a CommonMark post CommonMark is a **strongly** defined, *highly* compatible -specification of Markdown, learn more about CommomMark +specification of Markdown. Learn more about CommomMark [here](http://commonmark.org/). ``` @@ -190,29 +190,29 @@ data (SXML). Metadata+data representing one post is a good unit of operation. With one more transformation (it can be just a template, function adding `html`, `head`, `body` tags and a few more minor things) SSG can -produce almost ready for serialization SXML. Decide on resulting file -name, one more serialization step and final HTML is here. +produce almost ready for serialization SXML. After deciding on the +resulting file name and serialization step, the final HTML is produced. -Some additional transformation can be desirable in between: substitute -relative links to source markup files to finally generated html files -or something else, but overall it fits this general trasformation +Some additional transformations can be desirable. For example, +substituting relative links to source markup files in the generated html +files or something else, but overall it fits this general trasformation workflow well. -Let's zoom out a little and take a look at the directory, rather than -a single file. Usually, SSGs operate on a number of files and in +Let's zoom out a little and take a look at the directory structure, rather +than a single file. Usually, SSGs operate on a number of files and in addition to simple pages can generate composite pages like a list of -articles, rss feeds or something else. For this purpose our unit of -operation becomes a list of data+metadata objects: instead of parsing -one markup file SSG traverses the whole directory and generates a list -of objects for future transformation, overall idea still the same, but -instead many output files for many input files, SSG produces a list -containing only a few or even one output file. +articles, rss feeds, etc. For this purpose our unit of operation become +a list of data+metadata objects: instead of parsing one markup fil, +SSGs traverses the whole directory and generate a list of objects for +future transformation. The overall idea is still the same, but instead +many output files get produced from many input files. SSGs produces a +list containing only a few or even one output file. ### The Implementation #### The Entry Point -The entry point is a `site` record, which can be created with a -function having the following docstring: +The entry point in haunt is a `site` record, which can be created with +a function that has the following docstring: ``` Create a new site object. All arguments are optional: @@ -232,35 +232,33 @@ READERS: A list of reader objects for processing posts BUILDERS: A list of procedures for building pages from posts ``` -The primary thing here is a list of builders, as previously mentioned -a builder is a special case of complex transformation, it's a thing, -which do all the work including parsing, templating, generating -collections, serialization, etc. +The primary thing here is a list of builders. As previously mentioned, +a builder is a special case of complex transformation, which does all the +work of parsing, templating, generating collections, serialization, etc. -The rest of the list is basically metadata or auxiliary functions, -while many of those values can be useful, almost none of them are -needed in many cases. `scheme` and `domain` used for rss/atom feeds, -which are rare for personal or landing pages, the similiar logic is -applicable for the rest of function arguments, except maybe -`build-directory`, which almost always make sense. +The rest of the list is basically metadata or auxiliary functions. +While many of those values can be useful, almost none of them are needed +in many cases. `scheme` and `domain` are used for rss/atom feeds, which +are rare for personal or landing pages. Similiar logic is applicable +to the rest of the function arguments, except for maybe `build-directory`, +which almost always make sense. -Providing default values for them is convinient, but making them to be -fields of `site` records incorporates unecessary assumptions about -blog nature of the site, which can negatively impact the rest of the -implementation by adding unwanted coupling and reducing composability. -One of the options to avoid it is to make them to be values in -default-metadata rather than fields in the record. +Providing default values for them is convenient, but making them fields +of `site` records incorporates unecessary assumptions about the nature +of the blog and can negatively impact the rest of the implementation by +adding unwanted coupling as well as reducing its composability. One of +the options to avoid it is to make them values in the default-metadata +rather than fields in the record. #### Builders, Themes and Readers Builders are functions, which accept `site` and `posts`, apply series -of transformations and returns a list of artifacts. Themes and -Readers are basically transformations used somewhere in the build -process. Artifacts are records, which have `artifact-writer` field, -containing a closure writing actual output file. There are a number -of different builders provided out of the box, but the most basic one -(static-page) is missing, luckily it's not hard to implement it, so -let's do it. +of transformations and returns a list of artifacts. Themes and Readers +are basically transformations used in the build process. Artifacts are +records, which have `artifact-writer` field, containing a closure writing +the actual output file. There are a number of different builders provided +out of the box, but the most basic one (static-page) is missing, luckily +it's not hard to implement it, so let's do it. ```scheme (define* (page-theme #:key (footer %default-footer)) @@ -296,26 +294,26 @@ path." ``` As described in a section about transformations, the series of -transmorations happens here: -- `read-post` basically prases markdown and returns SXML + metadata. +transformations happens here: +- `read-post` basically parses markdown and returns SXML + metadata. - `render-post` uses post-template from `theme` to produce SXML post body. - `render-post` uses layout from `theme` to produce SXML post body. - `serialized-artifact` creates a closure, which wraps `sxml->html` and will later serialize obtained SXML for the page to HTML. -The implementation using already existing API is quite easy, but +The implementation using already existing APIs is quite easy, but unfortunately not perfect. While functions and records are composable enough to produce desired results, names are quite confusing and tightly related to blogs, but doesn't make much sense in the context of other site types. Every builder always accepts a list of posts, which were read and -transformed into sxml before ahead, this is imlpicit and again blog -related, which makes implementation less generic. It could be -implemented in the `blog` builder, but this way other builders like -atom-feed won't be able to reuse readed posts from from `blog` builder -and would need to read them again. This is due to the fact, that -build process has 3 primary steps and looks like this: +transformed into sxml ahead of time. This transformation is implicit +and again blog related, which makes the implementation less generic. +It could be implemented in the `blog` builder, but this way other +builders like atom-feed won't be able to reuse readed posts from from +`blog` builder and would need to read them again. This is due to the +fact, that the build process has three primary steps and looks like this: ```scheme ;; 1. Prepare site and posts @@ -351,7 +349,7 @@ this: ``` Just a series of transformations, which enriches one associative data -structures. Moreover it makes the implementation of such +structure. Moreover it makes the implementation of such transformations much more composable: ```scheme @@ -409,81 +407,80 @@ page, which relies on the content of previous steps, for example a collection of generated rss/atom links. However, such implementation has its own flaws: more flexibility and -less rigid structure can lead to more user mistakes and steeper -learning curve, original implementation theoretically could run -builders in parallel, but here one will need to implement it on the +less rigid structure can lead to more user mistakes and a steeper +learning curve. The original implementation could theoretically run +builders in parallel, but one will need to implement it on the user or builder side. ### Readers -As a component of the build process we encountered a step, where file -with in markup language is read by readers. There are two parts for -it: reading metadata and reading actual content. Let's cover +As a component of the build process we encountered a step, where the +file within the markup language is read by readers. There are two +parts for it: reading metadata and reading actual content. Let's cover implementation details for them. #### Metadata -As show in the example code snippet in the section related to -transformation, one can provide additional metadata in simple +As shown in the example code snippet in the section related to +transformation, one can provide additional metadata in a simple key-value format delimited by `---` from the content of the markup file. There are two main issues with the implementation, let's discuss them. -The metadata is required for built-in readers and even if one don't +The metadata is required for built-in readers and even if one doesn't want to set any values, they have to add `---` at the beginning of the file. This requirement is not needed and could be easily avoided. -Metadata reader accepts only simple `:` delimited key-value pairs. It -maybe not as flexible as yaml frontmatter. Metadata in such format -usually is not a part of the markup grammar and that means files are -written in the invalid markup. However, it's not a big deal, as -readers can use custom metadata parsers. +The metadata reader simply accepts colon-delimited key-value pairs. +It is potentially not be as flexible as yaml frontmatter. Metadata in +such format usually is not a part of the markup grammar and that means +files are written in an invalid markup. However, it's not a big deal, +as readers can use custom metadata parsers. #### Guile-Commonmark and Tree-Sitter Guile-Commonmark is used in Haunt by default to parse markdown files -in SXML, it doesn't support embeded html, tables, footnotes and -comments, so it can be quite inconvinient for many use cases. It's +in SXML, it doesn't support embedded html, tables, footnotes and +comments, so it can be quite inconvenient for many use cases. It somehow works and serves basic needs and more advanced use cases can be potentially implemented with more feature full libraries like -hypotetical `guile-ts-markdown` +a hypothetical `guile-ts-markdown` ([tree-sitter](https://tree-sitter.github.io/) based markdown parser). ## Conclusion -Haunt is the primary player in Scheme static site generators field at -the moment of writing. It gives all the basics to get up and running. -The number of available learning resources in the wild much smaller -than for similiar solutions from other languages ecosystems, but -provided documentation and source code is enough for seasoned schemer -to start with it and more importantly to learn everything about it in -a matter of hours, which is not possible for projects like `hugo`, -`jekyll`. +Haunt is the primary player in the Scheme static site generators arena +at the moment of this writing. It gives all the basics to get up and +running. The number of available learning resources in the wild are much +smaller than for similiar solutions from other languages ecosystems, but +provided documentation and source code is enough for a seasoned schemer +to start with learn in just a matter of hours. This is not possible with +projects like `hugo`, `jekyll`. -The functionality can be lacking in some cases, but due to hackable -nature of the project it's possible to gradually build upon basics and -add all the things needed. Unfortunatelly, the current state of -Scheme ecosystem and Guile in particular feels to be behind more -mainstream languages, but luckily the popularity of Guile reached the -critical level and the ecosystem will start growing in the nearest -future. +The functionality can be lacking in some cases, but due to the hackable +nature of the project, it is possible to gradually build upon the basics +and as well as any future needs. Unfortunately, the current state of the +Scheme ecosystem and Guile in particular feels behind more mainstream +languages, but hopefully the popularity of Guile will reach a higher +level and the ecosystem will start growing in the nearest future. ### Future Work -There is a number of improvements points for Haunt in particular and -Guile and Scheme in general. More complete tooling for working with -markup languages: org, md, html, yaml, etc. As a generic solution -tree-sitter seems a good candidate to quickly cover this huge area. +There are a number of improvement points for Haunt in particular, and +Guile Scheme in general. We need more complete tooling for working with +markup languages like org, md, html, yaml, etc. As a generic solution, +tree-sitter seems like a good candidate to quickly cover this huge area. -More streamlined and composable build process for Haunt described in -Builders section could be a good thing as well to make SSG to be more -flexible and components more reusable. +More streamlined and composable build processes for Haunt as described +in the Builders section could add to haunt's flexibility in general as +well as encouraging the use of reusable components. Possible integrations with other tools like Guix, REPL, Emacs for easier deployment, better caching, more interactive development and other goodies. -More documenation, materials and tool for possible workflows and use +More documenation, materials and tools for possible workflows and use cases from citation capabilites and automatic url resolution to on-huge-file workflows and org-roam integration. -**Aknowledgments.** Kudos to [David -Thompson](https://dthompson.us/about.html) for making Haunt. +**Acknowledgments.** Kudos to [David +Thompson](https://dthompson.us/about.html) for making Haunt and [Erik +Edrosa](http://www.erikedrosa.com/) for making guile-commonmark. [^1]: https://jamstack.org/generators/haunt/ -- 2.40.1