Peter Marinov: 1 Add syntax coloring for Gemini markup via `awk` 4 files changed, 90 insertions(+), 2 deletions(-)
Hello Greg,
Hi Peter, Peter Marinov <pmar21@sonic.net> writes:
Hello Greg,
Peter Marinov <pmar21@sonic.net> writes:
Hi Peter, You should know that, since my last email, I’ve made a number of changes to gmi inspired by your contributions. I’m quite pleased with the results even if the program is some 13 lines longer for it.
Peter Marinov <pmar21@sonic.net>Hello Greg, I was thinking that the main problem that the formatting has to solve is wrapping of long lines. The coloring is nice but the wrapping is what makes the navigation of pages pleasant to read. Go to this page for example: gemini://guardian.shit.cx/uk-news/business/2021/may/19/uk-rail-overhaul-privatised-great-british-railways-/index.gmi Another good one is with combination of code sections + quote sections and long lines in all of them: gemini://drewdevault.com/2021/05/03/awk-is-the-coolest-tool-you-dont-know.gmi Try with and without the gawk script (Maximize your terminal window for maxium effect)Hmm, good point. It is much more readable with your gawk script.Now, if you want an even simpler solition, instead of coloring you can pipe through `fold` (although that would be indiscriminate, it would fold even code sections and will mangle the quotes sections, but you can try it to get a sense of what it is)I don’t think fold can repeat the "> " prefix, so Awk is probably the right tool for that job.A bug, I guess you have to mention `uuencode` in the README as required: $ ./gmi gemini://drewdevault.com ./gmi: 58: uuencode: not foundI found POSIX man pages for uuencode and uudecode on my system, so I assumed they were required utilities. It seems base64(1) may be more portable so I may switch gmi over to that instead.A bug, you have to invoke `less` with option --raw for the coloring to show up
Gemtext documents are now rendered with syntax colouring using a portable Awk script embedded in gmi itself. It’s enabled by default, but can be disabled by setting the PAGER environment variable. See the Git log for more details about this.
Peter Marinov <pmar21@sonic.net>I think the variable PAGER is too generic, it is used to point to `less` or equivalents. `lessgmi` itself it nos universal pager but a pager for GMI pages. I think GMI_PAGER is a better name with less of a chance of a conflict
As documented in the readme, gmi still lacks rendering of bold and italic text inline. My Awk skills are limited so, if anyone is able to share a patch for this, that would be ace. Until then I will be chipping away at the wishlist/to-do list in the readme.
Peter Marinov <pmar21@sonic.net>I guess I can look into that. Basically a coloring script without wrapping of linesI’ve added "pretty line wrapping" to the wishlist/to-do list. It is an important feature I had overlooked. I’ll have another read of your gawk script and see if I can rewrite it in standardised Awk as part of the embedded script -- or feel free to share a patch.Peter Marinov <pmar21@sonic.net>Hello Greg, After your last two messages I got into action and this evening I'm sending you a self-contained `lessgmi` pager I think it will be alright if `gmi` installs two files -- the main script itself and the pager `lessgmi`. Both are small and individually easy to reason about and understand. Because `lessgmi` now works with any (well the 2 I could test with) `awk` I suggest this to be plugged as the default pager for `gmi`. Then diverting to `less` could be an option. I've also cleaned the script code and I hope it is easier to understand and manipulate I hope with this it has crystalized to its final form :-) A PROBLEM: `lessgmi` can only receive content via a pipe, I simply couldn't make it operate via a file from the command line, I hope you can make that work so it is a proper pager (I imagine people might use it to open local .gmi files) Example: cat README.gmi | lessgmi = Works lessgmi READM.gmi = Doesn't --peGreg.--pe
Regards, Greg.
Peter Marinov <pmar21@sonic.net> writes:
Peter Marinov <pmar21@sonic.net> writes:
Hadn’t thought of it that way. Much of that is owed to the limited scope of the Gemini protocol and the simplicity of text/gemini. My initial reaction to Gemtext markup was that it was way too restrictive for authors, especially since it lacks inline links. Writing a client has given me a new perspective on that. Greg.
Copy & paste the following snippet into your terminal to import this patchset into git:
curl -s https://lists.sr.ht/~chambln/public-inbox/patches/22751/mbox | git am -3Learn more about email & git
* Requires GNU Awk * To try quickly inside the repo: $ cat README.gmi | gawk -f gmi_color.awk * This commit DOESN'T change the `make install` procedure, it is not clear what is a suitable location for the file gmi_color.awk --- README.gmi | 1 + README.md | 1 + gmi | 4 +-- gmi_color.awk | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 90 insertions(+), 2 deletions(-) create mode 100644 gmi_color.awk diff --git a/README.gmi b/README.gmi index 3599001..82515cb 100644 --- a/README.gmi +++ b/README.gmi @@ -11,6 +11,7 @@ Tiny Gemini browser written in POSIX-compliant shell. * less * fzf * xdg-open (for following non-Gemini links) +* GNU Awk ## Installation diff --git a/README.md b/README.md index 79a0257..c44f2f3 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,7 @@ Tiny Gemini browser written in POSIX-compliant shell. * less * fzf * xdg-open (for following non-Gemini links) +* GNU Awk ## Installation diff --git a/gmi b/gmi index a28275c..aab3ee5 100755 --- a/gmi +++ b/gmi @@ -75,8 +75,8 @@ main() { case $mimetype in text/*) printf %s "$body" | - iconv -f "$charset" | - less -Ps"$(printf %s "$uri" | sed 's/\./\\./g')" + iconv -f "$charset" | gawk -f gmi_color.awk | + less --raw -Ps"$(printf %s "$uri" | sed 's/\./\\./g')" { [ "$2" ] && printf '<= %s Back\n' "$2" printf '=> .. Up\n' diff --git a/gmi_color.awk b/gmi_color.awk new file mode 100644 index 0000000..d4b0c9c --- /dev/null +++ b/gmi_color.awk @@ -0,0 +1,86 @@ +# gmi_color.awk +# +# Markup highlighting of Gemini pages for ANSI terminal +# +# IMPORTANT: +# It uses GNU Awk syntax, it won't work with mawk for example + +func print_folded(long_line, prefix) { + # Split words into an array + split(long_line, list_of_words) + + line_len = 0 + for(i in list_of_words) { + # Check if we go beyond width of the block + if ((line_len + length(list_of_words[i])) > 78) { + line_len = 0 + printf("\n") + } + # Print prefix + if (line_len == 0) + printf(prefix) + line_len += length(list_of_words[i]) + 1 + printf("%s ", list_of_words[i]) + } + + # Handle case of empty lines + if (length(long_line) == 0) + printf(prefix) + + # The line ends with a new-line + printf("\n") +} + +BEGIN { + code_section = 0 +} + +# URL +match($0, /^(=>) ([^ \t]+).(.*)/, arr) { + if (code_section == 0) + { + print "\033[0;36m" arr[1], arr[2] "\033[0;35m,\n\t\033[1m", arr[3] "\033[0m" + next + } +} + +# Quote +match($0, /^>(.*)/, arr) { + if (code_section == 0) + { + printf("\033[0;34m") + print_folded(arr[1], "> ") + printf("\033[0m") + next + } +} + +# Code section +/^```/ { + if (code_section == 0) + # Activate color for code section + print "\033[0;32m```" + else + # Remove color at the end of the code section + print "```\033[0m" + code_section = 1 - code_section + next +} + +# Heading +/^#.*/ { + if (code_section == 0) + { + print "\033[0m\033[1m" $0 "\033[0m" + next + } +} + +# Everything else -- the plain text of the file +{ + # Outaide code sections: Wrap long lines, print short lines and as-is + if (code_section == 0) + print_folded($0, "\033[0m") + else + print("\033[0;32m" $0) +} -- 2.25.1
This is really cool. Thanks for sharing! In the interest of portability and customisability, I'd like to make this feature optional. I have an idea for how to achieve this fairly tidily, but I'll explain it later in this message. Last night I rewrote gmi, almost from scratch, with the help of various caffeinated beverages. I hope to publish the results as soon as I get it cleaned up. It will be the same in spirit but more robust. This was part of an effort to fix a number of issues, not least of which was a bug where downloads of binary files, such as images, became corrupt on disk. I discovered this was due to the shell's command substition deleting null bytes in the body. So that's fixed in this new, rewritten version which I'll push to the repo soon. Anyway, here's my idea: we write a separate program -- a pager specifically designed for text/gemini -- that invokes less(1) but with your gmi_color.awk as a kind-of preprocessor. Then we patch gmi to optionally use any pager the user desires, perhaps via an environment variable like PAGER or GMI_PAGER. What do you think? Greg.