From nobody Sat Jan 30 18:55:19 2021 Authentication-Results: mail-b.sr.ht; dkim=pass header.d=gmail.com header.i=@gmail.com Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by mail-b.sr.ht (Postfix) with ESMTPS id 8FE6E11F029 for <~technomancy/fennel@lists.sr.ht>; Sat, 30 Jan 2021 18:55:19 +0000 (UTC) Received: by mail-pg1-f181.google.com with SMTP id o7so9135166pgl.1 for <~technomancy/fennel@lists.sr.ht>; Sat, 30 Jan 2021 10:55:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=v7Q1ofm+I0CATXCj+dquGnuqDqq2R4H3bBvByxeGWbc=; b=r4QOfFRDWTwj9aqoO+g/HeFBQV/7eOgXqIaOrZWcTEXRA8dg5Q9JbZsq43wMftvNcA 1dxdjpAj4JgLCKUMInGEWrFSec2yYMgclG5/f/wBRPdduy+Mat7Xqy3hAsU6yjcDV6Ih 33epCoY/8EsydhwvCIqmbAfnhcOra1dy9eLrYbqSrZZtCgrTJIXhK2M24PpJaU0w4A18 1ivxaE8gZ43nBvWfOgyN+s/sSWS9pyN2TtJ8jL8xvfUYAeEzEiEXbJk36+T0Wr95JPnE Bcd2kfFGeruhRGjfcMx6RBVQ/HjXKehSjilJj+9VFJMd2BaBVTQ+r63sUABFld5wzz8i 0Euw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=v7Q1ofm+I0CATXCj+dquGnuqDqq2R4H3bBvByxeGWbc=; b=U0pLNnpxUlFxrf6br3nvjsB5De5dJBJpjJiiqiYaqc28waVIiKu47W9nINOs9nmEX/ u4D0JlrSUN7kX4H6Uta6n4VKJNuQjdWBbXzw7eW7we8KjZTM+SciblL/E9fE8RAroEXE HWPx8DGzHfs7TR9lm4tpj3tiX6jdiV1pse5DS3eEbFJP/gX6GxC0IQFbwoTpuRdZ9jT4 Tu7xkcO83b2ktd0TP1t++U0SgqiFZwLYAFTw7KWMyJKGUDDGt7JJ4OOznQ7ACGMxhJcU OYLYWdXehQ9hdcsmIP57p1QrbsukDK3XtbA7hHYM28CPhQwbqLQ6198mVNHI+yfSpXeF sHcw== X-Gm-Message-State: AOAM531MdTS3kr5f+fPZdRyYq+DSTYTWqV/bEkxFG1aPZiJG+RMo4/8c 9JXCSF4bdANcpya2glRNcHqe19VyM1LpGC81Lypyzd9fhN2jEg== X-Google-Smtp-Source: ABdhPJwJIBw54VBQBX9h79DAxg+JinQ9CGouf6LcGqAH8mTYcF7WWzmSPDBFQSkm5QS+4Zx/eItLi+C7Bho2wc2mF+8= X-Received: by 2002:a63:f111:: with SMTP id f17mr9769814pgi.287.1612032918553; Sat, 30 Jan 2021 10:55:18 -0800 (PST) MIME-Version: 1.0 From: Andrey Orst Date: Sat, 30 Jan 2021 21:55:07 +0300 Message-ID: Subject: Raw string syntax proposal for Fennel To: ~technomancy/fennel@lists.sr.ht Content-Type: text/plain; charset="UTF-8" I often write strings that countain double quotes inside. This includes messages to be displayed with `print` or `io`, and especially docstrings. While such strings are printed correctly in the REPL or in program log, it is hard to read and edit such strings in the sources. Fortunately, Emacs has a lot of facilities, like separedit.el [1] package, that allows editing nested strings, and have everything be escaped automatically. But not everyone uses Emacs, and this still has a problem of reading such strings. Lua has raw strings denoted with `[[]]` syntax, which can actually contain nested square brackets too. This is done by adding additional symbols between opening brackets, and matching the same between closing ones: raw string `[==[[[string]]]==]` contains `[[string]]` in it. Such strings also has some useful properties, like ignoring escape sequences, and first newline, making it easier to write multi-line strings in deeply indented code, without having first line to be on the same line as string start: some_very_long_function_name[[ string with some long lines that exceed 80 character recommended width. Especially when first line is placed on the same line as the call]] So in Fennel, when we put strings in tables, or write function docstrings that contain inner strings, it would be handy to have such syntax to make things easier to read and write. Unfortunately we can't use Lua's `[[]]` directly, because it is used for sequential collections, and for destructuring, which may be very confusing. So I propose a different delimiter variants for raw strings. Here's a plenty to choose from: 1. r" "r - r is for raw string. 2. @r" "r - parser macro style. 3. r#" "# - Rust style [2]. 4. <" "> - Quote tag style Options 1 and 2 can be increased in depth by increasing amount of `r` symbols around the string. For example, here's how a documentation for raw string can be written in a raw string: rrrr" Raw string is starts with one or more r symbols followed by a double quote, and ends with double quote, followed by the same amount of r symbols as for the opeining quote. For example, here's a raw string that contains ordinary string in it: r"String "with quotes" in raw string"r Note, that there's no need to escape inner double quotes. That would result in such Lua raw string: [[String "with quotes" in raw string]] Raw string can contain a raw string inside as well: rr"a r"raw string with "with" string"r within a raw string"rr The escaping is not needed as we start and end raw string with matching amount of r symbols. The string abowe would result in the following Lua raw string: [[a r"raw string with "with" string"r within a raw string]] Note, that printed variant can be copied back to Fennel and read without any modifications. "rrrr This would be almost the same if we use variant 2, except each string would have to be prefixed with @, which is not ideal in my opinion. Another example, with raw strings in tables: (local raw-strings {:_VERSION "v0.0.1" :_DESCRIPTION rr" Raw string syntax proposal for Fennel language. With raw strings we can have "strings" within raw strings, and r"raw strings"r too"rr} Here's another example how raw strings allow us to start sting on a new line and the resulting string will be printed without it, as per Lua [[]] string implementation. Variant 3 is taken directly from Rust, just to show how this problem is tackled in a completely different language. I don't think we should use this variant, as `#` symbol already used for `hash-fn` and `auto-gensym`. Variant 4 is another option to consider, as we can increase nesting level by specifying more angle brackets around the string: <<"raw string with <"raw string">">> rr"raw string with r"raw string"r"rr And it is a bit a bit easier to see correct string end, but I don't really like this variant, because most of raw strings are of depth 1, and at that depth it looks like string is being compared with `<`. The 1st variant, in my opinion is also easier to type, and should be equally easy to parse as any other variant, probably even easier than variant 2. It also doesn't require parser macro implementation whatsoever. There's one thing to concern though. As always, I guess. If raw string contains square brackets, Fennel compiler no longer can produce simple `[[]]` string, and has to analyze raw string for having at least one `[` or `]`. But as far as I can see this is already done, because docstrings are produced with `[[]]` strings already, and therefore this can be already done for raw strings too. So a raw string with brackets like this: r"[[]]"r Should produce such Lua string: [=[[[]]]=] I'm open to suggestions and criticism, so if you have any thoughts I would be glad to hear! [1]: https://github.com/twlz0ne/separedit.el [2]: https://doc.rust-lang.org/rust-by-example/std/str.html#literals-and-escapes -- Best regards, Andrey Listopadov