~technomancy/fennel

1

The road to self-hosting

Details
Message ID
<87r1u5shvf.fsf@hagelb.org>
DKIM signature
missing
Download raw message
Hello Fennelers!

One of the longer-term goals of Fennel has been to achieve self-hosting;
that is, we want Fennel's compiler to be written in Fennel. This is
commonly seen as a big milestone among languages; it's a sign of
maturity if the language can achieve it. But it's often tricky for
reasons that have less to do with language maturity and more to do with
practicality. As you can imagine, there are some serious circularity
puzzles you need to solve in order for a compiler to be implemented in
itself. But I think I have the beginnings of a plan.

The first step is to ensure the compiler is modularized. Right now
fennel.lua is just one big file, which makes it easy for anyone to embed
Fennel in their programs. This is a really important property that we
are committed to keeping, but it means in the current state that it's a
bit difficult to follow how some parts interact with other parts. We
have top-level locals that technically aren't globals, but effectively
might as well be since any piece of the compiler can read and modify
them.

I have a branch in which I've separated out the compiler into four
distinct "pseudo-modules". It's all still in one file, but I've used
inline functions to isolate each module's scope so that its internals
aren't visible to other sections, and it's very explicit about what it
exports by returning a table from the inline function. The four
pseudo-modules are:

* utils
* parser
* compiler
* specials

And then there's an "outer" bit at the bottom that is not isolated but
ties together the four modules and exports the public API.

    https://github.com/bakpakin/Fennel/pull/297

This is nice from a code organization perspective; it makes it easier to
understand the compiler. But it also allows us to move towards
self-hosting in a step-by-step way. We start by porting the "outer" bit
to Fennel and loading utils, parser, compiler, and specials from
separate Lua files where we store their current implementation. Fennel
has an "include" feature that lets us just dump the Lua files directly into
the compiled output.

Barring mistakes in the process of porting "outer", this should get us
1:1 correspondence with the existing implementation, with one important
difference: the file fennel.fnl is now canonical, and fennel.lua is now
compiler output. We would still check the file into the repo, and it
would be used to compile fennel.fnl -> fennel.lua.

At that point, we can begin the process of porting each of the four
pseudo-modules from Lua to Fennel. In fact, Benaiah has already done the
considerable work of porting the parser in his excellent "Fennel: the
Book" literate program:

    https://benaiah.me/pastes/fennel-the-book.html

The other modules vary in their complexity; utils (180 lines) consists
largely of small self-contained pure functions, while compiler (790
lines) is massive and extremely inter-tangled. The specials module is
longer than the compiler (909 lines) but easier to understand since each
individual special form tends to be somewhat isolated and not rely on
others. There might also be an opportunity to convert some of these
specials into macros.

Breaking down the task in this way makes me feel like self-hosting is a
lot more doable. One question is that of timing; I think that we should
merge the modularization changes for 0.5.0, but not port "outer" to
Fennel until 0.6.0. It's tempting to keep this work on a long-lived
branch, but I'm hesitant to do that because it means every single change
on master would need to be ported over, which would be pretty error-prone.

I'm very interested in hearing if anyone else has feedback about this
plan, or feedback about the "pseudo-modules" patch I linked to above.

-Phil
Details
Message ID
<87eepylahd.fsf@hagelb.org>
In-Reply-To
<87r1u5shvf.fsf@hagelb.org> (view parent)
DKIM signature
missing
Download raw message
Phil Hagelberg <phil@hagelb.org> writes:

> I have a branch in which I've separated out the compiler into four
> distinct "pseudo-modules". It's all still in one file, but I've used
> inline functions to isolate each module's scope so that its internals
> aren't visible to other sections, and it's very explicit about what it
> exports by returning a table from the inline function.

Great progress on the self-hosting front: I wrote a Lua->Fennel compiler
(the opposite of Fennel which compiles from Fennel->Lua) called Antifennel:

    https://git.sr.ht/~technomancy/antifennel/

Even though it is pretty immature, it quickly became capable of
compiling first its own parser, which was written in Lua and taken from
this fascinating self-hosting Lua->LuaJIT-bytecode compiler[1] and then
just recently capable of compiling Fennel itself into Fennel. You can
see the results here after only three days:

    https://p.hagelb.org/fennel.fnl.html

This is very far from being idiomatic Fennel code, but it does pass the
complete test suite on all four supported versions.

The next step is to cut a release with the modularized compiler, then
start take the ported modules and refine them. Certain features of Lua
do not translate very well to Fennel such as early returns and repeat
loops. The Antifennel compiler also does not do any checking to see
whether locals are mutated or not, so it emits them all as "var". It
also doesn't check to see whether assignments are to globals, locals, or
function params, so it uses "set-forcibly!" for all of them. Tracking
these factors in Antifennel will allow it to emit much more readable code.

Anyway, if you'd like to help out with the process please let me know!
I'm very excited to see where this is going.

-Phil

[1] - https://github.com/franko/luajit-lang-toolkit
Export thread (mbox)