Hi all,
I had a very interesting meeting with Owen Lynch earlier this week, and
he gave me the idea that forests would ideally be queryable using
something like datalog. This seems like a great idea indeed…
I have done some experimentation with the OCaml datalog library
(https://ocaml.org/p/datalog/0.6), which needs a bit of a tweak to build
on the current compiler (so I will send a patch soon). On my own
machine, I have a version of forester (which you can see in the
‘datalog’ branch) in which most of the builtin queries are computed
via datalog queries — and the performance is acceptable.
The model for integrating datalog into Forester is that we will
distinguish between two kinds of datalog snippets:
1. “Scripts”, which add rules and facts to the global database.
These scripts are executed once at “planting-time” — so they
would get executed after evaluating your own forest, or after implanting
foreign trees (when that feature is eventually surfaced to the user).
2. “Queries”, which are basically “Find all X such that P(X),
Q(X), … hold”. These are executed at render_time (we can in the
future consider some caching, but for now, I’m just keeping it
simple).
——
Overall there are a couple benefits to adopting datalog as the future of
Forester’s querying support. The first benefit is expressivity and
simplicity. But the major benefit is that, because datalog is a pretty
canonical point in the design space, we are far less likely to futz
around with it in the future and cause churn. Although queries have been
treated as “experimental, use at your own risk” currently, in the
relatively near future I will want the query language to be
well-solidified so that serialised forests will not be broken every time
I mess around with the code.
Finally, Owen made another useful observation: we can treat datalog as
the “core” language format for queries that we expect to be stable,
but that does not prevent us from experimenting (in the source language)
with useful shortcuts and high-level operations that can ultimately be
elaborated to datalog. But I do want to keep things pretty simple
nonetheless.
Thanks to Owen for his suggestion… I’m eager to turn this prototype
into a full user-facing implementation.
Best,
Jon
This is really cool.
I wonder how you think the graph analysis phase will be impacted by
these changes?
From what you described in the previous mail, it did not seem that the
graph analysis phase would be changed much, but then you said
https://mathstodon.xyz/@jonmsterling/113317954999338267
so I am curious.
The reason I am asking is because I am working on improving and
specializing the [Asai lsp](https://github.com/RedPRL/asai/tree/d38ae1a9a79eee7f9aee89df3edeb884607e0398/src-lsp)
implementation for our use case. This means that I need to work
with intermediate stages and unevaluated trees at a finer granularity
than what the current API allows for. For example it would be nice to be
able to load a single file and parse the transitive dependencies, and
then amend the import graph once new files are opened which might open
new paths to the already constructed import graph. I am tediously
writing these functions right now, but if the API for that will change
significantly, I would rather wait and also ask that you keep this use
case in mind while you refactor :)
Best, Kento
Hi Jon,
(This may be the second time you are receiving this email, because it was rejected from the list for not being plain text before.)
First of all, incredibly excited to receive this less than 24 hours after idly proposing it; I wish this happened with all of my harebrained ideas!
Another thought I had related to this: all of the metadata commands (\date, \author, etc.) could simply be macros that expanded to datalog fact declarations. Then the rendering phase could use queries like date(ocl-0023, ?) to pull out metadata from trees. This could reduce the surface area of the Forester language, while simultaneously opening up a bunch of new ways to query!
Looking forward to the alarmingly close future!
-Owen