~jonsterling/forester-devel

3 2

Coming soon: datalog

Details
Message ID
<DC551F52-6827-42EA-9100-B4B63126A4D4@jonmsterling.com>
DKIM signature
pass
Download raw message
Hi all,

I had a very interesting meeting with Owen Lynch earlier this week, and 
he gave me the idea that forests would ideally be queryable using 
something like datalog. This seems like a great idea indeed…

I have done some experimentation with the OCaml datalog library 
(https://ocaml.org/p/datalog/0.6), which needs a bit of a tweak to build 
on the current compiler (so I will send a patch soon). On my own 
machine, I have a version of forester (which you can see in the 
‘datalog’ branch) in which most of the builtin queries are computed 
via datalog queries — and the performance is acceptable.

The model for integrating datalog into Forester is that we will 
distinguish between two kinds of datalog snippets:

1. “Scripts”, which add rules and facts to the global database. 
These scripts are executed once at “planting-time” — so they 
would get executed after evaluating your own forest, or after implanting 
foreign trees (when that feature is eventually surfaced to the user).

2. “Queries”, which are basically “Find all X such that P(X), 
Q(X), … hold”.  These are executed at render_time (we can in the 
future consider some caching, but for now, I’m just keeping it 
simple).

——

Overall there are a couple benefits to adopting datalog as the future of 
Forester’s querying support. The first benefit is expressivity and 
simplicity. But the major benefit is that, because datalog is a pretty 
canonical point in the design space, we are far less likely to futz 
around with it in the future and cause churn. Although queries have been 
treated as “experimental, use at your own risk” currently, in the 
relatively near future I will want the query language to be 
well-solidified so that serialised forests will not be broken every time 
I mess around with the code.

Finally, Owen made another useful observation: we can treat datalog as 
the “core” language format for queries that we expect to be stable, 
but that does not prevent us from experimenting (in the source language) 
with useful shortcuts and high-level operations that can ultimately be 
elaborated to datalog. But I do want to keep things pretty simple 
nonetheless.

Thanks to Owen for his suggestion… I’m eager to turn this prototype 
into a full user-facing implementation.

Best,
Jon
Details
Message ID
<D4XF444AXCIK.3OFWL81I9U1IT@gmail.com>
In-Reply-To
<DC551F52-6827-42EA-9100-B4B63126A4D4@jonmsterling.com> (view parent)
DKIM signature
pass
Download raw message
This is really cool.

I wonder how you think the graph analysis phase will be impacted by
these changes?

From what you described in the previous mail, it did not seem that the
graph analysis phase would be changed much, but then you said

https://mathstodon.xyz/@jonmsterling/113317954999338267

so I am curious.

The reason I am asking is because I am working on improving and
specializing the [Asai lsp](https://github.com/RedPRL/asai/tree/d38ae1a9a79eee7f9aee89df3edeb884607e0398/src-lsp)
implementation for our use case. This means that I need to work 
with intermediate stages and unevaluated trees at a finer granularity
than what the current API allows for. For example it would be nice to be
able to load a single file and parse the transitive dependencies, and
then amend the import graph once new files are opened which might open
new paths to the already constructed import graph. I am tediously
writing these functions right now, but if the API for that will change
significantly, I would rather wait and also ask that you keep this use
case in mind while you refactor :)

Best, Kento
Details
Message ID
<D4XF6YC34ANP.1XWRBAQW8LRST@gmail.com>
In-Reply-To
<DC551F52-6827-42EA-9100-B4B63126A4D4@jonmsterling.com> (view parent)
DKIM signature
pass
Download raw message
Oh, I conflated the graph analysis phase with the phase that constructs
the import graph... Long day...
Details
Message ID
<99CC25AB-6638-466A-AFA9-09118562A234@owenlynch.org>
In-Reply-To
<DC551F52-6827-42EA-9100-B4B63126A4D4@jonmsterling.com> (view parent)
DKIM signature
pass
Download raw message
Hi Jon,

(This may be the second time you are receiving this email, because it was rejected from the list for not being plain text before.)

First of all, incredibly excited to receive this less than 24 hours after idly proposing it; I wish this happened with all of my harebrained ideas!

Another thought I had related to this: all of the metadata commands (\date, \author, etc.) could simply be macros that expanded to datalog fact declarations. Then the rendering phase could use queries like date(ocl-0023, ?) to pull out metadata from trees. This could reduce the surface area of the Forester language, while simultaneously opening up a bunch of new ways to query!

Looking forward to the alarmingly close future!

-Owen
Reply to thread Export thread (mbox)