Hello!
I'm a FLOSS search engine enthusiast (in complicity with several other
people).
It is been 10 years that I have been more or less researching the
subject. That is around the time the term 'filter bubble' [0] pop up on
the Internet, that it really started. This is also intellectual
curiosity, and a lot of technical ambition: what is the most complex
project or unsolved problem around?
I just updated my blog at:
https://hyper.dev/
Experimenting with notion.so as backend, but it is self-hosted at
Hetzner [1].
I made several attempts at building single node / machine or desktop
search engines. My primary goals are easy to run, easy to operate, easy
maintain. I used to do all my experience with CPython, but because of
the GIL, in general performance problem I tried Scheme Guile, and
nowadays Chez Scheme which according to my benchmark can be as fast as C
code trying to do the same thing.
Last year I built another prototype for my single server search engine
that I call babelia using GNU Guile on top wiredtiger. That is at this
time that I started benchmarking Chez Scheme, and Guile and discovered
that Chez Scheme was two to three times faster than Guile.
I also try to take part in standardization effort of the Scheme
programming language [2][3][4].
As you will soon figure, I am a great fan of Ordered Key-Value Stores,
and in particular FoundationDB for which I organize a meetup.
In September, I wrote a program that can scan through Common Crawl data
as fast as C called `ccse`. I think I lost the alpine and debian
binaries, but I can point you to the code if you find that interesting.
More recently, I experimented with an algorithm I devised myself to do
spell checking over a dictionary that is bigger than RAM. The only
problem is bigger-than-RAM dictionaries are difficult to find. But you
can try it with:
curl "https://search.hyper.dev/??q=jdango+saerch"
Yeah, it is written with Python, single threaded and sqlite lsm
extension (there is around 200K package names in the database).
And, I am the first the introduce myself the peacesear.ch community :)
[0] https://en.wikipedia.org/wiki/Filter_bubble
[1] https://hyper.dev/22db4369d7254513bb9e668004a920da.html
[2] https://srfi.schemers.org/srfi-167/
[3] https://srfi.schemers.org/srfi-168/
[4] https://srfi.schemers.org/srfi-180/