~peacesearch/peacesearch-discuss

Introduction

Details
Message ID
<66e4f31a7ba988dd96f240521f2424e1@hyper.dev>
DKIM signature
missing
Download raw message
Hello!


I'm a FLOSS search engine enthusiast (in complicity with several other 
people).

It is been 10 years that I have been more or less researching the 
subject. That is around the time the term 'filter bubble' [0] pop up on 
the Internet, that it really started. This is also intellectual 
curiosity, and a lot of technical ambition: what is the most complex 
project or unsolved problem around?

I just updated my blog at:

   https://hyper.dev/

Experimenting with notion.so as backend, but it is self-hosted at 
Hetzner [1].

I made several attempts at building single node / machine or desktop 
search engines.  My primary goals are easy to run, easy to operate, easy 
maintain.  I used to do all my experience with CPython, but because of 
the GIL, in general performance problem I tried Scheme Guile, and 
nowadays Chez Scheme which according to my benchmark can be as fast as C 
code trying to do the same thing.

Last year I built another prototype for my single server search engine 
that I call babelia using GNU Guile on top wiredtiger. That is at this 
time that I started benchmarking Chez Scheme, and Guile and discovered 
that Chez Scheme was two to three times faster than Guile.

I also try to take part in standardization effort of the Scheme 
programming language [2][3][4].

As you will soon figure, I am a great fan of Ordered Key-Value Stores, 
and in particular FoundationDB for which I organize a meetup.

In September, I wrote a program that can scan through Common Crawl data 
as fast as C called `ccse`. I think I lost the alpine and debian 
binaries, but I can point you to the code if you find that interesting.

More recently, I experimented with an algorithm I devised myself to do 
spell checking over a dictionary that is bigger than RAM. The only 
problem is bigger-than-RAM dictionaries are difficult to find. But you 
can try it with:

   curl "https://search.hyper.dev/??q=jdango+saerch"

Yeah, it is written with Python, single threaded and sqlite lsm 
extension (there is around 200K package names in the database).

And, I am the first the introduce myself the peacesear.ch community :)


[0] https://en.wikipedia.org/wiki/Filter_bubble
[1] https://hyper.dev/22db4369d7254513bb9e668004a920da.html
[2] https://srfi.schemers.org/srfi-167/
[3] https://srfi.schemers.org/srfi-168/
[4] https://srfi.schemers.org/srfi-180/
Reply to thread Export thread (mbox)