~danskeren/ask.moe

1

funding to develop a independent crawler. Deadline: 1st june 2020

toogley
Details
Message ID
<C2WBOGSP6A71.3LD93LMQ5ALM6@unicorn.openbsd.amsterdam>
DKIM signature
pass
Download raw message
Hey,

Do you know nlnet.nl ? There, you could ask for funding to develop for
instance an in-house crawler.

They support already seaerchx with funding (or have, don't know), so you
could explain that many search engines block the access of searchx
instances. This is at least the experience of Mike kuketz, who has
created a really popular searchx instance in germany. It was so popular,
that he had to regularly change the IP adress to avoid being blocked.

I think it would be great a federated network of crawlers. So there
would be many different and independent instances earch running their
own crawler. And all would talk to each other and exchange the
crawling-results. Kind of like yacy, just with high but with a few
powerful instances.

I think this is the most stable solution to the search engine problem.
It is the most independent from other search engines and probably the
most succesful because of the computing power behind. At least compared
to yacy, which has the problem that the results are small because the
crawlers are not powerful.

Here is the open call for funding, deadline is 1st of june.
https://nlnet.nl/news/2020/20200401-call.html
toogley
Details
Message ID
<C2WBXCNC6Y0G.3MI0T56YB3B0U@unicorn.openbsd.amsterdam>
In-Reply-To
<C2WBOGSP6A71.3LD93LMQ5ALM6@unicorn.openbsd.amsterdam> (view parent)
DKIM signature
pass
Download raw message
On Thu May 21, 2020 at 2:07 PM CEST, toogley wrote:
> Hey,
>
> Do you know nlnet.nl ? There, you could ask for funding to develop for
> instance an in-house crawler.

i think it would be the most sensible solution to develop that as
independent from other backends/frontends. So that way, it could be used
as par of searchx instances, but also as part of yacy instances but also
on its own.


I think we (in the free software scene) have the computing power to
create an independent crawler, we are just missing the software to run.


I also think it would be great if it would be possible to just focus on
the crawling. So many people probably have some computing time on their
servers available but don't want to run a search-engine on that.

So with this idea, they could just run the crawler from time to time,
and feed the results back into the network.
Export thread (mbox)