I made that list for myself, maybe it can help other figure what to
think when we write about search engines:
- crawler e.g: apache nutch, possibly with headless browsers support
and feeds (atom, rss, sitemap, robots.txt)
- boolean keyword, full-text search to conceptual search
- entity linking (aka. wikification) and disambiguations
- one-way or two way synonyms and other lexicographic features
- last few days weeks or month search (like google tools)
- trends and trending topics (zscore)
- alerting
- domain search ie. restrict search to a given domain / subdomain
- spell checking, transliteration and soundex
- multi-lingual
- autocomplete, query suggestion
- more-like-this
- network topology e.g. incomings links, outgoing links, page rank
- caching, offline use, archiving
- image and video search
- code search
- question answering e.g. how big is the Everest?
- "response extraction" extract answer to query from top results
- summary
- oembed
- readability metrics, privacy metrics (no third party cookies or
analytics, no javascript, small size...)
- maps
- book OCR
- safe search
- voice search (mozilla common voice)
- deep search / full index scan
Chime in!