All original content is created in Ukrainian. Not all content has been translated yet. Some posts may only be available in Ukrainian.Learn more
This content has been automatically translated from Ukrainian.
Imagine you have a million documents. You want to find all where the word "coffee" appears, but only those where it is used in the context of "brewing," not "store." And sort by relevance. And all this - in 50 milliseconds.
The relational database will break here. It can do WHERE body LIKE '%coffee%' - but that’s a complete string enumeration, without understanding of language, without ranking, without resilience to typos. On a million records, this will take seconds. On a billion - minutes.
This is why search engines exist - specialized systems built around one idea: to make searching large volumes of text fast, flexible, and intelligent.

Elasticsearch: the de facto standard

Elasticsearch is a distributed search engine and analytics platform built on Apache Lucene. Lucene is a Java library that has existed since 1999 and implements classic information retrieval algorithms. Elasticsearch appeared in 2010 as a convenient HTTP wrapper over Lucene with clustering support.

How it works internally

At its core is the concept of an inverted index. Instead of storing the text of each document and searching through it, the system builds a dictionary: "which word appears in which documents." This is similar to a subject index in a book.
"coffee"         → [doc_3, doc_7, doc_42, doc_100]
"brewing" → [doc_7, doc_15, doc_42]
When querying "brewing coffee," the system finds the intersection of sets in milliseconds - regardless of the size of the collection.
On top of this, Elasticsearch adds:
  • Analyzers - text processing pipelines: tokenization, stop-word removal, stemming (reducing to root), transliteration.
  • Scoring - the BM25 algorithm (and previously TF-IDF) assigns a numerical relevance score to each result.
  • Shards and replicas - the index is split into parts (shards), which are distributed across the cluster nodes. Replicas provide fault tolerance.
  • REST API - all interaction occurs via JSON over HTTP. No specific client protocols.

What Elasticsearch does well

Full-text search - this is obvious. But besides that:
Aggregations - real-time analytics. "How many orders per city in the last 7 days, broken down by hours?" - one query, instant response.
Geo-search - "find all cafes within a 2 km radius of the coordinates." Elasticsearch supports geospatial indexes natively.
Vector search (kNN) - starting from version 8.x, support for dense vectors allows for semantic search based on embeddings. That is, finding documents by content, not by exact word match.
Observability stack (ELK) - Elasticsearch + Logstash + Kibana. A classic combination for collecting and analyzing logs. Thousands of companies run it for infrastructure monitoring.

The dark side: licensing drama

In 2021, Elastic NV changed the license for Elasticsearch and Kibana from Apache 2.0 to SSPL (Server Side Public License) and Elastic License 2.0. Both licenses prohibit providing Elasticsearch as a cloud service without a commercial agreement with Elastic.
The reason: Amazon Web Services launched Amazon Elasticsearch Service (later renamed to OpenSearch) and was effectively profiting from someone else's open-source code without contributing back. Elastic decided to close this loophole.
The open-source community reacted ambiguously. SSPL is rejected by most definitions of open-source, including OSI. For many, this meant: Elasticsearch is no longer truly open.

OpenSearch: a fork from Amazon

In response to the licensing change, AWS in 2021 forked Elasticsearch 7.10 (the last version under Apache 2.0) and created OpenSearch. At the same time, Kibana was forked → OpenSearch Dashboards.
OpenSearch remains under Apache License 2.0. Amazon continues to offer it as a managed service - Amazon OpenSearch Service.

How does OpenSearch differ from Elasticsearch?

At the time of the fork - practically nothing. The APIs were 95% compatible. Over time, they diverged:
Аспект Elasticsearch OpenSearch
Ліцензія Elastic License 2.0 / SSPL Apache 2.0
Векторний пошук kNN з версії 8.x k-NN плагін (з ранніх версій)
ML функції Elastic ML (комерційні) ML Commons (відкрито)
Security X-Pack (безкоштовно з 7.x) Security плагін (завжди безкоштовно)
Управління Elastic NV (США) OpenSearch Foundation (Linux Foundation)
Important: in 2024, Elastic announced it would return Elasticsearch under the AGPL-3.0 license - true open-source. This is partly a response to the rise of OpenSearch. But AGPL has its limitations for commercial use.

When to choose OpenSearch?

  • If you are already on AWS and using a managed service
  • If license purity (Apache 2.0) is important
  • If you need built-in security features for free
  • If your team does not want to depend on a commercial company

Alternatives: when Elasticsearch is too much

Elasticsearch is powerful, but it is also heavy. The minimum cluster is at least 3 nodes, several gigabytes of RAM, complex configuration. For small and medium projects, this is often overkill.

Typesense

Typesense is a modern search engine written in C++, focused on simplicity and speed.
Key features:
  • Instant setup: one binary file, config with 5 lines
  • Typo tolerance out of the box - "cafe" will find "cafe," "café," "кафэ"
  • Vector search and hybrid search (text + vectors)
  • Built-in support for faceted search
  • License: GPL-3.0 (self-hosted), there is a cloud SaaS
Typesense is great for searching products, articles, documentation.
Limitations: not suitable for log analytics, lacks Elasticsearch-level aggregations, smaller ecosystem.

Meilisearch

Meilisearch is an open-source (MIT) search engine in Rust, with a focus on developer experience.
# Run in 10 seconds
docker run -p 7700:7700 getmeili/meilisearch
curl -X POST 'http://localhost:7700/indexes/movies/documents' \
  -d '[{"id": 1, "title": "Star Wars"}]'
Features:
  • Extremely simple REST API
  • Instant search out of the box
  • Filters, facets, sorting
  • Multilingual support
  • Vector search (from version 1.3)
Limitations: less scalable than Elasticsearch, not suitable for petabyte data, limited analytical capabilities.

Solr

Apache Solr is the "older brother" of Elasticsearch. Also built on Lucene, it appeared in 2004. For a long time, it was the industry standard.
Today, Solr lags behind Elasticsearch in API convenience, documentation, and cloud-native features. But there are niches where Solr excels: classic enterprise search, very complex faceting scenarios, integration with Hadoop.
Yes, your favorite relational database can do full-text search. And it’s not bad.
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('ukrainian', 'search & engine') query
WHERE search_vector @@ query
ORDER BY rank DESC;
PostgreSQL supports dictionaries for various languages (including Ukrainian via uk_hunspell), GIN/GiST indexes for fast searching, and ranking by relevance.
When it’s enough: up to a few million documents, if typo tolerance and complex aggregations are not needed, if you want to minimize infrastructure.
When it’s not enough: large volumes, fuzzy search needed, synonyms, complex ranking, real-time analytics.

SQLite FTS5

For very small projects or mobile applications - FTS5 module of SQLite. Full-text search without any external dependencies.

Qdrant, Weaviate, Pinecone - vector databases

These are a separate class of systems gaining popularity with the development of LLM. Instead of searching by keywords - searching by semantic proximity through vectors (embeddings).
  • Qdrant (Rust, MIT) - fast, production-ready, self-hosted or cloud.
  • Weaviate (Go, BSD-3) - hybrid search + built-in integration with OpenAI/Cohere.
  • Pinecone - cloud SaaS, the easiest to set up.
These systems do not replace Elasticsearch - they complement it. A hybrid approach (BM25 + vector search) often yields the best results.

Comparison table

Elasticsearch OpenSearch Typesense Meilisearch PG FTS
Ліцензія Elastic/AGPL Apache 2.0 GPL-3 MIT PostgreSQL
Мова Java Java C++ Rust C
Typo tolerance Так (fuzzy) Так Відмінна Відмінна Ні
Агрегації Розширені Розширені Базові Базові Обмежені
Складність Висока Висока Низька Низька -
RAM (мін.) 1–2 GB 1–2 GB 256 MB 256 MB -
Векторний пошук Так (8.x+) Так Так Так pgvector
Ідеально для Enterprise, logs AWS, відкритість Продукти, docs Стартапи Малі проєкти
How to choose?

Choose Elasticsearch if:
  • You need log analytics (ELK stack)
  • Scale - tens of millions of documents or more
  • You need complex aggregations and Kibana dashboards
  • Your team is ready to invest time in configuration
Choose OpenSearch if:
  • You are on AWS or license purity of Apache 2.0 is important
  • You need the same functionality as Elasticsearch
Choose Typesense or Meilisearch if:
  • Startup or medium project
  • Priority is fast start and developer experience
  • You need typo tolerance without fine-tuning
Stick with PostgreSQL FTS if:
  • Less than a million documents
  • You don’t want additional infrastructure
  • Basic search quality is sufficient

Real-world use cases

GitHub uses Elasticsearch for code and repository search - billions of documents, sub-second response.
Wikipedia - Elasticsearch for searching articles. About 60 million pages, 300+ languages.
Netflix - ELK stack for log analysis. Petabytes of data.
Shopify - Elasticsearch for product search in millions of stores.
At the same time, thousands of smaller products thrive on Meilisearch or Typesense - and do not require the complexity of Elasticsearch.
Elasticsearch is the Photoshop of search engines: extremely powerful, but with a steep learning curve and serious system requirements. OpenSearch is its license-clean twin.
But 2024–2026 shows: competition is alive. Typesense and Meilisearch are nibbling at the audience in the mid-segment. Vector databases open a new dimension of semantic search. PostgreSQL with pgvector is quietly creeping up from below.
The right choice is not "the most powerful solution," but "the simplest solution that solves your problem." Sometimes it’s Elasticsearch. Sometimes - LIKE '%search%' in PostgreSQL.
The main thing is to understand what you are building and not to use a cannon to shoot sparrows.
Like it?React
🧵

This post doesn't have any additions from the author yet.

Pessimistic Lock in Rails: what it is and when to use it. What are the alternatives?
Mar 31, '25 17:45

Pessimistic Lock in Rails: what it is and when to use it. What are the alternatives?

[Fix] Heroku / SearchBox addon - indexing error "The client is unable to verify that the server is Elasticsearch"
Jan 31, '25 13:09

[Fix] Heroku / SearchBox addon - indexing error "The client is unable to verify that the server is Elasticsearch"

Dec 27, '23 15:32

What is identification and authentication, and what is the difference?

May 31, '26 23:56

Copilot error - client not supported: bad request: the specified API version is no longer supported.

Why is TOON better than JSON when working with AI?
Nov 14, '25 15:14

Why is TOON better than JSON when working with AI?

MCP: a new internet where websites communicate with AI
Nov 4, '25 11:43

MCP: a new internet where websites communicate with AI

What is ORM and why is it needed?
Oct 26, '25 14:00

What is ORM and why is it needed?

What are the differences between OAuth 1 and OAuth 2
Oct 19, '25 20:34

What are the differences between OAuth 1 and OAuth 2

Main methods of authentication in API
Oct 19, '25 20:26

Main methods of authentication in API