Table of contentsClick link to navigate to the desired location
Elasticsearch: the de facto standardHow it works internallyWhat Elasticsearch does wellThe dark side: licensing dramaOpenSearch: a fork from AmazonHow does OpenSearch differ from Elasticsearch?When to choose OpenSearch?Alternatives: when Elasticsearch is too muchTypesenseMeilisearchSolrPostgreSQL Full-Text SearchSQLite FTS5Qdrant, Weaviate, Pinecone - vector databasesComparison table
Elasticsearch
OpenSearch
Typesense
Meilisearch
PG FTS
Ліцензія
Elastic/AGPL
Apache 2.0
GPL-3
MIT
PostgreSQL
Мова
Java
Java
C++
Rust
C
Typo tolerance
Так (fuzzy)
Так
Відмінна
Відмінна
Ні
Агрегації
Розширені
Розширені
Базові
Базові
Обмежені
Складність
Висока
Висока
Низька
Низька
-
RAM (мін.)
1–2 GB
1–2 GB
256 MB
256 MB
-
Векторний пошук
Так (8.x+)
Так
Так
Так
pgvector
Ідеально для
Enterprise, logs
AWS, відкритість
Продукти, docs
Стартапи
Малі проєкти
How to choose?Real-world use cases
This content has been automatically translated from Ukrainian.
Imagine you have a million documents. You want to find all where the word "coffee" appears, but only those where it is used in the context of "brewing," not "store." And sort by relevance. And all this - in 50 milliseconds.
The relational database will break here. It can do WHERE body LIKE '%coffee%' - but that’s a complete string enumeration, without understanding of language, without ranking, without resilience to typos. On a million records, this will take seconds. On a billion - minutes.
This is why search engines exist - specialized systems built around one idea: to make searching large volumes of text fast, flexible, and intelligent.
Elasticsearch: the de facto standard
Elasticsearch is a distributed search engine and analytics platform built on Apache Lucene. Lucene is a Java library that has existed since 1999 and implements classic information retrieval algorithms. Elasticsearch appeared in 2010 as a convenient HTTP wrapper over Lucene with clustering support.
How it works internally
At its core is the concept of an inverted index. Instead of storing the text of each document and searching through it, the system builds a dictionary: "which word appears in which documents." This is similar to a subject index in a book.
"coffee" → [doc_3, doc_7, doc_42, doc_100] "brewing" → [doc_7, doc_15, doc_42]
When querying "brewing coffee," the system finds the intersection of sets in milliseconds - regardless of the size of the collection.
On top of this, Elasticsearch adds:
- Analyzers - text processing pipelines: tokenization, stop-word removal, stemming (reducing to root), transliteration.
- Scoring - the BM25 algorithm (and previously TF-IDF) assigns a numerical relevance score to each result.
- Shards and replicas - the index is split into parts (shards), which are distributed across the cluster nodes. Replicas provide fault tolerance.
- REST API - all interaction occurs via JSON over HTTP. No specific client protocols.
What Elasticsearch does well
Full-text search - this is obvious. But besides that:
Aggregations - real-time analytics. "How many orders per city in the last 7 days, broken down by hours?" - one query, instant response.
Geo-search - "find all cafes within a 2 km radius of the coordinates." Elasticsearch supports geospatial indexes natively.
Vector search (kNN) - starting from version 8.x, support for dense vectors allows for semantic search based on embeddings. That is, finding documents by content, not by exact word match.
Observability stack (ELK) - Elasticsearch + Logstash + Kibana. A classic combination for collecting and analyzing logs. Thousands of companies run it for infrastructure monitoring.
The dark side: licensing drama
In 2021, Elastic NV changed the license for Elasticsearch and Kibana from Apache 2.0 to SSPL (Server Side Public License) and Elastic License 2.0. Both licenses prohibit providing Elasticsearch as a cloud service without a commercial agreement with Elastic.
The reason: Amazon Web Services launched Amazon Elasticsearch Service (later renamed to OpenSearch) and was effectively profiting from someone else's open-source code without contributing back. Elastic decided to close this loophole.
The open-source community reacted ambiguously. SSPL is rejected by most definitions of open-source, including OSI. For many, this meant: Elasticsearch is no longer truly open.
OpenSearch: a fork from Amazon
In response to the licensing change, AWS in 2021 forked Elasticsearch 7.10 (the last version under Apache 2.0) and created OpenSearch. At the same time, Kibana was forked → OpenSearch Dashboards.
OpenSearch remains under Apache License 2.0. Amazon continues to offer it as a managed service - Amazon OpenSearch Service.
How does OpenSearch differ from Elasticsearch?
At the time of the fork - practically nothing. The APIs were 95% compatible. Over time, they diverged:
When to choose OpenSearch?
- If you are already on AWS and using a managed service
- If license purity (Apache 2.0) is important
- If you need built-in security features for free
- If your team does not want to depend on a commercial company
Alternatives: when Elasticsearch is too much
Elasticsearch is powerful, but it is also heavy. The minimum cluster is at least 3 nodes, several gigabytes of RAM, complex configuration. For small and medium projects, this is often overkill.
Typesense
Typesense is a modern search engine written in C++, focused on simplicity and speed.
Key features:
- Instant setup: one binary file, config with 5 lines
- Typo tolerance out of the box - "cafe" will find "cafe," "café," "кафэ"
- Vector search and hybrid search (text + vectors)
- Built-in support for faceted search
- License: GPL-3.0 (self-hosted), there is a cloud SaaS
Typesense is great for searching products, articles, documentation.
Limitations: not suitable for log analytics, lacks Elasticsearch-level aggregations, smaller ecosystem.
Meilisearch
Meilisearch is an open-source (MIT) search engine in Rust, with a focus on developer experience.
# Run in 10 seconds
docker run -p 7700:7700 getmeili/meilisearch
curl -X POST 'http://localhost:7700/indexes/movies/documents' \
-d '[{"id": 1, "title": "Star Wars"}]'
Features:
- Extremely simple REST API
- Instant search out of the box
- Filters, facets, sorting
- Multilingual support
- Vector search (from version 1.3)
Limitations: less scalable than Elasticsearch, not suitable for petabyte data, limited analytical capabilities.
Solr
Apache Solr is the "older brother" of Elasticsearch. Also built on Lucene, it appeared in 2004. For a long time, it was the industry standard.
Today, Solr lags behind Elasticsearch in API convenience, documentation, and cloud-native features. But there are niches where Solr excels: classic enterprise search, very complex faceting scenarios, integration with Hadoop.
PostgreSQL Full-Text Search
Yes, your favorite relational database can do full-text search. And it’s not bad.
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('ukrainian', 'search & engine') query
WHERE search_vector @@ query
ORDER BY rank DESC;
PostgreSQL supports dictionaries for various languages (including Ukrainian via uk_hunspell), GIN/GiST indexes for fast searching, and ranking by relevance.
When it’s enough: up to a few million documents, if typo tolerance and complex aggregations are not needed, if you want to minimize infrastructure.
When it’s not enough: large volumes, fuzzy search needed, synonyms, complex ranking, real-time analytics.
SQLite FTS5
For very small projects or mobile applications - FTS5 module of SQLite. Full-text search without any external dependencies.
Qdrant, Weaviate, Pinecone - vector databases
These are a separate class of systems gaining popularity with the development of LLM. Instead of searching by keywords - searching by semantic proximity through vectors (embeddings).
- Qdrant (Rust, MIT) - fast, production-ready, self-hosted or cloud.
- Weaviate (Go, BSD-3) - hybrid search + built-in integration with OpenAI/Cohere.
- Pinecone - cloud SaaS, the easiest to set up.
These systems do not replace Elasticsearch - they complement it. A hybrid approach (BM25 + vector search) often yields the best results.
Comparison table
How to choose?
Choose Elasticsearch if:
- You need log analytics (ELK stack)
- Scale - tens of millions of documents or more
- You need complex aggregations and Kibana dashboards
- Your team is ready to invest time in configuration
Choose OpenSearch if:
- You are on AWS or license purity of Apache 2.0 is important
- You need the same functionality as Elasticsearch
Choose Typesense or Meilisearch if:
- Startup or medium project
- Priority is fast start and developer experience
- You need typo tolerance without fine-tuning
Stick with PostgreSQL FTS if:
- Less than a million documents
- You don’t want additional infrastructure
- Basic search quality is sufficient
Real-world use cases
GitHub uses Elasticsearch for code and repository search - billions of documents, sub-second response.
Wikipedia - Elasticsearch for searching articles. About 60 million pages, 300+ languages.
Netflix - ELK stack for log analysis. Petabytes of data.
Shopify - Elasticsearch for product search in millions of stores.
At the same time, thousands of smaller products thrive on Meilisearch or Typesense - and do not require the complexity of Elasticsearch.
Elasticsearch is the Photoshop of search engines: extremely powerful, but with a steep learning curve and serious system requirements. OpenSearch is its license-clean twin.
But 2024–2026 shows: competition is alive. Typesense and Meilisearch are nibbling at the audience in the mid-segment. Vector databases open a new dimension of semantic search. PostgreSQL with pgvector is quietly creeping up from below.
The right choice is not "the most powerful solution," but "the simplest solution that solves your problem." Sometimes it’s Elasticsearch. Sometimes - LIKE '%search%' in PostgreSQL.
The main thing is to understand what you are building and not to use a cannon to shoot sparrows.
Like it?React
🧵
This post doesn't have any additions from the author yet.