pg_search: Bringing Elasticsearch-Grade Search to PostgreSQL
Presented by:
Mithun Chicklore Yogendra
Mithun Chicklore Yogendra is a software engineer at ParadeDB working on PostgreSQL search extensions. He previously worked at Oracle MySQL and EnterpriseDB, where he contributed to PostgreSQL core, including autoprewarm and hash index performance improvements. His expertise spans database internals, SQL query processing, and PostgreSQL extension development.
No video of the event yet, sorry!
Developers building on PostgreSQL face a painful dilemma: PostgreSQL's native full-text search lacks BM25 ranking, powerful query composition, and performs poorly at scale (millions or more rows), while Elasticsearch introduces dual infrastructure and complex synchronization (ETL). pg_search eliminates this trade-off with dramatically faster performance for various use cases. This talk explores how pg_search leverages PostgreSQL's extension architecture to deliver Elasticsearch-grade performance without forking PostgreSQL.
What is pg_search: It is a Rust-based PostgreSQL extension that provides BM25 relevance ranking, sub-100 millisecond queries over millions of documents, faceted search and aggregations, and hybrid search combining BM25 with vector similarity. It offers full ACID guarantees without requiring separate infrastructure.
Technical Architecture: pg_search uses PostgreSQL’s official extension hooks instead of forking the database. These include set_rel_pathlist_hook for base table scan interception, create_upper_paths_hook for aggregate planning takeover, and the Custom Scan API for execution path replacement. The extension intercepts the at-sign operator queries (written as @@@) and routes them to Tantivy, a Rust-based alternative to Lucene, while maintaining full PostgreSQL compatibility.
Specialized Execution Methods: TopN Execution: Optimizes ORDER BY and LIMIT queries. Performs an index scan into a quickselect buffer, ordering results during search. Achieves over 100x speedup compared to native PostgreSQL. Example: 81 ms vs 38,797 ms in Neon benchmarks.
Aggregate Scan: Pushes COUNT and GROUP BY operations directly into the index using Tantivy’s fast fields, avoiding heap access and separate aggregation phases.
Fast Fields Mode: Provides columnar storage for field retrieval, serving data directly from Tantivy.
Normal Mode: Standard execution path where Tantivy search results map to CTIDs followed by heap fetch.
Reference: https://neon.com/blog/pgsearch-on-neon
MVCC Integration: The Tantivy index is stored in PostgreSQL’s block storage with an MVCC-aware directory. Snapshot-based visibility checking ensures full transactional consistency through extension APIs.
Why This Matters: pg_search demonstrates that PostgreSQL’s extension architecture can integrate specialized search engines while preserving ACID guarantees. It removes the need for Elasticsearch, simplifying infrastructure while achieving similar performance—all without forking PostgreSQL.
- Date:
- Duration:
- 45 min
- Room:
- Conference:
- PGConf India, 2026
- Language:
- Track:
- Application Developer
- Difficulty:
- Medium