Presented by:

E69b638d643ac997ba6a44d4fe2ad113

Gunjan Juyal

from Google
0b06e571fe29c0726c15e532b9e237bd

Eeshan Gupta

from Google

Eeshan Gupta is a Software Engineer in the Cloud SQL for PostgreSQL team in Bangalore, India. He has been working on this team and developing for PostgreSQL for around 4 years. During this time, Eeshan has worked on various facets of PostgreSQL including developing and supporting the PostgreSQL kernel and extensions, automated vacuum management, connection pooling and recently been working on vector search using PostgreSQL. He has expertise on certain extensions like plv8 and pgvector as well, and is interested in contributing to PostgreSQL and the adjacent ecosystem, having been a Google Summer of Code participant in 2018.

No video of the event yet, sorry!

Summary

Vector databases are revolutionizing information-retrieval applications by bridging the semantic-information unlocked by LLMs to the power and convenience of semantic query search. But efficiently managing and querying vector embeddings within a relational database like PostgreSQL requires specialized knowledge, especially when talking about the scale and accuracy expected in a production scenario. This talk demystifies the process of scaling pgvector for large workloads, focusing on practical techniques but with brief dives into the algorithms powering this awesome extension.

What We Will Cover

  • Index-creation scaling techniques: Quantization (with re-ordering), filtered indexes for low-cardinality dimensions, parallel index building
  • Tuning index-type specific indexing parameters for recall (aka accuracy) vs TPS tradeoff: IVFFlat (lists) and HNSW (ef_construction)
  • Tuning index-type specific querying parameters for recall (aka accuracy) vs TPS tradeoff: IVFFlat (probes) and HNSW (ef_search)
  • Query scaling techniques: Distributed queries using foreign-data-wrapper based sharding.
  • Handling data-drift: Mitigate performance drift caused by evolving datasets in IVFFlat and HNSW index types.
  • Pgvector weak spots: Open areas where pgvector currently struggles, such as specific query shapes (e.g. high selectivity), data size limitations, and vector-size limitations.
  • A brief overview of some techniques to mitigate these limitations to some extent: Quantization to support higher-dimensional vectors or larger datasets, high-selectivity queries etc

Key Takeaways

This intermediate-level session empowers GenAI application developers and DBAs with the tools and strategies needed to scale pgvector beyond prototype stage to unlock its full potential for high-performance vector search.

Target Audience

This talk is aimed at GenAI application developers, DBAs and those interested in vector-related algorithms who already have a basic understanding of concepts such as vectors and ANN search, and who are interested in digging deeper into pgvector to understand its feasibility for their large-scale datasets and workloads.

Presenters

  • Gunjal Juyal is a Software Engineer at the Cloud SQL for PostgreSQL team in Google Cloud and is based out of Bangalore. (LinkedIn Profile)
  • Eeshan Gupta is a Software Engineer at the Cloud SQL for PostgreSQL team in Google Cloud and is based out of Bangalore. (LinkedIn Profile)

Date:
2025 March 7 - 15:30
Duration:
45 min
Room:
Grand Ballroom 2
Conference:
PGConf India, 2025
Language:
Track:
Application Developer
Difficulty:
Hard

Happening at the same time:

  1. Failover Slots in PostgreSQL-17: Ensuring High Availability with Logical Replication
  2. Start Time:
    2025 March 7 15:30

    Room:
    Grand Ballroom 1

  3. Sharded and Distributed Are Not the Same: What You Must Know When PostgreSQL Is Not Enough
  4. Start Time:
    2025 March 7 15:30

    Room:
    Jupiter