Presented by:

77201339a3f094a4148c70c85c0f489f

Apoorva Aggarwal

from Grofers India Pvt Ltd

Works as a data platform engineer

Download slides

SQL Query on a database table running slow? Let's create an index on the table. The query is still running slow although the query planner says the index is being utilized. What do we do next?

This talk is about a time when we were building personalized recommendations for our customers and were faced with high latencies in our systems. We pre computed relevant item recommendations for each user and dumped the results in a table of our OLTP PostgreSQL database to enable us to serve these recommendations through our APIs. The underlying cause of high latencies was a slow running query on a postgres table which was using the index. We will explore what does having an index really means for the database, how those indexes work to make our queries faster and times when they won't. In this particular case for us, it boiled down to how the data was distributed on the database disk which in turn forced us to look into how data was being written to the disk. This talk takes you through our Apache Spark based recommendation system to understand how Spark writes to the database and the changes we made to our Spark jobs to suit the distribution on the disk according to our query patterns

Date:
2020 February 27 - 11:45
Duration:
40 min
Room:
Grand Victoria 2
Conference:
PGConf India, 2020
Language:
Track:
Application Developer
Difficulty:
Medium

Happening at the same time:

  1. Journey of the Query from SELECT to Result set.
  2. Start Time:
    2020 February 27 11:45

    Room:
    Robusta + Arabica

  3. 25 Interesting Features of PostgreSQL 12
  4. Start Time:
    2020 February 27 11:45

    Room:
    Grand Victoria 1