Presented by:

792f72b59c0d05f18afc47fff7c7edce

Robins Tharakan

from Amazon Web Services

I have been with Amazon Web Services for the past 6 years and currently work as a Database Engineer with the RDS Postgres team. Before this, I was a Postgres DBA managing a multi-terabyte database and prior to that, a web developer mostly working on the PHP / ASP platform.

I have been using Postgres since 8.3 days when I first reviewed Postgres & MySQL as a replacement for an ageing MsAccess database. Prior to that, I’ve been a linux user since mid ‘98 dabbling with open-source projects along the way.

My current work involves working with the RDS Postgres team to launch new features, such as FDWs (recently mysql_fdw/tds_fdw), extensions (recently pg_bigm), generic features and (pertinent here) bug-finding in Postgres core and some popular Postgres extensions. Over the last few years, working with RDS customers has also led me to submit patches upstream to assist Postgres users in general.

Checkout the slides

Bug-hunting in the postgres ecosystem

Duration: It would be good to have at least 45min to go into some depth. More would help in answering queries, if any.

Brief

The talk would focus on my experience using SQLSmith (and other tools) exploring bugs in the postgres ecosystem using a test-framework that I've been developing / using. The talk tries to cover some learnings, best-practices & (in general) pitfalls of fuzz-testing with Postgres. The idea is to present a basic construct of the test-framework used and help a database user / developer recreate a similar test-setup within their organization.

Experience on the subject

To date the test setup has resulted in contribution to 5 postgres threads, with 2 bugs getting fixed in Postgres core. Most of the other bugs reported involved popular postgres extensions, such as PostGIS, pg_repack, plprofiler, mysql_fdw, rdkit, orafce etc. Considering that most of these bugs had easy to reproduce engine-crash scenarios, most were taken up promptly and now ~20 bugs are fixed upstream. Similarly, this has also been helpful in finding bugs in multiple postgres based software used within the organization.

A few highlights of this exercise:

  • SQLSmith / SQLReduce and what they can / can't do while finding bugs
  • Touch briefly on how to prepare a basic SQLSmith setup
  • Elaborate the test-setup I use
    • Testing all branches
    • Checking new commits frequently
    • Autotuned workload
    • Auto-generate crash reports
    • 1.3 Billion queries and counting
    • SQLSmith's short-comings
  • Demo
    • How the tests run on a regular day
    • Sample a crash + see how a crash-report looks like

Reference - Short list of bug-reports that came out of this project. (Listing only 1 bug-report per category):

  1. postgres - FIXED: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=03361a368e7bf909283cc7721af004317fdabd3d
  2. postgres - FIXED: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ae4fc52ae24785f51e3dc5a4c7e5b4d206356bdc
  3. postgres - FIXED: https://www.postgresql.org/message-id/flat/c0c149af17b358883e95a49984a30360b65f8118.camel%40j-davis.com#c8e75b9dcb6a58d13332a194f90ae6ce
  4. postgres - REPORTED: https://www.postgresql.org/message-id/flat/CAEP4nAzwokLht2Bi3RCHNcCJBdOa8d7LnqJkDO1xiaH%2BbiW8fw%40mail.gmail.com
  5. pg_repack: https://github.com/reorg/pg_repack/issues/321
  6. mysql_fdw: https://github.com/EnterpriseDB/mysql_fdw/issues/226
  7. rdkit: https://github.com/rdkit/rdkit/issues/3687
  8. postgis: https://trac.osgeo.org/postgis/ticket/4770
  9. pg_similarity: https://github.com/eulerto/pg_similarity/issues/28
  10. plprofiler: https://github.com/bigsql/plprofiler/issues/2

Date:
2023 February 24 - 12:00
Duration:
45 min
Room:
Grand Victoria 2
Conference:
PGConf India, 2023
Language:
Track:
Application Developer
Difficulty:
Medium

Happening at the same time:

  1. Rare but extremely challenging Postgres Performance Problems. How to diagnose and overcome
  2. Start Time:
    2023 February 24 12:00

    Room:
    Grand Victoria 1

  3. Migrate logical replication during physical cluster switchover
  4. Start Time:
    2023 February 24 12:00

    Room:
    Robusta + Arabica