Interview with Tomas Vondra

Talks

BRIN improvements and new opclasses
Wednesday, 12:10
Berlin 1

Social Media

Could you briefly introduce yourself?

I’m 43, I live in Prague, and I’m a PostgreSQL developer and contributor. I work for EDB for a little over 2 years—before that I worked for 2ndQuadrant for a number of years, until it got acquired in 2020.

How do you engage with the PostgreSQL Community?

I engage in multiple ways. The one that takes most of my time is being a committer—so I’m developing features, reviewing stuff, getting it committed, that sort of stuff. This also includes discussions on the mailing lists, etc. Then there are conferences, of course—I attend them, and do give talks once in a while. And I’m also organizing a conference in Prague—2023 will be the 15th year of that event.

Have you enjoyed previous PostgreSQL Europe conferences, either as an attendee or as a speaker? (PGConf.EU, FOSDEM PGDay, Nordic PGDay, pgDay Paris, PGConf.DE)

Sure, I enjoyed them, both as an attendee and a speaker—otherwise I wouldn’t come back. I think the only of those events I never attended is pgDay Paris, but I hope to fix that soon. I also used to help with some of the events in the past.

What will your talk be about, exactly? Why this topic?

I’ll be talking about recent improvements in BRIN indexes, and also about some ideas for possible improvements. I think BRIN indexes are not known well enough, and should be more widely known and used. And some of those ideas also seem like a fairly good topic for the first patch, which is the other reason why I give this talk—to help people interested in contributing to PostgreSQL for the first time.

What is the audience for your talk?

I think it’s primarily for users who want (and need) to understand how BRIN indexes work, the strengths and weaknesses, and how PG14 addresses at least some of the issues. The second group of people that might be interested in this are people interested in contributing a patch, and are looking for ideas to work on.

What existing knowledge should the attendee have?

I think it’s enough to have some basic user-level experience with indexes in PostgreSQL—basic knowledge of what BTREE indexes do should be enough. Some experience with BRIN indexes is an advantage, but it’s not a requirement. I’m not going to talk about implementation, but of course if someone is interested in developing a patch in this area, that’d be helpful.

What is the one feature in PostgreSQL 15 which you like most?

Oh, this is really hard—there are so many great features in 15, so I’ll take the liberty to give you three instead of one:

Allow WAL processing to pre-fetch needed file contents (Thomas Munro)

The recovery in Postgres is single-threaded, which used to be fine but is becoming a bottleneck more and more often. Not because of the CPU (it does only very simple things), but because of I/O it needs to do on data files, resulting in replication lag (or even making it impossible for the replica to keep up with the primary). Prefetching the data asynchronously will make this a non-issue in most cases.
Allow GROUP BY sorting to optimize column order (Dmitry Dolgov, Teodor Sigaev, Tomas Vondra)

A simple optimization that can significantly improve a lot of GROUP BY queries with multiple columns.
Store cumulative statistics system data in shared memory (Kyotaro Horiguchi, Andres Freund, Melanie Plageman)

I think this is a great improvement over the system we had until now, storing statistics in files on disk. We have improved it over the years to address various performance issues with many databases / objects, but that just made it pretty complicated and difficult to understand. Moving the data into shared memory fixes all of that, which is great.

Which other talk at this year’s conference would you like to see?

Again, impossible to give just one talk, so I’ll give you three:

I’ll definitely go to the talk keynote by Peter Boncz—I’ve read so many great papers written by him, it’ll be a delight to attend an actual talk.
Administering large scale PostgreSQL installations / Jan Birk seems interesting too. As a developer I’m doing less and less practical work “with” the database, so it’s good to hear what is the stuff users struggle with etc.
The two talks about Neon / database branching by Heikki Linnakangas and Nikolay Samokhvalov seem very interesting and moving the Postgres code in a completely new direction.

Which measure, action, feature or activity would—in your eyes—help to accelerate the adoption of PostgreSQL?

In the past, or now? I think in the past, it’s pretty nicely aligned with the introduction of JSONB—true, it’s correlation and not causation, but I think it was a great feature that helped many people to discover and adopt Postgres.

Overall, I think the biggest strength of the Postgres project is the community, and the truly open source and transparent nature of the community and the development process.

For the future, I think it’ll be crucial to continue making it easier to operate in cloud environments, exploiting the elasticity, etc.