Interview with Nikolay Samokhvalov

Talks

Database branching to scale and speed up application development and performance optimization
Wednesday, 17:20
Zurich

Social Media

Twitter: samokhvalov
LinkedIn: samokhvalov
Website: https://postgres.ai/
GitHub: NikolayS
GitLab: NikolayS

Could you briefly introduce yourself?

My name is Nikolay and I’m a Postgresoholic since 2005.

I have two things that I hate – guessing and waiting. The former “hate” means I fight “guessing” approach applied to development and operations involving Postgres and use experiments in every decision. The latter is about wait time – when we need a clone for our database experiments, and database size is not trivial, we might need to wait for hours, sometimes days. This waiting is unacceptable.

Guessing and waiting often play together in the work with large databases. That’s why we need to get rid of them together. For that, I founded a company – Postgres.ai – that helps Postgres users eliminate wait time when a fresh copy of a large Postgres database is needed for various kinds of experiments, to eliminate the guesswork – in development, testing, and various kinds of troubleshooting. So every engineer can get their own independent database clone in just a few seconds. Full-size copy. And with zero extra dollars spent.

Fast database cloning and database branching transform development processes, leveling them up. I like to say that we, at Postgres.ai, optimize various processes around Postgres, including testing in CI/CD, and Postgres query optimization itself. We optimize the optimization process, yes.

How do you engage with the PostgreSQL Community?

Besides talking all things Postgres in my Twitter every day, I’m involved in two projects:

Postgres.tv with Ilya Kosmodemyansky, DataEgret – a YouTube channel running various online events, many famous Postgres people participated; our new idea is to re-do interesting talks from Postgres conferences that do not record talks. Because good talks should be open and available to everyone who need them. Not everyony has opportunity to attend conferences. We call this series “Open Talks” – and since this very conference is not recording talks again, I’m inviting everyone to re-do their talks with us.
Postgres.fm with Michael Christofides, pgMustard – a Postgres discussing talking various Postgres topics every week.

Both these activities are just starting, so we need your support – please subscribe and participate.

Have you enjoyed previous PostgreSQL Europe conferences, either as an attendee or as a speaker? (PGConf.EU, FOSDEM PGDay, Nordic PGDay, pgDay Paris, PGConf.DE)

Not yet!

What will your talk be about, exactly? Why this topic?

Database branching is a very modern topic. As I already mentioned, I truly believe that during development, testing, and troubleshooting, engineers need their own clones, to make their work done fast, with good quality, and not disturb others.

But branching is more than just cloning: if we consider Git, for example, Git branches have knowledge about hierarchy, we can see the diff between branches and merge our changes from development branches to the master/main branch.

Observing that many companies – such as Neon, OrioleDB, Supabase, PlanetScale – started to talk (and sometimes work) in this area recently, and understanding that the “database branching” term is not well established yet, I decided to shift our efforts in this direction.

So, I’m talking about what database branching is, in my vision, how different it is from cloning, and what the best developer’s experience, theoretically, should we provide as companies developing tools for engineers. Of course, I’ll cover our, Postgres.ai’s, work as well.

What is the audience for your talk?

Any engineer who works with Postgres. I’m sure database branching is something that will be used by literally everyone in the future. We cannot imagine working in teams without Git branching today. Well, even if you’re alone, you probably still prefer using Git branches, right? We do need a similar thing for databases. Postgres.ai’s Database Lab Engine makes database branching available already today, without any need to change Postgres on production.

Come join me at the conference (or watch the recording on YouTube later – as I said, I plan to record it), and learn how database branching is going to transform tech industry, how it is already transforming it, and how to benefit from it in your systems and processes.

What existing knowledge should the attendee have?

It’s intended for very wide audience, so I’d say any engineer who worked with Postgres and Git at least a little bit will understand the material.

What is the one feature in PostgreSQL 15 which you like most?

Let me mention two of them – at first glance, both look quite small, but I consider them both important:

A new extension, pg_walinspect, allows seeing the content of WALs using SQL – in addition to good old pg_waldump which is a console utility. I see several good uses for this module in the future.
log_checkpoints is now “on” by default. Oh yes. Please continue. Many defaults need to be reconsidered.

Which other talk at this year’s conference would you like to see?

BRIN improvements and new opclasses – a few years ago, when BRIN was a new thing, it didn’t look attractive to me, all my attempts to use them were unsuccessful, btree indexes always won the battle; but I’ve heard that many things changed since then many improvements were made, and I want to learn more
PostgreSQL at GitLab.com – should be interesting, GitLab is running quite large and heavily loaded Postgres databases (disclaimer: GitLab is a Postgres.ai’s client)
Timescale Cloud: Scale further, build faster, and stay under budget – more and more companies build their own Postgres clouds and I do want to learn more about challenges on this path

(all 3 talks above will be running in parallel, unfortunately, so attend them in person I’ll need to choose)

Performance tips you have never seen before – Hans-Jürgen is a champion in giving brief, concise, and very practical pieces of Postgres advice, I’m a big fan of Cybertec’s blog posts, so I have very good expectations here
Understanding Postgres HOT Updates plus using Prometheus and Grafana to track and tune issues – HOT updates just got documented in the official Postgres docs, with PG15 release (finally), and I think this feature is underappreciated by DBAs/DBREs, so I’d like to learn about the methodology of increasing the chances to have more HOTs.

(again, conflict, again will need to choose)

Postgres community panel: Upgradability – Postgres major upgrade is a tough topic, and there are many challenges DBAs/DBREs are facing. Going to attend, also because I was invited as a participant
Cloud friendly COPY – interested because the speaker, first of all. Hannu’s talk “Do you vacuum every day?” presented at Postgres.tv was amazing and extremely helpful. COPY is an important command, for our (Postgres.ai’s) work too – in many cases, we build non-production environments using logical copying of data.
Party tricks for PostgreSQL: perf, ftrace and bpftrace – always interested in learning more details about performance analysis tooling
How to handle 1000 application users – should be interesting, we all know that connection pooler is a must at large installations, but there are many cases when people tend to use a lot of direct connections to Postgres
plProfiler: where is my PL/pgSQL code spending time? – maybe, one day, PLpgSQL will be more appreciated and developing using it will become as convenient as writing code in Go, Ruby, or Python. I have good expectations here because companies like Hasura or Supabase should eventually promote writing more Postgres-side code, including complex code using PLpgSQL.
Neon, cloud-native storage backend for PostgreSQL and Why we built Neon – interested also because Neon, as I learned from talking to its founders, consider moving the focus of development to database branching and testing in CI/CD, which is, as I explained above, among my key interests.
How do you put an elephant in a container in 3 steps? – the main open source tool we (Postgres.ai) develop—Database Lab Engine—uses containers in non-production environments (of course), so I’m interested in all the details associated with running Postgres in containers
Google AlloyDB vs. Amazon Aurora vs. Azure Hyperscale: comparison of databases build for clouds – an interesting competition between largest cloud providers building Postgres-based new systems, good for understanding how SQL database landscape is changing
Practical transactions theory for PostgreSQL users – Ilya, my co-host on Postgres.tv – can explain hard problems so everyone can understand. Transaction processing is at the heart of Postgres, so I expect new good material from Ilya here.
Table Partitioning - Transparent but No Magic – partitioning is a must if you have (or plan to have) 100+ GiB tables. There is a lack of good materials in this area, and Boriss is among those who can create one.

Which measure, action, feature or activity would—in your eyes—help to accelerate the adoption of PostgreSQL?

I don’t see any problems with current adoption and the current speed of its growth – for example, Postgres is already the #1 DBMS for startups.

But if Postgres development will continue be based on mailing lists and file attachments, being very unfriendly to new contributors, this can be a problem in the future, affecting adoption. Postgres hackers already use CI/CD tools, it’s time to use something like GitLab or GitHub, to use full power of Git branches, to have code discussions and reviews in a structured way (MRs/PRs), to track statuses in a reliable way and finally make contributing to Postgres convenient to more people. Otherwise, one day Postgres might start losing to one of the new DBMSs that are built on top of it or—even worse—to a completely fresh player.