Interview with Kshitij Purwar
- What happens when stack overflow doesn’t have an answer? comparing ST_within & H3 for spatial queries
- Thursday, 15:40
- Berlin 2+3
- Could you briefly introduce yourself?
I am Kshitij Purwar, Founder and CTO of Blue Sky Analytics, a Climate-Tech start-up, using satellite data, cloud computing, open source technology and AI to build environmental monitoring and climate-risk assessment products. We have complete remote team distributed between India, Netherland & US!
I am a college dropout & self-taught developer with over nine years of software development experience. I began working professionally in first semester of my college & joined an early state tech startup as a core engineer after dropping out in the 3rd semester of my college.
In 2018, along with my elder sister Abhilasha, I founded Blue Sky Analytics to help fight climate change with data. At Blue Sky, I lead a team of young developers and data scientists to analyze terabytes of satellite data to deliver sophisticated environmental datasets; and build “SpaceTime”, a data visualisation platform to support open source & collaboration for Climate Action.
- How do you engage with the PostgreSQL Community?
I am fairly new to the PostgreSQL community, been working on it only for last couple of years, primarily as a consumer. Mostly engaging on Twitter & Stack Overflow as a silent observer.
Learnt a lot from good folks at TimescaleDB on their slack community as we were one of the earliest people to put it in production, especially in a combination with PostGIS to run spatio-temporal queries.
Looking forward to making more contributions to the community by publishing a few blogs/tutorials in next coming months.
- Have you enjoyed previous PostgreSQL Europe conferences, either as an attendee or as a speaker? (PGConf.EU, FOSDEM PGDay, Nordic PGDay, pgDay Paris, PGConf.DE)
No, this would be my first PostgreSQL Europe conference as I moved to Netherlands only a few months ago.
- What will your talk be about, exactly? Why this topic?
My talk is about how we solved our issues of spatial joins across a large amount of geospatial data using H3 indexing.
We deal with large geospatial datasets and run many spatial-temporal queries over them; the spatial join (point in polygon query) was one of the bottlenecks we encountered in many use cases. Some postgis functions like
ST_Withinsolve this problem for us but it’s very slow when there are millions of points & the shapes are complex. My talk is how to optimise these specific kinds of query using H3.
- What is the audience for your talk?
Anybody handling large amounts of geospatial data with the interest in analyzing and optimising point in polygon queries.
- What existing knowledge should the attendee have?
Basic knowledge of PostgreSQL, PostGIS i.e. spatial joins & H3 is enough to understand the talk. I’d giving 2 min crash course for each anyway!
- What is the one feature in PostgreSQL 15 which you like most?
Truth be told we are yet to upgrade to version 15 but my team & I have been really excited about the latest release.
I really like the server side compression & client decompression feature in pg_basebackup that moves the native backup and restore functionality of postgresql 15 to a more efficient and robust direction. For our larger databases, server side compression makes a lot of difference.
But the in-memory and on-disk sorting performance improvements are amazing as well since they affect a lot of our queries and give us out of the box “just works” improvements.
- Which other talk at this year’s conference would you like to see?
Lots of interesting topics but here are my top 4 (I hope they timings don’t clash 🤞)
- PostgreSQL at GitLab.com, always wondered how things worked in Big Companies
- Hands-on Benchmarking, you can’t solve what you can’t measure
- A comparison of PostgreSQL backup tools, I am glad someone tried them all & compared so other don’t have to
- Performance tips you have never seen before, You can never be fast enough!
- Which measure, action, feature or activity would—in your eyes—help to accelerate the adoption of PostgreSQL?
Making it more beginner friendly & less scary. I have seen people jump on NoSQL for SQL suited use cases because of the ease & simplicity around NoSQL DBs like MongoDB.
I wish PSQL documentation had a “Explain me like I am 5” version!