Citus BlogCitus Blog

Thoughts about the Citus database—as well as PostgreSQL, sharding, distributed databases, and other open source extensions to Postgres.

With so many Postgres conferences coming up soon, it seemed fitting to share some highlights from a past episode of the Path To Citus Con about why to give talks at Postgres conferences. This episode was recorded back in May 2023 and shares an hourlong conversation between some wonderful Postgres engineers—Álvaro Herrera and Boriss Mejías—along with my co-host Pino de Candia and me.

The guests both have deep roots in the community—Álvaro as a Postgres committer and Boriss as a frequent conference speaker as well as the organizer for the PgBE PostgreSQL User Group Belgium. And they have known each other for decades, since university days. As much as Alvaro and Boriss have in common, it’s interesting to hear them talk about their totally different approaches to giving talks at conferences.

There’s also a point in the podcast where we explore whether it helps to be an introvert, or an extrovert, when it comes to giving conference talks. And how speaking at conferences can make it easier to meet people… after you’ve given a talk, people will often walk up to you and say “hey I saw your talk, I want to ask you about <insert PG topic here>”.

Keep reading

The latest episode of Path To Citus Con—the monthly podcast for developers who love Postgres—is now out. This episode featured guests Paul Ramsey and Regina Obe on the topic “Why people care about PostGIS and Postgres”.

The conversation was all about PostGIS, a geospatial extension to Postgres which just happens to be one the most popular Postgres extensions. This episode was fairly technical, but still fascinating. The discussion ranged all the way from cartesian math at one point to how it’s very difficult to construct a database these days without a location component. This episode of Path To Citus Con focuses on the geospatial world of Postgres and shows how “where” is one of the fundamental things we all want to know about.

In this post, you’ll get a bit of backstory on the topic and the guests—both with a long history with PostGIS—of this episode of Path To Citus Con; and you’ll get a peek at key moments from this show, including the extensibility of Postgres demonstrated by PostGIS, “where” as the universal foreign key, and more. At the end of the post, you’ll find links of where you can listen to this and every episode of the podcast. We hope you love these “human side of Postgres” podcast episodes.

Keep reading
Claire Giordano

What’s new with Postgres at Microsoft (August 2023)

Written byBy Claire Giordano | August 31, 2023Aug 31, 2023

On one of the Postgres community chat forums, a friend asked me: "Is there a blog post that outlines all the work that is being done on Postgres at Microsoft? It's hard to keep track these days."

And my friend is right: it is hard to keep track. Probably because there are multiple Postgres workstreams at Microsoft, spread across a few different teams.

In this post, you'll get a bird's eye view of all the Postgres work the Microsoft team has done over the last year. Our work includes some pretty significant improvements to the Postgres managed services on Azure, as well as contributions across the entire open source ecosystem—including commits to the Postgres core; new releases to Postgres open source extensions like Citus and pg_cron; plus ecosystem work on Patroni, PgBouncer, pgcopydb. And more.

Keep reading

The latest episode of Path To Citus Con—the monthly podcast for developers who love Postgres—is now out. This 6th episode featured guests Chelsea Dole and Floor Drees on the topic “You’re probably already using Postgres: What you need to know”.

The conversation explored the app developer perspective on Postgres. Many of you app developers are already using Postgres, but perhaps on top of an ORM. If you want to optimize your application, if you want to better understand the underlying database: what do you need to know? And how do you remove the fear? This episode of Path To Citus Con, our new podcast for developers who love Postgres, focuses on opportunities for building your knowledge in the database internals space—whether you want to go breadth-first or depth-first.

In this post, you’ll get a bit of backstory on the topic and the guests of Episode 06 of Path To Citus Con; and you’ll get a peek at highlights of a few interesting moments from the show. At the end of the post, you’ll find all the links of where you can listen to this and every episode of the podcast. Some people love these “human side of Postgres” podcast episodes—hopefully you will, too.

Keep reading

The topic of this month's PGSQL Phriday #011 community blogging event is partitioning vs. sharding in PostgreSQL. It seemed right to share a perspective on the question of "partitioning vs. sharding" from someone in the Citus open source team, since we eat, sleep, and breathe sharding for Postgres.

Postgres built-in "native" partitioning—and sharding via PG extensions like Citus—are both tools to grow your Postgres database, scale your application, and improve your application's performance.

What is partitioning and what is sharding? In Postgres, database partitioning and sharding are techniques for splitting collections of data into smaller sets, so the database only needs to process smaller chunks of data at a time. And as you might imagine, work gets done faster when you're processing less data.

In this post, you'll learn what partitioning and sharding are, why they matter, and when to use them. The table of contents:

Keep reading
Onur Tirtir

Schema-based sharding comes to PostgreSQL with Citus

Written byBy Onur Tirtir | July 31, 2023Jul 31, 2023

Citus, a database scaling extension for PostgreSQL, is known for its ability to shard data tables and efficiently distribute workloads across multiple nodes. With Citus 12.0, Citus introduces a very exciting feature called schema-based sharding. The new schema-based sharding feature gives you a choice of how to distribute your data across a cluster, and for some data models (think: multi-tenant apps, microservices, etc.) this schema-based sharding approach may be significantly easier!

In this blog post, we will take a deep dive into the new schema-based sharding feature, and you will learn:

Keep reading

Postgres community released a new feature, in Postgres 15.0, that performs actions to modify rows in the target table, using the data from a source. MERGE provides a single SQL statement that can conditionally INSERT, UPDATE or DELETE rows, a task that would otherwise require multiple procedural language statements, using INSERT with ON CONFLICT clause etc.

In this blog post, you will learn a high-level overview of the functioning of Postgres MERGE. It will delve into some of the practical use-cases, and subsequently elaborate on the different strategies employed by Citus for handling MERGE in a distributed environment.

Keep reading
Marco Slot

Citus 12: Schema-based sharding for PostgreSQL

Written byBy Marco Slot | July 18, 2023Jul 18, 2023

What if you could automatically shard your PostgreSQL database across any number of servers and get industry-leading performance at scale without any special data modelling steps?

Our latest Citus open source release, Citus 12, adds a new and easy way to transparently scale your Postgres database: Schema-based sharding, where the database is transparently sharded by schema name.

Schema-based sharding gives an easy path for scaling out several important classes of applications that can divide their data across schemas:

  • Multi-tenant SaaS applications
  • Microservices that use the same database
  • Vertical partitioning by groups of tables

Each of these scenarios can now be enabled on Citus using regular CREATE SCHEMA commands. That way, many existing applications and libraries (e.g. django-tenants) can scale out without any changes, and developing new applications can be much easier. Moreover, you keep all the other benefits of Citus, including distributed transactions, reference tables, rebalancing, and more.

Keep reading

Introducing Path To Citus Con, a podcast for developers who love Postgres. Why? Because sometimes, something you build gets bigger than you thought it would. The monthly podcast Path To Citus Con (renamed to Talking Postgres in July 2024) was originally meant to be a “pre-event” to build excitement and give a hands-on experience for people who would be attending Citus Con: An Event for Postgres. The audience would get a chance to talk to speakers for the conference and hear a deep dive conversation.

It’s now its own monthly podcast with guests from around the world. Guests have been deep in the world of databases and the Citus database extension to Postgres, and also people in the Postgres community and technology more generally. It’s the human side of open source, PostgreSQL, and the many PG extensions (including Citus.)

In this blog post, you’ll learn about what Path To Citus Con is, how you can participate, listen, and read each episode, and about episodes like “Working in public on open source,” “Why giving talks at Postgres conferences matters,” and more (details below.)

Keep reading

Distributed PostgreSQL has become a hot topic. Several distributed database vendors have added support for the PostgreSQL protocol as a convenient way to gain access to the PostgreSQL ecosystem. Others (like us) have built a distributed database on top of PostgreSQL itself.

For the Citus database team, distributed PostgreSQL is primarily about achieving high performance at scale. The unique thing about Citus, the technology powering Azure Cosmos DB for PostgreSQL, is that it is fully implemented as an open-source extension to PostgreSQL. It also leans entirely on PostgreSQL for storage, indexing, low-level query planning and execution, and various performance features. As such, Citus inherits the performance characteristics of a single PostgreSQL server but applies them at scale.

That all sounds good in theory, but to see whether this holds up in practice, you need benchmark numbers. We therefore asked GigaOM to run performance benchmarks comparing Azure Cosmos DB for PostgreSQL to other distributed implementations. GigaOM compared the transaction performance and price-performance of these popular managed services of distributed PostgreSQL, using the HammerDB benchmark software:

Keep reading

Page 3 of 32