Citus Blog

Articles tagged: Postgres

Nate Barbettini

Multi-tenant web apps with ASP.NET Core and Postgres

Written byBy Nate Barbettini | January 22, 2018Jan 22, 2018

When it comes to building large-scale, multi-tenant applications, Microsoft's ASP.NET platform is a strong choice. Like other popular web frameworks such as Express and Django, ASP.NET is used to build web applications and APIs. It's been around for a while, but don't let that fool you: ASP.NET packs some serious muscle. After all, it powers one of the biggest Q&A networks on the web: Stack Exchange!

In the past, ASP.NET apps could only run on Windows servers. That's changed with the latest version, ASP.NET Core, which is fully open source and cross-platform. ASP.NET Core runs anywhere you need it to (Windows, Mac, Linux, Docker) and features a modern middleware pipeline, a rich package ecosystem, and blazing-fast performance.

My experience working on multi-tenant enterprise apps has taught me that it's never too early to design for scale. How you architect your code matters, as does how you architect your data. In the past, the apps I worked on were designed around a database-per-tenant model—unfortunately, the database-per-tenant model didn’t scale and caused problems once our app reached thousands of customers (aka tenants). In this post, I’ll show you a different approach to scale the underlying database with ASP.NET: sharding. With sharding you can leave behind the drawbacks of the database-per-tenant model and can scale infinitely.

In this blog post, I'll show you how to build your multi-tenant app with scale in mind. You'll learn how to use ASP.NET Core's middleware pipeline plus the sharding features of Postgres and Citus to build a scalable multi-tenant application on ASP.NET Core. Along the way we’ll start to build the MVP of our very own StackExchange. Let's get started!

Keep reading
Craig Kerstiens

Database sharding explained in plain English

Written byBy Craig Kerstiens | January 10, 2018Jan 10, 2018

Sharding is one of those database topics that most developers have a distant understanding of, but the details aren't always perfectly clear unless you've implemented sharding yourself. In building the Citus database (our extension to Postgres that shards the underlying database), we've followed a lot of the same principles you'd follow if you were manually sharding Postgres yourself. The main difference of course is that with Citus, we’ve done the heavy lifting to shard Postgres and make it easy to adopt, whereas if you were to shard at the application layer then there’s a good bit of of work needed to re-architect your application.

I've found myself explaining how sharding works to many people over the past year and realized it would be useful (and maybe even interesting) to break it down in plain English.

Keep reading
Craig Kerstiens

PGConf EU: HyperLogLog, Eclipse, and Distributed Postgres

Written byBy Craig Kerstiens | December 11, 2017Dec 11, 2017

We're big fans of Postgres and enjoy getting around to the various community conferences to give talks on relevant topics as well as learn from others. A few months ago we had a good number of Citus team members over at the largest Postgres conference in Europe. Additionally, three of our Citus team members gave talks at the conference. We thought for those of you that couldn't make the conference you might still enjoy getting a glimpse of some of the content. You can browse the full set of talks that were given and slides for them on the PGConf EU website or flip through the presentations from members of the Citus team below.

Keep reading
Sai Krishna Srirampur

Scaling out your Django Multi-tenant App

Written byBy Sai Srirampur | November 14, 2017Nov 14, 2017

There are a number of data architectures you could use when building a multi-tenant app. Some, such as using one database per customer or one schema per customer, have trade-offs when it comes to larger scale. The other option is to build the notion of tenancy directly into the logic of your SaaS application. With django-multitenant and Citus, built-in tenancy becomes much easier to put in place for your application without having to re-invent the wheel yourself.

Our django-multitenant Python library, enables easy scale out of applications that are built on top of Django and follow a multi tenant data model. This Python library has evolved from our experience working with SaaS customers, scaling out their multi-tenant apps.

Keep reading
Craig Kerstiens

Faster bulk loading in Postgres with copy

Written byBy Craig Kerstiens | November 8, 2017Nov 8, 2017

If you've used a relational database, you understand basic INSERT statements. Even if you come from a NoSQL background, you likely grok inserts. Within the Postgres world, there is a utility that is useful for fast bulk ingestion: \copy. Postgres \copy is a mechanism for you to bulk load data in or out of Postgres.

First, lets pause. Do you even need to bulk load data and what's it have to do with Citus? We see customers leverage Citus for a number of different uses. When looking at Citus for a transactional workload, say as the system of record for some kind of multi-tenant SaaS app, your app is mostly performing standard insert/updates/deletes.

But when you're leveraging Citus for real-time analytics, you may already have a separate ingest pipeline. In this case you might be looking at event data, which can be higher in volume than a standard web app workload. If you already have an ingest pipeline that reads off Apache Kafka or Kinesis, you could be a great candidate for bulk ingest.

Back to our feature presentation: Postgres \copy. Copy is interesting because you can achieve much higher throughput than with single row inserts.

Keep reading
Jason Petersen

What it means to be a Postgres extension

Written byBy Jason Petersen | October 25, 2017Oct 25, 2017

Nearly 18 months ago, we open sourced our Citus distributed database and "unforked it" from PostgreSQL by refactoring Citus into a PostgreSQL extension. Seasoned PostgreSQL users likely already know of and use popular PostgreSQL extensions, such as hstore, PostGIS, and pg_stat_statements; however, we realized some of you might appreciate a recap of our journey from fork to extension and beyond.

Keep reading
Craig Kerstiens

Monitoring your bloat in Postgres

Written byBy Craig Kerstiens | October 20, 2017Oct 20, 2017

Postgres under the covers in simplified terms is one giant append only log. When you insert a new record that gets appended, but the same happens for deletes and updates. For a delete a record is just flagged as invisible, it's not actually removed from disk immediately. For updates the old record is flagged as invisible and a new record is written. What then later happens is Postgres comes through and looks at all records that are invisible and actually frees up the disk storage. This process is known as vacuum.

There are a couple of key levels to VACUUM within Postgres:

  • VACUUM ANALYZE - This one is commonly run when you've recently loaded data into your database and want Postgres to update it's statistics about the data
  • VACUUM FULL - This will take a lock during the operation, but will scan the full table and reclaim all the space it can from dead tuples.
Keep reading
Craig Kerstiens

A tour of Postgres Index Types

Written byBy Craig Kerstiens | October 17, 2017Oct 17, 2017

At Citus we spend a lot of time working with customers on data modeling, optimizing queries, and adding indexes to make things snappy. My goal is to be as available for our customers as we need to be, in order to make you successful. Part of that is keeping your Citus cluster well tuned and performant which we take care of for you. Another part is helping you with everything you need to know about Postgres and Citus. After all a healthy and performant database means a fast performing app and who wouldn’t want that. Today we’re going to condense some of the information we’ve shared directly with customers about Postgres indexes.

Postgres has a number of index types, and with each new release seems to come with another new index type. Each of these indexes can be useful, but which one to use depends on 1. the data type and then sometimes 2. the underlying data within the table, and 3. the types of lookups performed. In what follows we'll look at a quick survey of the index types available to you in Postgres and when you should leverage each. Before we dig in, here’s a quick glimpse of the indexes we’ll walk you through:

  • B-Tree
  • Generalized Inverted Index (GIN)
  • Generalized Inverted Seach Tree (GiST)
  • Space partitioned GiST (SP-GiST)
  • Block Range Indexes (BRIN)
  • Hash

Now onto the indexing

Keep reading
Craig Kerstiens

Indexing all the things in Postgres

Written byBy Craig Kerstiens | October 11, 2017Oct 11, 2017

Postgres indexes make your application fast. And while one option is to analyze each of your relational database queries with pg_stat_statements to see where you should add indexes... an alternative fix (and a quick one at that) could be to add indexes to each and every database table—and every column—within your database. To make this easy for you, here's a query you can run that will create the CREATE INDEX commands for every table and column in your Postgres database.

Disclaimer: Adding an index to every column of every table is not a best practice for production databases. And it’s certainly not something we recommend at Citus Data, where we take scaling out Postgres and database performance very seriously. But indexing all the things is a fun “what if” thought exercise that is well worth thinking about, to understand the value of indexes in Postgres. What if you indexed all the things?

With that disclaimer out of the way, onto my advice of how to index all the things as a way to understand your database’s performance.

Keep reading
Samay Sharma

Fermi Estimates On Postgres Performance

Written byBy Samay Sharma | September 29, 2017Sep 29, 2017

A lot of people look to Citus for a solution that scales out their Postgres database, whether on-prem or as open source or in the cloud, as a fully-managed database as a service. And yet, a common question even before looking at Citus is: "what kind of performance can I get with Postgres?" The answer is: it depends. The performance you can expect from single node Postgres comes down to your workload, both on inserts and on the query side and how large that single node is. Unfortunately, "it depends" often leaves people a bit dissatisfied.

Fortunately, there are some fermi estimates, or in laymans terms ballpark, of what performance single node Postgres can deliver. These ballparks apply both to single-node Postgres, but from there you can start to get estimates of how much further you can go when scaling out with Citus. Let's walk through a simplified guide for what you should expect in terms of the read performance and ingest performance for queries in Postgres.

Keep reading

Page 12 of 16