Citus BlogCitus Blog

Thoughts about the Citus database—as well as PostgreSQL, sharding, distributed databases, and other open source extensions to Postgres.

Craig Kerstiens

Yubikeys and U2F make two-factor authentication easier

Written byBy Craig Kerstiens | February 1, 2017Feb 1, 2017

We're excited to announce U2F Fido (Yubikey) support for Citus Cloud to make the experience of keeping your account and data secure even easier. Within the Account Security section of the Citus Cloud Console you'll now see a section to add your new device. If you already have a U2F click Register New Device then you'll be prompted to activate it, and you're done.

If you already have a Yubikey then you know all the benefits it brings, however when testing many of our customers were unaware of them or weren't using them already. We felt it would be worth it to spend some time explaining why they're great as well as creating a few guides for how to set them up on the most common services you may be using.

Keep reading
Craig Kerstiens

Getting started with GitHub event data on Citus

Written byBy Craig Kerstiens | January 27, 2017Jan 27, 2017

Getting an example schema and data is often one of the more time consuming parts of testing a database. To make that easier for you, we're going to walk through Citus with an open data set which almost any developer can relate to–github event data. If you already have your own schema, data, and queries you want to test with, by all means use it. If you need any help with getting setup, join us in our Slack channel and we'll be happy to talk through different data modeling options for your own data.

An overview of the schema and queries

The data model we're going to work with here is simple, we have users and events. An event can be a fork or a commit related to an organization and of course many more.

Keep reading
Marco Slot

Postgres Parallel indexing in Citus

Written byBy Marco Slot | January 17, 2017Jan 17, 2017

Indexes are an essential tool for optimizing database performance and are becoming ever more important with big data. However, as the volume of data increases, index maintenance often becomes a write bottleneck, especially for advanced index types which use a lot of CPU time for every row that gets written. Index creation may also become prohibitively expensive as it may take hours or even days to build a new index on terabytes of data in postgres. As of Citus 6.0, we’ve made creating and maintaining indexes that much faster through parallelization.

Keep reading
Lukas Fittl

Scale Out Multi-Tenant Apps based on Ruby on Rails

Written byBy Lukas Fittl | January 5, 2017Jan 5, 2017

Today we’re happy to announce our new activerecord-multi-tenant Ruby library, which enables easy scale-out of applications that are built on top of Ruby on Rails and follow a multi-tenant data model.

This Ruby library has evolved from our experience working with customers, scaling out their multi-tenant apps, and patching some restrictions that ActiveRecord and Rails currently have when it comes to automatic query building. It is based on the excellent acts_as_tenant library, and extends it for the particular use-case of a distributed multi-tenant database like Citus.

Keep reading
Craig Kerstiens

Lessons learned from Postgres schema sharding

Written byBy Craig Kerstiens | December 18, 2016Dec 18, 2016

We talk with a number of Postgres users each week that are looking to scale out their database. First, we would never recommend scaling out until you truly have to, it’s always easier to scale your database up rather than out. It’s often not until over 100 GB of data that you need to think about sharding.

When you want to scale out though, you want it to be simple. For scaling a multi-tenant database, there’s three common approaches:

Keep reading
Ozgun Erdogan

Citus' Replication Model: Today and Tomorrow

Written byBy Ozgun Erdogan | December 15, 2016Dec 15, 2016

Citus is a distributed database that extends (not forks) PostgreSQL. Citus does this by transparently sharding database tables across the cluster and replicating those shards.

After open sourcing Citus, one question that we frequently heard from users related to how Citus replicated data and automated node failovers. In this blog post, we intend to cover the two replication models available in Citus: statement-based and streaming replication. We also plan to describe how these models evolved over time for different use cases.

Keep reading
Marco Slot

Real-time event aggregation at scale using Postgres w/ Citus

Written byBy Marco Slot | November 29, 2016Nov 29, 2016

Citus is commonly used to scale out event data pipelines on top of PostgreSQL. Its ability to transparently shard data and parallelise queries over many machines makes it possible to have real-time responsiveness even with terabytes of data. Users with very high data volumes often store pre-aggregated data to avoid the cost of processing raw data at run-time. With Citus 6.0 this type of workflow became even easier using a new feature that enables pre-aggregation inside the database in a massively parallel fashion using standard SQL. For large datasets, querying pre-computed aggregation tables can be orders of magnitude faster than querying the facts table on demand.

Keep reading

Citus 6.0 allows you to scale out your transactional relational database with minimal changes to your application, thus reducing complexity over other alternatives while still allowing scale. If you're building a multi-tenant application and outgrow a single node Postgres, by sharding based on tenant with Citus 6.0 you can linearly add more memory and processing power to your database without a large re-architecting of your application. You can still maintain referential integrity, and to your application it's still just standard Postgres.

Keep reading

Page 23 of 32