Percona Live Featured SessionWelcome to another post in the series of Percona Live featured session blogs! In these blogs, we’ll highlight some of the session speakers that will be at this year’s Percona Live conference. We’ll also discuss how these sessions can help you improve your database environment. Make sure to read to the end to get a special Percona Live 2017 registration bonus!

In this Percona Live featured session, we’ll meet Casper Kejlberg-Rasmussen, Software Developer at Uber. His session is Placing Databases @ Uber. Uber has many thousands of MySQL databases running inside of Docker containers on thousands of hosts. When deciding exactly which host a database should run on, it is important that you avoid hosts running databases of the same cluster as the one you are placing, and that you avoid placing all databases of a cluster on the same rack or in the same data center.

I had a chance to talk to Casper about Uber and database placement:

CasperPercona: How did you get into database technology? What do you love about it?

Casper: When I took my Ph.D., my thesis area was about dynamic data structures. During my bachelor, master and Ph.D., I took all the algorithms and data structure classes I could. So it was natural for me to also work with databases in my professional career. Databases are a prime example of a very useful dynamic data structure.

Percona: Your talk is called Placing Database @ Uber. What do you mean by placing databases, and why is it important?

Casper: At Uber, the storage team manages all of Uber’s storage offerings. Our main database technology is an in-house NoSQL database called Schemaless. Schemaless builds on top of MySQL (and we specifically use our own fork of Percona’s MySQL variant, found here in GitHub). We have many thousands of databases that run inside of Docker containers. Whenever we need to create a new Schemaless instance for our internal customers, or we need to add capacity to an existing Schemaless instance, we need to place new Docker containers with Percona Server for MySQL running inside. For our Schemaless instances to be reliable, durable and highly available, we need to place databases in at least two different data centers. We want to avoid placing two databases of the same instance on the same rack or the same host. So consideration needs to be taken when deciding where to place the databases.

Percona: What are some of the conditions that affect where you place databases?

Casper: When we place a database we have to take into account the labels on the hosts we consider. These labels can be which data center or rack the host is part of, or what Clusto (Clusto is an internal hardware management tool we have) pools a host belongs to. This can be a testing, staging or production host, etc. A host also has “relations.” A relation is also a label, but instead of stating facts about the host, a relation states what other databases are running on the host. An example of a relation label is schemaless.instance.mezzanine, which indicates that the host is running a Schemaless database from the Mezzanine instance. Another example is schemaless.cluster.percona-cluster-mezzanine-us1-db01, which indicates that the database is a Schemaless database belonging to the cluster percona-cluster-mezzanine-us1-db01.

Percona: What do you want attendees to take away from your session? Why should they attend?

Casper: I want the attendees to remember that there are three steps when placing a database or any container:

  1. Filter out any host that fails the hard requirements (like not having enough resources) or fails the label or relation requirements (like having databases of the same instance as the one we want to place)
  2. Rank the hosts to select the best one, which can be having a host with the most free space left or having a low number of other databases on it
  3. As time passes and databases consume more resources, we want to relocate databases to other hosts at which it makes more sense to place them.

People should attend my session to see how to get good database placements in a simple way.

Percona: What are you most looking forward to at Percona Live 2017?

Casper: I look forward to hearing about other NoSQL solutions, and hearing about different storage engines for MySQL systems. And of course meeting other exciting people from the database community! 🙂

Register for Percona Live Data Performance Conference 2017, and see Casper present Placing Databases @ Uber. Use the code FeaturedTalk and receive $100 off the current registration price!

Percona Live Data Performance Conference 2017 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community, as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Data Performance Conference will be April 24-27, 2017 at the Hyatt Regency Santa Clara and the Santa Clara Convention Center.