MySQL on Autopilot

February 22, 2016 - by Tim Gross

The autopilot pattern is an approach to application and infrastructure design that pushes automation for each component of our systems into the application. Each container that makes up the application has its own lifecycle, and we package those lifecycle behaviors into the application container rather than relying on external infrastructure.

Let's see how we can apply this pattern to help us deploy and operate one of the kinds of sophisticated stateful applications usually considered tough to run in Docker containers: MySQL.

Operating MySQL

We'll start with a very common MySQL deployment: asynchronous replication from primary to replicas. Clients make queries to the replica, or make writes to the primary. This architecture brings up some questions about service discovery and topology:

  • How does the replica know where to find the primary?
  • How does the primary tell the replicas where to start replication?
  • How does the client know where to find nodes and which nodes accept writes?

And then after deployment we have another set of ongoing concerns:

  • How do we do backups?
  • How do we promote a replica if the primary fails?
  • How do the other replicas know where to find a new primary during failover?
  • How does the client know that we failed over?

There are some established answers for some of these questions, of course. Configuration management tools often lead to splitting configuration of the infrastructure from the application. They also can't respond to application topology changes at runtime. Database as a service (DBaaS) handles the management but configuration is now largely out of your control and costs are much higher.

The alternative is for applications to run on autopilot. In this pattern, operationalizing an application means giving it the responsibility for how it fits within the overall system: startup, shutdown, scaling, discovery, and recovery. This minimizes human intervention which means fewer mistakes and more time to spend on more important parts of your business.

Obviously we're not going to rewrite MySQL to do this, so we need a way to provide this functionality to an existing application, and for that we're going to lean on ContainerPilot.

Architecture

We'll need the following components to create our MySQL deployment:

  • MySQL: we're using MySQL5.6 via Percona Server, and XtraBackup for running hot snapshots.
  • Consul: used to coordinate replication and failover
  • Manta: the Joyent object store, for securely and durably storing our MySQL snapshots.
  • ContainerPilot: included in our MySQL containers orchestrate bootstrap behavior and coordinate replication using keys and checks stored in Consul in the preStart (formerly onStart), health, and onChange handlers.
  • manage.py: a small Python application that ContainerPilot's lifecycle hooks will call to bootstrap MySQL, perform health checks, manage replication setup, and perform coordinated failover.

All the code and configuration described here can be found on GitHub.

Architecture diagram

When a new MySQL node is started, ContainerPilot's preStart handler (formerly onStart) will call into manage.py. ContainerPilot will fork Percona Server and wait. Meanwhile, it'll run its preStart, health, and onChange handlers concurrently. This leaves us with a process tree in the MySQL container that looks like this:

root@993acf351cd9:/# ps axo uid,pid,ppid,stime,cmd

UID    PID  PPID  STIME  CMD
root     1     0  19:02  /bin/containerpilot
mysql   94     1  19:02  |_ mysqld --console --gtid-mode=ON...
root   107     1  19:04  |_ python /bin/manage.py health
root   109     1  19:04  |  |_ /usr/bin/innobackupex --no-timestamp...
root   120     1  19:06  |_ python /bin/manage.py health
root   121     1  19:06     |_ mysql -u repl -p...

Self-assembling

Because we're just using a few Docker images and there's no need for a separate scheduler to manage discovery and bootstrapping we can launch the stack simply by:

docker-compose up -d

When the first node comes up, it checks in with the Consul discovery service to try and find a primary. This first node sees that there is no primary yet so it'll run itself as a primary and initialize the database. It writes a key with atomic lock using a Consul session so that one and only one node becomes the primary.

Lifecycle preStart handler

The primary will also snapshot itself using Percona XtraBackup and push this snapshot and its most recent binlog to the Manta object store. (It'll do this again periodically or whenever its binlog rotates.) It records these paths in Consul, and we'll use those paths to set up replication next.

Once the primary is up and healthy we can scale up replicas just as easily:

docker-compose scale mysql=3

During their preStart (formerly onStart) handlers, each replica node will ask Consul where to find the primary, and then set up replication from that primary. They'll ask Consul where to find the latest snapshot, download it from Manta, and then use the Global Transaction Identifiers to sync up with the primary. Once this is done, they'll register themselves as healthy with Consul.

Self-monitoring

While the MySQL process is running, ContainerPilot will perform periodic health checks using the mysql client bundled in the container. In this case we're using a simple SELECT 1 but this could just as easily be checking on replication status or the number of queries in flight. If the health check passes, ContainerPilot will write a heartbeat into Consul with a TTL.

Lifecycle heartbeats

Self-healing

If the primary is removed from service (say by running docker stop on it), ContainerPilot within the container will immediately deregister itself from Consul and the replicas will pick this up as an onChange event.

Lifecycle onChange handler

Although the replicas see this change within a few seconds, they're forced to wait for the lock in Consul to expire. Once it does, all replicas will attempt to obtain the lock. Whichever node wins will mark itself as the primary. The remaining nodes will automatically reconfigure their replication to come from the new primary.

Try it yourself!

Percona Server with the autopilot pattern can be used wherever you need a high-availability, high-performance, MySQL-compatible database. You can use it as a database to power any of a number of open source applications that depend on MySQL-compatible servers, including WordPress, Drupal, Joomla, TYPO3, MODx, phpBB, MyBB, and many others.

Check out the code on GitHub or check out the FAQ below.


What is this?

The autopilot pattern Percona Server is an implementation of the autopilot pattern for Percona's MySQL-compatible Server in Docker. This is not a product or service of Joyent, but a blueprint of the autopilot pattern applied to a sophisticated, stateful application.

What is the autopilot pattern?

From autopilotpattern.io:

The autopilot pattern automates in code the repetitive and boring operational tasks of an application, including startup, shutdown, scaling, and recovery from anticipated failure conditions for reliability, ease of use, and improved productivity.

Is this a DBaaS?

No. It is an implementation of the autopilot pattern for Percona's MySQL-compatible Server in Docker. Though it is designed for automated operations and great simplicity, it is not an officially supported product or service of Joyent.

What's required to run this?

Most autopilot pattern apps have similar base requirements, including:

  • The ability to run Docker containers across multiple container hosts (physical or virtual).
  • Seamless networking between containers on multiple container hosts (physical or virtual).
  • The ability for apps inside the container to detect their own IP address(es), and for other containers running on other hosts to be able to connect to the container using the detected IP address(es).

These features are (or can be) available in the following environments:

Other than the general requirements for all autopilot pattern applications, this implementation requires RAM, disk, and CPU sufficient to run Percona Server at the size and scale the user desires.

Does it require Triton?

No! This pattern is tested for Triton, but will run anywhere that meets the requirements listed above.

Can I run it in the Joyent Public Cloud? Can I run it in my Triton on-prem installation or private cloud?

Yes! Both Triton on-prem and Joyent public cloud meet the requirements of being able to deploy Docker containers across multiple hosts and providing seamless networking for containers.

How do I get it?

The autopilot pattern code is available on GitHub and the container images are available from the Docker Hub. See the README for how to use and operate.

How much does it cost?

The code is completely free and open source. Users of the software need only pay for the infrastructure required to run it, with no additional fees or license costs.

Is it open source?

Yes! Percona Server is licensed under the GPLv2 (source repo), and the autopilot pattern code that operates it is licensed under the MPLv2. Additional components and dependencies are licensed individually.

The autopilot pattern for Percona Server is available on GitHub, where we welcome pull requests and bug reports. Please note, however, that this is not a supported product or service of Joyent and Joyent cannot offer support via that channel. See the support options described above.

Why Percona Server and not "official" MySQL?

This blueprint is built against Percona Server because it depends on Percona XtraBackup, which works best and is most thoroughly tested with Percona Server.

Percona XtraBackup is notable for supporting consistent backups against a live database, making 100% uptime possible while delivering consistent and reliable backups.

Does Percona endorse the autopilot pattern for Percona Server?

Let's hear from Percona founder Peter Zaitsev:

"Percona built its business on maximizing database performance and reliability, so we are excited to see Percona Server at the center of this new pattern that makes DBaaS ease of management available to all, without locking users into a platform or locking them out of the details that advanced users require for performance optimization."

Can I get support from Percona?

Percona supports MySQL users with a wide range of support services. Percona's EVP of sales & marketing, Jim Doherty, detailed their support position for Percona products in Docker, including the autopilot implementation, as follows:

"Percona is planning to support this, but we have yet to announce an "official" date. We can provide "best effort" support at this time and work with customers to ensure proper expectations are set.

Are there special advantages to running on Triton?

Yes! Running on Triton provides unique advantages. By running containers on bare metal, you get the performance you need for resource-intensive applications like MySQL without the overhead of hardware virtualization. And you don't need to separately provision a VM to serve as the underlying Docker Host which reduces the deployment effort.

Please contact the Joyent sales team if you'd like to learn more.

How does this compare to running on RDS?

Autopilot pattern MySQL AWS RDS
Easy to administer Yes Yes
Scalable Yes Yes
Available and durable Yes Yes
Fast Yes Yes
Secure Yes Yes
Inexpensive Yes No. It's 40-50% more expensive than the infrastructure
User-configurable Yes No
Runs on-prem Yes No
Full access to all data Yes No
Avoids vendor lock-in Yes No