No more locks with Xtrabackup

Posted by on August 7, 2012

Percona’s Xtrabackup is a great product. It performs binary backups of heavily loaded Mysql servers amazingly fast. We’ve used it here at Etsy for years and works very well for us. Mysqldump is another way of doing backups, but for servers with hundreds of gigabytes of data, it’s too slow to be useful, especially for restoring a backup. It can literally take days to restore a couple of hundred of gigabytes generated with Mysqldump. Usually when you need to restore from a backup, you’re in some sort of emergency, and waiting days is not an option.

But Xtrabackup has a significant shortcoming: it needs a global lock at the end of its procedure. Not good for a server doing hundreds or thousands of queries per second. Percona touts Xtrabackup as “non-blocking”, which is for the most part true, but not entirely.

Restoring a Mysql server, most of the time, involves installing an OS, Mysql, and any additional packages you want. Then restoring the latest Xtrabackup data and finally give the new server replication coordinates where it left off before the crash so it can get to the point where its data is consistent with its master. Here’s where the lock comes in: to get a reliable reading of replication coordinates, Xtrabackup issues a “FLUSH TABLES WITH READ LOCK” in the final stage of its process. When the lock is granted, Xtrabackup reads the “MASTER STATUS” and releases the lock. FTWRL is very disruptive to a busy server. It tells Mysql to start the process of read/write locking all tables, which in turn causes all new connections to wait for the lock to be released. Mysql then waits for all outstanding queries to finish, and then grants the lock. If there’s a long running query when FTWRL is requested, you will undoubtedly get a query pile-up which can quickly overwhelm the maximum number of connections your server is configured to accept and your application will stall.

Percona’s documentation states that there is a –no-lock option for Xtrabackup. It also states “Use this option to disable table lock with FLUSH TABLES WITH READ LOCK. Use it only if ALL your tables are InnoDB and you DO NOT CARE about the binary log position of the backup”. We don’t want any locks, but we do want the “binary log position of the backup”, aka replication coordinates.

It turns out that replication coordinates are hidden in the backup files when you run it with the –no-lock option. You just have to know how to get to them. Xtrabackup’s backup procedure involves copying Mysql’s data files to another location, knowing the exact point-in-time when the copying started, and creating an extra file, named xtrabackup_logfile, which contains all writes that occurred during the copying time. Restoring a backup with Xtrabackup requires a “prepare” phase, which is basically applying all writes from the xtrabackup_log file onto data files. When you do the prepare phase, a new file named xtrabackup_binlog_pos_innodb will appear in the restore directory. This file contains the replication coordinates that we need to reestablish replication. With or without the –no-lock option, xtrabackup_binlog_pos_innodb is created in your restore directory!

Even a 1-second stall can be disruptive for a busy server, but our locks used to last around 30 seconds. That was before we adopted the –no-lock option combined with getting replication coordinates from the xtrabackup_binlog_pos_innodb for restores.

An important thing to note is that this works if you only use Innodb tables. You shouldn’t be using MyISAM tables anyway. Use them only for the unavoidable: the `mysql` db which contains grants and other internal metadata.

Posted by on August 7, 2012
Category: databases, operations Tags:

Related Posts

8 Comments

Thanks good stuff in this post.

I’m actually right in the middle of setting up Xtrabackup (innobackupex), using some of the Percona blog posts on the subject as reference.

I’m using the –export command to create and prepare the backup so that I can recover per table (innodb_file_per_table).

There seems to be a restriction that you can’t recover a table if the table schema changes, or the table is truncated or dropped. This seems like a very inconvenient restriction as this is exactly kind of thing one might want to recover from.

How would you use Xtrabackup to get around this restriction?

Thanks.

Hello,
You should be making daily backups so if you drop a table or change its schema you can recover it from a previous day.

Hi,

Just a caveat I would like to drop here, when we say no non-transactional work during –no-lock includes DDL to transactional tables as well (something that is not explicitly mentioned). This is because DDL is non-transactional in MySQL. I have raised a bug to document this aspect — https://bugs.launchpad.net/percona-xtrabackup/+bug/1075444

So if you do create / alter / drop innodb tables during xtrabackup with –no-lock, it is possible that you don’t have the ibd file but the frm file corresponding to the table (if newly created) and that can lead to failure during restore.

Just ran across this post. We do things a bit differently with xtrabackups. We’re actually running the XtraDB solution (One of these days I’ll get around to writing a blog post about scaling with it) and what we do is simply have a non-critical node which no front-end connections persist to (it’s only a failover node essentially). Through this we run backups transparently to user connections.

I have no idea if XtraDB is a reasonable option for you guys, but it’s worth a look if you have the resources (and if for some reason you guys read this post and have questions, feel free to e-mail).

Grish,

If you don’t revision control your schema or might need to revert a particular table just after upgrading off-the-shelf software, it may be prudent to export schema along with your scheduled runs of xtrabackup:

mysqldump -all-databases –no-data –skip-lock-tables

Note that, in the case your table was dropped or the definition changed, you would need to grab the individual CREATE TABLE statements from the resulting file before attempting an “import tablespace” type restore. Generally, the resulting file will be small enough to open with less and search around in, since it contains only DDL.

It is also worth mention that this strategy will only work if innodb_file_per_table was set before you created or rebuilt via ‘alter table t1 engine=innodb’. If the table was previously not in an individual .ibd file, there was “free space” left in the system tablespace ibdata1. The only way to shrink an ibdata1 file is to import a mysqldump into an empty datadir (and promote this slave)

Girish, I apologize for the typo

Hey Arie,

You might be happy to see Percona has implemented LOCK TABLES FOR BACKUP so that you can take virtually lock-less backups: http://www.mysqlperformanceblog.com/2014/03/11/introducing-backup-locks-percona-server-2/

Will only lock DDL operations, but not all the operations.

Percona Xtrabackup 2.2 has implemented counterpart, and you can test the alpha already: http://www.mysqlperformanceblog.com/2014/03/28/percona-xtrabackup-2-2-1-alpha-release-is-now-available/

Marcos,
This is great! Thanks for the heads-up.