Etsy Icon>

Code as Craft

Data Corruption To Go: The Perils Of sql_mode = NULL main image

Data Corruption To Go: The Perils Of sql_mode = NULL

  image

A little while back, we hit an extremely embarrassing issue: 32-bit signed integer overflow on some primary keys.

In our architecture, an unsigned 64-bit key is generated by a global ticket server, but the sharded tables’ schemas did not use BIGINT UNSIGNED as the column data-type. Inserts into the affected tables started failing en masse with "duplicate key" errors on key 2147483647.

We quickly realized the issue and ALTERed the impacted tables, but the MySQL behavior of silently truncating all values larger than 2147483647 with the only fail-safe being the primary key constraint was worrying. Any columns that lacked a similar constraint would be experiencing silent data corruption.

Some digging led us to the MySQL variable sql_mode.

What is sql_mode anyway?

By default MySQL is very accepting of bad data and bad queries. At best, it may emit a WARNING, but these are almost always ignored by client libraries such as PHP's PDO.

sql_mode is a run-time setting for MySQL and enabling it makes the database much more vocal by changing most WARNINGs to ERRORs. This helps prevent silent data corruption on write, and unexpected results on read. Here are a few examples of issues it helped fix within our stack and the settings we used:

PROBLEM: Missing default values, data-type and data-length mismatches

CREATE TABLE `sql_test`
    `id` TINYINT UNSIGNED NOT NULL PRIMARY KEY,
    `value` VARCHAR(3) NOT NULL,
    `column_not_null` VARCHAR(1) NOT NULL
) Engine = InnoDB;

Consider the following inserts:

  1. INSERT INTO sql_test (id, value, column_not_null) VALUES (10000, 'z', 'y');
  2. INSERT INTO sql_test (id, value, column_not_null) VALUES (1, 'abcde', 'x');
  3. INSERT INTO sql_test (id, value) VALUES (2, 'qwerty');

Here's what the table would look like when sql_mode is disabled:

+-----+-------+-----------------+
| id  | value | column_not_null |
+-----+-------+-----------------+
|   1 | abc   | x               |
|   2 | qwe   |                 |
| 255 | z     | y               |
+-----+-------+-----------------+

Massively different from the expected values!

When sql_mode='STRICT_ALL_TABLES' is set though, each of those inserts will fail hard so that you can recover and handle the problem in your application.

  1. ERROR 1264 (22003): Out of range value for column 'id' at row 1
  2. ERROR 1406 (22001): Data too long for column 'value' at row 1
  3. ERROR 1364 (HY000): Field 'column_not_null' doesn't have a default value

PROBLEM: Character-set issues

CREATE TABLE `charset_test` (
    `message` VARCHAR(3000) NOT NULL -- NB: this could also be `message` TEXT NOT NULL
) Engine = InnoDB DEFAULT CHARSET=UTF8;

Inserting a supplementary UTF-8 character string like 

τέχνη

would seem to work, but on a subsequent SELECT, the returned string would be quite different:

?????

This is because MySQL's UTF8 implementation can only accommodate characters up to 3 bytes in length, but Unicode characters could be up to 4 bytes long. The data stored in the message column is subtly truncated and corrupted.

With the stricter sql_mode='STRICT_ALL_TABLES', inserts containing 4-byte characters will fail hard with: ERROR 1366 (HY000): Incorrect string value

Note that sql_mode setting here is only an alarm-bell, it is not fixing the underlying issue. The real solution is twofold:

  1. Change the character-set of the connection to MySQL to UTF8MB4. If using PHP’s PDO, pass it in via the connection DSN. Or when running queries by hand, make sure to execute SET NAMES utf8mb4 before executing anything.
  2. Modify the column and the table definitions to UTF8MB4:
ALTER TABLE `charset_test`
    DEFAULT CHARSET utf8mb4,
    MODIFY `message` VARCHAR(3000) CHARACTER SET utf8mb4 NOT NULL;

One caveat to note when using the UTF8MB4 charset: the rule-of-thumb for an indexed VARCHAR column being at most 255 characters is no longer valid. MySQL has a length limit of 767 bytes for the index prefix and it has to accommodate the worst case. Thus when using the 3-byte UTF8 character set, 255 characters fit in, but when using 4-byte UTF8MB4 only 191 characters fit.

If we absolutely need 255 characters indexed (for example if the column holds email addresses), a workaround is to SHA-1 hash the value, and store and index that column instead.

PROBLEM: Unsigned int decrements

CREATE TABLE `decrement_test` (
    `id` int(11) NOT NULL,
    `value` int(11) unsigned NOT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT INTO decrement_test(id, value)
    VALUES(1, 20) ON DUPLICATE KEY UPDATE
    value = value + VALUES(value);
SELECT * FROM decrement_test;
+----+-------+
| id | value |
+----+-------+
| 1  | 20    |
+----+-------+
INSERT INTO decrement_test(id, value)
    VALUES(1, -1) ON DUPLICATE KEY UPDATE
    value = value + VALUES(value);

If you were expecting the value to now be 19, you would be wrong:

SELECT * FROM decrement_test;
+----+-------+
| id | value |
+----+-------+
| 1  | 20    |
+----+-------+

When sql_mode='STRICT_ALL_TABLES' is enabled, the decrement INSERT statement will fail with: ERROR 1264 (22003): Out of range value for column 'value' at row 1

Note that in order to do atomic decrements as shown above, the column data type would need to be signed-integer rather than unsigned-integer.

PROBLEM: Incorrect group-by

CREATE TABLE `group_by_test`
    `id` INT NOT NULL PRIMARY KEY,
    `name` VARCHAR(30) NOT NULL,
    `status` INT NOT NULL,
    `region` INT NOT NULL
) Engine = InnoDB DEFAULT CHARSET = UTF8MB4;
INSERT INTO group_by_test(id, name, status, region) VALUES (1, 'etsy', 10, 100);
INSERT INTO group_by_test(id, name, status, region) VALUES (2, 'ebay', 12, 100);
INSERT INTO group_by_test(id, name, status, region) VALUES (3, 'amazon', 13, 150);
INSERT INTO group_by_test(id, name, status, region) VALUES (4, 'fab', 10, 100);
SELECT COUNT(1), status, name FROM group_by_test GROUP BY status;
+----------+--------+--------+
| COUNT(1) | status | name   |
+----------+--------+--------+
|   2      | 10     | etsy   |
|   1      | 12     | ebay   |
|   1      | 13     | amazon |
+----------+--------+--------+

The name column isn't in the GROUP BY</code. clause, but is in the SELECT and MySQL pulls the first matching value and puts it into the result set. The output in the name column depends on relative order of rows in the table and may not be repeatable and this is almost never desirable.

By setting sql_mode='ONLY_FULL_GROUP_BY', we can prevent such unreliable SQL from executing. The query would instead fail with: ERROR 1055 (42000): 'etsy.group_by_test.name' isn't in GROUP BY

Just throw the switch and test in Production?

Unfortunately our application was continually executing some variations of the bad queries shown above, especially the bad GROUP BY and bad DEFAULT/NOT NULL, and we couldn’t just enable the mode in Production without causing a lot of disruption.

We had to find the bad queries and table definitions first. In order to do that, we set up a test database with the Production schema on it and enabled the strict sql_mode='ONLY_FULL_GROUP_BY,STRICT_ALL_TABLES' on it.

We slurped all queries from our Production DBs using a local tcpdump and netcat’ed them over to the test database:

prodb-shard> sudo /usr/sbin/tcpdump -i eth0
  "port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2"
  -s 65535 -x -n -q -tttt |
  nc testdb.etsy.com 20000

On the test box, we used the combination of Percona Toolkit’s pt-query-digest and a custom Go script to execute these queries (NB: Percona Playback did not exist at the time). Our Production servers execute ~4000 queries/sec (~30000 packets/sec in and out) and while pt-query-digest has a built-in execute mode, it cannot keep up with the volume and the tcpdump capture would drop packets. The Go script helped multiplex this query volume and keep packet loss low (~1%).

testdb> nc -l 20000 | pt-query-digest --type tcpdump
  --no-report --print | grep -v "SET NAMES utf8" |
  go run shard_executor.go -db-user=username
  -db-password=password -db-host=localhost
  -db-charset=utf8mb4 -log "/var/tmp/shard.log"
  -db-name=etsy_shard -threads=8

Note that pt-query-digest leaks a small amount of memory continually and will need to be restarted every so often.

All errors were logged to disk and we now had visibility into exactly what needed fixing:

testdb> cut -c 21- /var/tmp/shard.log | grep "^Error" |
  sort | uniq

Along with fixing bad queries, we also needed to modify a large number of tables to change the character-set for VARCHAR and TEXT columns from UTF8 to UTF8MB4 in order to accommodate 4-byte UTF-8 characters. We generated the ALTERs with this one-off script and Schemanator allowed us to execute all of these without any downtime!

Note that the script does not generate ALTER statements for indexed VARCHAR/TEXT columns longer than 191 characters for reasons explained above. In our case, we worked around it by reducing the column lengths where feasible, and using the hashing method otherwise.

A few weeks of bug squashing and schema changes later, our databases and applications were finally ready to accept sql_mode= 'ONLY_FULL_GROUP_BY,STRICT_ALL_TABLES'.

CHECK YOUR FLAGS!

We strongly encourage everyone running MySQL to review the various sql_mode flags and choose the ones that make the most sense for their set up.

'ONLY_FULL_GROUP_BY,STRICT_ALL_TABLES' is the absolute minimum we’d recommend.

We don’t use any of MySQL’s DATE/TIME data-types nor any Engine other than InnoDB in our schemas, but if you do, we strongly suggest also using 'NO_ZERO_IN_DATE,NO_ZERO_DATE,NO_ENGINE_SUBSTITUTION'

References and resources

You can follow Keyur on Twitter at @keyurdg