Okay, I'll admit it: SQL and big data things aren't at the top of my skillset.

No earlier than today, I resolved an issue we had in our customer databases: adding a missing index.
Seems easy at first but if the table is HUGE (yea, 8GB+ is big for me) with about 246MM rows (!!), it gets damn slow and the downtime doesn't last for an eye blink.

Reminder: ALTER TABLE allows read operations while altering the table then blocks reads and writes ops when the alteration is done and needs to go live. This may vary according to the configuration but the point is: altering a table has a downtime impact. RTFM.

Altering a ~246MM rows table takes time, unless you sold your soul to the devil to own a kickin'-ass database cluster that is able to stand any kind load.

Besides, in this table, most of the rows were junk rows that needed to be cleared but even performing a massive deletion would take hours (this time per customer, yey!).

With some help, we came up with this first idea:

  1. Create a temporary table
  2. Fill it with rows we want to keep from the big-ass table
  3. Empty the big-ass table
  4. Add the index to the big-ass table
  5. Refill the indexed big-ass table with the temporary table rows
  6. Drop the temporary table

I wrapped the whole thing within a transaction and tested it. Of course, it failed (because of a syntax error). As a result and also because the TRUNCATE is not an atomic operation, I was good to reimport the whole database *table-flip*.

Let's go for another 20min of database import... *cereal-eating*
This left me plenty time to research how to do this correctly.

Since TRUNCATE is not atomic and RENAME is, I tweaked our first idea to this:

  1. Create a big-ass-2 table from current big-ass-1 table schema
  2. Add the index to big-ass-2 table
  3. Fill the big-ass-2 table with rows we want to keep from big-ass-1 table
  4. Rename big-ass-1 table into big-ass-3 table
  5. Rename big-ass-2 table into big-ass-1 table
  6. Drop big-ass-3 table

I could test that the transaction would successfully rollback (since my SQL syntax isn't perfect) and surprisingly, when successful, the whole operation took about 2 seconds on my MacBook Pro (to handle ~246MM rows).

Impressive!

If you are interested into the SQL code:

DROP TABLE IF EXISTS `foo_new`;
CREATE TABLE `foo_new` LIKE `foo`;
ALTER TABLE `foo_new` ADD INDEX `indexName` (`fieldName`);
INSERT INTO `foo_new`
SELECT `foo`.*
FROM `foo`
INNER JOIN `bar` ON `bar`.`pkField` = `foo`.`fkField`
AND `bar`.`stringField` NOT LIKE "pattern%";
RENAME TABLE `foo` TO `foo_old`, `foo_new` TO `foo`;
DROP TABLE `foo_old`;

Thanks for reading and don't hesitate to share.


Joris Berthelot