Flash balances the books with atomic writes
Let’s say you have $50,000 in your bank account. One of the important keys to making sure your accounting system and your bank agree that you really have that amount is to insure that every transaction in your accounting system uses atomic writes.
Atomic writes in this case means that a transaction has to be indivisible. If a transaction deducts $50,000 from your Bank of America account and deposits $50,000 in your account at the Grand Cayman Community Bank, all or none of the transaction should be saved to the database.
If the system didn’t use a journal, double-write buffer or some other method for insuring atomicity, the $50,000 would just disappear.
Atomic Writes and Flash
Flash is a whole different animal. The flash controller in an SSD is always writing to blank pages across the flash it manages. After it writes new data to fresh pages, the flash controller updates the metadata that maps the logical block locations it presents to the outside world.
Because the flash controller doesn’t overwrite the current contents when it writes to blocks 124-130, as long as it doesn’t update the metadata until after all the changes for a transaction are complete, none of the transaction will be posted and it will continue to return the old data.
Some efforts have been going on in the storage community to better integrate atomic writes and flash. For instance, Fusion-io promotes a dedicated atomic write API for its products. Several MySQL implementations, including the MariaDB fork, now support the use of the Fusion-io’s atomic write API. Meanwhile, the INCITS T10 committee, which defines the SCSI command set, is working on an extension of the SCSI standard to support atomic writes.
Both the Fusion-io API and the T10 SCSI command set extensions simply provide a mechanism for applications to tell the flash controller which set of updates should be performed as an atomic entity. The flash controller can then hold its metadata updates until all those writes are complete.
Benchmark data shows that with MariaDB, atomic writes can provide a 30 percent to50 percent performance boost and reduce CPU utilization. In addition to the performance boost, using atomic writes eliminates the need to write data twice, which will significantly improve SSD life.
I hope the T10 committee finishes its work on a standard set of atomic write commands as quickly as possible, and that enterprise SSD vendors get on the atomic write bandwagon right away. Then we can start thinking about how SAN vendors, whose systems use the same SCSI commands as disk drives, can implement atomicity.
For more on this story go to: