Revisiting Database Storage Optimizations on Flash
Mohit Saxena and Michael M. Swift
The database storage hierarchy has been heavily optimized for the performance characteristics of disks. Storage managers typically employ row- or column-oriented storage layouts, or a combination, to improve the I/O performance of different query workloads with disks. The recent rise of flash memory-based solid-state drives (SSDs) significantly change the performance characteristics of storage: these drives provide an order of magnitude lower read/access latencies, significantly higher read bandwidths, and most importantly, negligible seek overheads.
In light of these differences, we analyze major storage optimizations for read-optimized databases. We examine the benefits of row and column-oriented storage layouts on flash SSDs. Our measurements span through different workload variations, including selectivity, projectivity and concurrency that affect query processing on flash. Further, we also investigate the cost and benefits of a set of database optimizations, including data compression, prefetching, and indexes on flash SSDs. We back our experimental evaluation with analytical models of the performance tradeoffs of these optimizations.
Three of our key findings are: (1) SSDs scale up linearly with concurrent execution of database queries and outperform disks by up to a factor of two, (2) the low seek cost on SSDs makes column-storage a better choice for laying out data on a variety of flash devices, (3) and that while data compression is useful to further leverage the bandwidth of flash, database prefetching has less benefit for flash storage. Finally, we present a list of design implications of our findings on future database and operating systems for effectively embracing flash storage.
Download this report (PDF)
Return to tech report index