A Comparison of C-Store and Row-Store in a Common Framework
Alan Halverson, Jennifer L. Beckmann, Jeffrey F. Naughton, and David J. Dewitt
Recently, a “column store” system called CStore has shown significant performance benefits by utilizing storage optimizations for a read-mostly query workload. The authors of the C-Store paper compared their optimized column store to a commercial row store RDBMS that is optimized for a mixture of reads and writes, which obscures the relative benefits of row and column stores. In this paper, we describe two storage optimizations for a row store architecture given a read-mostly query workload – “super tuples” and “column abstraction.” We implement both our optimized row store and C-Store in a common framework in order to perform an “apples-to-apples” comparison of the optimizations in isolation and combination. We also develop a detailed cost model for sequential scans tobreak down time spent into three categories – disk I/O, iteration cost, and local tuple reconstruction cost. We conclude that, while the C-Store system offers tremendous performance benefits for scanning a small fraction of columns from a table, our optimized row store provides disk storage savings, reduced sequential scan times, and low additional CPU overheads while requiring only evolutionary changes to a standard row store.
Download this report (PDF)
Return to tech report index