Clustera is an integrated computation and data management system. In contrast to traditional cluster-management systems that target specific types of workloads, Clustera is designed for extensibility, enabling the system to be easily extended to handle a wide variety of job types ranging from computationally-intensive, long-running jobs with minimal I/O requirements to complex SQL queries over massive relational tables. Another unique feature of Clustera is the way in which the system architecture exploits modern software building blocks including application servers and relational database systems in order to realize important performance, scalability, portability and usability benefits.

People

David DeWitt
Jeffrey Naughton
Eric Robinson
Srinath Shankar
Erik Paulson
Andrew Krioukov
Joshua Royalty

Publications

Turning Cluster management into Data Management: A Systems Overview
Eric Robinson and David DeWitt
2007 Conference on Innovative Data Systems Research

Data driven workflow planning in cluster management systems
Srinath Shankar and David J. DeWitt
HPDC '07: The 16th international symposium on high performance distributed computing

Clustera: An Integrated Computation And Data Management System
David J. DeWitt, Eric Robinson, Srinath Shankar, Erik Paulson, Jeffrey Naughton, Joshua Royalty, Andrew Krioukov
To appear, VLDB08

Computer Sciences Technical Report 16something - Clustera: An Integrated Computation and Data Management System
This is an extended version of the VLDB08 paper, with extended experimental results.