Themis: I/O Efficient MapReduce
(11/25/2013) 11 minutes

description:

Themis tackles the inefficiencies inherent in big data processing engines such as Hadoop. Existing systems sacrifice orders of magnitude of performance for the sake of scalability. For example, Hadoop can be used to sort 100TB of data using thousands of machines. Themis makes such problems feasible using only around 50 machines. While other solutions focus on smaller problems that can be solved using in-memory techniques, Themis is designed for larger problems where efficient use of secondary storage is critical for achieving good performance.

more on this subject: