Hadoop MapReduce in the Cassandra Cluster

Post Details

Company

DataStax

Date Published

Oct. 4, 2011

Author

Eric Gilmore

Word Count

542

Language

English

Hacker News Points

-

Source URL

www.datastax.com/blog/hadoop-mapreduce-cassandra-cluster

Summary

The text discusses how to deploy and distribute Hadoop components within a Cassandra cluster for efficient processing times. It recommends overlaying Hadoop over Cassandra by installing a Hadoop TaskTracker on each Cassandra node, with one dedicated server for Hadoop components like JobTracker, NameNode, and DataNode. The input/output formats in 0.6 and 0.7 are crucial for Cassandra's Hadoop support, allowing data to be read from and written back into Cassandra. Additionally, the text mentions that work is underway to add support for Pig and Hive, with potential automation of setup tasks in the future.