Hadoop MapReduce in the Cassandra Cluster
The text discusses how to deploy and distribute Hadoop components within a Cassandra cluster for efficient processing times. It recommends overlaying Hadoop over Cassandra by installing a Hadoop TaskTracker on each Cassandra node, with one dedicated server for Hadoop components like JobTracker, NameNode, and DataNode. The input/output formats in 0.6 and 0.7 are crucial for Cassandra's Hadoop support, allowing data to be read from and written back into Cassandra. Additionally, the text mentions that work is underway to add support for Pig and Hive, with potential automation of setup tasks in the future.
Company
DataStax
Date published
Oct. 4, 2011
Author(s)
Eric Gilmore
Word count
542
Hacker News points
None found.
Language
English