/plushcap/analysis/datastax/datastax-hadoop-mapreduce-cassandra-cluster

Hadoop MapReduce in the Cassandra Cluster

What's this blog post about?

The text discusses how to deploy and distribute Hadoop components within a Cassandra cluster for efficient processing times. It recommends overlaying Hadoop over Cassandra by installing a Hadoop TaskTracker on each Cassandra node, with one dedicated server for Hadoop components like JobTracker, NameNode, and DataNode. The input/output formats in 0.6 and 0.7 are crucial for Cassandra's Hadoop support, allowing data to be read from and written back into Cassandra. Additionally, the text mentions that work is underway to add support for Pig and Hive, with potential automation of setup tasks in the future.

Company
DataStax

Date published
Oct. 4, 2011

Author(s)
Eric Gilmore

Word count
542

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.