/plushcap/analysis/datastax/datastax-look-ma-no-etl

Look Ma! No ETL!

What's this blog post about?

The text discusses the integration of Solr and Hadoop with Cassandra in DataStax Enterprise, a big open source technology. It explains how running MapReduce operations over data and doing searches on that same data can be done easily without needing to ETL the data between two clusters. The example used is a survey dataset from The Pew Research Center about Facebook habits and attitudes. The text demonstrates how to create a Solr schema file, upload it to DSE Solr, and then import the survey csv data for processing. It also explains how to search using SOLR's HTTP API and run MapReduce jobs over the imported data using Hive.

Company
DataStax

Date published
July 16, 2013

Author(s)
Hamilton Tran

Word count
702

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.