Look Ma! No ETL!
The text discusses the integration of Solr and Hadoop with Cassandra in DataStax Enterprise, a big open source technology. It explains how running MapReduce operations over data and doing searches on that same data can be done easily without needing to ETL the data between two clusters. The example used is a survey dataset from The Pew Research Center about Facebook habits and attitudes. The text demonstrates how to create a Solr schema file, upload it to DSE Solr, and then import the survey csv data for processing. It also explains how to search using SOLR's HTTP API and run MapReduce jobs over the imported data using Hive.
Company
DataStax
Date published
July 16, 2013
Author(s)
Hamilton Tran
Word count
702
Language
English
Hacker News points
None found.