/plushcap/analysis/datastax/datastax-powers-ten-part-ii

Powers of Ten – Part II

What's this blog post about?

This article discusses strategies for bulk loading data into Titan at varying scales, focusing on hundreds of millions and billions of edges using Faunus as the loading tool. It provides a step-by-step guide to loading the DocGraph dataset with approximately 1 million vertices and 154 million edges using a single Hadoop node running in pseudo-distributed mode. The article also demonstrates how to load the Friendster social network dataset, which represents a graph with 117 million vertices and 2.5 billion edges, using a four-node Hadoop cluster. It emphasizes that while there are common strategies for loading data at different scales, the actual approach must be adapted to the specific data and domain.

Company
DataStax

Date published
June 2, 2014

Author(s)
Stephen Mallette

Word count
2202

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.