Powers of Ten – Part I
The text discusses strategies for loading data into Titan, a distributed graph database. It outlines different approaches based on the size of the data, ranging from millions to billions of edges. For smaller datasets (up to tens of millions of edges), common Gremlin operations and scripts executed through the Gremlin REPL are recommended. Larger datasets may require more advanced techniques such as BatchGraph for handling intermediate commits and maintaining a vertex cache. The text also provides examples using real-world datasets like Wikipedia Vote Network and DocGraph data set to demonstrate these strategies.
Company
DataStax
Date published
May 29, 2014
Author(s)
Stephen Mallette
Word count
1342
Hacker News points
None found.
Language
English