Zen and the Art of Spark Maintenance
This blog post delves into deeper detail on the inner workings of Apache Spark and how to shape your application to take advantage of interactions between Spark and Apache Cassandra. It covers key components of Spark, such as its four processes, executor JVMs, heap memory allocation, and RDDs. The post also discusses troubleshooting connections between the driver and executors, minimizing shuffles, caching RDDs, leveraging Cassandra's advantages within Spark, and using metrics to monitor throughput to and from Cassandra.
Company
DataStax
Date published
June 10, 2015
Author(s)
Russell Spitzer
Word count
2971
Language
English
Hacker News points
None found.