/plushcap/analysis/acceldata/how-to-monitor-spark-on-kubernetes

How to Monitor Spark on Kubernetes

What's this blog post about?

Data processing with Apache Spark can be optimized by monitoring its performance, especially when running on Kubernetes. Spark on Kubernetes became generally available in March 2021, and companies are adopting this approach to improve their infrastructure's efficiency. Monitoring tools like Pulse integration provide an overview of Spark jobs, job status, memory usage, and other metrics. Accessing Spark Metrics can be done through the Spark UI or REST API, which returns JSON data for easy visualization and monitoring tool integration. The Kubernetes Dashboard offers basic metrics but may not directly link them to specific Spark jobs. Acceldata's Pulse integration allows users to create custom dashboards with various metrics and visualizations, improving Spark observability on Kubernetes.

Company
Acceldata

Date published
Sept. 21, 2021

Author(s)
Ashwin Rajeev

Word count
715

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.