/plushcap/analysis/hasura/getting-started-with-hdfs-on-kubernetes-a75325d4178c

Getting started with HDFS on Kubernetes

What's this blog post about?

This text discusses the process of running Hadoop Distributed File System (HDFS) on Kubernetes. It begins by explaining the basic architecture of HDFS and then moves onto how to architect it on Kubernetes. The author highlights the challenges faced while deploying HDFS on Kubernetes, such as pods going down and coming back up with different IP addresses, and provides solutions for these issues. They propose wrapping the namenode in a Service resource and using Stateful Sets to identify datanodes. Additionally, they demonstrate how to run fully distributed HDFS on a single node using Kubernetes Persistent Volume (PV) resources. The text concludes by mentioning that a follow-up blog post will showcase deploying Apache Spark on Kubernetes to process data stored in the new k8s HDFS.

Company
Hasura

Date published
Feb. 13, 2018

Author(s)
Tirumarai Selvan

Word count
1041

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.