Troubleshooting Cassandra File System
Cassandra File System (CFS) is a HDFS-compatible file system implemented on top of Cassandra. It is fully distributed and has no Single Point of Failure (SPOF). CFS includes two tables - inode and sblocks, which store metadata and data blocks respectively. When using CFS, you may occasionally encounter problems such as orphan blocks and lost blocks. Orphan blocks occur when a file stream is not closed properly, leaving unreferenced data blocks in the sblocks table. To remove these orphan blocks, use the dsetool repaircfs command. Lost blocks problem occurs when there exists an inode of the file but one or more referenced data blocks cannot be read. This can be caused by insufficient Consistency Level for writes into CFS or corruption of CFS data files. To diagnose CFS inconsistencies, use the dsetool checkcfs tool, which has two modes of operation: recursive checking directories and checking single files. Running nodetool repair cfs can fix some issues, but permanent file corruption may require deleting and re-saving the file. Using a Replication Factor of at least 3 and Consistency Level at least CL.QUORUM is recommended to avoid these problems.
Company
DataStax
Date published
Oct. 1, 2013
Author(s)
Piotr Kołaczkowski
Word count
603
Language
English
Hacker News points
None found.