Posts

Showing posts with the label fsck

Hadoop: How to find which file is healthy

Image
Hadoop provides file system health check utility which is called "fsck". Basically, it checks the health of all the files under a path It also checks the health of all the files under the '/'(root). BIN/HADOOP fsck / - It checks the health of all the files BIN/HADOOP fsck /test/ - It checks the health of files under the path By default fsck utility cannot do anything for under replicated blocks and over replicated blocks. Hadoop itself heal the blocks.   How to find which file is healthy It prints out dot for each healthy file It will print a message for each file, if it is not healthy, also for under replicated blocks, over replicated blocks, mis-replicated blocks, and corrupted blocks. By default fsck utility cannot do anything for under replicated blocks and over replicated blocks. Hadoop itself heal the blocks. How to delete corrupted blocks BIN/HADOOP fsck -delete block-names It will delete all corrupted blocks BIN/HADOOP fsck -m