Posts

Showing posts with the label Hadoop Utilities

HBASE Vs. RDBMS Top Differences You can Unlock Now

Image
HBASE in the Big data context has a lot of benefits over RDBMS. The listed differences below make you understandable why HBASE is popular in Hadoop (or Bigdata) platform. Let us check one by one quickly. HBASE Vs. RDBMS Differences Random Accessing HBase handles a large amount of data that is store in a distributed manner in the column-oriented format while RDBMS is systematic storage of a database that cannot support a random manner for accessing the database. Database Rules RDBMS strictly follow Codd's 12 rules with fixed schemas and row-oriented manner of database and also follow ACID properties. HBase follows BASE properties and implement complex queries. Secondary indexes, complex inner and outer joins, count, sum, sort, group, and data of page and table can easily be accessible by RDBMS. Storage From small to medium storage application there is the use of RDBMS that provide the solution with MySQL and PostgreSQL whose size increase with concurrency and performance.  Codd'

Hadoop: How to find which file is healthy

Image
Hadoop provides file system health check utility which is called "fsck". Basically, it checks the health of all the files under a path It also checks the health of all the files under the '/'(root). BIN/HADOOP fsck / - It checks the health of all the files BIN/HADOOP fsck /test/ - It checks the health of files under the path By default fsck utility cannot do anything for under replicated blocks and over replicated blocks. Hadoop itself heal the blocks.   How to find which file is healthy It prints out dot for each healthy file It will print a message for each file, if it is not healthy, also for under replicated blocks, over replicated blocks, mis-replicated blocks, and corrupted blocks. By default fsck utility cannot do anything for under replicated blocks and over replicated blocks. Hadoop itself heal the blocks. How to delete corrupted blocks BIN/HADOOP fsck -delete block-names It will delete all corrupted blocks BIN/HADOOP fsck -m