Posts

Showing posts with the label MapReduce

HBASE Vs. RDBMS Top Differences You can Unlock Now

Image
HBASE in the Big data context has a lot of benefits over RDBMS. The listed differences below make you understandable why HBASE is popular in Hadoop (or Bigdata) platform. Let us check one by one quickly. HBASE Vs. RDBMS Differences Random Accessing HBase handles a large amount of data that is store in a distributed manner in the column-oriented format while RDBMS is systematic storage of a database that cannot support a random manner for accessing the database. Database Rules RDBMS strictly follow Codd's 12 rules with fixed schemas and row-oriented manner of database and also follow ACID properties. HBase follows BASE properties and implement complex queries. Secondary indexes, complex inner and outer joins, count, sum, sort, group, and data of page and table can easily be accessible by RDBMS. Storage From small to medium storage application there is the use of RDBMS that provide the solution with MySQL and PostgreSQL whose size increase with concurrency and performance.  Codd'

6 Stages of MapReduce Here to Understand Now

Image
Here are the six stages of MapReduce . The MapReduce is critical for your data processing needs. Traditionally, the whole file needs to read once then divided manually, but it is not convenient. With that respect, Hadoop provides the facility to read files (ignoring their size) line-for-line by using offset and key-value. Hadoop MapReduce Data Flowchart MapReduce Process 6 Stages of MapReduce - Know How It Process Data MapReduce receives input and processes it. Here are the six stages of processing . It is helpful for your interviews and project. Step 1:  Take the file as input for processing purposes. Any file will consist of a group of lines. These lines containing key-value pairs of data. The whole file can be read out with this method. Step 2:  In the next step, the file will be in "splitting" mode. This mode will divide the file into key, value pair of data. This time key will be offset and data will be a valuable part of the program. Each line will be read individually