The Ultimate Cheat Sheet On Hadoop

Top 20 frequently asked questions to test your Hadoop knowledge given in the below Hadoop cheat sheet. Try finding your own answers and match the answers given here.

Question #1 

You have written a MapReduce job that will process 500 million input records and generate 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reducers which is a potential bottleneck. A custom implementation of which of the following interfaces is most likely to reduce the amount of intermediate data transferred across the network?

A. Writable
B. WritableComparable
C. InputFormat
D. OutputFormat
E. Combiner
F. Partitioner
Ans: e

Question #2 

Where is Hive metastore stored by default ?

B. In client machine in the form of a flat file.
C. In client machine in a derby database
D. In lib directory of HADOOP_HOME, and requires HADOOP_CLASSPATH to be modified.
Ans: c


Internet of Thing Awesome Basics You Need to Read Now: Part 5

Internet of things can be applied to both Vertical and Horizontal of things: Applications of the Internet of Things (IoT) have spread across an enormously large number of industry sectors. The development of the vertical applications in these sectors is unbalanced.

It is very important to sort out those vertical applications and identify common underpinning technologies that can be used across the board, so that interconnecting, interrelating, and synergized grand integration and new creative, disruptive applications can be achieved.

IoT part 5
One of the common characteristics of the Internet of Things is that objects in a IoT world have to be instrumented 

Why we need IOT is a fundamental change in the way information is generated, from mostly manual input to massively machine-generated without human intervention.

To achieve such 5A (anything, anywhere, anytime, anyway, anyhow) and 3I (instrumented, interconnected, and intelligent) capabilities, some common, horizontal, general-purpose technologies, standards, and platforms, especially middleware platforms based on common data representations just like the three-tiered application server middleware, HTML language, and HTTP protocol in the Internet/web arena, have to be established to support various vertical applications cost effectively, and new applications can be added to the platform unlimitedly.

Four pillars of IOT
  • RFID - The internet of devices.Example,Radio wave, NFC,IC cards
  • WSN-The internet of transducers. Examples Wireless mess, Bluetooth,Networks, ZigBee
  • M2M - Machine to machine. Examples, Cellular, Fixed networks, WAN, GPRS
  • SCADA-The network of controllers. Example-Wired Field Buses, CanBus,BacNet


