The Ultimate Cheat Sheet On Hadoop

Top 20 frequently asked questions to test your Hadoop knowledge given in the below Hadoop cheat sheet. Try finding your own answers and match the answers given here.

Question #1 

You have written a MapReduce job that will process 500 million input records and generate 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reducers which is a potential bottleneck. A custom implementation of which of the following interfaces is most likely to reduce the amount of intermediate data transferred across the network?

A. Writable
B. WritableComparable
C. InputFormat
D. OutputFormat
E. Combiner
F. Partitioner
Ans: e

Question #2 

Where is Hive metastore stored by default ?

B. In client machine in the form of a flat file.
C. In client machine in a derby database
D. In lib directory of HADOOP_HOME, and requires HADOOP_CLASSPATH to be modified.
Ans: c


Alternative technologies matches to mainframe developers

Read my part-1 post. Secondly, the programmers who are working on mainframe have very good business knowledge. People who have the following skills are the valuable asset to any organization.
Mainframe programmers if they learn other skills and try for new jobs they can earn more money.

Role of Programmer

  1. "A programmer who can do analysis, create database structures, write clean code, create testing structures and clearly communicate all that has been done is a very valuable asset.".
  2. The mainframe was leading in the market since 1950. All the big companies in the world are running their business in mainframes. Yes, many American universities now teaching mainframe technology in their education curriculum, since in future possibility is there for mainframe skill shortage.

Alternative Technologies

  • It may become more important for IT professionals to gain experience working with analytics technology, as research firm Gartner predicted a surge in demand in this area in the next two years. According to CIO contributor Hamish Barwick, the big data trend alone is expected to create 4.4 million jobs worldwide
  • Analysts warned that only a third of those jobs are likely to be filled due to difficulties in recruiting analytics talent.
  • An opportunity for every IT Professional: "Dark data is the data being collected, but going unused despite its value and leading organizations of the future will be distinguished by the quality of their predictive algorithms,"


