Featured post

The Ultimate Cheat Sheet On Hadoop

Top 20 frequently asked questions to test your Hadoop knowledge given in the below Hadoop cheat sheet. Try finding your own answers and match the answers given here.




Question #1 

You have written a MapReduce job that will process 500 million input records and generate 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reducers which is a potential bottleneck. A custom implementation of which of the following interfaces is most likely to reduce the amount of intermediate data transferred across the network?



A. Writable
B. WritableComparable
C. InputFormat
D. OutputFormat
E. Combiner
F. Partitioner
Ans: e




Question #2 

Where is Hive metastore stored by default ?


A. In HDFS
B. In client machine in the form of a flat file.
C. In client machine in a derby database
D. In lib directory of HADOOP_HOME, and requires HADOOP_CLASSPATH to be modified.
Ans: c




Question…

Cloud Storage the real Points You Need to Read Now

There are hundreds of different cloud storage systems, and some are very specific in what they do.

Some are niche-oriented and store just email or digital pictures, while others store any type of data. Some providers are small, while others are huge and fill an entire warehouse.
Google Data Center

In this post, you will know about:
  1. Storage in Cloud
  2. The inside details of Cloud
  3. New things in Cloud Storage

Storage of CLOUD

One of Google’s data centers in Oregon is the size of a football field and houses thousands of servers. 

The inside details of  Cloud Storage

  • At the most rudimentary level, a cloud storage system just needs one data server connected to the Internet. 
  • A subscriber copies files to the server over the Internet, which then records the data. 
  • When a client wants to retrieve the data, he or she accesses the data server with a web-based interface, and the server then either sends the files back to the client or allows the client to access and manipulate the data itself.
Cloud computing market
Market of cloud computing

What is new in Cloud Storage

The cloud storage systems utilize dozens or hundreds of data servers. Because servers require maintenance or repair, it is necessary to store the saved data on multiple machines, providing redundancy.

Without that redundancy, cloud storage systems couldn’t assure clients that they could access their information at any given time.

Most systems store the same data on servers using different power supplies. That way, clients can still access their data even if a power supply fails.

Many clients use cloud storage not because they’ve run out of room locally, but for safety. If something happens to their building, then they haven’t lost all their data. 

Comments

Popular posts from this blog

Hadoop fs (File System) Commands List

Hyperledger Fabric: 20 Real Interview Questions

AWS Vs Azure Load Balancers Top Insights

4 Important Skills You Need for Data Scientists