Featured post

The Ultimate Cheat Sheet On Hadoop

Top 20 frequently asked questions to test your Hadoop knowledge given in the below Hadoop cheat sheet. Try finding your own answers and match the answers given here.

Question #1 

You have written a MapReduce job that will process 500 million input records and generate 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reducers which is a potential bottleneck. A custom implementation of which of the following interfaces is most likely to reduce the amount of intermediate data transferred across the network?

A. Writable
B. WritableComparable
C. InputFormat
D. OutputFormat
E. Combiner
F. Partitioner
Ans: e

Question #2 

Where is Hive metastore stored by default ?

B. In client machine in the form of a flat file.
C. In client machine in a derby database
D. In lib directory of HADOOP_HOME, and requires HADOOP_CLASSPATH to be modified.
Ans: c


Cloud Security the real challenges you need to look in

cloud challenges
You may curious that cloud computing will solve all your problems. In reality, there is security problem. That is- data security problem. I am just explaining my thoughts on data security in cloud. Really many security professional need in future.

Data is crucial to any organization. When data leaks into public, organization may collapse and loose its business.

Try to understand where we lack on data security.

Challenges in data security


Existing Methods

The challenge in addressing this threats of data loss and data leakage is that "the measures you put in place to mitigate one can exacerbate the other," according to the report.

You could encrypt your data to reduce the impact of a breach, but if you lose your encryption key, you'll lose your data. However, if you opt to keep offline backups of your data to reduce data loss, you increase your exposure to data breaches.

Cloud security problems

  1. Data loss: the prospect of seeing your valuable data disappear into the ether without a trace. A malicious hacker might delete a target's data out of spite -- but then, you could lose your data to a careless cloud service provider or a disaster, such as a fire, flood, or earthquake. Compounding the challenge, encrypting your data to ward off theft can backfire if you lose your encryption key.
  2. Legal issues: Data loss isn't only problematic in terms of impacting relationships with customers, the report notes. You could also get into hot water with the feds if you're legally required to store particular data to remain in compliance with certain laws, such as HIPAA.
  3. Hijacking: The third-greatest cloud computing security risk is account or service traffic hijacking. Cloud computing adds a new threat to this landscape, according to CSA. If an attacker gains access to your credentials, he or she can eavesdrop on your activities and transactions, manipulate data, return falsified information, and redirect your clients to illegitimate sites. "Your account or services instances may become a new base for the attacker. From here, they may leverage the power of your reputation to launch subsequent attacks," according to the report. As an example, CSA pointed to an XSS attack on Amazon in 2010 that let attackers hijack credentials to the site.

Also read


Popular posts from this blog

Hadoop fs (File System) Commands List

Hyperledger Fabric: 20 Real Interview Questions

AWS Vs Azure Load Balancers Top Insights

4 Important Skills You Need for Data Scientists