Featured post

The Ultimate Cheat Sheet On Hadoop

Top 20 frequently asked questions to test your Hadoop knowledge given in the below Hadoop cheat sheet. Try finding your own answers and match the answers given here.




Question #1 

You have written a MapReduce job that will process 500 million input records and generate 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reducers which is a potential bottleneck. A custom implementation of which of the following interfaces is most likely to reduce the amount of intermediate data transferred across the network?



A. Writable
B. WritableComparable
C. InputFormat
D. OutputFormat
E. Combiner
F. Partitioner
Ans: e




Question #2 

Where is Hive metastore stored by default ?


A. In HDFS
B. In client machine in the form of a flat file.
C. In client machine in a derby database
D. In lib directory of HADOOP_HOME, and requires HADOOP_CLASSPATH to be modified.
Ans: c




Question…

Python Command Line Options List

The complete list of command line options

-b

Issue warnings for calling str() with a bytes or bytearray object and no encoding argument, and comparing a bytes or bytearray with a str. Option -bb issues errors instead.

-B

Do not write .pyc or .pyo byte-code files on imports.

-d

Turn on parser debugging output (for developers of the Python core).

-E

Ignore Python environment variables described ahead (such as PYTHONPATH).

-h

Print help message and exit.

-i

Enter interactive mode after executing a script. Hint: useful for postmortem debugging; see also pdb.pm(), described in Python’s library manuals.

-O

Optimize generated byte code (create and use .pyo byte-code files). Currently yields a minor performance improvement.

-OO

Operates like -O, the previous option, but also removes docstrings from byte code.

-q

Do not print version and copyright message on interactive startup (as of Python 3.2).

-s

Do not add the user site directory to the sys.path module search path.

-S

Do not imply “import site” on initialization.

-u

Force stdout and stderr to be unbuffered and binary.

-v

Print a message each time a module is initialized, showing the place from which it is loaded; repeat this flag for more verbose output.

-V

Print Python version number and exit (also available as --version).

-W arg

Warnings control: arg takes the form action:message: category:module:lineno. See also “Warnings Framework” and “Warning Category Exceptions” ahead, and the warn ings module documentation in the Python Library Reference manual (available at http://www.python.org/doc/).

-x

Skip first line of source, allowing use of non-Unix forms of

#!cmd.

-X option

Set implementation-specific option (as of Python 3.2); see implementation documentation for supported option values.

Comments

Popular posts from this blog

AWS Vs Azure Load Balancers Top Insights

Hadoop File System Basic Commands

4 Important Skills You Need for Data Scientists

Hyperledger Fabric: 20 Real Interview Questions