Skip to main content

9 Recommended Steps to create Hadoop Cluster

Hadoop cluster
9 Steps to perform to setup hadoop cluster on your PC/Laptop in your centos either in local or virtual machine.

Step 1:  Installing Sun Java on Linux. Commands to execute for the same:
sudo apt-add-repository ppa:flexiondotorg/java
sudo apt-get update
sudo apt-get install sun-java6-jre sun-java6-plugin
sudo update-java-alternatives -s java-6-sun

Step 2:  Create Hadoop User. Commands to execute for the same:
$sudo addgroup hadoop
$sudo adduser —ingroup hadoop hduser

Step 3:  Install SSH Server if not already present. Commands are:
$ sudo apt-get install openssh-server
$ su - hduser
$ ssh-keygen -t rsa -P ""
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Step 4:  Installing Hadoop. Commands for the same are:
$wget http://www.eng.lsu.edu/mirrors/apache/hadoop/core/hadoop-0.22.0/hadoop-0.22.0.tar.gz
$ cd /home/hduser
$ tar xzf hadoop-0.22.2.tar.gz
$ mv hadoop-0.22.2 hadoop

Step 5:  Edit .bashrc. Commands:
# Set Hadoop-related environment variables
export HADOOP_HOME=/home/hduser/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun

Step 6:  Update hadoop-env.sh
Here we need to only update the JAVA_HOME variable, we just need to open the file from text editor.

$vi /home/hduser/hadoop/conf/hadoop-env.sh
And update the following:
export JAVA_HOME=/usr/lib/jvm/java-6-sun

Step 7:  Update the files residing in the conf folder.  The three main files are:
Core-site.xml
Mapred-site.xml
Hdfs-site.xml
Here we need to change the value of namenode and jobtracker to where we want to run.

Step 8:  Start the hadoop cluster, using command.
Start-all.sh

Step 9:  Check if all the processes are up and running, using command:

Jps
……………………………………………………………………………

That’s it. The cluster should be up and running with the following 5 processes:

  • NameNode
  • SecondaryNameNode
  • DataNode
  • JobTracker
  • TaskTracker
If we see all these processes, it means our installation is successful and the cluster is up and running.

Comments

Popular posts

Blue Prism complete tutorials download now

RPA blue prsim tutorial popular resources I have given in this post. You can download quickly.Learning Blue Prism is really good option if you are learner of Robotic process automation. The RPA is also called "Robotic Process Automation"- Real advantages are you can automate any business process and you can complete the customer requests in less time.

The Books Available on Blue Prism 
Blue Prism resourcesDavid chappal PDF bookBlue Prism BlogsVideo Training
RPA training The other Skills you need
Basic business skills and Domain skills are more than enough to be successful in this automation careerScripting languages like Perl/JSON/JavaScript/VBScript.  The interesting point is learning any RPA tool is not a problem. You can learn tool quickly. The real point is how quickly you apply your knowledge to implement automated tasks is important.


Also read
Robotic RPA Software developer skills you needBlue Prism tutorials download to learn quicklyPopular RPA tools functionality differen…

Three popular RPA tools functional differences

Robotic process automation is growing area and many IT developers across the board started up-skill in this popular area. I have written this post for the benefit of Software developers who are interested in RPA also called Robotic Process Automation.

In my previous post, I have described that total 12 tools are available in the market. Out of those 3 tools are most popular. Those are Automation anywhere, BluePrism and Uipath. Many programmers asked what are the differences between these tools. I have given differences of all these three RPA tools.

BluePrismBlue Prism has taken a simple concept, replicating user activity on the desktop, and made it enterprise strength. The technology is scalable, secure, resilient, and flexible and is supported by a comprehensive methodology, operational framework and provided as packaged software.The technology is developed and deployed within a “corridor of IT governance” and has sophisticated error handling and process modelling capabilities to ensu…

Super Easy Ways To Learn Everything About analytics top books

This post why I am writing is most of the analytics jobs now a days are in financial projects. The domains are finance,  Banking, Payments and credit cards. Skills you need are like SAS, UNIX, Python and Java Script. These are top books for beginners in data analysis in financial analytics.

Top books you need SAS best book to read
I found one best book that is little SAS. This covers almost all best examples and critical macros you need for your job.

The best-selling Little SAS Book just got even better. Readers worldwide study this easy-to-follow book to help them learn the basics of SAS programming. Now Rebecca Ottesen has teamed up with the original authors, Lora Delwiche and Susan Slaughter, to provide a new way to challenge and improve your SAS skills through thought-provoking questions, exercises, and projects.

UNIX best book
Unix you will get all basic commands every where.  Macros or shell scripts how to execute is really you need. Good book so that you can automate tasks.