Step 1: Installing Sun Java on Linux. Commands to execute for the same:
sudo apt-add-repository ppa:flexiondotorg/java
sudo apt-get update
sudo apt-get install sun-java6-jre sun-java6-plugin
sudo update-java-alternatives -s java-6-sun
Step 2: Create Hadoop User. Commands to execute for the same:
$sudo addgroup hadoop
$sudo adduser —ingroup hadoop hduser
Step 3: Install SSH Server if not already present. Commands are:
$ sudo apt-get install openssh-server
$ su - hduser
$ ssh-keygen -t rsa -P ""
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Step 4: Installing Hadoop. Commands for the same are:
$ cd /home/hduser
$ tar xzf hadoop-0.22.2.tar.gz
$ mv hadoop-0.22.2 hadoop
Step 5: Edit .bashrc. Commands:
# Set Hadoop-related environment variables
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
Step 6: Update hadoop-env.sh
Here we need to only update the JAVA_HOME variable, we just need to open the file from text editor.
And update the following:
Step 7: Update the files residing in the conf folder. The three main files are:
Here we need to change the value of namenode and jobtracker to where we want to run.
Step 8: Start the hadoop cluster, using command.
Step 9: Check if all the processes are up and running, using command:
That’s it. The cluster should be up and running with the following 5 processes:
If we see all these processes, it means our installation is successful and the cluster is up and running.