Apache Hadoop 3.3.1 Installation Steps on Ubuntu (Part 1)
Bhavesh
With this tutorial, we will learn the complete process to install Hadoop 3.3.1 on Ubuntu 20.
Supported Java Versions
Apache Hadoop 3.3 and upper supports Java 8 and Java 11 (runtime only)
Please compile Hadoop with Java 8. Compiling Hadoop with Java 11 is not supported: HADOOP-16795 – Java 11 compile support OPEN
Apache Hadoop from 3.0.x to 3.2.x now supports only Java 8
Apache Hadoop from 2.7.x to 2.10.x support both Java 7 and 8
Required software for Linux include:
Java must be installed. Recommended Java versions are described at HadoopJavaVersions.
ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons if the optional start and stop scripts are to be used.
Steps for Installing JAVA 8 on Ubuntu
Step 1 – Install Java 8 on Ubuntu
The OpenJDK 8 is available under default Apt repositories. You can simply install Java 8 on an Ubuntu system using the following commands.
Secure Shell (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Typical applications include remote command-line, login, and remote command execution, but any network service can be secured with SSH.
Install ssh on your system using the below command:
sudo apt-get install ssh
Type the password for the sudo user and then press Enter.
Install pdsh on your system using the below command:
sudo apt-get install pdsh
Type ‘Y’ and then press Enter to continue with the installation process.
Open the .bashrc file in the nano editor using the following command:
nano .bashrc
Now set the PDSH_RCMD_TYPE environment variable to ssh
Steps for Installing Hadoop on Ubuntu
Create a directory for example
$mkdir /home/bigdata/hadoop
Move to hadoop directory
$cd /home/bigdata/hadoop
Download Hadoop (Link will change with respect to country so please get the download link from hadoop website ie https://hadoop.apache.org/releases.html