Apache Hive Installation Steps on Ubuntu

With this tutorial, we will learn the complete process to install Apache Hive 3.1.2 on Ubuntu 20.

The Apache Hive  data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.

Steps for Installing Hadoop on Ubuntu

Step 1 – Create a directory for example

$mkdir /home/bigdata/apachehive

Step 2 – Move to hadoop directory

$cd /home/bigdata/apachehive

Step 3 – Download Apache Hive (Link will change with respect to country so please get the download link from Apache Hive website ie https://hive.apache.org/downloads.html        

https://downloads.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz

Hive Download Page
Download Hive
Download Hive Complete
$wget https://downloads.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz

Step 4 – Extract this tar file

$tar -xzf apache-hive-3.1.2-bin.tar.gz
Unzip Hive
Hive Install

Step 5 – Open the bashrc files in the nano editor using the following command:

nano .bashrc

Edit .bashrc file located in the user’s home directory and add the following parameters:

export HIVE_HOME= “home/bigdata/apachehive/apache-hive-3.1.2-bin”
export PATH=$PATH:$HIVE_HOME/bin
BASH Config
Hive Setting

Press CTRL+O and enter to save changes. Then press CTRL+X to exit the editor.

Step 6 – Open the core-site.xml file in the nano editor. The file is located in /home/bigdata/hadoop/hadoop-3.3.1/etc/hadoop/ (Hadoop Configuration Directory).

This location will differ based on your Hadoop installation.

Add the following configuration property in the core-site.xml file.

<configuration>
  <property>
   <name>hadoop.proxyuser.dataflair.groups</name>
   <value>*</value>
  </property>
  <property>
   <name>hadoop.proxyuser.dataflair.hosts</name>
   <value>*</value>
  </property>
  <property>
   <name>hadoop.proxyuser.server.hosts</name>
   <value>*</value>
  </property>
  <property>
   <name>hadoop.proxyuser.server.groups</name>
   <value>*</value>
  </property>
</configuration>
Hive Conf 1
Hive Conf 2
Hive Conf 3

Press CTRL+O and enter to save changes. Then press CTRL+X to exit the editor.

Step 7 – Make a few directories in HDFS, commands are as follows (Make Sure Hadoop is up and running)

$hadoop fs -mkdir /tmp
$hadoop fs -mkdir /user
$hadoop fs -mkdir /user/hive
$hadoop fs -mkdir /user/hive/warehouse

Step 8 – Give the write permission to those folders

$hadoop fs -chmod g+w /tmp
$hadoop fs -chmod g+w /user/hive/warehouse
Hadoop files for Hive

Step 9 – Initialize Hive by default uses Derby database

$bin/schematool -dbType derby -initSchema
Hive Initialize
Hive Initialize final

We will get the above message.

Step 10 – Start Apache Hive

First, we need to go to the hive directory

$bin/hiveserver2
Loading Hive

Step 11 – Open a new terminal, type the below command to launch the beeline command shell.

$bin/beeline -n bigdata -u jdbc:hive2://localhost:10000
Beeline Hive 1

Check databases by displaying

 
Beeline Hive 2
By Bhavesh