With this tutorial, we will learn the complete process to install Apache Hive 3.1.2 on Ubuntu 20.
The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.
Steps for Installing Hadoop on Ubuntu
Step 1 – Create a directory for example
data:image/s3,"s3://crabby-images/638bc/638bc6ab78098e510c95ea73688310d64c1b0702" alt=""
Step 2 – Move to hadoop directory
Step 3 – Download Apache Hive (Link will change with respect to country so please get the download link from Apache Hive website ie https://hive.apache.org/downloads.html
https://downloads.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
data:image/s3,"s3://crabby-images/53eec/53eec9cada5e00043cee3afc3c5bd4c3f48f14e0" alt="Hive Download Page"
data:image/s3,"s3://crabby-images/93a25/93a2518209506cea7c00c114072c6ec4ae111b8f" alt="Download Hive"
data:image/s3,"s3://crabby-images/f5994/f5994a9ccd4741f8b5eceb70fa1359e3502c3ef0" alt="Download Hive Complete"
Step 4 – Extract this tar file
data:image/s3,"s3://crabby-images/1c63f/1c63f169772719a1ad962876e438e4f31676d41b" alt="Unzip Hive"
data:image/s3,"s3://crabby-images/d56c9/d56c95eb87b5bf333c2b3f4697595f45829ec102" alt="Hive Install"
Step 5 – Open the bashrc files in the nano editor using the following command:
Edit .bashrc file located in the user’s home directory and add the following parameters:
export PATH=$PATH:$HIVE_HOME/bin
data:image/s3,"s3://crabby-images/78b21/78b219cd3cb3d43d15739b55012296b9115b0d02" alt="BASH Config"
data:image/s3,"s3://crabby-images/e4540/e45408b7dcd6c1f6f2c244bf4a7ecee05582112b" alt="Hive Setting"
Press CTRL+O and enter to save changes. Then press CTRL+X to exit the editor.
Step 6 – Open the core-site.xml file in the nano editor. The file is located in /home/bigdata/hadoop/hadoop-3.3.1/etc/hadoop/ (Hadoop Configuration Directory).
This location will differ based on your Hadoop installation.
Add the following configuration property in the core-site.xml file.
<property>
<name>hadoop.proxyuser.dataflair.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.dataflair.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.server.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.server.groups</name>
<value>*</value>
</property>
</configuration>
data:image/s3,"s3://crabby-images/6a395/6a395e1abc11c056929e6b8950024c9f4887ac93" alt="Hive Conf 1"
data:image/s3,"s3://crabby-images/1094f/1094f0171434bf60591f1a5addb9ab9556c7e3c0" alt="Hive Conf 2"
data:image/s3,"s3://crabby-images/c0948/c094816e327a2bc9d84fd7a1df222b7e51c6f4e8" alt="Hive Conf 3"
Press CTRL+O and enter to save changes. Then press CTRL+X to exit the editor.
Step 7 – Make a few directories in HDFS, commands are as follows (Make Sure Hadoop is up and running)
$hadoop fs -mkdir /user
$hadoop fs -mkdir /user/hive
$hadoop fs -mkdir /user/hive/warehouse
Step 8 – Give the write permission to those folders
$hadoop fs -chmod g+w /user/hive/warehouse
data:image/s3,"s3://crabby-images/23175/23175eed8e5b8d1a29d6e985fae4fa9ac4f80c57" alt="Hadoop files for Hive"
Step 9 – Initialize Hive by default uses Derby database
data:image/s3,"s3://crabby-images/45801/45801d0afe08063bb89ddb9910743f4275c91764" alt="Hive Initialize"
data:image/s3,"s3://crabby-images/ad3d8/ad3d8caa7c3aa884aa7e9040f5a5f2056911f6a7" alt="Hive Initialize final"
We will get the above message.
Step 10 – Start Apache Hive
First, we need to go to the hive directory
data:image/s3,"s3://crabby-images/abbc4/abbc48892f9e2bd49d9d91535f802f8d538ea466" alt="Loading Hive"
Step 11 – Open a new terminal, type the below command to launch the beeline command shell.
data:image/s3,"s3://crabby-images/14808/1480857b1cc9e2f0d0d9bd22cd370f71c3255427" alt="Beeline Hive 1"
Check databases by displaying
data:image/s3,"s3://crabby-images/56c7b/56c7b169734de69e7d632c37a561b3d27d532fdf" alt="Beeline Hive 2"