Running Hive on Windows Using Docker Desktop: Everything You Need to Know

BhaveshApril 3, 2025

Apache Hive is a powerful data warehouse infrastructure built on top of Apache Hadoop, providing SQL-like querying capabilities for big data processing. Running Hive on Docker simplifies the setup process and ensures a consistent environment across different systems. This guide will walk you through setting up Apache Hive on Docker Desktop on a Windows operating system.

Prerequisites

Before you start, ensure you have the following installed on your Windows system:

Docker Desktop (with WSL 2 backend enabled)
At least 8GB of RAM for smooth performance

Step 1: Pull the Required Docker Images

Pull the 4.0.1 image from Hive DockerHub (Latest April 2025)

docker pull apache/hive:4.0.1

This image comes with all necessary components, including Hive Metastore and HiveServer2.

Step 2: Start the Hive Containers

Launch the HiveServer2 with an embedded Metastore.
This is lightweight and for a quick setup, it uses Derby as metastore db.

docker run -d -p 10000:10000 -p 10002:10002 –env SERVICE_NAME=hiveserver2 –name hive4 apache/hive:4.0.1

Step 3: Access Apache Hive

To interact with Hive, use the Hive command-line interface (CLI) inside the HiveServer2 container.

Click on Hive Container

2. Click on Exec Tab as shown in the Image

3. We will execute all the required commands on the execute tab

Step 4: Running Sample Queries in Hive

Once inside the Hive CLI, try running some queries:

hive

!connect jdbc:hive2://localhost:10000/default “” “” “”

SELECT CURRENT_DATE;

SELECT CURRENT_TIMESTAMP;

Once inside the Hive CLI, try running some queries:

CREATE DATABASE test_db;
USE test_db;
CREATE TABLE sample_table (id INT, name STRING);
INSERT INTO sample_table VALUES (1, 'Alice'), (2, 'Bob');
SELECT * FROM sample_table;

This verifies that Hive is functioning correctly.

Step 5: Connecting Hive to External BI Tools

If you want to connect Hive to BI tools like Tableau or Power BI, use the Hive JDBC driver. Configure the JDBC URL as follows:

jdbc:hive2://localhost:10000/default

With this setup, external applications can query data stored in Hive.

Conclusion

Running Apache Hive on Docker Desktop simplifies big data analytics on Windows without complex manual configurations. This method provides a containerized, portable solution for data engineers and analysts working with large datasets. By following the above steps, you can easily set up and run Hive, integrate it with external tools, and start querying your big data efficiently.

Happy querying! 🚀