Running Apache Zeppelin on Docker Desktop (Windows OS)

Apache Zeppelin is an open-source web-based notebook that enables interactive data analytics. It supports multiple languages like Scala, Python, SQL, and more, making it an excellent choice for data engineers, analysts, and scientists working with big data frameworks like Apache Spark, Flink, and Hadoop.

Setting up Zeppelin on a Windows system can sometimes be tricky due to dependency and configuration issues. Fortunately, Docker Desktop makes the process simple, reproducible, and fast. In this blog, we’ll walk you through how to run Apache Zeppelin on Docker Desktop on a Windows OS, step-by-step.

✅ Prerequisites

Before you begin, make sure the following are installed on your Windows machine:

Docker Desktop for Windows (ensure WSL 2 backend is enabled)
At least 8 GB RAM for smooth operation
Basic understanding of Docker commands
Windows Terminal, PowerShell, or Git Bash

🐳 Step 1: Pull the Official Apache Zeppelin Docker Image

Apache Zeppelin provides an official Docker image that comes preloaded with everything you need.

Open a terminal and run:

docker pull apache/zeppelin:latest

docker pull apache/zeppelin:0.12.0 (Latest 2025)

This will download the latest stable image of Zeppelin.

📦 Step 2: Run Zeppelin Container

After the image is pulled, you can run Zeppelin with a simple command:

docker run -d -p 8080:8080 –name zeppelin apache/zeppelin:latest

🔍 Explanation:

-d: Runs the container in detached mode.
-p 8080:8080: Maps port 8080 of your host machine to port 8080 inside the container.
--name zeppelin: Names your container “zeppelin” for easier management.

After a few seconds, Zeppelin should be running.

🌐 Step 3: Access Zeppelin Notebook in Browser

Open your browser and go to:

http://localhost:8080

You should see the Zeppelin web interface where you can create and run notebooks using SQL, Spark, Python, and more.

🧠 Step 4: Configure Apache Spark Interpreter (Optional)

If you’re planning to use Apache Spark inside Zeppelin, it can be configured via the interpreter settings.

Click on the gear icon in the Zeppelin UI.
Go to Interpreter > Spark.
Modify the Spark interpreter settings (like master, spark.executor.memory, etc.) based on your local setup.

By default, Zeppelin uses Spark in local mode, which is great for development.

💾 Step 5: Persist Notebooks (Optional)

By default, notebooks are stored inside the container. To persist your work between restarts, mount a local volume:

docker run -d -p 8080:8080 \
-v %cd%/zeppelin-data:/zeppelin/notebook \
–name zeppelin \
apache/zeppelin:latest

Replace %cd%/zeppelin-data with the full path to your desired local folder on Windows.

🔄 Step 6: Managing the Container

Here are some helpful Docker commands to manage your Zeppelin instance:

Stop the container:

docker stop zeppelin

Start the container again:

docker start zeppelin

View logs:

docker logs -f zeppelin

Remove the container:

docker rm -f zeppelin

🔐 Bonus: Secure Your Zeppelin Notebook

While running locally, security might not be an issue. But if you’re exposing Zeppelin outside your local machine:

Use authentication by enabling Shiro in shiro.ini
Configure HTTPS using a reverse proxy like Nginx or Traefik
Limit network exposure (e.g., bind to 127.0.0.1)

🧪 Test Drive Zeppelin Features

After setup, explore these cool features:

Connect to Apache Spark and run distributed analytics
Query CSV/JSON/Parquet files
Use the Helium plugin system to visualize data using charts and maps
Share your notebooks with team members as HTML or JSON

🎯 Conclusion

Running Apache Zeppelin on Docker Desktop is an efficient way to explore big data tools and workflows without complex setups. Whether you’re experimenting with Spark or visualizing SQL data, Zeppelin provides an intuitive interface that works seamlessly inside containers.

If you’re a data engineer, scientist, or analyst looking to build and share notebooks, Dockerizing Zeppelin is the fastest route to productivity on Windows.