1. Posts/

Hortonworks Sandbox on Ubuntu using docker

···
Hortonworks ubuntu

In a previous post, I talked about installing Hortonworks Sandbox on a Mac machine. That was straight forward.

In this blog ’m going to talk about installing the Hortonworks Sandbox manually on a Linux VM.

A little background about Docker base device size
#

The first time docker is started, it sets up a base device with a default size of 10GB. All future images and containers would be a snapshot of this base device.

Base size is the maximum size that a container/image can grow to. By default docker limit containers to 10G. In Devicemapper, new container/images take zero size and grow up to the maximum size. Changing the base size will not actually change the physical usage of containers unless they grow larger than 10 GB.

You can view this information with the following command

1
sudo docker info

Since Hortonworks image is greater than 10GB, we need to increase the storage base size. We can do this by changing the storage-driver to overlay and increasing the dm.bazesize.

Increase the docker base device size on Ubuntu
#

For this we need to set storag-opt=dm.basesize and change storage-diver to overlay.

Assuming we have Systemd based setup

  • First create directory for customising docker service
1
sudo mkdir /etc/systemd/system/docker.service.d/
  • Edit /etc/systemd/system/docker.service.d/docker.conf
1
sudo vi /etc/systemd/system/docker.service.d/docker.conf
1
2
3
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --graph="/mnt/docker-data" --storage-driver=overlay --storage-opt=dm.basesize=30G

Here the first line with empty ExecStart is necessary to clear previous configurations. Also note that we are increasing the dm.basesize to 30G

IMPORTANT: Make sure there is enough space in /mnt/docker-data

Reload systemd daemon and restart docker
#

After editing docker service definition for systemd we need to releoad systemd daemon and restart docker, for the new settings to take effect.

1
2
sudo systemctl daemon-reload
sudo systemctl restart docker

If you check sudo docker info, Storage Driver should be overlay and Docker Root Dir should be /mnt/docker-data

Docker Info

Now we have setup docker to work with Hortonworks Sandbox.

Download the Hortonworks Sandbox Docker image
#

Download the docker image. It’s a huge one 12GB, use high bandwidth network :)

1
wget https://downloads-hortonworks.akamaized.net/sandbox-hdp-2.6.1/HDP_2_6_1_docker_image_28_07_2017_14_42_40.tar

Download the Hortonworks Script
#

1
wget https://raw.githubusercontent.com/hortonworks/data-tutorials/master/tutorials/hdp/sandbox-port-forwarding-guide/assets/start-sandbox-hdp.sh

Load the sandbox image
#

1
docker load -i HDP_2_6_1_docker_image_28_07_2017_14_42_40.tar

Docker load Hortonworks Sandbox image

This is a heavy task. Go get your coffee now because it will take some time to load the docker image.

This step would fail if you did not increase the base device size.

No space left on the device

After it is loaded you should see sandbox-hdp in your docker images list.

docker images - sandbox-hdp

Update /etc/hosts file
#

Add the following to /etc/hosts

1
127.0.0.1   localhost   sandbox.hortonworks.com

Run the Hortonworks Sandbox startup script
#

1
2
chmod +x ./start-sandbox-hdp.sh
./start-sandbox-hdp.sh

Start Hortonworks Data Platform Sandbox

Again this will take some time. Once the script completes you should be able to access the sandbox.

Access Hortonworks Sandbox
#

Go to https://sandbox.hortonworks.com:8888/

This Sandbox comes with a lot of components like Ambari, Ranger, Hive, Spark etc. installed and configured.

References
#