时事通讯

通过电子邮件获得 Hortonworks 的最新更新

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

CTA

开始

云

是否已准备就绪?

下载 sandbox

我们能为您做什么?

* 我了解我可以随时取消预订。我也承认在 Hortonworks 隐私政策中发现的更多信息。
关闭关闭按钮
HDP > Hadoop Administration > Hortonworks Sandbox

Sandbox Deployment and Install Guide

Deploying Hortonworks Sandbox on Docker

云 是否已准备就绪?

下载 SANDBOX

Introduction

This tutorial walks through the general approach for installing the Hortonworks Sandbox (HDP or HDF) onto Docker on your computer.

Prerequisites

Outline

Memory Configuration

Memory For Linux

No special configuration needed for Linux.

Memory For Windows

After installing Docker For Windows, open the application and click on the Docker icon in the menu bar. Select Settings.

Docker Settings

Select the Advanced tab and adjust the dedicated memory to at least 10240GB of RAM.

Configure Docker RAM

Memory For Mac

After installing Docker For Mac, open the application and click on the Docker icon in the menu bar. Select Preferences.

docker-mac-preferences

Select the Advanced tab and adjust the dedicated memory to at least 12GB of RAM.

docker-mac-configure

HDP Deployment

Deploy HDP Sandbox

Install/Deploy/Start HDP Sandbox

docker-download-hdp

In the decompressed folder, you will find shell script docker-deploy-{version}.sh. From the command line, Linux / Mac / Windows(Git Bash), run the script:

cd /path/to/script
sh docker-deploy-{HDPversion}.sh

Note: You only need to run script once. It will setup and start the sandbox for you, creating the sandbox docker container in the process if necessary.

Note: The decompressed folder has other scripts and folders. We will ignore those for now. They will be used later in advanced tutorials.

The script output will be similar to:

docker-start-hdp-output

Verify HDP Sandbox

Verify HDP sandbox was deployed successfully by issuing the command:

docker ps

You should see something like:

docker-ps-hdp-output

Stop HDP Sandbox

When you want to stop/shutdown your HDP sandbox, run the following commands:

docker stop sandbox-hdp
docker stop sandbox-proxy

Restart HDP Sandbox

When you want to re-start your sandbox, run the following commands:

docker start sandbox-hdp
docker start sandbox-proxy

Remove HDP Sandbox

A container is an instance of the Sandbox image. You must stop container dependancies before removing it. Issue the following commands:

docker stop sandbox-hdp
docker stop sandbox-proxy
docker rm sandbox-hdp
docker rm sandbox-proxy

If you want to remove the HDP Sandbox image, issue the following command after stopping and removing the containers:

docker rmi hortonworks/sandbox-hdp:{release}

HDF Deployment

Deploy HDF Sandbox

Install/Deploy/Start HDF Sandbox

docker-download-hdf

In the decompressed folder, you will find shell script docker-deploy-{version}.sh. From the command line, Linux / Mac / Windows(Git Bash), run the script:

cd /path/to/script
sh docker-deploy-{HDFversion}.sh

Note: You only need to run script once. It will setup and start the sandbox for you, creating the sandbox docker container in the process if necessary.

Note: The decompressed folder has other scripts and folders. We will ignore those for now. They will be used later in advanced tutorials.

The script output will be similar to:

docker-start-hdf-output

Verify HDF Sandbox

Verify HDF sandbox was deployed successfully by issuing the command:

docker ps

You should see something like:

docker-ps-hdf-output

Stop HDF Sandbox

When you want to stop/shutdown your HDF sandbox, run the following commands:

docker stop sandbox-hdf
docker stop sandbox-proxy

Restart HDF Sandbox

When you want to re-start your HDF sandbox, run the following commands:

docker start sandbox-hdf
docker start sandbox-proxy

Remove HDF Sandbox

A container is an instance of the Sandbox image. You must stop container dependencies before removing it. Issue the following commands:

docker stop sandbox-hdf
docker stop sandbox-proxy
docker rm sandbox-hdf
docker rm sandbox-proxy

If you want to remove the HDF Sandbox image, issue the following command after stopping and removing the containers:

docker rmi hortonworks/sandbox-hdf:{release}

Enable Connected Data Architecture (CDA) – Advanced Topic

Prerequisite:

  • A computer with minimum 22 GB of RAM dedicated to the virtual machine
  • Have already deployed the latest HDP/HDF sandbox
  • Update Docker settings to use minimum 16 GB (16384 MB)

Hortonworks Connected Data Architecture (CDA) allows you to play with both data-in-motion (HDF) and data-at-rest (HDP) sandboxes simultaneously.

HDF (Data-In-Motion)

Data-In-Motion is the idea where data is being ingested from all sorts of different devices into a flow or stream. While the data is moving throughout this flow, components or as NiFi calls them “processors” are performing actions on the data to modify, transform, aggregate and route it. Data-In-Motion covers a lot of the preprocessing stage in building a Big Data Application. For instance, data preprocessing is where Data Engineers work with the raw data to format it into a better schema, so Data Scientists can focus on analyzing and visualizing the data.

HDP (Data-At-Rest)

Data-At-Rest is the idea where data is not moving and is stored in a database or robust datastore across a distributed data storage such as Hadoop Distributed File System (HDFS). Instead of sending the data to the queries, the queries are being sent to the data to find meaningful insights. At this stage data, data processing and analysis occurs in building a Big Data Application.

Update Docker Memory

Select Docker -> Preferences… -> Advanced and set memory accordingly. Restart Docker.

docker-memory-settings

Run Script to Enable CDA

When you first deployed the sandbox, a suite of deployment scripts were downloaded – refer to Deploy HDP Sandbox as an example.

In the decompressed folder, you will find shell script enable-native-cda.sh. From the command line, Linux / Mac / Windows(Git Bash), run the script:

cd /path/to/script
sh enable-native-cda.sh

The script output will be similar to:

docker-enable-cda-output

Further Reading

Appendix A: Troubleshooting

Drive not shared

docker-drive-not-shared

  • Docker needs write access to the drive where the docker-deploy-{version}.sh is executed.

  • The easiest solution is to execute script from Downloads folder.

  • Otherwise, go to Docker Preferences/Settings -> File Sharing/Shared Drives -> Add/Select path/drive where deploy-scripts are located and try again.

No space left on device

Port Conflict

While running the deployment script, you may run into conflicting port issue(s) similar to:

docker-conflicting-port

In the picture about, we had a port conflict with 6001.

Go to the location where you saved the Docker deployment scripts – refer to Deploy HDP Sandbox as an example. You will notice a new directory sandbox was created.

  • Edit file sandbox/proxy/proxy-deploy.sh
  • Modify conflicting port (first in keypair). For example, 6001:6001 to 16001:6001
  • Save/Exit the File
  • Run bash script: bash sandbox/proxy/proxy-deploy.sh
  • Repeat steps for continued port conflicts

Verify sandbox was deployed successfully by issuing the command:

docker ps

You should see something like:

docker-ps-hdf-output

User Reviews

User Rating
1 4 out of 5 stars
5 Star 0%
4 Star 100%
3 Star 0%
2 Star 0%
1 Star 0%
Tutorial Name
Sandbox Deployment and Install Guide

To ask a question, or find an answer, please visit the Hortonworks Community Connection.

1 Review
Write Review

注册

Please register to write a review

Share Your Experience

Example: Best Tutorial Ever

You must write at least 50 characters for this field.

Success

Thank you for sharing your review!

Sandbox Deployment and Install Guide
by Patrick Hagan on August 9, 2018 at 3:13 am

The instructions were written well, except at the end where you have to put in the URL. It would have been better with a screen prints of the browser before and after initial URL is entered and the result. Right now it is not clear, which browser - outside VM or inside VM and which URL - the ones on the top screen or the ones on the bottom. My guess is the URL on the bottom on a browser outside the VM. Thank you.

The instructions were written well, except at the end where you have to put in the URL. It would have been better with a screen prints of the browser before and after initial URL is entered and the result. Right now it is not clear, which browser – outside VM or inside VM and which URL – the ones on the top screen or the ones on the bottom. My guess is the URL on the bottom on a browser outside the VM. Thank you.

显示更少内容
Cancel

Review updated successfully.