时事通讯

通过电子邮件获得 Hortonworks 的最新更新

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

CTA

开始

云

是否已准备就绪?

下载 sandbox

我们能为您做什么?

* 我了解我可以随时取消预订。我也承认在 Hortonworks 隐私政策中发现的更多信息。
关闭关闭按钮
HDP > Hadoop Administration > Hortonworks Sandbox

Sandbox Deployment and Install Guide

Deploying Hortonworks Sandbox on VirtualBox

云 是否已准备就绪?

下载 SANDBOX

Introduction

This tutorial walks through the general approach for installing the Hortonworks Sandbox (HDP or HDF) onto VirtualBox on your computer.

Prerequisites

Outline

Import the Hortonworks Sandbox

Start by importing the Hortonworks Sandbox into VirtualBox:

  • Open VirtualBox and navigate to File -> Import Appliance. Select the sandbox image you downloaded and click Open.

You should end up with a screen like this:

Appliance Settings

Note: Make sure to allocate at least 10 GB (10240 MB) of RAM for the sandbox.

Click Import and wait for VirtualBox to import the sandbox.

Start the Hortonworks Sandbox

Once the sandbox has finished being imported, you may start it by selecting the sandbox and clicking “Start” from the VirtualBox menu.

virtualbox_start_windows

A console window opens and displays the boot process. This process take a few minutes. When you see the following screen, you may begin using the sandbox.

vbox-splash-screen

Welcome to the Hortonworks Sandbox!

Enable Connected Data Architecture (CDA) – Advanced Topic

Prerequisite:

  • A computer with minimum 22 GB of RAM dedicated to the virtual machine
  • Have already deployed the latest HDP/HDF sandbox
  • Update virtual machine settings to minimum 22 GB (22528MB)

Hortonworks Connected Data Architecture (CDA) allows you to play with both data-in-motion (HDF) and data-at-rest (HDP) sandboxes simultaneously.

HDF (Data-In-Motion)

Data-In-Motion is the idea where data is being ingested from all sorts of different devices into a flow or stream. While the data is moving throughout this flow, components or as NiFi calls them “processors” are performing actions on the data to modify, transform, aggregate and route it. Data-In-Motion covers a lot of the preprocessing stage in building a Big Data Application. For instance, data preprocessing is where Data Engineers work with the raw data to format it into a better schema, so Data Scientists can focus on analyzing and visualizing the data.

HDP (Data-At-Rest)

Data-At-Rest is the idea where data is not moving and is stored in a database or robust datastore across a distributed data storage such as Hadoop Distributed File System (HDFS). Instead of sending the data to the queries, the queries are being sent to the data to find meaningful insights. At this stage data, data processing and analysis occurs in building a Big Data Application.

Update Virtual Machine Memory

VirtualBox Manager -> Settings

vbox-manager-settings

System -> Motherboard -> Base Memory -> OK

vbox-system-settings

Run Script to Enable CDA

The sandbox comes prepackaged with the script needed to enable CDA. Assuming you have already deployed the HDP sandbox, you need to SSH into Sandbox VM using password hadoop:

  • Issue command: ssh root@sandbox-hdp.hortonworks.com -p 2200

Note: if you originally deployed HDF sandbox, replace sandbox-hdp with sandbox-hdf in the ssh command above.

  • Run bash script:
cd /sandbox/deploy-scripts/
sh enable-vm-cda.sh

The script output will be similar to:

enable-vm-cda-output

Further Reading

User Reviews

User Rating
1 4 out of 5 stars
5 Star 0%
4 Star 100%
3 Star 0%
2 Star 0%
1 Star 0%
Tutorial Name
Sandbox Deployment and Install Guide

To ask a question, or find an answer, please visit the Hortonworks Community Connection.

1 Review
Write Review

注册

Please register to write a review

Share Your Experience

Example: Best Tutorial Ever

You must write at least 50 characters for this field.

Success

Thank you for sharing your review!

Sandbox Deployment and Install Guide
by Patrick Hagan on August 9, 2018 at 3:13 am

The instructions were written well, except at the end where you have to put in the URL. It would have been better with a screen prints of the browser before and after initial URL is entered and the result. Right now it is not clear, which browser - outside VM or inside VM and which URL - the ones on the top screen or the ones on the bottom. My guess is the URL on the bottom on a browser outside the VM. Thank you.

The instructions were written well, except at the end where you have to put in the URL. It would have been better with a screen prints of the browser before and after initial URL is entered and the result. Right now it is not clear, which browser – outside VM or inside VM and which URL – the ones on the top screen or the ones on the bottom. My guess is the URL on the bottom on a browser outside the VM. Thank you.

显示更少内容
Cancel

Review updated successfully.