CTA

开始

云

是否已准备就绪?

下载 sandbox

我们能为您做什么?

关闭关闭按钮

Hortonworks Sandbox Tutorials
for Apache Hadoop

借助基于 Hortonworks Sandbox 的这些教程开始使用 Hadoop。

使用 Hadoop 开发

使用 Hadoop 开始开发。这些教程旨在帮助您通过使用 Hadoop 进行轻松开发:

HDP 中的 Apache Spark

Introduction Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs in Scala, Java, Python, and R that allow developers to execute a variety of data intensive workloads. In this tutorial, we will use an Apache Zeppelin notebook for our development environment to keep things simple and elegant. Zeppelin will […]

Introduction This tutorial will get you started with Apache Spark and will cover: How to run Spark on YARN with SparkPi and WordCount examples How to use the Spark DataFrame & Dataset API How to use SparkSQL Thrift Server for JDBC/ODBC access How to use SparkR You will mostly use Spark 1.6.x with some Spark […]

Introduction Apache Zeppelin is a web-based notebook that enables interactive data analytics. With Zeppelin, you can make beautiful data-driven, interactive and collaborative documents with a rich set of pre-built language backends (or interpreters) such as Scala (with Apache Spark), Python (with Apache Spark), SparkSQL, Hive, Markdown, Angular, and Shell. With a focus on Enterprise, Zeppelin […]

Introduction In this two-part lab-based tutorial, we will first introduce you to Apache Spark SQL. Spark SQL is a higher-level Spark module that allows you to operate on DataFrames and Datasets, which we will cover in more detail later. In the second part of the lab, we will explore an airline dataset using high-level SQL […]

Introduction This tutorial will teach you how to build sentiment analysis algorithms with Apache Spark. We will be doing data transformation using Scala and Apache Spark 2, and we will be classifying tweets as happy or sad using a Gradient Boosting algorithm. Although this tutorial is focused on sentiment analysis, Gradient Boosting is a versatile […]

Introduction This tutorial will teach you how to set up a full development environment for developing and debugging Spark applications. For this tutorial we’ll be using Java, but Spark also supports development with Java, Python, and R. The Scala version of this tutorial can be found here, and the Python version here. We’ll be using […]

Introduction This tutorial will teach you how to set up a full development environment for developing and debugging Spark applications. For this tutorial we’ll be using Python, but Spark also supports development with Java, Python, and R. The Scala version of this tutorial can be found here, and the Java version here. We’ll be using […]

Introduction This tutorial will teach you how to set up a full development environment for developing and debugging Spark applications. For this tutorial we’ll be using Scala, but Spark also supports development with Java, Python, and R. The Java version of this tutorial can be found here, and the Python version here. We’ll be using […]

Introduction In this tutorial, we will introduce you to Machine Learning with Apache Spark. The hands-on lab for this tutorial is an Apache Zeppelin notebook that has all the steps necessary to ingest and explore data, train, test, visualize, and save a model. We will cover a basic Linear Regression model that will allow us […]

Introduction This is the third tutorial in a series about building and deploying machine learning models with Apache Nifi and Spark. In Part 1 of the series we learned how to use Nifi to ingest and store Twitter Streams. In Part 2 we ran Spark from a Zeppelin notebook to design a machine learning model […]

Hello World

Introduction In this tutorial, you will learn about the different features available in the HDF sandbox. HDF stands for Hortonworks DataFlow. HDF was built to make processing data-in-motion an easier task while also directing the data from source to the destination. You will learn about quick links to access these tools that way when you […]

Introduction This tutorial is aimed for users who do not have much experience in using the Sandbox. We will install and explore the Sandbox on virtual machine and cloud environments. We will also navigate the Ambari user interface. Let’s begin our Hadoop journey. Prerequisites Downloaded and Installed Hortonworks Sandbox Allow yourself around one hour to […]

This tutorial will help you get started with Hadoop and HDP.

交通拥堵是双方来往人群的问题。城市规划师团队携手合作,根据交通模式为新公路规划相应的位置。由于使用过往和合计的人流量数据,因此原始的实时数据会为分析人流量带来问题。他们选择 NiFi 用于集成实时数据,因为它可以利用提取、过滤和存储动态数据的能力。观察他们的团队如何使用 NiFi 深入了解交通模式,并决定新公路的位置。

This tutorial will go through the introduction of Apache HBase and Apache Phoenix along with the new Backup and Restore utility in HBase that has been introduced in HDP 2.5. Enjoy HADOOPING!!

This Hadoop tutorial shows how to Process Data with Hive using a set of driver data statistics.

This Hadoop tutorial shows how to Process Data with Apache Pig using a set of driver data statistics.

In this tutorial, we will load and review data for a fictitious web retail store in what has become an established use case for Hadoop: deriving insights from large data sources such as web logs.

how to get started with Cascading and Hortonworks Data Platform using the Word Count Example.

如果您在完成本教程时遇到任何错误,请通过 Hortonworks Community Connection 提问或告知我们!这是供 Java 开发人员学习级联和 Hortonworks Data Platform (HDP) 的第二个教程。其他教程包括:在 HDP 2.3 Sandbox 中使用级联进行字数统计 在 HDP 中使用级联解析日志 […]

了解如何使用级联模式将预测模型 (PMML) 从 SAS、R 和 MicroStrategy 快速迁移至 Hadoop,并进行大规模部署。

Introduction Apache HBase is a NoSQL database in the Hadoop eco-system. Many business intelligence tool and data analytic tools lack the ability to work with HBase data directly. Apache Phoenix enables you to interact with HBase using SQL. In HDP 2.5, we have introduced support for ODBC drivers. With this, you can connect any ODBC […]

如何使用 Apache Storm 通过 Hortonworks Data Platform 处理 Hadoop 中的实时流数据。

How to use Apache Tez and Apache Hive for Interactive Query with Hadoop and Hortonworks Data Platform 2.5

在本教程中,我们将演示如何使用存储在 HDFS 中的索引文件(Solr 数据文件)在 Hadoop 中运行 Solr,并使用 MapReduce 作业索引文件。

使用 Apache Falcon 为 Hadoop 和 Hortonworks Data Platform 2.1 定义端到端数据管道和策略

Standard SQL provides ACID operations through INSERT, UPDATE, DELETE, transactions, and the more recent MERGE operations. These have proven to be robust and flexible enough for most workloads. Hive offers INSERT, UPDATE and DELETE, with more of capabilities on the roadmap.

简介 在适用于 Hadoop 开发者的本教程中,我们将深入研究 Apache Hadoop 的核心概念,并检查编写 MapReduce 程序的过程。前提条件 已下载并安装最新的 Hortonworks Sandbox 学习 Hortonworks Sandbox 大纲的内容 Hadoop 步骤 1:探讨 Apache Hadoop 1.1 的核心概念 什么是 MapReduce? 1.2 […]

Real World Examples

许多客户的一个非常普遍的要求是要能够索引图像文件中的文本;例如,扫描的 JPEG 文件中的文本。在本教程中,我们将演示如何使用 SOLR 实现此目的。前提条件 下载 Hortonworks Sandbox 完成学习 HDP Sandbox 教程的内容。逐步指导 […]

This tutorial will cover the core concepts of Storm and the role it plays in an environment where real-time, low-latency and distributed data processing is important.

Introduction Note: This tutorial is deprecated and meant for the HDP 2.5 Sandbox. That sandbox can be found under the “Hortonworks Sandbox Archive” section of the Hortonworks Download Page. Apache Falcon simplifies the configuration of data motion with: replication; lifecycle management; lineage and traceability. This provides data governance consistency across Hadoop components. Scenario In this […]

Learn to ingest the real-time data from car sensors with NiFi and send it to Hadoop. Use Apache Kafka for capturing that data in between NiFi and Storm for scalability and reliability. Deploy a storm topology that pulls the data from Kafka and performs complex transformations to combine geolocation data from trucks with sensor data from trucks and roads. Once all sub projects are completed, deploy the driver monitor demo web application to see driver behavior, predictions and drools data in 3 different map visualizations.

如何提高在线客户完成购买的几率?Hadoop 简化了分析过程,并改变访客在您网站上的行为方式。这里可以看到一家在线零售商如何通过优化购买路径来降低弹出率并提高转化率。HDP 可以帮助您捕获和优化网站点击流数据,以超越您的公司的电子商务目标。该视频随附的教程介绍了如何使用 HDP 来优化原始点击流数据。

如果发生安全违规行为 ,服务器日志分析可帮助您识别威胁,然后在将来更好地保护您自己。了解 Hadoop 如何通过加速预测,更长时间保留日志数据以及证实符合 IT 策略,从而将服务器日志分析提升到新高度。该视频随附的教程介绍了如何使用 HDP 来优化原始服务器日志数据。

借助 Hadoop,您可以挖掘 Twitter、Facebook 和其他社交媒体对话,以分析客户有关您和您的竞争对手的情绪。借助更多的社交大数据,您可以制定更有针对性的实时决策。该视频随附的教程介绍了如何使用 HDP 来优化原始推文数据。

机器了解很多情况。传感器串流低成本、始终在线的数据。Hadoop 让存储和精化该数据并确定有意义的模式更为容易,让您通过预测分析获得洞察力来进行前瞻性业务决策。了解可以如何使用 Hadoop 来分析制热、通风和空调数据以保持理想的办公室温度并最大限度降低费用

RADAR 是一款专为使用 ITC Handy 工具(NLP 和 Sentiment Analysis 引擎)和利用 Hadoop 技术的零售商设计的软件解决方案 …

简介 H2O 是 0xdata 的一款开源内存解决方案,用于对大数据作出预测分析。它是一个数学和机器学习引擎,可以为强大的算法带来分布式并行处理能力,从而使您能够更快地做出更好的预测和更精确的模型。熟悉各种 API,如 R 和 JSON,以及 […]

Hadoop Administration

开始使用 Hadoop Administration。这些教程旨在帮助您轻松管理 Hadoop:

Hortonworks Sandbox

The Hortonworks Sandbox is delivered as a Dockerized container with the most common ports already opened and forwarded for you. If you would like to open even more ports, check out this tutorial.

Welcome to the Hortonworks Sandbox! Look at the attached sections for sandbox documentation.

The Hortonworks Sandbox can be installed in a myriad of virtualization platforms, including VirtualBox, Docker, VMWare and Azure.

运营

Introduction The Azure cloud infrastructure has become a common place for users to deploy virtual machines on the cloud due to its flexibility, ease of deployment, and cost benefits. Microsoft has expanded Azure to include a marketplace with thousands of certified, open source, and community software applications and developer services, pre-configured for Microsoft Azure. This […]

Introduction The Hortonworks Sandbox running on Azure requires opening ports a bit differently than when the sandbox is running locally on Virtualbox or Docker. We’ll walk through how to open a port in Azure so that outside connections make their way into the sandbox, which is a Docker container inside an Azure virtual machine. Note: […]

Introduction Note: This tutorial is deprecated and meant for the HDP 2.5 Sandbox. That sandbox can be found under the “Hortonworks Sandbox Archive” section of the Hortonworks Download Page. Apache Falcon is a framework to simplify data pipeline processing and management on Hadoop clusters. It makes it much simpler to onboard new workflows/pipelines, with support […]

Introduction Note: This tutorial is deprecated and meant for the HDP 2.5 Sandbox. That sandbox can be found under the “Hortonworks Sandbox Archive” section of the Hortonworks Download Page. Apache Falcon is a framework to simplify data pipeline processing and management on Hadoop clusters. It provides data management services such as retention, replications across clusters, […]

Introduction In this tutorial, we will explore how to quickly and easily deploy Apache Hadoop with Apache Ambari. We will spin up our own VM with Vagrant and Apache Ambari. Vagrant is very popular with developers as it lets one mirror the production environment in a VM while staying with all the IDEs and tools in the comfort […]

简介 Apache Falcon 是一个用于简化在 Hadoop 群集中处理和管理数据管道的框架。它通过支持后期的数据处理和重试策略,使得附带的新工作流程/管道变得更加简单。此外,它还可以让您轻松定义各种数据和处理元素之间的关系,并与元存储/目录(如 Hive/HCatalog)集成。最后,[…]

Introduction In this tutorial we are going to explore how we can configure YARN Capacity Scheduler from Ambari. YARN’s Capacity Scheduler is designed to run Hadoop applications in a shared, multi-tenant cluster while maximizing the throughput and the utilization of the cluster. Traditionally each organization has it own private set of compute resources that have […]

Apache Hadoop clusters grow and change with use. Maybe you used Apache Ambari to build your initial cluster with a base set of Hadoop services targeting known use cases and now you want to add other services for new use cases. Or you may just need to expand the storage and processing capacity of the […]

In this tutorial, we will walk through many of the common of the basic Hadoop Distributed File System (HDFS) commands you will need to manage files on HDFS. The particular datasets we will utilize to learn HDFS file management are San Francisco salaries from 2011-2014.

以前,我们推出创建快照的功能是为了保护重要的企业数据资产,以防止用户或应用程序出错。HDFS Snapshots 是文件系统的只读时间点拷贝。在文件系统的子树或整个文件系统中使用 Snapshots 需要考虑:性能和可靠:快照创建是原子,并且 […]

Real World Examples

Introduction This tutorial is aimed for users who do not have much experience in using the Sandbox. We will install and explore the Sandbox on virtual machine and cloud environments. We will also navigate the Ambari user interface. Let’s begin our Hadoop journey. Prerequisites Downloaded and Installed Hortonworks Sandbox Allow yourself around one hour to […]

安全性

在本教程中,我们将探讨如何在 HDP Advanced Security 中使用策略保护企业数据湖,以及由用户通过集中式 HDP 安全管理控制台审核访问 HDFS、Hive 和 HBase 中的资源。

简介 Apache Ranger 可为 Hadoop 群集实现全面的安全措施。它针对授权、结算和数据保护等核心企业安全要求,提供中央安全政策管理。Apache Ranger 在 Hadoop 中为整个 Hadoop 工作负载协调强制执行从批次、交互式 SQL 和实时已经扩展基准功能。在本教程中,[…]

Introduction Hortonworks has recently announced the integration of Apache Atlas and Apache Ranger, and introduced the concept of tag or classification based policies. Enterprises can classify data in Apache Atlas and use the classification to build security policies in Apache Ranger. This tutorial walks through an example of tagging data in Atlas and building a […]

Protegrity Avatar™ for Hortonworks® 使用 Protegrity Vaultless Tokenization (PVT)、扩展 HDFS 加密和 Protegrity Enterprise Security Administrator,为先进的数据保护策略、密钥管理和审计扩展 HDP 本地安全的功能。在 Protegrity Avatar for Hortonworks Sandbox 附加设备和教程中,您将学习如何:使用基于策略的 […] 保护和取消保护字段级数据

The hosted Hortonworks Sandbox from Bit Refinery provides an easy way to experience and learn Hadoop with ease. All the tutorials available from HDP work just as if you were running a localized version of the Sandbox. Here is how our “flavor” of Hadoop interacts with the Hortonworks platform: alt text Our new tutorial will […]

Introduction Hortonworks introduced Apache Atlas as part of the Data Governance Initiative, and has continued to deliver on the vision for open source solution for centralized metadata store, data classification, data lifecycle management and centralized security. Atlas is now offering, as a tech preview, cross component lineage functionality, delivering a complete view of data movement […]

Introduction In this tutorial we will walk through the process of Configuring Apache Knox and LDAP services on HDP Sandbox Run a MapReduce Program using Apache Knox Gateway Server Prerequisites Download Hortonworks 2.5 Sandbox. Complete the Learning the Ropes of the Hortonworks Sandbox tutorial, you will need it for logging into Ambari. Outline Concepts 1: […]

Introduction HDP 2.5 ships with Apache Knox 0.6.0. This release of Apache Knox supports WebHDFS, WebHCAT, Oozie, Hive, and HBase REST APIs. Apache Hive is a popular component used for SQL access to Hadoop, and the Hive Server 2 with Thrift supports JDBC access over HTTP. The following steps show the configuration to enable a […]

保护任何系统都需要您实现多层保护。通常,会对数据运用访问控制列表 (ACL) 以限制访问已批准实体的数据。在每个数据访问层运用访问控制列表对于保护系统至关重要。在此图中介绍了 Hadoop 的各个层,并在此 […]

安全与监管

Introduction Hortonworks has recently announced the integration of Apache Atlas and Apache Ranger, and introduced the concept of tag or classification based policies. Enterprises can classify data in Apache Atlas and use the classification to build security policies in Apache Ranger. This tutorial walks through an example of tagging data in Atlas and building a […]

Introduction Hortonworks introduced Apache Atlas as part of the Data Governance Initiative, and has continued to deliver on the vision for open source solution for centralized metadata store, data classification, data lifecycle management and centralized security. Atlas is now offering, as a tech preview, cross component lineage functionality, delivering a complete view of data movement […]

Hadoop 适用于数据科学家和分析师

在 Hadoop 中开始分析数据。这些教程旨在帮助您通过使用 Hadoop 充分利用数据:

通过我们的合作伙伴

简介 JReport 是一个嵌入式 BI 报表工具,可使用 Apache Hive JDBC 驱动程序通过 Hortonworks Data Platform 2.3 轻松提取和可视化数据。然后,您可以创建能够嵌入自己的应用程序的报表、仪表盘和数据分析。在本教程中,我们将演示以下步骤以 […]

Pivotal HAWQ 可以为低延迟解析 SQL 查询以及 Hortonworks Data Platform (HDP) 的大规模并行机器学习功能提供强大支持。HAWQ 是全球最先进的基于 Hadoop 工具的 SQL 引擎。它使用名为 MADlib(毫秒查询响应时间)的内容丰富的数据科学库提供最丰富的 SQL 方言。HAWQ 可启用基于发现的分析 […]

Introduction to Data Analysis with Hadoop

Introduction R is a popular tool for statistics and data analysis. It has rich visualization capabilities and a large collection of libraries that have been developed and maintained by the R developer community. One drawback to R is that it’s designed to run on in-memory data, which makes it unsuitable for large datasets. Spark is […]

This Hadoop tutorial shows how to Process Data with Hive using a set of driver data statistics.

This Hadoop tutorial shows how to Process Data with Apache Pig using a set of driver data statistics.

How to use Apache Tez and Apache Hive for Interactive Query with Hadoop and Hortonworks Data Platform 2.5

本 Hadoop 教程能够让您获取有关 Pig 的工作知识,以及创建 Pig 脚本执行必要的数据操作和任务的实践经验。

This Hadoop tutorial shows how to use HCatalog, Pig and Hive to load and process data using a driver data statistics.

Learn how to visualize data using Microsoft BI and HDP with 10 years of raw stock ticker data from NYSE.

在本教程中,您将学习如何将 Sandbox 连接到 Talend,为您的 Hadoop 环境快速构建测试数据。

在本教程中,将为用户介绍 Revolution R Enterprise 以及它与 Hortonworks Sandbox 的工作原理。首先将使用 ODBC 从 Sandbox 提取数据文件,然后使用 Revolution R Enterprise 中的 R 功能进行分析。

简介 欢迎学习由 Qlik™ 开发的 QlikView(业务发现工具)教程。本教程旨在帮助您在几分钟内连接 QlikView,从而通过 Hortonworks Sandbox 或 Hortonworks Data Platform (HDP) 访问数据。QlikView 能让您立即个性化分析和发现洞察驻留在 Sandbox 中的数据 […]

Real World Examples

This tutorial will cover the core concepts of Storm and the role it plays in an environment where real-time, low-latency and distributed data processing is important.

如何提高在线客户完成购买的几率?Hadoop 简化了分析过程,并改变访客在您网站上的行为方式。这里可以看到一家在线零售商如何通过优化购买路径来降低弹出率并提高转化率。HDP 可以帮助您捕获和优化网站点击流数据,以超越您的公司的电子商务目标。该视频随附的教程介绍了如何使用 HDP 来优化原始点击流数据。

如果发生安全违规行为 ,服务器日志分析可帮助您识别威胁,然后在将来更好地保护您自己。了解 Hadoop 如何通过加速预测,更长时间保留日志数据以及证实符合 IT 策略,从而将服务器日志分析提升到新高度。该视频随附的教程介绍了如何使用 HDP 来优化原始服务器日志数据。

借助 Hadoop,您可以挖掘 Twitter、Facebook 和其他社交媒体对话,以分析客户有关您和您的竞争对手的情绪。借助更多的社交大数据,您可以制定更有针对性的实时决策。该视频随附的教程介绍了如何使用 HDP 来优化原始推文数据。

机器了解很多情况。传感器串流低成本、始终在线的数据。Hadoop 让存储和精化该数据并确定有意义的模式更为容易,让您通过预测分析获得洞察力来进行前瞻性业务决策。了解可以如何使用 Hadoop 来分析制热、通风和空调数据以保持理想的办公室温度并最大限度降低费用

RADAR 是一款专为使用 ITC Handy 工具(NLP 和 Sentiment Analysis 引擎)和利用 Hadoop 技术的零售商设计的软件解决方案 …

简介 H2O 是 0xdata 的一款开源内存解决方案,用于对大数据作出预测分析。它是一个数学和机器学习引擎,可以为强大的算法带来分布式并行处理能力,从而使您能够更快地做出更好的预测和更精确的模型。熟悉各种 API,如 R 和 JSON,以及 […]

合作伙伴提供的集成指南

这些教程通过使用合作伙伴应用程序说明关键集成点。

在本教程中,您将学习如何使用 Datameer Playground(建立在 Hortonworks Sandbox 之上)为零售业务客户构建 360 度视图。

在本教程中,您将学习如何在 Hortonworks Sandbox 中运行 ETL 和构建 MapReduce 作业。

在本教程中,您将学习如何将 Sandbox 连接到 Talend,为您的 Hadoop 环境快速构建测试数据。

了解如何使用级联模式将预测模型 (PMML) 从 SAS、R 和 MicroStrategy 快速迁移至 Hadoop,并进行大规模部署。

了解配置 Business Intelligence and Reporting Tools (BIRT) 从 Hortonworks Sandbox 访问数据。超过 250 万开发者使用 BIRT 快速个性化洞悉和分析 Java/J2EE 应用程序。

使用 Hortonworks Data Platform 2.0 将 Hortonworks Sandbox V2.0 连接到 Hunk™: Splunk Analytics for Hadoop。Hunk 可提供一个集成平台,用于快速探索、分析和可视化本地驻留在 Hadoop 中的数据

了解如何使用 Hortonworks Sandbox 设置各种产品(SQL Anywhere、Sybase IQ、BusinessObjects BI、HANA 和 Lumira)的 SAP 产品组合,从而以企业快速发展的速度进军大数据市场。

MicroStrategy 使用 Apache Hive(通过 ODBC 连接)作为在 Hadoop 中进行 SQL 访问的事实标准。建立 MicroStrategy 与 Hadoop 的连接,并在此处介绍 Hortonworks Sandbox

在本教程中,将为用户介绍 Revolution R Enterprise 以及它与 Hortonworks Sandbox 的工作原理。首先将使用 ODBC 从 Sandbox 提取数据文件,然后使用 Revolution R Enterprise 中的 R 功能进行分析。

Learn how to visualize data using Microsoft BI and HDP with 10 years of raw stock ticker data from NYSE.

简介 欢迎学习由 Qlik™ 开发的 QlikView(业务发现工具)教程。本教程旨在帮助您在几分钟内连接 QlikView,从而通过 Hortonworks Sandbox 或 Hortonworks Data Platform (HDP) 访问数据。QlikView 能让您立即个性化分析和发现洞察驻留在 Sandbox 中的数据 […]

how to get started with Cascading and Hortonworks Data Platform using the Word Count Example.

简介 H2O 是 0xdata 的一款开源内存解决方案,用于对大数据作出预测分析。它是一个数学和机器学习引擎,可以为强大的算法带来分布式并行处理能力,从而使您能够更快地做出更好的预测和更精确的模型。熟悉各种 API,如 R 和 JSON,以及 […]

RADAR 是一款专为使用 ITC Handy 工具(NLP 和 Sentiment Analysis 引擎)和利用 Hadoop 技术的零售商设计的软件解决方案 …

在本教程中,我们将引导您使用 Sqrrl 和 HDP 加载并分析图形数据。Sqrrl 刚刚公布了最新的 Sqrrl Test Drive VM 与 Hortonworks Sandbox 搭配使用的可用性,可以顺畅运行 HDP 2.1!这为用户提供了一种方便实用的方法试用 Sqrrl 的各项功能,而无需 […]

This use case is the sentiment analysis and sales analysis with Hadoop and MySQL. It uses one Hortonworks Data Platform VM for the twitter sentiment data and one MySQL database for the sales
data.

Protegrity Avatar™ for Hortonworks® 使用 Protegrity Vaultless Tokenization (PVT)、扩展 HDFS 加密和 Protegrity Enterprise Security Administrator,为先进的数据保护策略、密钥管理和审计扩展 HDP 本地安全的功能。在 Protegrity Avatar for Hortonworks Sandbox 附加设备和教程中,您将学习如何:使用基于策略的 […] 保护和取消保护字段级数据

Download the turn-key Waterline Data Sandbox preloaded with HDP, Waterline Data Inventory and sample data with tutorials in one package. Waterline Data Inventory enables users of Hadoop to find, understand, and govern data in their data lake. How do you get the Waterline Data advantage? It’s a combination of automated profiling and metadata discovery, and […]

The hosted Hortonworks Sandbox from Bit Refinery provides an easy way to experience and learn Hadoop with ease. All the tutorials available from HDP work just as if you were running a localized version of the Sandbox. Here is how our “flavor” of Hadoop interacts with the Hortonworks platform: alt text Our new tutorial will […]

Hadoop is fast emerging as a mainstay in enterprise data architectures. To meet the increasing demands of business owners and resource constraints, IT teams are challenged to provide an enterprise grade cluster that can be consistently and reliably deployed. The complexities of the varied Hadoop services and their requirements make it more onerous and time […]