通过电子邮件获得 Hortonworks 的最新更新

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

CTA

开始

云

是否已准备就绪?

下载 sandbox

我们能为您做什么?

关闭关闭按钮
CTA

企业 Spark 大数据大规模解决方案

Hortonworks 提供用于企业部署的 Spark

云 Hortonworks 是领导者。阅读 Forrester Wave。

下载报告

概述

Apache™ Spark Overview

Hortonworks is unleashing the power of the Apache Spark big data processing framework for enterprise scale, unifying the capabilities of open enterprise Apache Hadoop® and the in-memory analytic capabilities of Apache Spark to maximize organizational value.

Spark is Better as Part of the Platform
Spark is certified as YARN-ready and is part of Hortonworks Data Platform. Memory and CPU-intensive enterprise Spark-based applications can coexist with other workloads deployed in a YARN-enabled cluster. Spark has first class support for external data sources, it can run directly on the cluster in YARN, and that is where enterprises want to perform their data analysis. This approach avoids the need to create and manage dedicated enterprise Spark clusters and allows for more efficient resource use within a single cluster. 

Spark Requires Enterprise-Grade Security and Governance
As part of the HDP platform, Spark has access to the same governance, security and management policies as other components of the HDP stack. The Spark big data processing framework is one the fastest moving projects in the Big Data ecosystem and its libraries remain at different levels of maturity. Hortonworks investigates, validates, certifies and then supports each of the components in the Spark project. This approach is key to the way we add value for our customers.

Notebooks Makes Spark and Data Science Easier to Consume & Share
Web-based notebooks bring data ingestion, exploration, visualization, sharing and collaboration capabilities to Hadoop and Spark. Hortonworks is making a substantial investment in Apache Zeppelin; we plan to make Zeppelin ready for production use by making it easier to use, while adding security, stability and R support.

By delivering a unified Apache Spark and Hadoop, we combine Spark-driven Agile Analytic workflows with the vast-data set and economics of Hadoop. With Hortonworks, enterprises can deploy the Apache Spark big data processing framework with the industry’s best security, governance, and operations capabilities.

Hortonworks 对 Spark 的投入如何?

随着 Spark 1.6 的发布,Hortonworks 承诺帮助客户加速数据科学,维护无缝数据访问以及驱动核心创新。

Spark 作为开放企业 Hadoop 的一部分,使组织可以针对企业价值扩展 Spark。

管理员

数据科学加速

通过增强Apache Zeppelin 以及贡献其他 Spark 算法和软件包来简化关键解决方案的部署,从而提高数据科学生产力。

例如:麦哲伦项目 - Apache Spark 中的地理分析学,一个面向地理分析的开源库,可便于地理空间查询,其基于 Spark,可解决处理大规模地理空间数据的棘手难题。

管理员

无缝数据访问

Spark SQL 提供 SQL 和数据帧 API 以访问结构化数据,而 Spark Streaming 则使开发者可以轻松构建五个实时数据流的可扩展、高吞吐量、容错性流处理。

Hortonworks 一直在改善 Spark 与 YARN、HDFS、Hive、HBase 和 ORC 集成。特别是,我们认为我们可以通过新的数据源 API 进一步优化数据访问。

管理员

核心创新

使用 HDFS 内存层实现 RDD 共享

贡献其他机器学习算法

Enhance enterprise Spark’s security, governance, operations, and readiness

CTA

要详细了解全部激动人心的 Spark 创新,

查看我们的 Apache Spark 页面。

查看页面

如何开始使用 Apache Spark at Scale?

收听我们最新的网络研讨会 - 包含 Hadoop 的 Spark at Scale