时事通讯

通过电子邮件获得 Hortonworks 的最新更新

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

CTA

开始

云

是否已准备就绪?

下载 sandbox

我们能为您做什么?

* 我了解我可以随时取消预订。我也承认在 Hortonworks 隐私政策中发现的更多信息。
关闭关闭按钮
CTA

Data Science @ Scale

HDP and IBM Data Science Experience

Improve the success of your data science initiatives

下载白皮书

对于企业而言,数据科学是游戏规则颠覆者

数据科学是一个跨学科的领域,融合了机器学习、统计、高级分析和编程。它是一种新的艺术形式,它汲取隐藏的洞察力,并将数据投入到认知时代的工作中。

IBM Data Science Experience (DSX) is an enterprise platform for data scientists and data engineers. It offers out-of-the-box open-source and commercial data science tools including RStudio, Apache Spark, Jupyter, and Zeppelin notebooks. DSX supports the entire data science lifecycle from data preparation and ETL to model development and deployment. With DSX, companies can build predictive and machine learning models using their favorite tools, technologies, and libraries, while leveraging the scale, security and governance of the HDP platform.

制造 video img视频按钮

数据科学生命周期

优点

Access to community

DSX provides a social environment where data scientists can research and share articles, data sets, notebooks, and tutorials. DSX enables data scientists and analysts to come up to speed by taking courses in R, Python, or Scala, copy content into a Jupyter or a Zeppelin notebook, or work in an embedded RStudio environment.

  • Find tutorials and datasets
  • Connect with data scientists and ask questions
  • Research articles and papers
  • Fork and share projects
Blog: Certification of IBM Data Science Experience (DSX) on HDP is a Win-Win for Customers
Use familiar open source tools and libraries

With DSX, data scientists have the flexibility to create new Jupyter or Zeppelin notebooks in R, Python, or Scala or import an existing notebook. DSX includes popular open source libraries, such as PySpark, matplotlib, SparkML and machine learning and deep learning APIs. Data scientists can use DSX to tell a compelling story with the help of open source visualization libraries like Brunel and PixieDust and have the flexibility to install other open source libraries of their choice.

  • Code in Scala, Python, R, Apache Spark and SQL
  • Visualize and share code using Zeppelin & Jupyter Notebooks
  • Leverage RStudio IDE and Shiny
  • Use your favorite libraries including Scikit-learn, XGBoost, Spark Mlib, TensorFlow, Caffe, Keras and MXNet
Webinar: From Data Science to Enterprise Data Science @ Scale
Operationalize models with one click

With DSX, administrators can deploy models with one-click and have the ability to monitor all runtime environments and services.

  • Data Shaping Pipeline UI
  • Auto-data preparation & modeling
  • Advanced Visualizations
  • Model management & deployment
  • Documented Model APIs
Solution Brief: Data Science Machine Learning
Scale and enterprise security

The combination of HDP and DSX empowers enterprises to run data science at scale by leveraging all the data in the data lake, as well as deploying enterprise-grade security, governance, and operations.

  • Data Science at Scale - Run Spark Jobs on HDP Cluster
  • Secure Hadoop Support using Apache Ranger
  • Support for ABAC using Apache Ranger
Blog: An Exciting Data Science Experience on HDP