时事通讯

通过电子邮件获得 Hortonworks 的最新更新

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

每月一次,接收最新的洞察力、趋势、分析信息和大数据的知识。

CTA

开始

云

是否已准备就绪?

下载 sandbox

我们能为您做什么?

* 我了解我可以随时取消预订。我也承认在 Hortonworks 隐私政策中发现的更多信息。
关闭关闭按钮
HDF > Develop Data Flow & Streaming Applications > Hello World

Schema Registry in Trucking IoT on HDF

Benefits of a Schema Registry

云 是否已准备就绪?

下载 SANDBOX

Benefits of a Schema Registry

Introduction

So what is Schema Registry and what benefits does it provide? Would using it make a data pipeline more robust and maintainable? Let us explore exactly what Schema Registry is and how it fits into modern data architectures.

Prerequisites

Outline

What is Schema Registry?

模式注册表

Schema Registry provides a centralized repository for schemas and metadata, allowing services to flexibly interact and exchange data with each other without the challenge of managing and sharing schemas between them.

Schema Registry has support for multiple underlying schema representations (Avro, JSON, etc.) and is able to store a schema’s corresponding serializer and deserializer.

Smaller Payloads

Typically, when serializing data for transmission using schemas, the actual schema (text) needs to be transmitted with the data. This results in an increase of payload size.

Using Schema Registry, all schemas are registered with a central system. Data producers no longer need to include the full schema text with the payload, but instead only include the ID of that schema, also resulting in speedier serialization.

Payload differences

Differing Schemas

Consider the case where thousands of medical devices are reading the vitals of patients and relaying information back to a server.

The services and applications in your pipeline are expecting data using a specific format and fields that these medical devices use.

What about when medical devices from a different vendor are added to the system? Data in a different format carrying a different set of fields would typically require updates to the different components of your data pipeline.

Schema Registry enables generic format conversion and generic routing, allowing you to build a resilient pipeline able to handle data in different format with varying sets of fields.

Schema Evolution

Following the use-case above, consider the case when the software in some of the medical devices you are collecting data from is updated. Some devices now collect new data points, while other devices report to same limited number of fields as before. Similarly, consider when the processing step in the pipeline is altered to output data with fewer or more fields than its previous version. Typically, for either of these cases, the rest of your pipeline would need to be updated to handle these changes.

With Schema Registry, the different components in your architecture (IoT devices, routing logic, processing nodes, etc.) can evolve at different rates. Components can change the shape of its data while Schema Registry handles the translation from one schema to another, ensuring compatibility with downstream services.

Next: A Closer Look At The Architecture

Next, we’ll go a bit more in depth and look at what different components make up Schema Registry and what they do for us.

User Reviews

User Rating
0 No Reviews
5 Star 0%
4 Star 0%
3 Star 0%
2 Star 0%
1 Star 0%
Tutorial Name
Schema Registry in Trucking IoT on HDF

To ask a question, or find an answer, please visit the Hortonworks Community Connection.

No Reviews
Write Review

注册

Please register to write a review

Share Your Experience

Example: Best Tutorial Ever

You must write at least 50 characters for this field.

Success

Thank you for sharing your review!