As growing volumes of diverse data are channeled into the data lake, it becomes a centralized source of vital and often highly sensitive business data. In recognition of the need to protect business and customer data, enterprise-grade data lake security has been an integral part of Hortonworks Data Platform (HDP). Apache Ranger and Apache Knox are the two components that form the basis of centralized enterprise big data governance security administration in HDP. Ranger provides centralized Hadoop security administration and management, while Knox streamlines security for services and users who access the cluster data and execute jobs.
Deploy the Most Comprehensive Authorization Model Across Big Data Ecosystem
Apache Ranger provides the most comprehensive centralized platform to define, administer and manage security policies consistently across Hadoop components including:
Using Ranger, administrators can define Hadoop security policies at the database, table, column, and file levels, and can administer permissions for specific LDAP-based groups or individual users.
Implement Dynamic Security Policies
The integration of Apache Atlas and Apache Ranger represents a paradigm shift for enterprise big data governance and security in Apache Hadoop. By integrating Atlas with Ranger enterprises can now implement dynamic classification-based security policies, in addition to role-based security. Ranger’s centralized platform empowers data administrators to define security policy based on Atlas metadata tags or attributes and apply this policy in real-time to the entire hierarchy of data assets including databases, tables and columns.
Blog: Apache Ranger Graduates to a Top Level Project – Part 2
Conduct Scalable and Authoritative Audits
Ranger provides administrators with deep visibility into the security administration process that is a prerequisite to complying with enterprise data governance audits. The combination of a rich user interface and deep audit visibility provides answers to auditor's questions and enhances the productivity of data lake security administrators. Ranger also provides a centralized framework to collect and report on access audit history, including the ability to filter on various parameters. The aggregated audit information captured within different components of HDP provides granular insights through centralized reporting capability.
White Paper: Path to GDPR Compliance Begins with Data Governance
Balance the Need for Data Access Without Sacrificing Strong Security
Dynamic data masking via Apache Ranger enables Hadoop security administrators to ensure that only authorized users can see the data they are permitted to see, while for other users or groups the same data is masked or anonymized to protect sensitive content. The process of dynamic data masking does not physically alter the data, or make a copy of it. The original sensitive data also does not leave the data store, but rather the data is obfuscated when presenting to the user.
Row-level security enables security administrators to define precisely which rows in a database table can be accessed by users based on user attributes such as their membership in a specific group or the runtime context in which they are querying data. This functionality enhances the reliability and robustness of HDP to provide increased security for data access to Apache Hive tables.
Blog: Dynamic Column Masking & Row-Level Filtering in HDP 2.5
Streamline Security for Users and Services
The Apache Knox Gateway (“Knox”) extends the reach of Apache Hadoop® services to users outside of a Hadoop cluster without compromising security. Knox is designed as a reverse proxy that that can be deployed in the cloud or on-premises. Knox also simplifies Hadoop security for users who access the cluster data and execute jobs by providing a centralized gateway. Knox integrates with Identity Management and SSO systems used in enterprises and allows identity from these systems be used for access to Hadoop clusters. Knox Gateways provides security for multiple Hadoop clusters by:
Extending Hadoop’s REST/ HTTP services by encapsulating Kerberos within the cluster
Exposing Hadoop’s REST/HTTP services without revealing network details and providing SSL out of the box
Enforcing REST API security centrally and routing requests to multiple Hadoop clusters
Supporting LDAP, Active Directory, SSO, SAML and other authentication systems
Blog: Simplifying the Fortification of your Data Lake with Apache Knox
Apache, Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Phoenix, NiFi, Nifi Registry, HAWQ, Zeppelin, Slider, Mahout, MapReduce, HDFS, YARN, Metron and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States or other countries.