SlideShare a Scribd company logo
1 of 15
© 2016 IBM CorporationHadoop Summit – San Jose 2016Hadoop Summit – San Jose 2015
Apache Ranger Hive Metastore Security
Yan Zhou (zhouya@us.ibm.com),
Tanping Wang(wangta@us.ibm.com)
IBM Big Insights Product Lead Architects, Silicon Valley Lab, IBM
© 2016 IBM Corporation2 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Apache Ranger
 Provides centralized policy definition for authorizing & auditing access to resources
in a consistent manner.

Agent AgentAgent AgentAgent Agent
HBase Hive YARN Knox Storm Solr Kafka
Agent
HDFS
Agent
Audit
Server
Policy
Server
Administration
Portal
REST
APIs
DB
SOLR
HDFS
KMS
LDAP/AD
user/group
syncLog4j
© 2016 IBM Corporation3 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
HiveServer2 Ranger Authorization Model
Ranger
Policy
Manager
HiveServer2
Ranger
Agent
Admin sets policies for Hive
Databases/Tables/Columns
…
User
Application
Users access Hive data
through application HiveServer2
IT/Analysis
users access
HiveServer2
through Beeline
Hiveserver2 uses
Agent for
Authorization
Ranger Audit
Database Audit logs pushed to DB
HiveServer2
provides table data
access to user/client
1
2
2
3
4
5
Policy Refreshing
© 2016 IBM Corporation4 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Motivation:
Gaps for the Current Hive Ranger Authorization Model
DO DO NOT
Hive CLI Hive CLI does not work with
Ranger
HiveServer 2 • Provides ACL to the database,
tables, columns and locks.
• Supports Ranger policy
creation or deletion from the
Hive Grant or Revoke
statements.
Do not support adjustments of
Hive-created policies as result of
DDLs:
• Once the DB object name is
changed from DDL, the Hive-
created policy in Ranger is out
of sync;
• Once the DB object is deleted,
the Hive-created policy in
Ranger becomes orphan.
© 2016 IBM Corporation5 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Motivation:
Gaps for the Current HiveServer2 Ranger Authorization Mode (cont’d)
Resource ACL Sync Up GOOD NOT GOOD
Storage-based
Authorization
Consistent access controls by
Hive and HDFS
Is not good at controlling of SQL
data access at finer granularity
like COLUMN
SQL Standard-based
Authorization
Fits well with SQL standard
privilege model
Does not provide consistent
privileges across Hive and HDFS,
and potentially forbids the sharing
of Hive data with other Hadoop
apps
Needs a holistic view of the HDFS and Hive ACLs to provide a consistent privilege
control.
© 2016 IBM Corporation6 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
We Introduce:
The New Hive Metastore Ranger Security Agent
Provides Use Cases
Hive CLI • ACLs for Hive CLI hive> SELECT * FROM employee;
Before: Hive decides the ACL on its own.
After: invoke the Hive Metastore Ranger security
agent to get the ACL from Ranger.
HiveServer2 • Authorization for the Metastore
objects
• ACLs is in sync with the SQL
objects all the time.
hive> GRANT SELECT on table employee to
user hr1;
hive> ALTER TABLE employee RENAME TO
employees;
Before: No changes on the Range policy for the
user, hr1 on the table, employee.
After: Ranger policy for hr1 changed to be on
employees.
© 2016 IBM Corporation7 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
We Introduce:
The New Hive Metastore Ranger Security Agent (cont’d)
Provide Use Cases
Resource
ACL Sync
Up
 Provide consistent access control
between Hive and HDFS for SQL-
standard based privilege model.
beeline> CREATE TABLE employee(name
STRING); // by user “hr1”
beeline> LOAD DATA LOCAL INPATH
‘/data/input.txt’ OVERWRITE INTO TABLE
employee;
pig> LOAD ‘/user/hive/warehouse/employee’
USING PigStorage() AS (name:chararray)
Before: not allowed by the user, hr1
After: allowed by the user, hr1.
© 2016 IBM Corporation8 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Ranger Hive Metastore Security Workflow – Hive CLI
Ranger
Policy
Manager
Admin sets policies
for Hive
Databases/Tables/C
olumns …
User
Application
Users access Hive data
through application
invoking Hive CLI
Hive CLI
IT/Analysis
users access
Hive data
through CLI
Ranger Audit
Database
Audit logs pushed to DB
Hive CLI
provides table
data access to
user/client
1
2
2
4
5
Ranger
Metastore
Agents
Hive CLI uses
agents for Authz,
and Policy Object
Sync from DDL
3
Policy Refreshing
© 2016 IBM Corporation9 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Ranger Hive Metastore Security Workflow – HiveServer2
Ranger
Policy
Manager
Ranger
HiveServer2
Agent
Admin sets policies
for Hive
Databases/Tables/Col
umns …
User
Application
Users access Hive
data through
application
HiveServer2
IT/Analysis
users access
HiveServer2
through Beeline
Ranger Audit
Database
Audit logs pushed to DB
HiveServer2
provides table
data access to
user/client
1
2
2
3
5
6
Ranger
Metastore
Agents
4
Policy Refreshing
Hiveserver2
uses Ranger
Agent for
Authz
HiveServer2
uses Ranger
Metastore
agent for ACL
Object Sync
on DDL
© 2016 IBM Corporation10 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Metastore Security Workflow – HDFS ACL Sync (Ongoing)
Ranger
Policy
Manager
Admin sets policies
for Hive
Databases/Tables/Col
umns …
HiveServer2
IT/Analysis
user Joe
1
1 Ranger
Metastore
Agents
HDFS uses Agent
for authorization
Create table t1
Sets new HDFS policy for Joe on
/user/hive/warehouse/t1
2
2
Ranger
HDFS
Agent
HDFS
NameNode
HiveServer2
passes Hive
Metadata to
Metastore
Agents
5
Joe uses
PIG to
read Hive
Data in
/user/hive/
warehouse
/t1
PIG
6
Policy Refreshing
Passes HDFS security
info to Policy Manager3
4
© 2016 IBM Corporation11 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Hive Security Hooks and Their Ranger Implementation/Extensions
Hive
Authorizer
MetaStorePre
EventListener
MetaStore
EventListener
RangerHive
Authorizer
RangerHive
Metastore
Authorizer
RangerHive
Metastore
PrivilegeHandler
implements extends extends
Hive
© 2016 IBM Corporation12 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Ranger Implementation/Extensions of Hive Security Hooks
 RangerHiveAuthorizer
 Existing Ranger Hive Agent
 Methods: check/grant/revokePrivileges
 Handles: HiveServer2 Authorization; Grant/Revoke
 RangerHiveMetastoreAuthorizer
 New Ranger Hive Metastore Agent
 Methods: on(Create/Drop/Alter)(Table/Database/Index/…)
 Handles: CLI Authorization
 RangerHiveMetastorePrivilegeHandler
 New Ranger Hive Metastore Agent
 Methods: (create/drop/alter)(Table/Databse/Index/…)
 Handles: Sync of Hive ACL objects and Resource ACLs
© 2016 IBM Corporation13 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Status, Future Plan and References
 Patch Ready:
o CLI access control
o Policy Object Sync from DDL
 Ongoing Work:
o Resource ACL Sync
 References:
o https://issues.apache.org/jira/browse/RANGER-768
o https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization
o https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+
Metastore+Server
© 2016 IBM Corporation14 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Demo
 Software Versions: Ranger 6.0 + Hadoop 2.7.0 + Hive 1.2.1
 Test Cases:
With Ranger HiveServer2 Agent but without Ranger Hive Metastore Security Agents
• CLI: SQL not subject to Ranger ACLs
• HiveServer2: No Object sync of Ranger ACLs as result of SQL DDL
With Ranger HiveServer2 Agent and Ranger Hive Metastore Security Agents
• CLI: SQL subject to Ranger ACLs
• HiveServer2: Object sync of Ranger ACLs as result of SQL DDL
© 2016 IBM Corporation15 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016
Q & A

More Related Content

What's hot

Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Getting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsGetting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsDatabricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
How Impala Works
How Impala WorksHow Impala Works
How Impala WorksYue Chen
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafkaconfluent
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...Databricks
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala InternalsDavid Groozman
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowJulien Le Dem
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine kiran palaka
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...Databricks
 

What's hot (20)

Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Getting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsGetting Started with Databricks SQL Analytics
Getting Started with Databricks SQL Analytics
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
How Impala Works
How Impala WorksHow Impala Works
How Impala Works
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafka
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Presto: SQL-on-anything
Presto: SQL-on-anythingPresto: SQL-on-anything
Presto: SQL-on-anything
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Apache flink
Apache flinkApache flink
Apache flink
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
 

Viewers also liked

End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentDataWorks Summit/Hadoop Summit
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJIntro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJDaniel Madrigal
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataDataWorks Summit/Hadoop Summit
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseDataWorks Summit/Hadoop Summit
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveDataWorks Summit/Hadoop Summit
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewDataWorks Summit/Hadoop Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success DataWorks Summit/Hadoop Summit
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondDataWorks Summit/Hadoop Summit
 
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaBridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaDataWorks Summit/Hadoop Summit
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaDataWorks Summit/Hadoop Summit
 

Viewers also liked (20)

End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service Deployment
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJIntro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
 
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
Toward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFSToward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFS
 
Stream Processing made simple with Kafka
Stream Processing made simple with KafkaStream Processing made simple with Kafka
Stream Processing made simple with Kafka
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyond
 
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaBridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
 
Apache Hive ACID Project
Apache Hive ACID ProjectApache Hive ACID Project
Apache Hive ACID Project
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
 
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFiFrom Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
 

Similar to Apache Ranger Hive Metastore Security

Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?DataWorks Summit
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksLuan Moreno Medeiros Maciel
 
Lessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of DataLessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of DataDataWorks Summit
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?DataWorks Summit/Hadoop Summit
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in HadoopMadhan Neethiraj
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies DataWorks Summit/Hadoop Summit
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in ActionHao Chen
 
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! Sumeet Singh
 
Best Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentBest Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentDataWorks Summit/Hadoop Summit
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017alanfgates
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
Building Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaBuilding Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaAshish Thapliyal
 

Similar to Apache Ranger Hive Metastore Security (20)

Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
 
Lessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of DataLessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of Data
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in Hadoop
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
 
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
 
Best Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentBest Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop Environment
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
 
Api manager preconference
Api manager preconferenceApi manager preconference
Api manager preconference
 
Building Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaBuilding Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and Kafka
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 

Apache Ranger Hive Metastore Security

  • 1. © 2016 IBM CorporationHadoop Summit – San Jose 2016Hadoop Summit – San Jose 2015 Apache Ranger Hive Metastore Security Yan Zhou (zhouya@us.ibm.com), Tanping Wang(wangta@us.ibm.com) IBM Big Insights Product Lead Architects, Silicon Valley Lab, IBM
  • 2. © 2016 IBM Corporation2 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Apache Ranger  Provides centralized policy definition for authorizing & auditing access to resources in a consistent manner.  Agent AgentAgent AgentAgent Agent HBase Hive YARN Knox Storm Solr Kafka Agent HDFS Agent Audit Server Policy Server Administration Portal REST APIs DB SOLR HDFS KMS LDAP/AD user/group syncLog4j
  • 3. © 2016 IBM Corporation3 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 HiveServer2 Ranger Authorization Model Ranger Policy Manager HiveServer2 Ranger Agent Admin sets policies for Hive Databases/Tables/Columns … User Application Users access Hive data through application HiveServer2 IT/Analysis users access HiveServer2 through Beeline Hiveserver2 uses Agent for Authorization Ranger Audit Database Audit logs pushed to DB HiveServer2 provides table data access to user/client 1 2 2 3 4 5 Policy Refreshing
  • 4. © 2016 IBM Corporation4 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Motivation: Gaps for the Current Hive Ranger Authorization Model DO DO NOT Hive CLI Hive CLI does not work with Ranger HiveServer 2 • Provides ACL to the database, tables, columns and locks. • Supports Ranger policy creation or deletion from the Hive Grant or Revoke statements. Do not support adjustments of Hive-created policies as result of DDLs: • Once the DB object name is changed from DDL, the Hive- created policy in Ranger is out of sync; • Once the DB object is deleted, the Hive-created policy in Ranger becomes orphan.
  • 5. © 2016 IBM Corporation5 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Motivation: Gaps for the Current HiveServer2 Ranger Authorization Mode (cont’d) Resource ACL Sync Up GOOD NOT GOOD Storage-based Authorization Consistent access controls by Hive and HDFS Is not good at controlling of SQL data access at finer granularity like COLUMN SQL Standard-based Authorization Fits well with SQL standard privilege model Does not provide consistent privileges across Hive and HDFS, and potentially forbids the sharing of Hive data with other Hadoop apps Needs a holistic view of the HDFS and Hive ACLs to provide a consistent privilege control.
  • 6. © 2016 IBM Corporation6 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 We Introduce: The New Hive Metastore Ranger Security Agent Provides Use Cases Hive CLI • ACLs for Hive CLI hive> SELECT * FROM employee; Before: Hive decides the ACL on its own. After: invoke the Hive Metastore Ranger security agent to get the ACL from Ranger. HiveServer2 • Authorization for the Metastore objects • ACLs is in sync with the SQL objects all the time. hive> GRANT SELECT on table employee to user hr1; hive> ALTER TABLE employee RENAME TO employees; Before: No changes on the Range policy for the user, hr1 on the table, employee. After: Ranger policy for hr1 changed to be on employees.
  • 7. © 2016 IBM Corporation7 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 We Introduce: The New Hive Metastore Ranger Security Agent (cont’d) Provide Use Cases Resource ACL Sync Up  Provide consistent access control between Hive and HDFS for SQL- standard based privilege model. beeline> CREATE TABLE employee(name STRING); // by user “hr1” beeline> LOAD DATA LOCAL INPATH ‘/data/input.txt’ OVERWRITE INTO TABLE employee; pig> LOAD ‘/user/hive/warehouse/employee’ USING PigStorage() AS (name:chararray) Before: not allowed by the user, hr1 After: allowed by the user, hr1.
  • 8. © 2016 IBM Corporation8 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Ranger Hive Metastore Security Workflow – Hive CLI Ranger Policy Manager Admin sets policies for Hive Databases/Tables/C olumns … User Application Users access Hive data through application invoking Hive CLI Hive CLI IT/Analysis users access Hive data through CLI Ranger Audit Database Audit logs pushed to DB Hive CLI provides table data access to user/client 1 2 2 4 5 Ranger Metastore Agents Hive CLI uses agents for Authz, and Policy Object Sync from DDL 3 Policy Refreshing
  • 9. © 2016 IBM Corporation9 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Ranger Hive Metastore Security Workflow – HiveServer2 Ranger Policy Manager Ranger HiveServer2 Agent Admin sets policies for Hive Databases/Tables/Col umns … User Application Users access Hive data through application HiveServer2 IT/Analysis users access HiveServer2 through Beeline Ranger Audit Database Audit logs pushed to DB HiveServer2 provides table data access to user/client 1 2 2 3 5 6 Ranger Metastore Agents 4 Policy Refreshing Hiveserver2 uses Ranger Agent for Authz HiveServer2 uses Ranger Metastore agent for ACL Object Sync on DDL
  • 10. © 2016 IBM Corporation10 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Metastore Security Workflow – HDFS ACL Sync (Ongoing) Ranger Policy Manager Admin sets policies for Hive Databases/Tables/Col umns … HiveServer2 IT/Analysis user Joe 1 1 Ranger Metastore Agents HDFS uses Agent for authorization Create table t1 Sets new HDFS policy for Joe on /user/hive/warehouse/t1 2 2 Ranger HDFS Agent HDFS NameNode HiveServer2 passes Hive Metadata to Metastore Agents 5 Joe uses PIG to read Hive Data in /user/hive/ warehouse /t1 PIG 6 Policy Refreshing Passes HDFS security info to Policy Manager3 4
  • 11. © 2016 IBM Corporation11 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Hive Security Hooks and Their Ranger Implementation/Extensions Hive Authorizer MetaStorePre EventListener MetaStore EventListener RangerHive Authorizer RangerHive Metastore Authorizer RangerHive Metastore PrivilegeHandler implements extends extends Hive
  • 12. © 2016 IBM Corporation12 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Ranger Implementation/Extensions of Hive Security Hooks  RangerHiveAuthorizer  Existing Ranger Hive Agent  Methods: check/grant/revokePrivileges  Handles: HiveServer2 Authorization; Grant/Revoke  RangerHiveMetastoreAuthorizer  New Ranger Hive Metastore Agent  Methods: on(Create/Drop/Alter)(Table/Database/Index/…)  Handles: CLI Authorization  RangerHiveMetastorePrivilegeHandler  New Ranger Hive Metastore Agent  Methods: (create/drop/alter)(Table/Databse/Index/…)  Handles: Sync of Hive ACL objects and Resource ACLs
  • 13. © 2016 IBM Corporation13 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Status, Future Plan and References  Patch Ready: o CLI access control o Policy Object Sync from DDL  Ongoing Work: o Resource ACL Sync  References: o https://issues.apache.org/jira/browse/RANGER-768 o https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization o https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+ Metastore+Server
  • 14. © 2016 IBM Corporation14 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Demo  Software Versions: Ranger 6.0 + Hadoop 2.7.0 + Hive 1.2.1  Test Cases: With Ranger HiveServer2 Agent but without Ranger Hive Metastore Security Agents • CLI: SQL not subject to Ranger ACLs • HiveServer2: No Object sync of Ranger ACLs as result of SQL DDL With Ranger HiveServer2 Agent and Ranger Hive Metastore Security Agents • CLI: SQL subject to Ranger ACLs • HiveServer2: Object sync of Ranger ACLs as result of SQL DDL
  • 15. © 2016 IBM Corporation15 Hadoop Summit – San Jose, CA – June 2015Hadoop Summit – San Jose, CA – June 2016 Q & A