SlideShare una empresa de Scribd logo
1 de 35
1 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Knox
- Hadoop Security Swiss Knife
Krishna Pandey, CISSP , CCSP Larry McCay
Staff Software Engineer (Platform Security) Senior Manager (Security Architecture) &
Knox PMC chair, Ranger, Metron and Hadoop committer
2 © Hortonworks Inc. 2011–2018. All rights reserved
• Over 53,000 Incidents [1]
• 2,216 Confirmed Data Breaches [2]x
• Formjacking – 48,000 unique websites on average/month [3]
• Crypto-jacking [reduced by 52% in 2018] [4]
• Penalties for non-compliance – GDPR
• More CVEs to patch – CVE-2019-5736, CVE-2018-8009, …
• Meltdown, Spectre, Foreshadow, …, Spoiler
Source: [1] [2] https://www.verizonenterprise.com/verizon-insights-lab/dbir/
[3] [4]: https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-executive-summary-en-apj.pdf
3 © Hortonworks Inc. 2011–2018. All rights reserved
What This Means to Big Data Landscape?
4 © Hortonworks Inc. 2011–2018. All rights reserved
Fig.2 Publicly
accessible
WebHDFS
cluster
5 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Knox
6 © Hortonworks Inc. 2011–2018. All rights reserved
• Knox Gateway is a reverse proxy for the Hadoop Ecosystem
• A trusted proxy within Hadoop platform deployments
• Proxies HTTP APIs and UIs
• Encapsulates and minimizes the burden of Kerberos on HTTP clients
• Highly flexible with pluggable provider chains
7 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Knox – What is it?
3 Primary Sets of Services from Knox:
● Proxying Services
○ Protected Access to Platform
Resources
○ Pluggable Provider Pipeline
● Authentication Services
○ KnoxSSO
○ KnoxToken
● Client Services
○ SDK Client Classes
○ Groovy based DSL for Scripting
○ Knox Token Sessions
○ Credential Collectors
8 © Hortonworks Inc. 2011–2018. All rights reserved
Why Knox?
Simplified Access
• Kerberos encapsulation
• Extends API reach
• Single access point
• Single SSL certificate
• Multi-cluster support
Centralized Control
• Auditing
• Service-level authorization
• Knox Admin UI
• Service Discovery and Topology Generation
Framework
Enterprise Integration
• LDAP/AD integration
• Support for SAMLv2, OAuth and related
IdPs: eg. Okta
• SSO integration
Enhanced Security
• Proxy to abstract network details
• TLS Termination for non-SSL services
9 © Hortonworks Inc. 2011–2018. All rights reserved
Knox - Hadoop Trusted Proxy & Authn Service
10 © Hortonworks Inc. 2011–2018. All rights reserved
Encapsulation of Kerberos
11 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Knox Providers
Web App
Security
Protections against common web application vulnerabilities
Authentication /
Federation
Authenticate a user’s credentials or federate an external
authentication event
Authorization Verify that the authenticated user has access to requested
resources
Audit Detailed accounting of resource access
12 © Hortonworks Inc. 2011–2018. All rights reserved
Web Application Security
13 © Hortonworks Inc. 2011–2018. All rights reserved
• Knox is a Web API (REST) and UI Gateway for Hadoop. The fact that REST interactions are
HTTP based means that they are vulnerable to a number of web application security
vulnerabilities.
• CSRF
• CORS
• X-Frame-Options
• X-Content-Type-Options
• HTTP Strict Transport Security
• X-XSS-Protection
Web App Security Provider
14 © Hortonworks Inc. 2011–2018. All rights reserved
Authentication Mechanisms
15 © Hortonworks Inc. 2011–2018. All rights reserved
• PAM based Authentication
• provide the ability to use the configuration provided by existing PAM modules
• remove limitations with the KnoxLdapRealm
• Supported for nested OUs and nested groups
• Faster lookups
• Support more complex LDAP queries
• Reduce load on the LDAP/AD server (caching by SSSD)
• HadoopAuth Authentication Provider
• allows clients to using the strong authentication and SSO capabilities of Kerberos.
• integrates the use of the Apache Hadoop module for SPNEGO and delegation token based
authentication
Apache Knox - Supported Authentication Mechanisms
16 © Hortonworks Inc. 2011–2018. All rights reserved
• Header based Preauthenticated SSO Provider
• This provider was designed for use with identity solutions such as those provided by CA’s
SiteMinder and IBM’s Tivoli Access Manager.
• CAUTION: The use of this provider requires that proper network security and identity provider
configuration and deployment does not allow requests directly to the Knox gateway. Otherwise,
this provider will leave the gateway exposed to identity spoofing.
• Pac4j Provider
• used as a federation provider to support the OAuth, CAS, SAML and OpenID Connect protocols.
• It must be used for SSO, in association with the KnoxSSO service and optionally with the
SSOCookieProvider for access to REST APIs.
Supported Authentication Mechanisms (Contd.)
17 © Hortonworks Inc. 2011–2018. All rights reserved
• SSO Cookie Provider
• a typical SP initiated websso mechanism that sets a cookie to be presented by browsers to
participating applications and cryptographically verified.
• enables the federation of the authentication event that occurred through KnoxSSO
• KnoxSSO Service
• Provides WebSSO capabilities to the Hadoop cluster*
• KnoxToken Service
• Using the acquired token, a client is able to access REST APIs that are protected with the
JWTProvider federation provider.
• TLS Client Certificate Provider
• enables establishing the user based on the client provided TLS certificate.
• The user will be the DN from the certificate.
* not all components are supported
Supported Authentication Mechanisms (Contd.)
18 © Hortonworks Inc. 2011–2018. All rights reserved
Authorization
19 © Hortonworks Inc. 2011–2018. All rights reserved
• Service Level Authorization
• Service based ACL definition – simple Knox authorization
• Apache Ranger Integration
• for enterprise/centralized authorization policies
• for finer grained policies on services across the platform (service level for Knox)
Authorization Support in Knox
20 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Ranger Centralized Policy Management
21 © Hortonworks Inc. 2011–2018. All rights reserved
Audit
22 © Hortonworks Inc. 2011–2018. All rights reserved
Audit – Centralized Logging
• All Service access through the Knox Gateway is Audited with detailed Entries
for:
• Original Access Request
• Authentication Event
• Identity Assertion: principal mapping, group lookup
• Authorization Event
• Dispatch to backend service
23 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Knox to the
Rescue…
24 © Hortonworks Inc. 2011–2018. All rights reserved
• Which category Knox has protection against?
• Security headers
• XFS
• CSRF (limited to REST APIs)
• SSL Downgrade
• MIME Sniffing
• XSS
• Open Redirect – validated redirect & forwards
• Whitelisting
Knox defense against Web Attack Vectors
25 © Hortonworks Inc. 2011–2018. All rights reserved
Let’s Protect with
Apache Knox
26 © Hortonworks Inc. 2011–2018. All rights reserved
• Only expose Knox Port through Cloud VPC
• All other Service Access goes through Knox
Limit Attack Surface
27 © Hortonworks Inc. 2011–2018. All rights reserved
• Without Knox or Kerberos enablement, there is no Authentication.
• Proxy Access to the UI through Knox Gateway require HTTP BASIC Auth against LDAP/AD
• Calls to WebHDFS URLs will include the Authenticated User as a user.name Query Param
• Only Authorized Users will have access.
Add Authenticated Access
28 © Hortonworks Inc. 2011–2018. All rights reserved
• Enable Authorization Providers e.g. Ranger
• Restrict Access to WebHDFS to Users in certain LDAP groups
• Only a subset of your Authenticated Users will have Access to WebHDFS
Add Service Level Authorization
29 © Hortonworks Inc. 2011–2018. All rights reserved
End Result - Authentication
30 © Hortonworks Inc. 2011–2018. All rights reserved
End Result - Authorization
31 © Hortonworks Inc. 2011–2018. All rights reserved
Security Controls - Apache Knox
Issue Categories (OWASP Top 10 – 2017) Apache Knox
A1 Injection No
A2 Broken Authentication Yes
A3 Sensitive Data Exposure Yes (Encrypts params & locations)
A4 XML External Entities (XXE) No
A5 Broken Access Control Partially (Service level ACLs)
A6 Security Misconfiguration Partially (Less exposed Services)
A7 XSS Partially*
A8 Insecure Deserialization No
A9 Using Components with Known Vulnerabilities Partially
A10 Insufficient Logging and Monitoring Partially**
32 © Hortonworks Inc. 2011–2018. All rights reserved
What else you can do?
• Security around your Perimeter
• Web Application Firewall
• Intrusion Detection & Prevention System
• Network (Packet filtering) Firewall
• API Gateway
• Hardening everything you can
• O/S - Secure PAM Configuration, SELinux, remove unused packages, disable unused services, etc.
• Application Server – Jetty, Tomcat, etc.
• Database Server – enable SSL, restrict IP addresses, etc.
• KDC Server – master_key_type, supported_enctypes, ticket_lifetime, etc.
• Set *.proxyuser.*.hosts/groups only to needed hosts/groups
33 © Hortonworks Inc. 2011–2018. All rights reserved
Take Away
• Kerberos Encapsulation – Web Browser friendly.
• Knox - Eliminates need for developing central Authentication solution for Hadoop
ecosystem
• Provides ready integration with Enterprise Authentication mechanisms like AD, Okta,
etc.
• Prevents against several OWASP Top 10 threats
34 © Hortonworks Inc. 2011–2018. All rights reserved
Questions?
35 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you

Más contenido relacionado

La actualidad más candente

Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?
Kai Wähner
 
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
confluent
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
DataWorks Summit
 

La actualidad más candente (20)

Apache NiFi Crash Course Intro
Apache NiFi Crash Course IntroApache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
 
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetupKafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetup
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
OpenStack Architecture and Use Cases
OpenStack Architecture and Use CasesOpenStack Architecture and Use Cases
OpenStack Architecture and Use Cases
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
 
Red Hat Satellite 6 - Automation with Puppet
Red Hat Satellite 6 - Automation with PuppetRed Hat Satellite 6 - Automation with Puppet
Red Hat Satellite 6 - Automation with Puppet
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replication
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 

Similar a Apache Knox - Hadoop Security Swiss Army Knife

Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
DataWorks Summit
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
Whats new in was liberty security and cloud readiness
Whats new in was liberty   security and cloud readinessWhats new in was liberty   security and cloud readiness
Whats new in was liberty security and cloud readiness
sflynn073
 
Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)
Christian Posta
 

Similar a Apache Knox - Hadoop Security Swiss Army Knife (20)

Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 
Web application vulnerability assessment
Web application vulnerability assessmentWeb application vulnerability assessment
Web application vulnerability assessment
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Brocade vADC Portfolio Overview 2016
Brocade vADC Portfolio Overview 2016Brocade vADC Portfolio Overview 2016
Brocade vADC Portfolio Overview 2016
 
OCS LIA
OCS LIAOCS LIA
OCS LIA
 
Building multi tenant highly secured applications on .net for any cloud - dem...
Building multi tenant highly secured applications on .net for any cloud - dem...Building multi tenant highly secured applications on .net for any cloud - dem...
Building multi tenant highly secured applications on .net for any cloud - dem...
 
Techcello hp-arch workshop
Techcello hp-arch workshopTechcello hp-arch workshop
Techcello hp-arch workshop
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
 
2017 02-17 rsac 2017 tech-f02
2017 02-17 rsac 2017 tech-f022017 02-17 rsac 2017 tech-f02
2017 02-17 rsac 2017 tech-f02
 
SUGCON EU 2023 - Secure Composable SaaS.pptx
SUGCON EU 2023 - Secure Composable SaaS.pptxSUGCON EU 2023 - Secure Composable SaaS.pptx
SUGCON EU 2023 - Secure Composable SaaS.pptx
 
Whats new in was liberty security and cloud readiness
Whats new in was liberty   security and cloud readinessWhats new in was liberty   security and cloud readiness
Whats new in was liberty security and cloud readiness
 
PLNOG 17 - Grzegorz Kornacki - F5 and OpenStack
PLNOG 17 - Grzegorz Kornacki - F5 and OpenStackPLNOG 17 - Grzegorz Kornacki - F5 and OpenStack
PLNOG 17 - Grzegorz Kornacki - F5 and OpenStack
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)
 
Cloud security what to expect (introduction to cloud security)
Cloud security   what to expect (introduction to cloud security)Cloud security   what to expect (introduction to cloud security)
Cloud security what to expect (introduction to cloud security)
 

Más de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Más de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

Último (20)

Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 

Apache Knox - Hadoop Security Swiss Army Knife

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Apache Knox - Hadoop Security Swiss Knife Krishna Pandey, CISSP , CCSP Larry McCay Staff Software Engineer (Platform Security) Senior Manager (Security Architecture) & Knox PMC chair, Ranger, Metron and Hadoop committer
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved • Over 53,000 Incidents [1] • 2,216 Confirmed Data Breaches [2]x • Formjacking – 48,000 unique websites on average/month [3] • Crypto-jacking [reduced by 52% in 2018] [4] • Penalties for non-compliance – GDPR • More CVEs to patch – CVE-2019-5736, CVE-2018-8009, … • Meltdown, Spectre, Foreshadow, …, Spoiler Source: [1] [2] https://www.verizonenterprise.com/verizon-insights-lab/dbir/ [3] [4]: https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-executive-summary-en-apj.pdf
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved What This Means to Big Data Landscape?
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Fig.2 Publicly accessible WebHDFS cluster
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Apache Knox
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved • Knox Gateway is a reverse proxy for the Hadoop Ecosystem • A trusted proxy within Hadoop platform deployments • Proxies HTTP APIs and UIs • Encapsulates and minimizes the burden of Kerberos on HTTP clients • Highly flexible with pluggable provider chains
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Apache Knox – What is it? 3 Primary Sets of Services from Knox: ● Proxying Services ○ Protected Access to Platform Resources ○ Pluggable Provider Pipeline ● Authentication Services ○ KnoxSSO ○ KnoxToken ● Client Services ○ SDK Client Classes ○ Groovy based DSL for Scripting ○ Knox Token Sessions ○ Credential Collectors
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Why Knox? Simplified Access • Kerberos encapsulation • Extends API reach • Single access point • Single SSL certificate • Multi-cluster support Centralized Control • Auditing • Service-level authorization • Knox Admin UI • Service Discovery and Topology Generation Framework Enterprise Integration • LDAP/AD integration • Support for SAMLv2, OAuth and related IdPs: eg. Okta • SSO integration Enhanced Security • Proxy to abstract network details • TLS Termination for non-SSL services
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Knox - Hadoop Trusted Proxy & Authn Service
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved Encapsulation of Kerberos
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved Apache Knox Providers Web App Security Protections against common web application vulnerabilities Authentication / Federation Authenticate a user’s credentials or federate an external authentication event Authorization Verify that the authenticated user has access to requested resources Audit Detailed accounting of resource access
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Web Application Security
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved • Knox is a Web API (REST) and UI Gateway for Hadoop. The fact that REST interactions are HTTP based means that they are vulnerable to a number of web application security vulnerabilities. • CSRF • CORS • X-Frame-Options • X-Content-Type-Options • HTTP Strict Transport Security • X-XSS-Protection Web App Security Provider
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Authentication Mechanisms
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved • PAM based Authentication • provide the ability to use the configuration provided by existing PAM modules • remove limitations with the KnoxLdapRealm • Supported for nested OUs and nested groups • Faster lookups • Support more complex LDAP queries • Reduce load on the LDAP/AD server (caching by SSSD) • HadoopAuth Authentication Provider • allows clients to using the strong authentication and SSO capabilities of Kerberos. • integrates the use of the Apache Hadoop module for SPNEGO and delegation token based authentication Apache Knox - Supported Authentication Mechanisms
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved • Header based Preauthenticated SSO Provider • This provider was designed for use with identity solutions such as those provided by CA’s SiteMinder and IBM’s Tivoli Access Manager. • CAUTION: The use of this provider requires that proper network security and identity provider configuration and deployment does not allow requests directly to the Knox gateway. Otherwise, this provider will leave the gateway exposed to identity spoofing. • Pac4j Provider • used as a federation provider to support the OAuth, CAS, SAML and OpenID Connect protocols. • It must be used for SSO, in association with the KnoxSSO service and optionally with the SSOCookieProvider for access to REST APIs. Supported Authentication Mechanisms (Contd.)
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved • SSO Cookie Provider • a typical SP initiated websso mechanism that sets a cookie to be presented by browsers to participating applications and cryptographically verified. • enables the federation of the authentication event that occurred through KnoxSSO • KnoxSSO Service • Provides WebSSO capabilities to the Hadoop cluster* • KnoxToken Service • Using the acquired token, a client is able to access REST APIs that are protected with the JWTProvider federation provider. • TLS Client Certificate Provider • enables establishing the user based on the client provided TLS certificate. • The user will be the DN from the certificate. * not all components are supported Supported Authentication Mechanisms (Contd.)
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved Authorization
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved • Service Level Authorization • Service based ACL definition – simple Knox authorization • Apache Ranger Integration • for enterprise/centralized authorization policies • for finer grained policies on services across the platform (service level for Knox) Authorization Support in Knox
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Apache Ranger Centralized Policy Management
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Audit
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Audit – Centralized Logging • All Service access through the Knox Gateway is Audited with detailed Entries for: • Original Access Request • Authentication Event • Identity Assertion: principal mapping, group lookup • Authorization Event • Dispatch to backend service
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Apache Knox to the Rescue…
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved • Which category Knox has protection against? • Security headers • XFS • CSRF (limited to REST APIs) • SSL Downgrade • MIME Sniffing • XSS • Open Redirect – validated redirect & forwards • Whitelisting Knox defense against Web Attack Vectors
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Let’s Protect with Apache Knox
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved • Only expose Knox Port through Cloud VPC • All other Service Access goes through Knox Limit Attack Surface
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved • Without Knox or Kerberos enablement, there is no Authentication. • Proxy Access to the UI through Knox Gateway require HTTP BASIC Auth against LDAP/AD • Calls to WebHDFS URLs will include the Authenticated User as a user.name Query Param • Only Authorized Users will have access. Add Authenticated Access
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved • Enable Authorization Providers e.g. Ranger • Restrict Access to WebHDFS to Users in certain LDAP groups • Only a subset of your Authenticated Users will have Access to WebHDFS Add Service Level Authorization
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved End Result - Authentication
  • 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved End Result - Authorization
  • 31. 31 © Hortonworks Inc. 2011–2018. All rights reserved Security Controls - Apache Knox Issue Categories (OWASP Top 10 – 2017) Apache Knox A1 Injection No A2 Broken Authentication Yes A3 Sensitive Data Exposure Yes (Encrypts params & locations) A4 XML External Entities (XXE) No A5 Broken Access Control Partially (Service level ACLs) A6 Security Misconfiguration Partially (Less exposed Services) A7 XSS Partially* A8 Insecure Deserialization No A9 Using Components with Known Vulnerabilities Partially A10 Insufficient Logging and Monitoring Partially**
  • 32. 32 © Hortonworks Inc. 2011–2018. All rights reserved What else you can do? • Security around your Perimeter • Web Application Firewall • Intrusion Detection & Prevention System • Network (Packet filtering) Firewall • API Gateway • Hardening everything you can • O/S - Secure PAM Configuration, SELinux, remove unused packages, disable unused services, etc. • Application Server – Jetty, Tomcat, etc. • Database Server – enable SSL, restrict IP addresses, etc. • KDC Server – master_key_type, supported_enctypes, ticket_lifetime, etc. • Set *.proxyuser.*.hosts/groups only to needed hosts/groups
  • 33. 33 © Hortonworks Inc. 2011–2018. All rights reserved Take Away • Kerberos Encapsulation – Web Browser friendly. • Knox - Eliminates need for developing central Authentication solution for Hadoop ecosystem • Provides ready integration with Enterprise Authentication mechanisms like AD, Okta, etc. • Prevents against several OWASP Top 10 threats
  • 34. 34 © Hortonworks Inc. 2011–2018. All rights reserved Questions?
  • 35. 35 © Hortonworks Inc. 2011–2018. All rights reserved Thank you