SlideShare una empresa de Scribd logo
1 de 28
Building Secure Hadoop
Environments

© 2012 Datameer, Inc. All rights reserved.
© 2012 Datameer, Inc. All rights reserved.
View the full recording
You can view the full recording of this ondemand webinar with slides at:
http://info.datameer.com/Slideshare-BuildingSecure-Hadoop-Environments.html

© 2012 Datameer, Inc. All rights reserved.
About our Speaker
Karen Hsu
With over 15 years of experience in
enterprise software, Karen Hsu has coauthored 4 patents and worked in a variety
of engineering, marketing and sales roles.
Most recently she came from Informatica
where she worked with the start-ups
Informatica purchased to bring data
quality, master data management, B2B and
data security solutions to market.
Karen has a Bachelors of Science degree
in Management Science and Engineering
from Stanford University.

© 2012 Datameer, Inc. All rights reserved.
About our Speaker
Filip Slunecko
Filip is part of the Customer support team at
Datameer.
He is a Linux professional and Python
enthusiast. Before joining Datameer, he was on
the Hadoop team at AVG, an antivirus/security
company.
Filip now uses his 8 years experience with Linux
servers and Hadoop security to help Datameer
customers.

© 2012 Datameer, Inc. All rights reserved.
Building Secure Hadoop
Environments

© 2012 Datameer, Inc. All rights reserved.
© 2012 Datameer, Inc. All rights reserved.
Agenda
Challenges and use cases
Hadoop security landscape
Components for building successful
Hadoop environments
Call to Action

© 2012 Datameer, Inc. All rights reserved.
Hadoop Data Security Challenges
Architectural issues
Hadoop security is developing
Vendors offer bolt-on solutions
To add security capabilities into a big data environment, the
capabilities need to scale with the data… Most security tools fail
to scale and perform with big data environments.
- Adrian Lane, Securosis
Securosis, Oct 12, 2012
© 2012 Datameer, Inc. All rights reserved.
Hadoop Security Use Cases
Use Case

Requirement

Example

Description

Role based
access

Data access is
restricted
through the
abstraction
layer

Users have a view of
data in Hadoop they
can manipulate

Transformation
of sensitive
values during
load

Data is
transformed,
masked, or
encrypted.

Cluster is copied and
then
masked/transformed
so that analysts work
on anonymized data

© 2012 Datameer, Inc. All rights reserved.
Role Based Access

Data Access

Pig / Hive

Map-Reduce

Restrict View
HDFS

© 2012 Datameer, Inc. All rights reserved.
Transformation of Sensitive Values

Data Access

Load

Map-Reduce

Transform Data
HDFS

© 2012 Datameer, Inc. All rights reserved.
Hybrid of Role Based Access and
Transformation of Sensitive Values
Data Access
Load

Map-Reduce

Transform

Restrict View

HDFS

© 2012 Datameer, Inc. All rights reserved.
Hadoop Security Offerings
Type

Description

Example vendors

Role based access control

Use LDAP / Active Directory (AD)
authentication to identify and
manage users. Leveraging
Kerberos to provide mutual
authentication

Encryption

•
•
•

Masking

Data Masking performed before
load

Block level encryption

Linux directory level encryption
with external key store

File encryption
Disk encryption
Format preserving encryption

© 2012 Datameer, Inc. All rights reserved.
Components for Building Secure
Hadoop Environment
Secure access – SSL
Access controls
Secure authentication
Kerberos
Logging – auditing
File Encryption
Disk encryption

© 2012 Datameer, Inc. All rights reserved.
Secure access

© 2012 Datameer, Inc. All rights reserved.
Access Controls
Datameer Example
Object permission
Roles
LDAP
Kerberos
Impersonation

© 2012 Datameer, Inc. All rights reserved.
Object Permission
Datameer Example
Object types
Import jobs
Data links
Workbooks
Export job

Info graphics

© 2012 Datameer, Inc. All rights reserved.
Roles
Datameer Example

© 2012 Datameer, Inc. All rights reserved.
Remote Authenticator
Datameer Example

Integrating into an existing infrastructure
Active directory support
Import groups and users to Datameer
Centralized user management

© 2012 Datameer, Inc. All rights reserved.
Kerberos

© 2012 Datameer, Inc. All rights reserved.
Impersonation

© 2012 Datameer, Inc. All rights reserved.
Demonstration

© 2012 Datameer, Inc. All rights reserved.
Disk Encryption
Why it’s important
• 1 year - 2%
• 2 year - 6-8%
Criteria for success
• Encryption per process
• Key management
• Safe and in full compliance with HIPAA, PCIDSS, FERPA

© 2012 Datameer, Inc. All rights reserved.
File Encryption
Emerging Technology

Intel Hadoop
Project Rhino
• Encryption and key management.
• A common authorization framework.
• Token based authentication and single sign on.
• Improve audit logging.

© 2012 Datameer, Inc. All rights reserved.
Logging and Auditing

Datameer
UI Access

Job
execution

Hadoop
File
access

Job runs

© 2012 Datameer, Inc. All rights reserved.
Logging and Auditing

Centralized logging
Collectors

Storage

Real Time Search

Visualization

Datameer

Datameer*

Katta

Datameer

Splunk

Splunk

Elasticsearch

Splunk

Flume

Elasticsearch

Solr

Greylog

Greylog

Solr

Graphite

Hive

© 2012 Datameer, Inc. All rights reserved.
Recap
Challenges and use cases
Hadoop security landscape
Components for building successful Hadoop
environments
• Secure access – SSL
• Access controls
• Secure authentication
• Kerberos
• Logging – auditing
• File Encryption
• Disk encryption
© 2012 Datameer, Inc. All rights reserved.
Call to Action
Contact
• Filip Slunecko
fslunecko@datameer.com
• Karen Hsu
khsu@datameer.com

Implementing Hadoop Security
Workshop
• Contact
marketing@datameer.com for
more details

Meet us at
Discover Big Data 8 City Workshop
near you!
http://info.datameer.com/Discove
r-Big-Data-RoadShow.html

www.datameer.com

© 2012 Datameer, Inc. All rights reserved.
Online Resources




Try Datameer: www.datameer.com
Follow us on Twitter @datameer

© 2012 Datameer, Inc. All rights reserved.

Más contenido relacionado

La actualidad más candente

The Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubThe Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data Hub
DataWorks Summit
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
 
Security As A Service
Security As A ServiceSecurity As A Service
Security As A Service
guest536dd0e
 
Jazoon'12 Enterprise-wide Cloud Governance
Jazoon'12 Enterprise-wide Cloud GovernanceJazoon'12 Enterprise-wide Cloud Governance
Jazoon'12 Enterprise-wide Cloud Governance
Netcetera
 
Irmintroductionforautocad 110415002444-phpapp02
Irmintroductionforautocad 110415002444-phpapp02Irmintroductionforautocad 110415002444-phpapp02
Irmintroductionforautocad 110415002444-phpapp02
gilberteric
 
Lessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of DataLessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of Data
DataWorks Summit
 

La actualidad más candente (20)

Enterprise policy-management
Enterprise policy-managementEnterprise policy-management
Enterprise policy-management
 
The Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubThe Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data Hub
 
Data Loss Prevention in O365
Data Loss Prevention in O365Data Loss Prevention in O365
Data Loss Prevention in O365
 
CIO Cloud Security Checklist
CIO Cloud Security ChecklistCIO Cloud Security Checklist
CIO Cloud Security Checklist
 
Guide to CASB Use Cases
Guide to CASB Use CasesGuide to CASB Use Cases
Guide to CASB Use Cases
 
Security and governance in the cloud
Security and governance in the cloudSecurity and governance in the cloud
Security and governance in the cloud
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Security As A Service
Security As A ServiceSecurity As A Service
Security As A Service
 
A case for Managed Detection and Response
A case for Managed Detection and ResponseA case for Managed Detection and Response
A case for Managed Detection and Response
 
Jazoon'12 Enterprise-wide Cloud Governance
Jazoon'12 Enterprise-wide Cloud GovernanceJazoon'12 Enterprise-wide Cloud Governance
Jazoon'12 Enterprise-wide Cloud Governance
 
Irmintroductionforautocad 110415002444-phpapp02
Irmintroductionforautocad 110415002444-phpapp02Irmintroductionforautocad 110415002444-phpapp02
Irmintroductionforautocad 110415002444-phpapp02
 
How to streamline data governance and security across on-prem and cloud?
How to streamline data governance and security across on-prem and cloud?How to streamline data governance and security across on-prem and cloud?
How to streamline data governance and security across on-prem and cloud?
 
AWS reInvent: Building an enterprise class backup and archival solution on AWS
AWS reInvent: Building an enterprise class backup and archival solution on AWSAWS reInvent: Building an enterprise class backup and archival solution on AWS
AWS reInvent: Building an enterprise class backup and archival solution on AWS
 
Retail security-services--client-presentation
Retail security-services--client-presentationRetail security-services--client-presentation
Retail security-services--client-presentation
 
Lessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of DataLessons Learned on How to Secure Petabytes of Data
Lessons Learned on How to Secure Petabytes of Data
 
Retail Security: Closing the Threat Gap
Retail Security: Closing the Threat GapRetail Security: Closing the Threat Gap
Retail Security: Closing the Threat Gap
 
Corporate Laptop Backup and Recovery
Corporate Laptop Backup and RecoveryCorporate Laptop Backup and Recovery
Corporate Laptop Backup and Recovery
 
Overview of Microsoft Teams and Data Loss Prevention(DLP)
Overview of Microsoft Teams  and Data Loss Prevention(DLP)Overview of Microsoft Teams  and Data Loss Prevention(DLP)
Overview of Microsoft Teams and Data Loss Prevention(DLP)
 
NIST Cybersecurity Framework (CSF) on the Public Cloud
NIST Cybersecurity Framework (CSF) on the Public CloudNIST Cybersecurity Framework (CSF) on the Public Cloud
NIST Cybersecurity Framework (CSF) on the Public Cloud
 

Destacado

Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001
jucaab
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
DataWorks Summit
 
Introduction to WebSphere Message Broker
Introduction to WebSphere Message BrokerIntroduction to WebSphere Message Broker
Introduction to WebSphere Message Broker
Ant Phillips
 
How to use Innovative Architectures for Digital Enterprises
How to use Innovative Architectures for Digital EnterprisesHow to use Innovative Architectures for Digital Enterprises
How to use Innovative Architectures for Digital Enterprises
Capgemini
 

Destacado (20)

Hadoop Security Preview
Hadoop Security PreviewHadoop Security Preview
Hadoop Security Preview
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Ontologising the Health Level Seven (HL7) Standard
Ontologising the Health Level Seven (HL7) StandardOntologising the Health Level Seven (HL7) Standard
Ontologising the Health Level Seven (HL7) Standard
 
Introduction to Hadoop - The Essentials
Introduction to Hadoop - The EssentialsIntroduction to Hadoop - The Essentials
Introduction to Hadoop - The Essentials
 
Webservices REST com Zend Framework
Webservices REST com Zend FrameworkWebservices REST com Zend Framework
Webservices REST com Zend Framework
 
SOA standards
SOA standardsSOA standards
SOA standards
 
Soa business centric and soap basic
Soa business centric and soap basicSoa business centric and soap basic
Soa business centric and soap basic
 
Splunking HL7 Healthcare Data for Business Value
Splunking HL7 Healthcare Data for Business ValueSplunking HL7 Healthcare Data for Business Value
Splunking HL7 Healthcare Data for Business Value
 
Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001Ebs soa con8716_pdf_8716_0001
Ebs soa con8716_pdf_8716_0001
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Automated Testing for BizTalk HL7 Solutions
Automated Testing for BizTalk HL7 SolutionsAutomated Testing for BizTalk HL7 Solutions
Automated Testing for BizTalk HL7 Solutions
 
Description of soa and SOAP,WSDL & UDDI
Description of soa and SOAP,WSDL & UDDIDescription of soa and SOAP,WSDL & UDDI
Description of soa and SOAP,WSDL & UDDI
 
Hadoop and Big Data Security
Hadoop and Big Data SecurityHadoop and Big Data Security
Hadoop and Big Data Security
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Back to basics: Simple database web services without the need for SOA
Back to basics: Simple database web services without the need for SOABack to basics: Simple database web services without the need for SOA
Back to basics: Simple database web services without the need for SOA
 
080827 abramson inmon vs kimball
080827 abramson   inmon vs kimball080827 abramson   inmon vs kimball
080827 abramson inmon vs kimball
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
Introduction to WebSphere Message Broker
Introduction to WebSphere Message BrokerIntroduction to WebSphere Message Broker
Introduction to WebSphere Message Broker
 
How to use Innovative Architectures for Digital Enterprises
How to use Innovative Architectures for Digital EnterprisesHow to use Innovative Architectures for Digital Enterprises
How to use Innovative Architectures for Digital Enterprises
 

Similar a Is Your Hadoop Environment Secure?

Composite Information Server
Composite Information ServerComposite Information Server
Composite Information Server
templedf
 
Taking Hadoop to Enterprise Security Standards
Taking Hadoop to Enterprise Security StandardsTaking Hadoop to Enterprise Security Standards
Taking Hadoop to Enterprise Security Standards
DataWorks Summit
 

Similar a Is Your Hadoop Environment Secure? (20)

Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Cloud computing Introductory Session
Cloud computing Introductory SessionCloud computing Introductory Session
Cloud computing Introductory Session
 
Innovation Without Compromise: The Challenges of Securing Big Data
Innovation Without Compromise: The Challenges of Securing Big DataInnovation Without Compromise: The Challenges of Securing Big Data
Innovation Without Compromise: The Challenges of Securing Big Data
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
 
Accelerate Design and Development of Data Projects Using AWS
Accelerate Design and Development of Data Projects Using AWSAccelerate Design and Development of Data Projects Using AWS
Accelerate Design and Development of Data Projects Using AWS
 
Harnessing the Power of Apache Hadoop Series
Harnessing the Power of Apache Hadoop SeriesHarnessing the Power of Apache Hadoop Series
Harnessing the Power of Apache Hadoop Series
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Composite Information Server
Composite Information ServerComposite Information Server
Composite Information Server
 
Taking Hadoop to Enterprise Security Standards
Taking Hadoop to Enterprise Security StandardsTaking Hadoop to Enterprise Security Standards
Taking Hadoop to Enterprise Security Standards
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready state
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Governance and Security in Cloud and Mobile Apps
Governance and Security in Cloud and Mobile AppsGovernance and Security in Cloud and Mobile Apps
Governance and Security in Cloud and Mobile Apps
 

Más de Datameer

Más de Datameer (20)

Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business Managers
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
 
Understand Your Customer Buying Journey with Big Data
Understand Your Customer Buying Journey with Big Data Understand Your Customer Buying Journey with Big Data
Understand Your Customer Buying Journey with Big Data
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
 
How to Avoid Pitfalls in Big Data Analytics Webinar
How to Avoid Pitfalls in Big Data Analytics WebinarHow to Avoid Pitfalls in Big Data Analytics Webinar
How to Avoid Pitfalls in Big Data Analytics Webinar
 
Webinar - Introducing Datameer 4.0: Visual, End-to-End
Webinar - Introducing Datameer 4.0: Visual, End-to-EndWebinar - Introducing Datameer 4.0: Visual, End-to-End
Webinar - Introducing Datameer 4.0: Visual, End-to-End
 
Webinar - Big Data: Power to the User
Webinar - Big Data: Power to the User Webinar - Big Data: Power to the User
Webinar - Big Data: Power to the User
 
Why Use Hadoop for Big Data Analytics?
Why Use Hadoop for Big Data Analytics?Why Use Hadoop for Big Data Analytics?
Why Use Hadoop for Big Data Analytics?
 
Why Use Hadoop?
Why Use Hadoop?Why Use Hadoop?
Why Use Hadoop?
 
Online Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics WebinarOnline Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics Webinar
 
Instant Visualizations in Every Step of Analysis
Instant Visualizations in Every Step of AnalysisInstant Visualizations in Every Step of Analysis
Instant Visualizations in Every Step of Analysis
 
Customer Case Studies of Self-Service Big Data Analytics
Customer Case Studies of Self-Service Big Data AnalyticsCustomer Case Studies of Self-Service Big Data Analytics
Customer Case Studies of Self-Service Big Data Analytics
 
BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics? BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics?
 
Fight Fraud with Big Data Analytics
Fight Fraud with Big Data AnalyticsFight Fraud with Big Data Analytics
Fight Fraud with Big Data Analytics
 
Lean Production Meets Big Data: A Next Generation Use Case
Lean Production Meets Big Data: A Next Generation Use CaseLean Production Meets Big Data: A Next Generation Use Case
Lean Production Meets Big Data: A Next Generation Use Case
 
The Economics of SQL on Hadoop
The Economics of SQL on HadoopThe Economics of SQL on Hadoop
The Economics of SQL on Hadoop
 
Top 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big DataTop 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big Data
 
Best Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by DatameerBest Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by Datameer
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Is Your Hadoop Environment Secure?

  • 1. Building Secure Hadoop Environments © 2012 Datameer, Inc. All rights reserved. © 2012 Datameer, Inc. All rights reserved.
  • 2. View the full recording You can view the full recording of this ondemand webinar with slides at: http://info.datameer.com/Slideshare-BuildingSecure-Hadoop-Environments.html © 2012 Datameer, Inc. All rights reserved.
  • 3. About our Speaker Karen Hsu With over 15 years of experience in enterprise software, Karen Hsu has coauthored 4 patents and worked in a variety of engineering, marketing and sales roles. Most recently she came from Informatica where she worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market. Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University. © 2012 Datameer, Inc. All rights reserved.
  • 4. About our Speaker Filip Slunecko Filip is part of the Customer support team at Datameer. He is a Linux professional and Python enthusiast. Before joining Datameer, he was on the Hadoop team at AVG, an antivirus/security company. Filip now uses his 8 years experience with Linux servers and Hadoop security to help Datameer customers. © 2012 Datameer, Inc. All rights reserved.
  • 5. Building Secure Hadoop Environments © 2012 Datameer, Inc. All rights reserved. © 2012 Datameer, Inc. All rights reserved.
  • 6. Agenda Challenges and use cases Hadoop security landscape Components for building successful Hadoop environments Call to Action © 2012 Datameer, Inc. All rights reserved.
  • 7. Hadoop Data Security Challenges Architectural issues Hadoop security is developing Vendors offer bolt-on solutions To add security capabilities into a big data environment, the capabilities need to scale with the data… Most security tools fail to scale and perform with big data environments. - Adrian Lane, Securosis Securosis, Oct 12, 2012 © 2012 Datameer, Inc. All rights reserved.
  • 8. Hadoop Security Use Cases Use Case Requirement Example Description Role based access Data access is restricted through the abstraction layer Users have a view of data in Hadoop they can manipulate Transformation of sensitive values during load Data is transformed, masked, or encrypted. Cluster is copied and then masked/transformed so that analysts work on anonymized data © 2012 Datameer, Inc. All rights reserved.
  • 9. Role Based Access Data Access Pig / Hive Map-Reduce Restrict View HDFS © 2012 Datameer, Inc. All rights reserved.
  • 10. Transformation of Sensitive Values Data Access Load Map-Reduce Transform Data HDFS © 2012 Datameer, Inc. All rights reserved.
  • 11. Hybrid of Role Based Access and Transformation of Sensitive Values Data Access Load Map-Reduce Transform Restrict View HDFS © 2012 Datameer, Inc. All rights reserved.
  • 12. Hadoop Security Offerings Type Description Example vendors Role based access control Use LDAP / Active Directory (AD) authentication to identify and manage users. Leveraging Kerberos to provide mutual authentication Encryption • • • Masking Data Masking performed before load Block level encryption Linux directory level encryption with external key store File encryption Disk encryption Format preserving encryption © 2012 Datameer, Inc. All rights reserved.
  • 13. Components for Building Secure Hadoop Environment Secure access – SSL Access controls Secure authentication Kerberos Logging – auditing File Encryption Disk encryption © 2012 Datameer, Inc. All rights reserved.
  • 14. Secure access © 2012 Datameer, Inc. All rights reserved.
  • 15. Access Controls Datameer Example Object permission Roles LDAP Kerberos Impersonation © 2012 Datameer, Inc. All rights reserved.
  • 16. Object Permission Datameer Example Object types Import jobs Data links Workbooks Export job Info graphics © 2012 Datameer, Inc. All rights reserved.
  • 17. Roles Datameer Example © 2012 Datameer, Inc. All rights reserved.
  • 18. Remote Authenticator Datameer Example Integrating into an existing infrastructure Active directory support Import groups and users to Datameer Centralized user management © 2012 Datameer, Inc. All rights reserved.
  • 19. Kerberos © 2012 Datameer, Inc. All rights reserved.
  • 20. Impersonation © 2012 Datameer, Inc. All rights reserved.
  • 21. Demonstration © 2012 Datameer, Inc. All rights reserved.
  • 22. Disk Encryption Why it’s important • 1 year - 2% • 2 year - 6-8% Criteria for success • Encryption per process • Key management • Safe and in full compliance with HIPAA, PCIDSS, FERPA © 2012 Datameer, Inc. All rights reserved.
  • 23. File Encryption Emerging Technology Intel Hadoop Project Rhino • Encryption and key management. • A common authorization framework. • Token based authentication and single sign on. • Improve audit logging. © 2012 Datameer, Inc. All rights reserved.
  • 24. Logging and Auditing Datameer UI Access Job execution Hadoop File access Job runs © 2012 Datameer, Inc. All rights reserved.
  • 25. Logging and Auditing Centralized logging Collectors Storage Real Time Search Visualization Datameer Datameer* Katta Datameer Splunk Splunk Elasticsearch Splunk Flume Elasticsearch Solr Greylog Greylog Solr Graphite Hive © 2012 Datameer, Inc. All rights reserved.
  • 26. Recap Challenges and use cases Hadoop security landscape Components for building successful Hadoop environments • Secure access – SSL • Access controls • Secure authentication • Kerberos • Logging – auditing • File Encryption • Disk encryption © 2012 Datameer, Inc. All rights reserved.
  • 27. Call to Action Contact • Filip Slunecko fslunecko@datameer.com • Karen Hsu khsu@datameer.com Implementing Hadoop Security Workshop • Contact marketing@datameer.com for more details Meet us at Discover Big Data 8 City Workshop near you! http://info.datameer.com/Discove r-Big-Data-RoadShow.html www.datameer.com © 2012 Datameer, Inc. All rights reserved.
  • 28. Online Resources   Try Datameer: www.datameer.com Follow us on Twitter @datameer © 2012 Datameer, Inc. All rights reserved.

Notas del editor

  1. Architectural issuesBig data environments do not typically offer no finer granularity of access than schema levelLack of secure inter-node communication (create separate layer so customers don’t have to worry about this)Hadoop security is developingImprovements to role-based HDFS security are in progressOpensource projects are just beginningVendors offer immature solutionsTrying to apply traditional methods to HadoopCreate a central chokepoint and are not operating at the node levelLack of solutions in production
  2. The big data users I have spoken with about data security agreed that data masking at that scale is infeasible. Given the rate of data insertion (also called ‘velocity’), masking sensitive data before loading it into a cluster would require “an entire ETL cluster to front the Hadoop cluster”. But apparently it’s doable, and Netflix did just that – fronted its analytics cluster with a data transformation cluster, all within EC2. 500 nodes massaging data for another 500 nodes. While the ETL cluster is not used for masking, note that it is about the same size as the analysis cluster. It’s this one-to-one mapping that I often worry about with security. Ask yourself, “Do we need another whole cluster for masking?” No? Then what about NoSQL activity monitoring? What about IAM, application monitoring, and any other security tasks. Do you start to see the problem with bolting on security? Logging and auditing are embeddable – most everything else is not.
  3. Kerberos to provide mutual authentication—both the user and the server verify each other’s identityGazzang – block level for big data
  4. A proper infrastructure PKI inside an organizationCert The warning screen – users are used to certificate warnings CDH4.1 – Kerberos SSLDisable Hadoop we access
  5. Object typesUnix based permResults sharingEasy to understand and audit
  6. Granular rolesPer type of an object not just per an objectExample:Hadoop admin – role – can access Hadoop settings and create import jobs – do not have access to data
  7. Different group in an organization – more security – Hadoop admins do not have rights to change add/remove p. from groups
  8. Join to a company AD infrastructure.Adopted by Hadoop as an authentication mechanismIntegration with other services across platforms – zookeeper, For example MSSQL services
  9. Delegation – Datameer can run jobs as a owner of the jobWith imp only owner can access his own file.When user is deleted from system ……Jobs are run as an owner of the job and stored
  10. - Show access rights, role screen, LDAP screen, Kerberos setup
  11. Intel – Implemented in Hadoop APIYoung project – others – future will shows if others participate – Cloudera ….Others: Volateg. Preterit – not open source and not wildly used
  12. Detailed information about user access Detailed information job runs – dependent on Hadoop logs
  13. Datameer for big data.Use Datameer to analyze Datameer access logs.Abnormality detectionSecurity breach detection.Behavior analysis.* HDFS - Hadoop