SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Protect customer privacy with AWS
Rohit Pujari
Solutions Architecture
Amazon Web Services
G R C 3 5 1
Anhad Preet Singh
Enterprise Architect
Dataguise
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s look at some metrics
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
PII
Rogue
agents
External
Hacking
Second-party
misuse
Breach
Spyware
Unsecured
devices
Espionage
Botnets
Consumer
consent
violation
Dangers of
holding PII
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Compliance and regulations
• GDPR (EU): General Data Protection Regulation
• CCPA (California): California Consumer Privacy Act
• PIPEDA (Canada): Personal Information Protection and Electronic Documents Act
• PCI-DSS (payment cards)
• FSA, ICO, DPA, Payment Schemes, EU Member State laws, and US and other foreign
regulators (e.g., SEC)
Compliance rules and regulations are constantly evolving. As such, we are moving toward true
data privacy law/regulation.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Then and now
Directives Laws
Best practices/
good ethics
Regulatory
requirements
No
consequences
Heavy fines
Overhead
In design and
necessity
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
What problems are customers are trying to solve?
• What type of data am I collecting?
• Where do I collect it?
• Where do I store it?
• Do I have the appropriate legal collection
statements?
• How and when do I delete data?
• How do I secure the data?
• What responsibility do I have?
• Why do I collect the data?
• What is my legal basis for processing and using
the data?
• Where is a list of all my data?
• Do I communicate with the subject I am
collecting from?
• Who do I share it with?
• Who has access to my data? How do I control it?
• What are the use cases for the data? Are they
permitted? Who provided permission?
• How do I find my data?
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data lakes architecture
• Relational and non-relational data
• Schema defined during analysis
• Unmatched durability and availability
• Security, compliance, and audit capabilities
• Run any analytics on the same data without
movement
• Scale storage and compute independently
• Pay for what is used
AWS
Snowball
AWS
Snowmobile
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Streams
Amazon S3
Amazon
Redshift
Amazon
EMR
Amazon
Athena
Amazon
Kinesis Amazon Elasticsearch
Service
Amazon Kinesis
Video Streams
AI Services
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pay only for the resources you use as you scale
• Pay as you go for the resources you consume
• As low as $0.05/GB scanned with Amazon Athena
• Amazon EMR and Amazon Athena can automatically
scale down resources after job completes, saving you
costs
• Commit to a set term and save up to 75% with Reserved
Instance
• Run on spare compute capacity with Amazon EMR and
save up to 90% with Amazon EC2 Spot
Traditional approach leads to wasted capacity
Traditional: Rigid
AWS: Elastic
Capacity
Demand
Demand
Servers
Unmet demand
Upset players
Missed revenue
Excess capacity
Wasted $$$
AWS approach: Pay for the capacity you use
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits of AWS for secure data storage
Security and compliance
Three different forms of
encryption; encrypts data in
transit when replicating across
Regions; log and monitor with
AWS CloudTrail, use ML to
discover and protect sensitive
data with Amazon Macie
Flexible management
Classify, report, and visualize
data usage trends; objects can be
tagged to see storage
consumption, cost, and security;
build lifecycle policies to
automate tiering, and retention
Durability, availability, &
scalability
Built for eleven nines of
durability; data distributed
across 3 physical facilities in
an AWS Region;
automatically replicated to
any other AWS Region
Query in place
Run analytics & ML on
data lake without data
movement; Amazon S3
Select can retrieve a
subset of data, improving
analytics performance by
400%
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Storing is not enough; data needs to be discoverable
Dark data are the information assets
organizations collect, process, and store
during regular business activities,
but generally fail to use for other purposes
(for example, analytics, business
relationships and direct monetizing).
CRM ERP Data warehouse Mainframe
data
Web Social Log
files
Machine
data
Semi-
structured
Unstructured
“
”Gartner IT Glossary, 2018
https://www.gartner.com/it-glossary/dark-data
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Glue Data Catalog
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Grant permissions to securely share data
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Lake Formation security workflow
User
IAM users, roles
Active Directory Amazon S3
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
CSA
Cloud Security
Alliance Controls
ISO 9001
Global Quality
Standard
ISO 27001
Security Management
Controls
ISO 27017
Cloud Specific
Controls
ISO 27018
Personal Data
Protection
PCI DSS Level 1
Payment Card
Standards
SOC 1
Audit Controls
Report
SOC 2
Security, Availability, &
Confidentiality Report
SOC 3
General Controls
Report
Global United States
CJIS
Criminal Justice
Information Services
DoD SRG
DoD Data
Processing
FedRAMP
Government Data
Standards
FERPA
Educational
Privacy Act
FIPS
Government Security
Standards
FISMA
Federal Information
Security Management
GxP
Quality Guidelines
and Regulations
ISO FFIEC
Financial Institutions
Regulation
HIPAA
Protected Health
Information
ITAR
International Arms
Regulations
MPAA
Protected Media
Content
NIST
National Institute of
Standards and Technology
SEC Rule 17a-4(f)
Financial Data
Standards
VPAT/Section 508
Accountability
Standards
Asia Pacific
FISC [Japan]
Financial Industry
Information Systems
IRAP [Australia]
Australian Security
Standards
K-ISMS [Korea]
Korean Information
Security
MTCS Tier 3 [Singapore]
Multi-Tier Cloud
Security Standard
My Number Act [Japan]
Personal Information
Protection
Europe
C5 [Germany]
Operational Security
Attestation
Cyber Essentials
Plus [UK]
Cyber Threat
Protection
G-Cloud [UK]
UK Government
Standards
IT-Grundschutz
[Germany]
Baseline Protection
Methodology
X P
G
Compliance: Virtually every regulatory agency
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
How data normally flows…
Extraction process
Load process
Transformation process
Amazon S3
data lake
Amazon
Redshift
staging
table
Reporting process
Amazon
Redshift
destination
table
Reports and
extracts
Source data
(database or
API)
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transforming sensitive data
The key to building a de-identified
system is adding a sensitive data
transformation step to the data
extraction process
Extraction and
transformation
process
Load process
Post-load
transformation
Amazon S3
data lake
Amazon
Redshift
staging
table
Reporting process
Amazon
Redshift
destination
table
Source data
(database or
API)
Reporting process
Reports and
extracts
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dataguisecombines privacy and security
Find sensitive data in structured,
unstructured, and
semi-structured content
Remediate your sensitive data exposure for
risk and compliance obligations
Track how and where sensitive data is being
accessed
Detect Protect Monitor
De-identify personal data
Encrypt at the element level
Track cross-border transfers
Track third-party disclosures
Discover and classify sensitive
data
Inventory identities and
requirements
Process data subject access
requests
Notify on retention limits
Alert on compliance violations
Alert on inappropriate user access
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
The problem: Regulatory resistance
Hadoop Database
File sharesData warehouse
On-premises
AWS
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
The solution
On-premises AWS
Start End
Scan on-premises
data repositories
for personal and
sensitive data
Sensitive
data?
Sensitive
data?
Scan the migrated
data in AWS for
personal and
sensitive data
Yes
Remediate: notify,
mask, encrypt,
tokenize, access
control, DLP
No Migrate
data
Yes
Remediate: notify,
mask, encrypt,
tokenize, access
control, DLP
No
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Solution architecture
Hadoop Cluster with
Hadoop IDP on the edge node
DgSecure controller
LDAP/AD
On-premises Data Center 1 On-premises Data Center 2
Target databases
File shares with files IDP
DBMS IDP
AWS Cloud
Target databases/Redshift/RDS
EC2 instance with
DBMS IDP and Cloud IDP
Amazon S3 buckets
Amazon EMR cluster
with S3 compute IDP
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Email Customer ID Transcript
vcorleon@gf.com 19664 Just talked to Vito Corleone
fred@gf.com 23423 Fredo’s SSN is 716905534
sonny@gf.com 99644 Sonny is moving to Nevada
NA 02945 It is expected to rain tomorrow
Validating the knowns & finding the unknowns: Structured
and semi-structured data
ID Name, SSN, StateEmail
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Email Customer ID Transcript
vcorleon@gf.com 19664 Just talked to Vito Corleone
fred@gf.com 23423 Fredo’s SSN is 716905534
sonny@gf.com 99644 Sonny is moving to Nevada
NA 02945 It is expected to rain tomorrow
Validating the knowns & finding the unknowns: Structured
and semi-structured data
ID Name, SSN, StateEmail
Email Customer ID Transcript
4t23gttt 7462391 Just talked to Lebron James
44e5325 1239474 Melo’s SSN is 983441298
0we&yrw 9983487 Manu is moving to Texas
NA 3344325 It is expected to rain tomorrow
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Finding the needles in the haystack
unstructured data
Customers Call center
Your call will be recorded
for quality assurance
……………..this is Jonathan Franklin and my social is
six one two one four five three zero nine is there any more
informationyou need for my app...........
Social Security Number
Full name
1 2 3
4
5
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Protecting the needles in the haystack
unstructured data
……………..this is Aaron Rodgers and my social is
two three one six four zero nine one two is there any more
informationyou need for my app...........
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale and accuracy
• Scanning strategies
• Sampling
• Location: Top, bottom, random, etc.
• Amount: By percentage, by size, etc.
• Machine learning
• Low to no false positives
• Intelligent detection
• Parallel execution
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
What Dataguise scans and protects on-premises
SQL Server
Supported databases Other supported repositoriesSupported Hadoop distributions
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
What Dataguise scan and protect in AWS
Amazon S3 Amazon Aurora
Amazon RDS Amazon EMR Amazon Redshift
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sleeping disorder device manufacturer
• Continuous ingestion of data from sleeping devices used by their patients, in CSV and Parquet files
• Customer created a data lake on Amazon S3
• PHI data needs to be masked before landing in data lake
• Customer uses a concept of microbatches, where each microbatch = 10 min., and in this time it
ingests almost 1.5 GB of data
• Dataguise needs to identify and mask data in < 5 min. (+1 min. tolerance)
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
The solution
Landing zone
Safe zone
Devices
Amazon S3
AWS Cloud
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
Large pharmaceutical company
• Continuous ingestion of data from various healthcare companies in the form of JSON and CSV files in
Amazon S3
• Customer created a data lake on Amazon S3
• AWS Lambda functions detect when a new file lands in the staging area and kicks off all the APIs
• PHI data needs to be detected and masked before landing in data lake
• Customer uses a staging area where Dataguise and other tools are used to identify sensitive data,
identify profile data, run anti-viruses, etc. before the data is moved into the data lake
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
The solution
Staging area
Safe zone
Source
Amazon S3
AWS Cloud
Data profile Antivirus
Source
Source
Thank you!
© 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Design for compliance: Practical patterns for meeting your IT compliance requ...
Design for compliance: Practical patterns for meeting your IT compliance requ...Design for compliance: Practical patterns for meeting your IT compliance requ...
Design for compliance: Practical patterns for meeting your IT compliance requ...
 
Ensure the integrity of your code for fast and secure deployments - SDD319 - ...
Ensure the integrity of your code for fast and secure deployments - SDD319 - ...Ensure the integrity of your code for fast and secure deployments - SDD319 - ...
Ensure the integrity of your code for fast and secure deployments - SDD319 - ...
 
Pop the hood: Using AWS resources to attest to security of the cloud - GRC310...
Pop the hood: Using AWS resources to attest to security of the cloud - GRC310...Pop the hood: Using AWS resources to attest to security of the cloud - GRC310...
Pop the hood: Using AWS resources to attest to security of the cloud - GRC310...
 
Identity and access control for custom enterprise applications - SDD412 - AWS...
Identity and access control for custom enterprise applications - SDD412 - AWS...Identity and access control for custom enterprise applications - SDD412 - AWS...
Identity and access control for custom enterprise applications - SDD412 - AWS...
 
Using ML with Amazon SageMaker & GuardDuty to identify anomalous traffic - SE...
Using ML with Amazon SageMaker & GuardDuty to identify anomalous traffic - SE...Using ML with Amazon SageMaker & GuardDuty to identify anomalous traffic - SE...
Using ML with Amazon SageMaker & GuardDuty to identify anomalous traffic - SE...
 
Achieving security goals with AWS CloudHSM - SDD333 - AWS re:Inforce 2019
Achieving security goals with AWS CloudHSM - SDD333 - AWS re:Inforce 2019 Achieving security goals with AWS CloudHSM - SDD333 - AWS re:Inforce 2019
Achieving security goals with AWS CloudHSM - SDD333 - AWS re:Inforce 2019
 
Build a PCI SAQ A-EP-compliant serverless service to manage credit card payme...
Build a PCI SAQ A-EP-compliant serverless service to manage credit card payme...Build a PCI SAQ A-EP-compliant serverless service to manage credit card payme...
Build a PCI SAQ A-EP-compliant serverless service to manage credit card payme...
 
Lean and clean SecOps using AWS native services cloud - SDD301 - AWS re:Infor...
Lean and clean SecOps using AWS native services cloud - SDD301 - AWS re:Infor...Lean and clean SecOps using AWS native services cloud - SDD301 - AWS re:Infor...
Lean and clean SecOps using AWS native services cloud - SDD301 - AWS re:Infor...
 
Cloud auditing workshop - GRC323 - AWS re:Inforce 2019
Cloud auditing workshop - GRC323 - AWS re:Inforce 2019 Cloud auditing workshop - GRC323 - AWS re:Inforce 2019
Cloud auditing workshop - GRC323 - AWS re:Inforce 2019
 
An open-source adventure in the cloud, containers, and incident response - SE...
An open-source adventure in the cloud, containers, and incident response - SE...An open-source adventure in the cloud, containers, and incident response - SE...
An open-source adventure in the cloud, containers, and incident response - SE...
 
Containers and mission-critical applications - SEP309-R - AWS re:Inforce 2019
Containers and mission-critical applications - SEP309-R - AWS re:Inforce 2019 Containers and mission-critical applications - SEP309-R - AWS re:Inforce 2019
Containers and mission-critical applications - SEP309-R - AWS re:Inforce 2019
 
How Dow Jones uses AWS to create a secure perimeter around its web properties...
How Dow Jones uses AWS to create a secure perimeter around its web properties...How Dow Jones uses AWS to create a secure perimeter around its web properties...
How Dow Jones uses AWS to create a secure perimeter around its web properties...
 
Privacy, ethics, and engineering in emerging technology - SEP204 - AWS re:Inf...
Privacy, ethics, and engineering in emerging technology - SEP204 - AWS re:Inf...Privacy, ethics, and engineering in emerging technology - SEP204 - AWS re:Inf...
Privacy, ethics, and engineering in emerging technology - SEP204 - AWS re:Inf...
 
Evolving perimeters with guardrails, not gates: Improving developer agility -...
Evolving perimeters with guardrails, not gates: Improving developer agility -...Evolving perimeters with guardrails, not gates: Improving developer agility -...
Evolving perimeters with guardrails, not gates: Improving developer agility -...
 
Security best practices the well-architected way - SDD318 - AWS re:Inforce 2019
Security best practices the well-architected way - SDD318 - AWS re:Inforce 2019 Security best practices the well-architected way - SDD318 - AWS re:Inforce 2019
Security best practices the well-architected way - SDD318 - AWS re:Inforce 2019
 
How to act on your security and compliance alerts with AWS Security Hub - FND...
How to act on your security and compliance alerts with AWS Security Hub - FND...How to act on your security and compliance alerts with AWS Security Hub - FND...
How to act on your security and compliance alerts with AWS Security Hub - FND...
 
New ways to automate compliance verification on AWS using provable security -...
New ways to automate compliance verification on AWS using provable security -...New ways to automate compliance verification on AWS using provable security -...
New ways to automate compliance verification on AWS using provable security -...
 
It’s in my backlog: The truth behind DevSecOps - FND217 - AWS re:Inforce 2019
It’s in my backlog: The truth behind DevSecOps - FND217 - AWS re:Inforce 2019 It’s in my backlog: The truth behind DevSecOps - FND217 - AWS re:Inforce 2019
It’s in my backlog: The truth behind DevSecOps - FND217 - AWS re:Inforce 2019
 
AWS GovCloud (US): A path to high compliance in the cloud - GRC344 - AWS re:I...
AWS GovCloud (US): A path to high compliance in the cloud - GRC344 - AWS re:I...AWS GovCloud (US): A path to high compliance in the cloud - GRC344 - AWS re:I...
AWS GovCloud (US): A path to high compliance in the cloud - GRC344 - AWS re:I...
 
Architecting security and governance through policy guardrails in Amazon EKS ...
Architecting security and governance through policy guardrails in Amazon EKS ...Architecting security and governance through policy guardrails in Amazon EKS ...
Architecting security and governance through policy guardrails in Amazon EKS ...
 

Similar a Protect customer privacy with AWS - GRC351 - AWS re:Inforce 2019

Secure and Automate AWS Deployments with Next Generation Security
Secure and Automate AWS Deployments with Next Generation SecuritySecure and Automate AWS Deployments with Next Generation Security
Secure and Automate AWS Deployments with Next Generation Security
Amazon Web Services
 

Similar a Protect customer privacy with AWS - GRC351 - AWS re:Inforce 2019 (20)

Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
 
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitBuilding Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
 
Building data lakes for analytics on AWS - ADB201 - Santa Clara AWS Summit.pdf
Building data lakes for analytics on AWS - ADB201 - Santa Clara AWS Summit.pdfBuilding data lakes for analytics on AWS - ADB201 - Santa Clara AWS Summit.pdf
Building data lakes for analytics on AWS - ADB201 - Santa Clara AWS Summit.pdf
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Unlock Highly Regulated Enterprise Workloads with SaaS on AWS GovCloud (US) (...
Unlock Highly Regulated Enterprise Workloads with SaaS on AWS GovCloud (US) (...Unlock Highly Regulated Enterprise Workloads with SaaS on AWS GovCloud (US) (...
Unlock Highly Regulated Enterprise Workloads with SaaS on AWS GovCloud (US) (...
 
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Managing Security on AWS
Managing Security on AWSManaging Security on AWS
Managing Security on AWS
 
How to Process Transactions Like a Boss! AWS Developer Workshop at Web Summit...
How to Process Transactions Like a Boss! AWS Developer Workshop at Web Summit...How to Process Transactions Like a Boss! AWS Developer Workshop at Web Summit...
How to Process Transactions Like a Boss! AWS Developer Workshop at Web Summit...
 
AWS per la semplificazione del percorso di conformità al GDPR
AWS per la semplificazione del percorso di conformità al GDPRAWS per la semplificazione del percorso di conformità al GDPR
AWS per la semplificazione del percorso di conformità al GDPR
 
Sicurezza in AWS automazione e best practice
Sicurezza in AWS automazione e best practiceSicurezza in AWS automazione e best practice
Sicurezza in AWS automazione e best practice
 
Threat Detection and Mitigation at Scale on AWS - SID301 - Anaheim AWS Summit
Threat Detection and Mitigation at Scale on AWS - SID301 - Anaheim AWS SummitThreat Detection and Mitigation at Scale on AWS - SID301 - Anaheim AWS Summit
Threat Detection and Mitigation at Scale on AWS - SID301 - Anaheim AWS Summit
 
Threat Detection and Mitigation at Scale on AWS
Threat Detection and Mitigation at Scale on AWS Threat Detection and Mitigation at Scale on AWS
Threat Detection and Mitigation at Scale on AWS
 
AWS - Security & Compliance
AWS - Security & ComplianceAWS - Security & Compliance
AWS - Security & Compliance
 
GDPR and Automation Overview
GDPR and Automation OverviewGDPR and Automation Overview
GDPR and Automation Overview
 
Sicurezza e conformità al GDPR con AWS
Sicurezza e conformità al GDPR con AWSSicurezza e conformità al GDPR con AWS
Sicurezza e conformità al GDPR con AWS
 
Secure and Automate AWS Deployments with Next Generation Security
Secure and Automate AWS Deployments with Next Generation SecuritySecure and Automate AWS Deployments with Next Generation Security
Secure and Automate AWS Deployments with Next Generation Security
 
Data Lifecycle Management
Data Lifecycle ManagementData Lifecycle Management
Data Lifecycle Management
 
Secure Your Customers' Data From Day One
Secure Your Customers' Data From Day OneSecure Your Customers' Data From Day One
Secure Your Customers' Data From Day One
 
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresa
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresaImmersion Day - Como a AWS apoia a estratégia analítica de sua empresa
Immersion Day - Como a AWS apoia a estratégia analítica de sua empresa
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Protect customer privacy with AWS - GRC351 - AWS re:Inforce 2019

  • 1. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Protect customer privacy with AWS Rohit Pujari Solutions Architecture Amazon Web Services G R C 3 5 1 Anhad Preet Singh Enterprise Architect Dataguise
  • 2. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s look at some metrics
  • 3. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. PII Rogue agents External Hacking Second-party misuse Breach Spyware Unsecured devices Espionage Botnets Consumer consent violation Dangers of holding PII
  • 4. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Compliance and regulations • GDPR (EU): General Data Protection Regulation • CCPA (California): California Consumer Privacy Act • PIPEDA (Canada): Personal Information Protection and Electronic Documents Act • PCI-DSS (payment cards) • FSA, ICO, DPA, Payment Schemes, EU Member State laws, and US and other foreign regulators (e.g., SEC) Compliance rules and regulations are constantly evolving. As such, we are moving toward true data privacy law/regulation.
  • 5. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Then and now Directives Laws Best practices/ good ethics Regulatory requirements No consequences Heavy fines Overhead In design and necessity
  • 6. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. What problems are customers are trying to solve? • What type of data am I collecting? • Where do I collect it? • Where do I store it? • Do I have the appropriate legal collection statements? • How and when do I delete data? • How do I secure the data? • What responsibility do I have? • Why do I collect the data? • What is my legal basis for processing and using the data? • Where is a list of all my data? • Do I communicate with the subject I am collecting from? • Who do I share it with? • Who has access to my data? How do I control it? • What are the use cases for the data? Are they permitted? Who provided permission? • How do I find my data?
  • 7. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Data lakes architecture • Relational and non-relational data • Schema defined during analysis • Unmatched durability and availability • Security, compliance, and audit capabilities • Run any analytics on the same data without movement • Scale storage and compute independently • Pay for what is used AWS Snowball AWS Snowmobile Amazon Kinesis Data Firehose Amazon Kinesis Data Streams Amazon S3 Amazon Redshift Amazon EMR Amazon Athena Amazon Kinesis Amazon Elasticsearch Service Amazon Kinesis Video Streams AI Services
  • 9. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Pay only for the resources you use as you scale • Pay as you go for the resources you consume • As low as $0.05/GB scanned with Amazon Athena • Amazon EMR and Amazon Athena can automatically scale down resources after job completes, saving you costs • Commit to a set term and save up to 75% with Reserved Instance • Run on spare compute capacity with Amazon EMR and save up to 90% with Amazon EC2 Spot Traditional approach leads to wasted capacity Traditional: Rigid AWS: Elastic Capacity Demand Demand Servers Unmet demand Upset players Missed revenue Excess capacity Wasted $$$ AWS approach: Pay for the capacity you use
  • 10. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits of AWS for secure data storage Security and compliance Three different forms of encryption; encrypts data in transit when replicating across Regions; log and monitor with AWS CloudTrail, use ML to discover and protect sensitive data with Amazon Macie Flexible management Classify, report, and visualize data usage trends; objects can be tagged to see storage consumption, cost, and security; build lifecycle policies to automate tiering, and retention Durability, availability, & scalability Built for eleven nines of durability; data distributed across 3 physical facilities in an AWS Region; automatically replicated to any other AWS Region Query in place Run analytics & ML on data lake without data movement; Amazon S3 Select can retrieve a subset of data, improving analytics performance by 400%
  • 11. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Storing is not enough; data needs to be discoverable Dark data are the information assets organizations collect, process, and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). CRM ERP Data warehouse Mainframe data Web Social Log files Machine data Semi- structured Unstructured “ ”Gartner IT Glossary, 2018 https://www.gartner.com/it-glossary/dark-data
  • 12. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Glue Data Catalog
  • 13. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 14. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 15. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Grant permissions to securely share data
  • 16. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Lake Formation security workflow User IAM users, roles Active Directory Amazon S3
  • 17. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. CSA Cloud Security Alliance Controls ISO 9001 Global Quality Standard ISO 27001 Security Management Controls ISO 27017 Cloud Specific Controls ISO 27018 Personal Data Protection PCI DSS Level 1 Payment Card Standards SOC 1 Audit Controls Report SOC 2 Security, Availability, & Confidentiality Report SOC 3 General Controls Report Global United States CJIS Criminal Justice Information Services DoD SRG DoD Data Processing FedRAMP Government Data Standards FERPA Educational Privacy Act FIPS Government Security Standards FISMA Federal Information Security Management GxP Quality Guidelines and Regulations ISO FFIEC Financial Institutions Regulation HIPAA Protected Health Information ITAR International Arms Regulations MPAA Protected Media Content NIST National Institute of Standards and Technology SEC Rule 17a-4(f) Financial Data Standards VPAT/Section 508 Accountability Standards Asia Pacific FISC [Japan] Financial Industry Information Systems IRAP [Australia] Australian Security Standards K-ISMS [Korea] Korean Information Security MTCS Tier 3 [Singapore] Multi-Tier Cloud Security Standard My Number Act [Japan] Personal Information Protection Europe C5 [Germany] Operational Security Attestation Cyber Essentials Plus [UK] Cyber Threat Protection G-Cloud [UK] UK Government Standards IT-Grundschutz [Germany] Baseline Protection Methodology X P G Compliance: Virtually every regulatory agency
  • 18. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. How data normally flows… Extraction process Load process Transformation process Amazon S3 data lake Amazon Redshift staging table Reporting process Amazon Redshift destination table Reports and extracts Source data (database or API)
  • 19. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Transforming sensitive data The key to building a de-identified system is adding a sensitive data transformation step to the data extraction process Extraction and transformation process Load process Post-load transformation Amazon S3 data lake Amazon Redshift staging table Reporting process Amazon Redshift destination table Source data (database or API) Reporting process Reports and extracts
  • 20. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 21. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Dataguisecombines privacy and security Find sensitive data in structured, unstructured, and semi-structured content Remediate your sensitive data exposure for risk and compliance obligations Track how and where sensitive data is being accessed Detect Protect Monitor De-identify personal data Encrypt at the element level Track cross-border transfers Track third-party disclosures Discover and classify sensitive data Inventory identities and requirements Process data subject access requests Notify on retention limits Alert on compliance violations Alert on inappropriate user access
  • 22. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. The problem: Regulatory resistance Hadoop Database File sharesData warehouse On-premises AWS
  • 23. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. The solution On-premises AWS Start End Scan on-premises data repositories for personal and sensitive data Sensitive data? Sensitive data? Scan the migrated data in AWS for personal and sensitive data Yes Remediate: notify, mask, encrypt, tokenize, access control, DLP No Migrate data Yes Remediate: notify, mask, encrypt, tokenize, access control, DLP No
  • 24. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 25. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Solution architecture Hadoop Cluster with Hadoop IDP on the edge node DgSecure controller LDAP/AD On-premises Data Center 1 On-premises Data Center 2 Target databases File shares with files IDP DBMS IDP AWS Cloud Target databases/Redshift/RDS EC2 instance with DBMS IDP and Cloud IDP Amazon S3 buckets Amazon EMR cluster with S3 compute IDP
  • 26. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Email Customer ID Transcript vcorleon@gf.com 19664 Just talked to Vito Corleone fred@gf.com 23423 Fredo’s SSN is 716905534 sonny@gf.com 99644 Sonny is moving to Nevada NA 02945 It is expected to rain tomorrow Validating the knowns & finding the unknowns: Structured and semi-structured data ID Name, SSN, StateEmail
  • 27. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Email Customer ID Transcript vcorleon@gf.com 19664 Just talked to Vito Corleone fred@gf.com 23423 Fredo’s SSN is 716905534 sonny@gf.com 99644 Sonny is moving to Nevada NA 02945 It is expected to rain tomorrow Validating the knowns & finding the unknowns: Structured and semi-structured data ID Name, SSN, StateEmail Email Customer ID Transcript 4t23gttt 7462391 Just talked to Lebron James 44e5325 1239474 Melo’s SSN is 983441298 0we&yrw 9983487 Manu is moving to Texas NA 3344325 It is expected to rain tomorrow
  • 28. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Finding the needles in the haystack unstructured data Customers Call center Your call will be recorded for quality assurance ……………..this is Jonathan Franklin and my social is six one two one four five three zero nine is there any more informationyou need for my app........... Social Security Number Full name 1 2 3 4 5
  • 29. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Protecting the needles in the haystack unstructured data ……………..this is Aaron Rodgers and my social is two three one six four zero nine one two is there any more informationyou need for my app...........
  • 30. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Scale and accuracy • Scanning strategies • Sampling • Location: Top, bottom, random, etc. • Amount: By percentage, by size, etc. • Machine learning • Low to no false positives • Intelligent detection • Parallel execution
  • 31. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. What Dataguise scans and protects on-premises SQL Server Supported databases Other supported repositoriesSupported Hadoop distributions
  • 32. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. What Dataguise scan and protect in AWS Amazon S3 Amazon Aurora Amazon RDS Amazon EMR Amazon Redshift
  • 33. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 34. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Sleeping disorder device manufacturer • Continuous ingestion of data from sleeping devices used by their patients, in CSV and Parquet files • Customer created a data lake on Amazon S3 • PHI data needs to be masked before landing in data lake • Customer uses a concept of microbatches, where each microbatch = 10 min., and in this time it ingests almost 1.5 GB of data • Dataguise needs to identify and mask data in < 5 min. (+1 min. tolerance)
  • 35. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. The solution Landing zone Safe zone Devices Amazon S3 AWS Cloud
  • 36. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. Large pharmaceutical company • Continuous ingestion of data from various healthcare companies in the form of JSON and CSV files in Amazon S3 • Customer created a data lake on Amazon S3 • AWS Lambda functions detect when a new file lands in the staging area and kicks off all the APIs • PHI data needs to be detected and masked before landing in data lake • Customer uses a staging area where Dataguise and other tools are used to identify sensitive data, identify profile data, run anti-viruses, etc. before the data is moved into the data lake
  • 37. © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved. The solution Staging area Safe zone Source Amazon S3 AWS Cloud Data profile Antivirus Source Source
  • 38. Thank you! © 2019,Amazon Web Services, Inc. or its affiliates. All rights reserved.