SlideShare una empresa de Scribd logo
1 de 41
Descargar para leer sin conexión
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
An Architecture for Trade Capture
and Regulatory Reporting
A s h i s h M a j m u n d a r , G l o b a l F i n a n c i a l S e r v i c e s P r i n c i p a l S o l u t i o n s A r c h i t e c t
J o h n K a i n , C a p i t a l M a r k e t s B u s i n e s s D e v e l o p m e n t L e a d
F SV302
November 27, 2017
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to Expect From This Session
• Identifying the challenges in architecting a data lake that meets the
unique requirements for regulatory reporting
• A pattern for ingestion, processing, and transformation of
semistructured data in a secure and auditable data repository that can
be used for a variety of reporting and analytics applications
• An implementation of consolidated audit trail (CAT) reporting using
AWS services integrated with herd, an open-source unified data catalog
framework
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
• Review of Regulatory Reporting Challenges
• Consolidated Audit Trail
• Architecture for Consolidated Audit Trail Reporting
• Security and Lineage Framework
• FIX Message Ingestion
• Message Transformation and Optimization
• Reporting and Analytics Tools
• Recap
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Regulatory Reporting and the
Consolidated Audit Trail
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s Regulatory Reporting Landscape
Financial institutions face challenges capturing, cleaning, organizing, and reporting for an array of
regulators and regulatory frameworks along with new expectations of fine-grained, n-dimensional reporting
with data lineage and governance controls.
EMA
PRA
Treasury
FDIC
FFIECBASEL
Dodd-Frank
NMSMiFID II
BCBS 239
CCAR
ESMA
RDA
FR Y-9C
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Current Architecture Challenges
Legacy System Fragmentation
• Data stored in multiple disconnected data silos
• Silos don’t provide lineage back to source data
• Distributed ETL processes at multiple levels, inconsistent
transformation between silos
Static Infrastructure vs. Dynamic Data
• Slow to onboard new data sources
• Slow to adapt to data format changes
• Slow to build new types of reports
• Slow to share data across teams and with regulators
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Regulatory Reporting Challenges
Diversity of
sources and
formats
Massive data
volumes
Stringent SLAs
(and fines)Security
Single record of
truth with lineage
and recreatability
A More Strategic Approach to Reporting
Financial institutions are viewing their reporting obligations as a catalyst to pursue
broader data management objectives that can help unlock the value of their data.
Business
benefits
Enhanced data
governance
Improved
efficiency
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-Life Example: SEC Rule 613
“… plan to create,
implement, and maintain a
consolidated order tracking
system, or consolidated
audit trail, with respect to
the trading of reportable
securities … ”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trading to CAT
Broker Dealer
Exchanges
CAT
Consolidator
Client/Firm
FIX Protocol
Regulators
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
8=FIX.4.09=029835=849=AWSHUB56=BRA
ES50=OR6857=PFDR34=42762852=201410
11-
15:22:356=38.1550011=B17238605x1c7
5s114=10017=3248042730=1331=38.155
0032=10037=500091438=60039=140=P44
=0.0000054=155=GYMB59=060=20141011
-
15:22:3563=075=2014101176=AWS20=07
100=C,M7101=M7107=38.20,A,L7108=10
0
{
"type": "MEOT",
"reporter": “AWSHUB",
"eventTimestamp": "20141011T152235.023471",
"sequenceNumber": 1199,
"symbol": “GYMB",
"tradeID": “32480427",
"quantity": 100,
"price": 38.155,
"buyDetails": {
"side": "Buy",
"leavesQty": 0,
"orderID": "B17238605x1c75s1”,
"capacity": "Agency",
"claringNumber": "0002",
"liquidityCode": “A"
},
“nbbPrice": 38.16,
"nbbQty": 200,
"nboPrice": 38.15,
"nboQty": 500,
"nbboSource": "SIP",
"nbboTimestamp": " 20141011T152235.023317 "
}
FIX – CAT/JSON
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Consolidated Audit Trail Reporting
Architecture
CAT Reporting Pipeline on AWS
Business
Intelligence
FIX
Messages
Single
Source of
Truth
Transform
and
Optimize
Optimized
Data
Repository
Transaction
Linking and
Transformation
Regulatory
Report
Ad-hoc Data
Analysis
FIX Ingestion
Transform FIX to Parquet
CAT Reporting
Trade Analytics
Region
Multipart
upload of
encrypted
data
Amazon
S3 data
lake
Transient Amazon
EMR Clusters for
ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On
premises
On-premises HSM
(optional)
CloudWatch Alarm AWS CloudTrail
Amazon
Glacier
(WORM
storage)
AWS KMS
CAT Reporting Architecture on AWS
BYO Key
Amazon
S3 Data
Warehouse
Transient
Amazon EMR
Clusters for Event
Sequencing
CAT
output
herd Metadata
Store
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Core Services Being Used
Amazon
S3
Amazon
Glacier
Amazon
CloudWatch
AWS
CloudTrail
AWS KMS Amazon
EMR
Amazon
Athena
Amazon
QuickSight
AWS Direct
Connect
Security
Region
Multipart
upload of
encrypted
data
S3 data
lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On
premises
On-premises HSM
(optional)
CloudWatch Alarm CloudTrail
Glacier
(WORM
storage)
AWS KMS
BYO Key
S3 Data
Warehouse
Transient EMR
Clusters for Event
Sequencing
CAT
output
herd Metadata
Store
Security Framework
Network Isolation
(AWS Direct Connect, VPC, VPN)
Encryption
(Data in Transit)
(Data at Rest)
Auditing
(CloudTrail, CloudWatch)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Region
Multipart
upload of
encrypted
data
S3 data
lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On
premises
On-premises HSM
(optional)
CloudWatch Alarm CloudTrail
Amazon
Glacier
(WORM
storage)
AWS
KMS
Lineage
BYO Key
S3 Data
Warehouse
Transient EMR
Clusters for Event
Sequencing
CAT
output
herd Metadata
Store
Lineage Framework – herd
Unified data catalog
A centralized, auditable
catalog for operational
usage and data governance
Track lineage
Capture data ancestry for
regulatory, forensic, and
analytical purposes
herd is a FINRA-built, open-source framework that tracks and catalogs data in a
unified data repository in order to capture audit and data lineage information
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Integrating herd
ETL Import/Export
ETL Transformation
herd
Metadata
Store
• All ETL applications update
the herd store with input,
output, and ETL application
version
• herd usage validated by
CloudTrail logs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Region
Multipart
upload of
encrypted
data
S3 data
lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On
premises
On-premises HSM
(optional)
CloudWatch Alarm CloudTrail
Amazon
Glacier
(WORM
storage)
AWS KMS
FIX Ingestion
BYO Key
S3 Data
Warehouse
Transient EMR
Clusters for Event
Sequencing
CAT
output
HERD Metadata
Store
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
FIX Ingestion – End of Day
Multipart upload of
encrypted data
S3 data
lake
Internal App
Amazon
Glacier
(WORM
storage)
AWS
Direct
Connect
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Region
Multipart
upload of
encrypted
data
S3 data
lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On
premises
On-premises HSM
(optional)
CloudWatch Alarm CloudTrail
Amazon
Glacier
(WORM
storage)
AWS KMS
Message Transformation and Optimization
BYO Key
S3 Data
Warehouse
Transient EMR
Clusters for Event
Sequencing
CAT
output
HERD Metadata
Store
FIX
S3 data
lake
Transient
EMR Clusters
for ETL
Message Optimization
S3 Data
Repository
EMRFS
Parquet
Core Nodes
Task Nodes
EMRFS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Message Optimization
136
237
100
232
44
345
488
130
215
260
435
109
62
0
100
200
300
400
500
600
kudu parquet hbase avro mapfile
AverageScanRate(kHZ)
No compression Snappy Gzip/BZip2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Region
Multipart
upload of
encrypted
data
S3 data
lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On
premises
On-premises HSM
(optional)
CloudWatch Alarm CloudTrail
Amazon
Glacier
(WORM
storage)
KMS
Reporting and Analytics
BYO Key
S3 Data
Warehouse
Transient EMR
Clusters for Event
Sequencing
CAT
output
HERD Metadata
Store
Parquet
Reporting
CAT
Consolidator
JSON
S3 data
lake
Transient EMR
Clusters
S3 Data
Warehouse
EMRFS
EMRFS
Athena
Athena Creating Tables – Parquet
CREATE EXTERNAL TABLE db_name. transactions (
reporter STRING,
event_timestamp TIMESTAMP,
symbol STRING,
tradeID STRING,
quantity INT,
price DOUBLE,
side INT,
liquidity INT,
clearingNumber STRING
)
PARTITIONED BY (YEAR INT, MONTH INT, DAY INT, CLEARINGNUMBER STRING)
STORED AS PARQUET
LOCATION 's3://fsi-sandbox/catarch/parquet’
TBLPROPERTIES ('has_encrypted_data'=’true');
Parquet
Analytics
S3 data
lake
EMRFS
Amazon QuickSightAthena
Analytics
User
Analytics
User
Analytics
User
Amazon QuickSightAthena
Amazon QuickSightAthena
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ad-hoc Data Analysis: A Typical Situation
Provide
all the trades
in ABC Corp
in last five years
9 TB
2016
2015
2014
2013
2012
Options?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Are Your Options?
Option 3: Query data
at rest using Amazon
Athena or Amazon
Redshift Spectrum
Amazon
Athena
Amazon
S3 data lake
Ad-hoc queries
Option 2: Archive the data, and
upon request, stand up the
database server, restore the
data, and then query the data
$45 for 9 TB scanned
Option 1: Keep it
online all the time
Amazon QuickSight: Import Dataset
Amazon QuickSight: One-click Visualization
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Analytics
Parquet
Formatted
S3 Data
Warehouse
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recap
• Identified the challenges in architecting a data lake that meets the
unique requirements for regulatory reporting: security, lineage, scale,
and elasticity
• Reviewed an architecture for ingestion, processing, and transformation
of FIX dataset into a data repository that can be used for a variety of
reporting and analytics applications
• Demonstrated a reference implementation of CAT reporting using AWS
services integrated with herd
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!

Más contenido relacionado

La actualidad más candente

Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...Amazon Web Services
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...Amazon Web Services
 
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeAmazon Web Services
 
GAM306_Building a Lake of Wisdom
GAM306_Building a Lake of WisdomGAM306_Building a Lake of Wisdom
GAM306_Building a Lake of WisdomAmazon Web Services
 
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...Amazon Web Services
 
ABD304-R-Best Practices for Data Warehousing with Amazon Redshift & Spectrum
ABD304-R-Best Practices for Data Warehousing with Amazon Redshift & SpectrumABD304-R-Best Practices for Data Warehousing with Amazon Redshift & Spectrum
ABD304-R-Best Practices for Data Warehousing with Amazon Redshift & SpectrumAmazon Web Services
 
EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...
EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...
EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...Amazon Web Services
 
ABD311_Deploying Amazon QuickSight For Enterprise
ABD311_Deploying Amazon QuickSight For EnterpriseABD311_Deploying Amazon QuickSight For Enterprise
ABD311_Deploying Amazon QuickSight For EnterpriseAmazon Web Services
 
Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...
Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...
Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...Amazon Web Services
 
ABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSAmazon Web Services
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...Amazon Web Services
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTAmazon Web Services
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsAmazon Web Services
 
AMF305_Autonomous Driving Algorithm Development on Amazon AI
AMF305_Autonomous Driving Algorithm Development on Amazon AIAMF305_Autonomous Driving Algorithm Development on Amazon AI
AMF305_Autonomous Driving Algorithm Development on Amazon AIAmazon Web Services
 
Migrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data LakeMigrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data LakeAmazon Web Services
 
DAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise Workloads
DAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise WorkloadsDAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise Workloads
DAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise WorkloadsAmazon Web Services
 
HLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdfHLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdfAmazon Web Services
 
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseGPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseAmazon Web Services
 
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...Amazon Web Services
 
Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...
Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...
Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...Amazon Web Services
 

La actualidad más candente (20)

Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
 
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
 
GAM306_Building a Lake of Wisdom
GAM306_Building a Lake of WisdomGAM306_Building a Lake of Wisdom
GAM306_Building a Lake of Wisdom
 
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
ABD208_Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores...
 
ABD304-R-Best Practices for Data Warehousing with Amazon Redshift & Spectrum
ABD304-R-Best Practices for Data Warehousing with Amazon Redshift & SpectrumABD304-R-Best Practices for Data Warehousing with Amazon Redshift & Spectrum
ABD304-R-Best Practices for Data Warehousing with Amazon Redshift & Spectrum
 
EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...
EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...
EUT302_Data Ingestion at Seismic Scale Best Practices for Processing Petabyte...
 
ABD311_Deploying Amazon QuickSight For Enterprise
ABD311_Deploying Amazon QuickSight For EnterpriseABD311_Deploying Amazon QuickSight For Enterprise
ABD311_Deploying Amazon QuickSight For Enterprise
 
Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...
Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...
Big Data Breakthroughs: Process and Query Data In Place with Amazon S3 Select...
 
ABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWS
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data Applications
 
AMF305_Autonomous Driving Algorithm Development on Amazon AI
AMF305_Autonomous Driving Algorithm Development on Amazon AIAMF305_Autonomous Driving Algorithm Development on Amazon AI
AMF305_Autonomous Driving Algorithm Development on Amazon AI
 
Migrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data LakeMigrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data Lake
 
DAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise Workloads
DAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise WorkloadsDAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise Workloads
DAT332_How Verizon is Adopting Amazon Aurora PostgreSQL for Enterprise Workloads
 
HLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdfHLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdf
 
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseGPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
 
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...
 
Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...
Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...
Easy and Scalable Log Analytics with Amazon Elasticsearch Service - ABD326 - ...
 

Similar a FSV302_An Architecture for Trade Capture and Regulatory Reporting

BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftAmazon Web Services
 
Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Amazon Web Services
 
ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...
ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...
ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...Amazon Web Services
 
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...Amazon Web Services
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETLAmazon Web Services
 
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...Amazon Web Services
 
Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...
Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...
Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...Amazon Web Services
 
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...Amazon Web Services
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxAmazon Web Services
 
Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsAmazon Web Services
 
An Architecture for Trade Capture and Regulatory Reporting
An Architecture for Trade Capture and Regulatory ReportingAn Architecture for Trade Capture and Regulatory Reporting
An Architecture for Trade Capture and Regulatory ReportingAmazon Web Services
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Amazon Web Services
 
Accelerate Database Migration to AWS with DB Best
 Accelerate Database Migration to AWS with DB Best Accelerate Database Migration to AWS with DB Best
Accelerate Database Migration to AWS with DB BestAmazon Web Services
 
Data Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksData Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksAmazon Web Services
 
Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018
Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018
Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018Amazon Web Services
 
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...Amazon Web Services
 
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...Amazon Web Services
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Amazon Web Services
 
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Web Services
 

Similar a FSV302_An Architecture for Trade Capture and Regulatory Reporting (20)

BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
 
Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2
 
ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...
ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...
ENT324-Automating and Auditing Cloud Governance and Compliance in Multi-Accou...
 
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
 
ENT315_Landing Zones
ENT315_Landing ZonesENT315_Landing Zones
ENT315_Landing Zones
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
I Want to Analyze and Visualize Website Access Logs, but Why Do I Need Server...
 
Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...
Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...
Operation Monitoring and Alerting at Scale in GE Transportation - ENT340 - re...
 
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
 
Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your Applications
 
An Architecture for Trade Capture and Regulatory Reporting
An Architecture for Trade Capture and Regulatory ReportingAn Architecture for Trade Capture and Regulatory Reporting
An Architecture for Trade Capture and Regulatory Reporting
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Accelerate Database Migration to AWS with DB Best
 Accelerate Database Migration to AWS with DB Best Accelerate Database Migration to AWS with DB Best
Accelerate Database Migration to AWS with DB Best
 
Data Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksData Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech Talks
 
Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018
Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018
Realize Value of Your Microsoft Investments - Transformation Day Montreal 2018
 
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
 
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
 
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

FSV302_An Architecture for Trade Capture and Regulatory Reporting

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. An Architecture for Trade Capture and Regulatory Reporting A s h i s h M a j m u n d a r , G l o b a l F i n a n c i a l S e r v i c e s P r i n c i p a l S o l u t i o n s A r c h i t e c t J o h n K a i n , C a p i t a l M a r k e t s B u s i n e s s D e v e l o p m e n t L e a d F SV302 November 27, 2017
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What to Expect From This Session • Identifying the challenges in architecting a data lake that meets the unique requirements for regulatory reporting • A pattern for ingestion, processing, and transformation of semistructured data in a secure and auditable data repository that can be used for a variety of reporting and analytics applications • An implementation of consolidated audit trail (CAT) reporting using AWS services integrated with herd, an open-source unified data catalog framework
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda • Review of Regulatory Reporting Challenges • Consolidated Audit Trail • Architecture for Consolidated Audit Trail Reporting • Security and Lineage Framework • FIX Message Ingestion • Message Transformation and Optimization • Reporting and Analytics Tools • Recap
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Regulatory Reporting and the Consolidated Audit Trail
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s Regulatory Reporting Landscape Financial institutions face challenges capturing, cleaning, organizing, and reporting for an array of regulators and regulatory frameworks along with new expectations of fine-grained, n-dimensional reporting with data lineage and governance controls. EMA PRA Treasury FDIC FFIECBASEL Dodd-Frank NMSMiFID II BCBS 239 CCAR ESMA RDA FR Y-9C
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Current Architecture Challenges Legacy System Fragmentation • Data stored in multiple disconnected data silos • Silos don’t provide lineage back to source data • Distributed ETL processes at multiple levels, inconsistent transformation between silos Static Infrastructure vs. Dynamic Data • Slow to onboard new data sources • Slow to adapt to data format changes • Slow to build new types of reports • Slow to share data across teams and with regulators
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Regulatory Reporting Challenges Diversity of sources and formats Massive data volumes Stringent SLAs (and fines)Security Single record of truth with lineage and recreatability
  • 8. A More Strategic Approach to Reporting Financial institutions are viewing their reporting obligations as a catalyst to pursue broader data management objectives that can help unlock the value of their data. Business benefits Enhanced data governance Improved efficiency
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-Life Example: SEC Rule 613 “… plan to create, implement, and maintain a consolidated order tracking system, or consolidated audit trail, with respect to the trading of reportable securities … ”
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Trading to CAT Broker Dealer Exchanges CAT Consolidator Client/Firm FIX Protocol Regulators
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 8=FIX.4.09=029835=849=AWSHUB56=BRA ES50=OR6857=PFDR34=42762852=201410 11- 15:22:356=38.1550011=B17238605x1c7 5s114=10017=3248042730=1331=38.155 0032=10037=500091438=60039=140=P44 =0.0000054=155=GYMB59=060=20141011 - 15:22:3563=075=2014101176=AWS20=07 100=C,M7101=M7107=38.20,A,L7108=10 0 { "type": "MEOT", "reporter": “AWSHUB", "eventTimestamp": "20141011T152235.023471", "sequenceNumber": 1199, "symbol": “GYMB", "tradeID": “32480427", "quantity": 100, "price": 38.155, "buyDetails": { "side": "Buy", "leavesQty": 0, "orderID": "B17238605x1c75s1”, "capacity": "Agency", "claringNumber": "0002", "liquidityCode": “A" }, “nbbPrice": 38.16, "nbbQty": 200, "nboPrice": 38.15, "nboQty": 500, "nbboSource": "SIP", "nbboTimestamp": " 20141011T152235.023317 " } FIX – CAT/JSON
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Consolidated Audit Trail Reporting Architecture
  • 13. CAT Reporting Pipeline on AWS Business Intelligence FIX Messages Single Source of Truth Transform and Optimize Optimized Data Repository Transaction Linking and Transformation Regulatory Report Ad-hoc Data Analysis FIX Ingestion Transform FIX to Parquet CAT Reporting Trade Analytics
  • 14. Region Multipart upload of encrypted data Amazon S3 data lake Transient Amazon EMR Clusters for ETL Cleansed, Formatted, Split, Compressed Output Internal App On premises On-premises HSM (optional) CloudWatch Alarm AWS CloudTrail Amazon Glacier (WORM storage) AWS KMS CAT Reporting Architecture on AWS BYO Key Amazon S3 Data Warehouse Transient Amazon EMR Clusters for Event Sequencing CAT output herd Metadata Store
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Core Services Being Used Amazon S3 Amazon Glacier Amazon CloudWatch AWS CloudTrail AWS KMS Amazon EMR Amazon Athena Amazon QuickSight AWS Direct Connect
  • 16. Security Region Multipart upload of encrypted data S3 data lake Transient EMR Clusters for ETL Cleansed, Formatted, Split, Compressed Output Internal App On premises On-premises HSM (optional) CloudWatch Alarm CloudTrail Glacier (WORM storage) AWS KMS BYO Key S3 Data Warehouse Transient EMR Clusters for Event Sequencing CAT output herd Metadata Store
  • 17. Security Framework Network Isolation (AWS Direct Connect, VPC, VPN) Encryption (Data in Transit) (Data at Rest) Auditing (CloudTrail, CloudWatch)
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 19. Region Multipart upload of encrypted data S3 data lake Transient EMR Clusters for ETL Cleansed, Formatted, Split, Compressed Output Internal App On premises On-premises HSM (optional) CloudWatch Alarm CloudTrail Amazon Glacier (WORM storage) AWS KMS Lineage BYO Key S3 Data Warehouse Transient EMR Clusters for Event Sequencing CAT output herd Metadata Store
  • 20. Lineage Framework – herd Unified data catalog A centralized, auditable catalog for operational usage and data governance Track lineage Capture data ancestry for regulatory, forensic, and analytical purposes herd is a FINRA-built, open-source framework that tracks and catalogs data in a unified data repository in order to capture audit and data lineage information
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Integrating herd ETL Import/Export ETL Transformation herd Metadata Store • All ETL applications update the herd store with input, output, and ETL application version • herd usage validated by CloudTrail logs
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 23. Region Multipart upload of encrypted data S3 data lake Transient EMR Clusters for ETL Cleansed, Formatted, Split, Compressed Output Internal App On premises On-premises HSM (optional) CloudWatch Alarm CloudTrail Amazon Glacier (WORM storage) AWS KMS FIX Ingestion BYO Key S3 Data Warehouse Transient EMR Clusters for Event Sequencing CAT output HERD Metadata Store
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. FIX Ingestion – End of Day Multipart upload of encrypted data S3 data lake Internal App Amazon Glacier (WORM storage) AWS Direct Connect
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 26. Region Multipart upload of encrypted data S3 data lake Transient EMR Clusters for ETL Cleansed, Formatted, Split, Compressed Output Internal App On premises On-premises HSM (optional) CloudWatch Alarm CloudTrail Amazon Glacier (WORM storage) AWS KMS Message Transformation and Optimization BYO Key S3 Data Warehouse Transient EMR Clusters for Event Sequencing CAT output HERD Metadata Store
  • 27. FIX S3 data lake Transient EMR Clusters for ETL Message Optimization S3 Data Repository EMRFS Parquet Core Nodes Task Nodes EMRFS
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Message Optimization 136 237 100 232 44 345 488 130 215 260 435 109 62 0 100 200 300 400 500 600 kudu parquet hbase avro mapfile AverageScanRate(kHZ) No compression Snappy Gzip/BZip2
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 30. Region Multipart upload of encrypted data S3 data lake Transient EMR Clusters for ETL Cleansed, Formatted, Split, Compressed Output Internal App On premises On-premises HSM (optional) CloudWatch Alarm CloudTrail Amazon Glacier (WORM storage) KMS Reporting and Analytics BYO Key S3 Data Warehouse Transient EMR Clusters for Event Sequencing CAT output HERD Metadata Store
  • 32. Athena Creating Tables – Parquet CREATE EXTERNAL TABLE db_name. transactions ( reporter STRING, event_timestamp TIMESTAMP, symbol STRING, tradeID STRING, quantity INT, price DOUBLE, side INT, liquidity INT, clearingNumber STRING ) PARTITIONED BY (YEAR INT, MONTH INT, DAY INT, CLEARINGNUMBER STRING) STORED AS PARQUET LOCATION 's3://fsi-sandbox/catarch/parquet’ TBLPROPERTIES ('has_encrypted_data'=’true');
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ad-hoc Data Analysis: A Typical Situation Provide all the trades in ABC Corp in last five years 9 TB 2016 2015 2014 2013 2012 Options?
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What Are Your Options? Option 3: Query data at rest using Amazon Athena or Amazon Redshift Spectrum Amazon Athena Amazon S3 data lake Ad-hoc queries Option 2: Archive the data, and upon request, stand up the database server, restore the data, and then query the data $45 for 9 TB scanned Option 1: Keep it online all the time
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recap • Identified the challenges in architecting a data lake that meets the unique requirements for regulatory reporting: security, lineage, scale, and elasticity • Reviewed an architecture for ingestion, processing, and transformation of FIX dataset into a data repository that can be used for a variety of reporting and analytics applications • Demonstrated a reference implementation of CAT reporting using AWS services integrated with herd
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU!