SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Scalable and Resilient
Security Ratings Platform
with ScyllaDB
Nguyen Cao, Staff Data Engineer
Nguyen Cao
■ Staff Software Engineer at SecurityScorecard
■ Key member of data migration to ScyllaDB project
■ 8 years of experience building large scale distributed
systems
■ MSc in Computing Science, specialized in Big Data
■ Introduction
■ Challenges & Improvements
■ Results
■ Lessons
Presentation Agenda
Introduction
SecurityScorecard Mission
To make the world a safer place by transforming
the way organizations understand,
mitigate, and communicate cybersecurity to their
Boards, employees, and vendors.
SecurityScorecard Security Rating
Security Rating is an objective, data-driven and quantifiable
measure of an organization’s overall cybersecurity and cyber
risk exposure. Ratings grade vendors and organizations on a
scale of A through F.
SecurityScorecard provides quality insights, giving you the
confidence to make fast and informed decisions about
cybersecurity investments.
Companies with an F rating are 7.7x more likely to suffer a
data breach versus those with an A rating.
Entities with a Better Security Rating are More Resilient
SecurityScorecard Provides:
Continuous
Visibility into
Statewide Risk
Greater Visibility
into Cyber
Investments
Decreased Risk of
Breaches
Hurting the State
and Taxpayers
SecurityScorecard Data Pipeline
- IPv4 scan
- Malware Sinkholes
- DNS data
- External data feeds
Signal Collection
- RIR, DNS, SSL data
- Domain discovery
- Subdomains
- IP-domain pairing
Attribution Engine
- Investigate
emerging threats
- CVEs
- Machine Learning
Cyber Analytics
- Digital Footprint
- Size normalization
- Factor scores
- Total score
Scoring Engine
Global network of sensors
deployed across 50 countries
to spot zero-day threats
4.1B IP addresses scanned
every week
100B+ vulnerabilities
published weekly at
trust.securityscorecard.com
12M+ organizations
continually scored every day
Risk Factors
Application
Security
Hacker
Chatter
Cubic
Score
Social
Engineering
Patching
Cadence
DNS
Health
Network
Security
Endpoint
Security
IP
Reputation
Information
Leak
The detected security issues are measured by the assigned factor with severity-based weights, update cadence
and age out window to determine the calculation of a score
Technical Challenges
Scoring Architecture
ssc-platform-api ssc-svc-measurements
Redis
HDFS
Cluster
Presto
Cluster
Aurora
SQL query
SQL query
Redis query
AWS EMR
Scoring
Workflow
12M
scorecards
4B measurement stats
for domains/IPs
16TB historical
measurement
details for 1 year
OVERVIEW
Pre 2022
Scoring Architecture
ssc-platform-api ssc-svc-measurements
Redis
HDFS
Cluster
Presto
Cluster
Aurora
Scoring
Workflow
SQL query
SQL query
Redis query
AWS EMR
SELECT *
FROM measurement_details
WHERE scorecard IN (...)
AND date >= … and date <= …
INSERT INTO
measurement_stats
VALUES (...)
HIGH LATENCY
Pre 2022
Scoring Architecture
VERTICAL SCALABILITY
ssc-platform-api ssc-svc-measurements
Redis
HDFS
Cluster
Presto
Cluster
Aurora
SQL query
SQL query
Redis query
AWS EMR
Scoring
Workflow
largest possible
ElasticCache instance
Pre 2022
ssc-airflow-ops
NodeJS/Typescript
Python
Scoring Architecture
ssc-platform-api ssc-svc-measurements
Redis
HDFS
Cluster
Presto
Cluster
Aurora
Scoring
Workflow
SQL query
SQL query
Redis query
AWS EMR
INSERT INTO
measurement_details(...)
VALUES (...)
UPDATE measurement_details(...)
SET (...)
DATA IMMUTABILITY
Pre 2022
Scoring Architecture
ssc-platform-api ssc-svc-measurements
Redis
HDFS
Cluster
Presto
Cluster
Aurora
SQL query
SQL query
Redis query
AWS EMR
Scoring
Workflow
ssc-svc-users
ssc-svc-reports
….
MAINTAINABILITY
Pre 2022
Technical Improvements
ScyllaDB Migration
Scoring Architecture Current
OVERVIEW
ssc-platform-api ssc-svc-measurements
Scoring
Workflow
CQL query
REST API
S3
ssc-scoring-api
Presto
Cluster
AWS EMR
SQL query
12M
scorecards
4B measurement stats
for domains/IPs
all historical
measurement details
historical measurement
details for 2 weeks
Scoring Architecture Current
LOW LATENCY
ssc-platform-api ssc-svc-measurements
Scoring
Workflow
CQL query
REST API
ssc-scoring-api
S3
Presto
Cluster
AWS EMR
SQL query
scorecard_detail (
uuid_company_id_key UUID,
total_score DOUBLE,
breach_impact DOUBLE,
…,
effective_date DATE,
PRIMARY KEY ((uuid_company_id_key),effective_date)
) WITH default_time_to_live = 32400000;
schemas are designed
based on access pattern
highly parallel
processing tasks
SELECT *
FROM scorecard_detail
WHERE uuid_company_id_key IN (...)
AND date >= … and date <= …
read throughput is
stable even under
high write workload
Scoring Architecture Current
HORIZONTAL SCALABILITY
ssc-platform-api ssc-svc-measurements
Scoring
Workflow
CQL query
REST API
S3
ssc-scoring-api
Presto
Cluster
AWS EMR
SQL query
6 ECS instances
12 GB
12 nodes
720 GB
20 TB storage
infinite object
storage
Scoring Architecture Current
DATA ACCESS ABSTRACTION
ssc-platform-api ssc-svc-measurements
Scoring
Workflow
CQL query
REST API
S3
ssc-scoring-api
Presto
Cluster
AWS EMR
SQL query
ssc-svc-users
ssc-svc-reports
….
access data in ScyllaDB for
low latency requests with
high volume
redirect all historical or
high latency requests
such as reporting to
Presto S3
REST interface
access for all
FE services
Results
Migration to ScyllaDB helps us gain lot of
benefits from different perspectives:
■ 90% latency reduction for most service
endpoints
■ 80% less production incidents related to
Presto/Aurora performance
■ $1M infrastructure cost saving per year
■ 30% faster data pipeline processing
■ Much better customer experience
Lessons
Route infrequent, complex and high
latency-tolerant data access to OLAP engines
like Presto, Athena (generating reports, custom
analysis, etc.)
Build a scalable, highly parallel processing
aggregation component to overcome current
limits of CQL (in-memory JOIN-capable,
SELECT-IN queries, etc.)
Design ScyllaDB schemas based on
data access patterns to address latency
issues.
Thank You
Stay in Touch
Nguyen Cao
ncao@securityscorecard.io
@ducnguyen_cao
https://github.com/nguyencaoduc
https://www.linkedin.com/in/nguyenduccao/

Más contenido relacionado

Similar a Scalable and Resilient Security Ratings Platform with ScyllaDB

Migrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with ConfidenceMigrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with ConfidenceDavid J Rosenthal
 
Work with data in ASP.NET
Work with data in ASP.NETWork with data in ASP.NET
Work with data in ASP.NETPeter Gfader
 
Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)
Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)
Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)Jeremy Gray
 
The Sysdig Secure DevOps Platform
The Sysdig Secure DevOps PlatformThe Sysdig Secure DevOps Platform
The Sysdig Secure DevOps PlatformAshnikbiz
 
Large-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCLarge-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCAmazon Web Services
 
AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...
AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...
AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...AWS Summits
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsRoy Kim
 
How Discovery Migrated 80% of Their IT to AWS with Cloudreach
How Discovery Migrated 80% of Their IT to AWS with CloudreachHow Discovery Migrated 80% of Their IT to AWS with Cloudreach
How Discovery Migrated 80% of Their IT to AWS with CloudreachAmazon Web Services
 
SQL Server Ground to Cloud.pptx
SQL Server Ground to          Cloud.pptxSQL Server Ground to          Cloud.pptx
SQL Server Ground to Cloud.pptxsaidbilgen
 
Microsoft Azure Bringing Cloud to Your Enterprise
Microsoft Azure Bringing Cloud to Your EnterpriseMicrosoft Azure Bringing Cloud to Your Enterprise
Microsoft Azure Bringing Cloud to Your EnterpriseCA Technologies
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationDatabricks
 
DELL Technologies - The Complete Portfolio in 25 Minutes
DELL Technologies - The Complete Portfolio in 25 MinutesDELL Technologies - The Complete Portfolio in 25 Minutes
DELL Technologies - The Complete Portfolio in 25 MinutesDell Technologies
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Denodo
 
A Multi-Company Perspective: Enterprise Cloud and PaaS
A Multi-Company Perspective: Enterprise Cloud and PaaSA Multi-Company Perspective: Enterprise Cloud and PaaS
A Multi-Company Perspective: Enterprise Cloud and PaaSThoughtworks
 
Big data certification summary aqonta
Big data certification summary   aqontaBig data certification summary   aqonta
Big data certification summary aqontaAqonta
 
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...VMware Tanzu
 
VMworld 2013: SDDC is Here and Now: A Success Story
VMworld 2013: SDDC is Here and Now: A Success Story VMworld 2013: SDDC is Here and Now: A Success Story
VMworld 2013: SDDC is Here and Now: A Success Story VMworld
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKSPhil Reay
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKSPhil Reay
 

Similar a Scalable and Resilient Security Ratings Platform with ScyllaDB (20)

Migrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with ConfidenceMigrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with Confidence
 
Work with data in ASP.NET
Work with data in ASP.NETWork with data in ASP.NET
Work with data in ASP.NET
 
Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)
Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)
Running Regulated Workloads on Azure PaaS services (DogFoodCon 2018)
 
The Sysdig Secure DevOps Platform
The Sysdig Secure DevOps PlatformThe Sysdig Secure DevOps Platform
The Sysdig Secure DevOps Platform
 
Large-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCLarge-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSC
 
Azure
AzureAzure
Azure
 
AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...
AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...
AWS Summit Singapore 2019 | Unlocking Developer Potential in Regulated Enviro...
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions Architects
 
How Discovery Migrated 80% of Their IT to AWS with Cloudreach
How Discovery Migrated 80% of Their IT to AWS with CloudreachHow Discovery Migrated 80% of Their IT to AWS with Cloudreach
How Discovery Migrated 80% of Their IT to AWS with Cloudreach
 
SQL Server Ground to Cloud.pptx
SQL Server Ground to          Cloud.pptxSQL Server Ground to          Cloud.pptx
SQL Server Ground to Cloud.pptx
 
Microsoft Azure Bringing Cloud to Your Enterprise
Microsoft Azure Bringing Cloud to Your EnterpriseMicrosoft Azure Bringing Cloud to Your Enterprise
Microsoft Azure Bringing Cloud to Your Enterprise
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
 
DELL Technologies - The Complete Portfolio in 25 Minutes
DELL Technologies - The Complete Portfolio in 25 MinutesDELL Technologies - The Complete Portfolio in 25 Minutes
DELL Technologies - The Complete Portfolio in 25 Minutes
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)
 
A Multi-Company Perspective: Enterprise Cloud and PaaS
A Multi-Company Perspective: Enterprise Cloud and PaaSA Multi-Company Perspective: Enterprise Cloud and PaaS
A Multi-Company Perspective: Enterprise Cloud and PaaS
 
Big data certification summary aqonta
Big data certification summary   aqontaBig data certification summary   aqonta
Big data certification summary aqonta
 
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
 
VMworld 2013: SDDC is Here and Now: A Success Story
VMworld 2013: SDDC is Here and Now: A Success Story VMworld 2013: SDDC is Here and Now: A Success Story
VMworld 2013: SDDC is Here and Now: A Success Story
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS
 

Más de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

Más de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Último (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Scalable and Resilient Security Ratings Platform with ScyllaDB

  • 1. Scalable and Resilient Security Ratings Platform with ScyllaDB Nguyen Cao, Staff Data Engineer
  • 2. Nguyen Cao ■ Staff Software Engineer at SecurityScorecard ■ Key member of data migration to ScyllaDB project ■ 8 years of experience building large scale distributed systems ■ MSc in Computing Science, specialized in Big Data
  • 3. ■ Introduction ■ Challenges & Improvements ■ Results ■ Lessons Presentation Agenda
  • 5. SecurityScorecard Mission To make the world a safer place by transforming the way organizations understand, mitigate, and communicate cybersecurity to their Boards, employees, and vendors.
  • 6. SecurityScorecard Security Rating Security Rating is an objective, data-driven and quantifiable measure of an organization’s overall cybersecurity and cyber risk exposure. Ratings grade vendors and organizations on a scale of A through F. SecurityScorecard provides quality insights, giving you the confidence to make fast and informed decisions about cybersecurity investments. Companies with an F rating are 7.7x more likely to suffer a data breach versus those with an A rating. Entities with a Better Security Rating are More Resilient SecurityScorecard Provides: Continuous Visibility into Statewide Risk Greater Visibility into Cyber Investments Decreased Risk of Breaches Hurting the State and Taxpayers
  • 7. SecurityScorecard Data Pipeline - IPv4 scan - Malware Sinkholes - DNS data - External data feeds Signal Collection - RIR, DNS, SSL data - Domain discovery - Subdomains - IP-domain pairing Attribution Engine - Investigate emerging threats - CVEs - Machine Learning Cyber Analytics - Digital Footprint - Size normalization - Factor scores - Total score Scoring Engine Global network of sensors deployed across 50 countries to spot zero-day threats 4.1B IP addresses scanned every week 100B+ vulnerabilities published weekly at trust.securityscorecard.com 12M+ organizations continually scored every day Risk Factors Application Security Hacker Chatter Cubic Score Social Engineering Patching Cadence DNS Health Network Security Endpoint Security IP Reputation Information Leak The detected security issues are measured by the assigned factor with severity-based weights, update cadence and age out window to determine the calculation of a score
  • 9. Scoring Architecture ssc-platform-api ssc-svc-measurements Redis HDFS Cluster Presto Cluster Aurora SQL query SQL query Redis query AWS EMR Scoring Workflow 12M scorecards 4B measurement stats for domains/IPs 16TB historical measurement details for 1 year OVERVIEW Pre 2022
  • 10. Scoring Architecture ssc-platform-api ssc-svc-measurements Redis HDFS Cluster Presto Cluster Aurora Scoring Workflow SQL query SQL query Redis query AWS EMR SELECT * FROM measurement_details WHERE scorecard IN (...) AND date >= … and date <= … INSERT INTO measurement_stats VALUES (...) HIGH LATENCY Pre 2022
  • 11. Scoring Architecture VERTICAL SCALABILITY ssc-platform-api ssc-svc-measurements Redis HDFS Cluster Presto Cluster Aurora SQL query SQL query Redis query AWS EMR Scoring Workflow largest possible ElasticCache instance Pre 2022 ssc-airflow-ops NodeJS/Typescript Python
  • 12. Scoring Architecture ssc-platform-api ssc-svc-measurements Redis HDFS Cluster Presto Cluster Aurora Scoring Workflow SQL query SQL query Redis query AWS EMR INSERT INTO measurement_details(...) VALUES (...) UPDATE measurement_details(...) SET (...) DATA IMMUTABILITY Pre 2022
  • 13. Scoring Architecture ssc-platform-api ssc-svc-measurements Redis HDFS Cluster Presto Cluster Aurora SQL query SQL query Redis query AWS EMR Scoring Workflow ssc-svc-users ssc-svc-reports …. MAINTAINABILITY Pre 2022
  • 15. Scoring Architecture Current OVERVIEW ssc-platform-api ssc-svc-measurements Scoring Workflow CQL query REST API S3 ssc-scoring-api Presto Cluster AWS EMR SQL query 12M scorecards 4B measurement stats for domains/IPs all historical measurement details historical measurement details for 2 weeks
  • 16. Scoring Architecture Current LOW LATENCY ssc-platform-api ssc-svc-measurements Scoring Workflow CQL query REST API ssc-scoring-api S3 Presto Cluster AWS EMR SQL query scorecard_detail ( uuid_company_id_key UUID, total_score DOUBLE, breach_impact DOUBLE, …, effective_date DATE, PRIMARY KEY ((uuid_company_id_key),effective_date) ) WITH default_time_to_live = 32400000; schemas are designed based on access pattern highly parallel processing tasks SELECT * FROM scorecard_detail WHERE uuid_company_id_key IN (...) AND date >= … and date <= … read throughput is stable even under high write workload
  • 17. Scoring Architecture Current HORIZONTAL SCALABILITY ssc-platform-api ssc-svc-measurements Scoring Workflow CQL query REST API S3 ssc-scoring-api Presto Cluster AWS EMR SQL query 6 ECS instances 12 GB 12 nodes 720 GB 20 TB storage infinite object storage
  • 18. Scoring Architecture Current DATA ACCESS ABSTRACTION ssc-platform-api ssc-svc-measurements Scoring Workflow CQL query REST API S3 ssc-scoring-api Presto Cluster AWS EMR SQL query ssc-svc-users ssc-svc-reports …. access data in ScyllaDB for low latency requests with high volume redirect all historical or high latency requests such as reporting to Presto S3 REST interface access for all FE services
  • 19. Results Migration to ScyllaDB helps us gain lot of benefits from different perspectives: ■ 90% latency reduction for most service endpoints ■ 80% less production incidents related to Presto/Aurora performance ■ $1M infrastructure cost saving per year ■ 30% faster data pipeline processing ■ Much better customer experience
  • 20. Lessons Route infrequent, complex and high latency-tolerant data access to OLAP engines like Presto, Athena (generating reports, custom analysis, etc.) Build a scalable, highly parallel processing aggregation component to overcome current limits of CQL (in-memory JOIN-capable, SELECT-IN queries, etc.) Design ScyllaDB schemas based on data access patterns to address latency issues.
  • 21. Thank You Stay in Touch Nguyen Cao ncao@securityscorecard.io @ducnguyen_cao https://github.com/nguyencaoduc https://www.linkedin.com/in/nguyenduccao/