Enviar búsqueda
Cargar
Using Sequence Statistics to Fight Advanced Persistent Threats
•
Descargar como PPTX, PDF
•
1 recomendación
•
1,425 vistas
DataWorks Summit/Hadoop Summit
Seguir
Using Sequence Statistics to Fight Advanced Persistent Threats
Leer menos
Leer más
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 45
Descargar ahora
Recomendados
Using Mahout and a Search Engine for Recommendation
Using Mahout and a Search Engine for Recommendation
Ted Dunning
Building multi-modal recommendation engines using search engines
Building multi-modal recommendation engines using search engines
Ted Dunning
Doing-the-impossible
Doing-the-impossible
Ted Dunning
Recommendation Techn
Recommendation Techn
Ted Dunning
Benefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a Service
DataWorks Summit/Hadoop Summit
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
MapR Technologies
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
Ted Dunning
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
Ted Dunning
Recomendados
Using Mahout and a Search Engine for Recommendation
Using Mahout and a Search Engine for Recommendation
Ted Dunning
Building multi-modal recommendation engines using search engines
Building multi-modal recommendation engines using search engines
Ted Dunning
Doing-the-impossible
Doing-the-impossible
Ted Dunning
Recommendation Techn
Recommendation Techn
Ted Dunning
Benefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a Service
DataWorks Summit/Hadoop Summit
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
MapR Technologies
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
Ted Dunning
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
Ted Dunning
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
MLconf
Sharing Sensitive Data Securely
Sharing Sensitive Data Securely
Ted Dunning
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Ted Dunning
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal Recommendations
MapR Technologies
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendation
Ted Dunning
The Keys to Digital Transformation
The Keys to Digital Transformation
MapR Technologies
Артем Гавриченков "The Dark Side of Things: Distributed Denial of Service Att...
Артем Гавриченков "The Dark Side of Things: Distributed Denial of Service Att...
Tanya Denisyuk
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
MapR Technologies
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
MLconf
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
MapR Technologies
Spoofing and Denial of Service: A risk to the decentralized Internet
Spoofing and Denial of Service: A risk to the decentralized Internet
APNIC
DDoS And Spoofing, a risk to the decentralized internet
DDoS And Spoofing, a risk to the decentralized internet
Tom Paseka
Polyvalent Recommendations
Polyvalent Recommendations
MapR Technologies
Dunning ml-conf-2014
Dunning ml-conf-2014
MapR Technologies
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
Streaming in the Extreme
Streaming in the Extreme
Julius Remigio, CBIP
AktaionPPTv5_JZedits
AktaionPPTv5_JZedits
Rod Soto
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control Activity
Sqrrl
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Más contenido relacionado
Similar a Using Sequence Statistics to Fight Advanced Persistent Threats
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
MLconf
Sharing Sensitive Data Securely
Sharing Sensitive Data Securely
Ted Dunning
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Ted Dunning
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal Recommendations
MapR Technologies
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendation
Ted Dunning
The Keys to Digital Transformation
The Keys to Digital Transformation
MapR Technologies
Артем Гавриченков "The Dark Side of Things: Distributed Denial of Service Att...
Артем Гавриченков "The Dark Side of Things: Distributed Denial of Service Att...
Tanya Denisyuk
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
MapR Technologies
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
MLconf
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
MapR Technologies
Spoofing and Denial of Service: A risk to the decentralized Internet
Spoofing and Denial of Service: A risk to the decentralized Internet
APNIC
DDoS And Spoofing, a risk to the decentralized internet
DDoS And Spoofing, a risk to the decentralized internet
Tom Paseka
Polyvalent Recommendations
Polyvalent Recommendations
MapR Technologies
Dunning ml-conf-2014
Dunning ml-conf-2014
MapR Technologies
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
Streaming in the Extreme
Streaming in the Extreme
Julius Remigio, CBIP
AktaionPPTv5_JZedits
AktaionPPTv5_JZedits
Rod Soto
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control Activity
Sqrrl
Similar a Using Sequence Statistics to Fight Advanced Persistent Threats
(20)
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Sharing Sensitive Data Securely
Sharing Sensitive Data Securely
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal Recommendations
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendation
The Keys to Digital Transformation
The Keys to Digital Transformation
Артем Гавриченков "The Dark Side of Things: Distributed Denial of Service Att...
Артем Гавриченков "The Dark Side of Things: Distributed Denial of Service Att...
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Spoofing and Denial of Service: A risk to the decentralized Internet
Spoofing and Denial of Service: A risk to the decentralized Internet
DDoS And Spoofing, a risk to the decentralized internet
DDoS And Spoofing, a risk to the decentralized internet
Polyvalent Recommendations
Polyvalent Recommendations
Dunning ml-conf-2014
Dunning ml-conf-2014
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
Streaming in the Extreme
Streaming in the Extreme
AktaionPPTv5_JZedits
AktaionPPTv5_JZedits
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control Activity
Más de DataWorks Summit/Hadoop Summit
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
Hadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
Más de DataWorks Summit/Hadoop Summit
(20)
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Hadoop Crash Course
Data Science Crash Course
Data Science Crash Course
Apache Spark Crash Course
Apache Spark Crash Course
Dataflow with Apache NiFi
Dataflow with Apache NiFi
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
HBase in Practice
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
Último
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Enterprise Knowledge
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Antenna Manufacturer Coco
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Katpro Technologies
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
UK Journal
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Último
(20)
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Using Sequence Statistics to Fight Advanced Persistent Threats
1.
© 2016 MapR
Technologies 1© 2014 MapR Technologies
2.
© 2016 MapR
Technologies 2 Contact Information Ted Dunning Chief Applications Architect at MapR Technologies Committer & PMC for Apache’s Drill, Zookeeper & others VP of Incubator at Apache Foundation Email tdunning@apache.org tdunning@maprtech.com Twitter @ted_dunning Hashtags today: #hs16dublin #mapr
3.
© 2016 MapR
Technologies 3 Agenda • What’s this persistent threat stuff? – What attackers do – How they do it • Examples • Sequence statistics – Really geeking with gas now! • Detection techniques • Specifics • Summary
4.
© 2016 MapR
Technologies 4 Agenda of All Security Talks • Terror • Faint hope • More terror • Practical suggestions • Summary
5.
© 2016 MapR
Technologies 5 Operation Ababil – Brobots on Parade • Dork attack to find unpatched default Joomla sites – Especially web servers with high bandwidth connections – Basically just Google searches for default strings – Joomla compromised into attack Brobot • C&C network checks in occasionally – Note C&C is incoming request and looks like normal web requests • Later, on command, multiple Brobots direct 50-75 Gb/s of attack – Attacks come from white-listed sites
6.
© 2016 MapR
Technologies 6 Attack Sequence Source First level C&C Second level C&C
7.
© 2016 MapR
Technologies 7 Google Attack Sequence Source First level C&C Second level C&C
8.
© 2016 MapR
Technologies 8 Brobot Brobot Brobot Attack Sequence Source First level C&C Second level C&C
9.
© 2016 MapR
Technologies 9 Target Brobot Brobot Brobot Attack Sequence Source First level C&C Second level C&C
10.
© 2016 MapR
Technologies 10 Outline of an Advanced Persistent Threat • Advanced – Common use of zero-day for preliminary attacks – Often attributed to state-level actors – Modern privateers blur the line • Persistent – Result of first attack is heavily muffled, no immediate exploit – Remote access toolset installed (RAT) • Threat – On command, data is exfiltrated covertly or en masse – Or the compromised host is used for other nefarious purpose
11.
© 2016 MapR
Technologies 11 APT in Summary • Attack, penetrate, pivot, exfiltrate or exploit • If you are a high-value target, attack is likely and stealthy – High-value = telecom, banks, utilities, retail targets, web100 – … and all their vendors – Conventional multi-factor auth is easily breached • Penetration and pivot are critical counter-measure opportunities – In 2010, RAT would contact command and control (C&C) – In 2016, C&C looks like normal traffic • Once exfiltration or exploit starts, you may no longer have a business
12.
© 2016 MapR
Technologies 12 So are we totally screwed?
13.
© 2016 MapR
Technologies 13 So are we totally screwed? Not entirely!
14.
© 2016 MapR
Technologies 14 Event Sequences Provide Clues • Event sequence appear in many places • Headers – Header types, ordering in requests • IP address accesses – Source and destination, sequences of either • TLS options – Which options, which values, which algorithms • Incoming component request ordering and timing – Body first, CSS, scripts and images next – But which are cached, what is round-trip time?
15.
© 2016 MapR
Technologies 15 Sequences and Cooccurrences • All of these characteristics form symbolic sequences • Current systems use hand-crafted rules about particular state – But hand-crafting depends on human knowledge • We can do much, much better by considering cooccurrence and ordering of symbols in these sequences • Log-likelihood ratio test (jargon alert) is a key tool
16.
© 2016 MapR
Technologies 16 A core technique • Many of these easy problems reduce to finding interesting coincidences • This can be summarized as a 2 x 2 table • Actually, many of these tables A Other B k11 k12 Other k21 k22
17.
© 2016 MapR
Technologies 17 How do you do that? • This is well handled using G-test – See wikipedia – See http://bit.ly/surprise-and-coincidence • Original application in linguistics now cited > 2000 times • Available in ElasticSearch, in Solr, in Mahout • Available in R, C, Java, Python
18.
© 2016 MapR
Technologies 18 Which one is the anomalous co-occurrence? A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 A not A B 1 0 not B 0 2
19.
© 2016 MapR
Technologies 19 Which one is the anomalous co-occurrence? A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 A not A B 1 0 not B 0 2 0.90 1.95 4.52 14.3 Dunning Ted, Accurate Methods for the Statistics of Surprise and Coincidence, Computational Linguistics vol 19 no. 1 (1993)
20.
© 2016 MapR
Technologies 20 How to Count (header-like documents) For each “document”: For each “word” A: left[A]++ For each “word” B after that (within window): count[A,B]++ right[B]++ total++
21.
© 2016 MapR
Technologies 21 • We wanted this 2 x 2 table for each A,B • But we only counted k11 directly • But we did count k*1 = k11 + k21 (how many A’s we saw) k1* = k11 + k12 (how many B’s we saw) k** = k11 + k21 + k12 + k22 (how many pairs in total) A Other B k11 k12 Other k21 k22
22.
© 2016 MapR
Technologies 22 How to Count (continued) Map<PriorityQueue> queue for each pair (A,B) k11 = count[A,B] k1x = left[A] kx1 = right[B] kxx = total k12 = k1x - k11 k21 = kx2 - k11 k22 = kxx - k11 - k12 - k21 queue.add(A, (LLR(k11,k12,k21,k22), B))
23.
© 2016 MapR
Technologies 23 How to Count (cooccurrence) for each (C,B)=(“context”, “word”): if (!filter(C) && !filter(B)): right[B]++ for each A in history(C): count[A,B]++ left[A]++ history(C) += B total++
24.
© 2016 MapR
Technologies 24 Seriously... It really can be that simple
25.
© 2016 MapR
Technologies 25 Basic techniques • Counting – often the hardest part • LLR – the basic tool • Order models – Ordered cooccurrences – Transition probabilities – Recurrent neural networks • Ploughing a quiet field – Reimage servers often – Force attackers to pivot repeatedly
26.
© 2016 MapR
Technologies 26 Target Brobot Brobot Brobot Example 1 - Ababil Source First level C&C Second level C&C Defense has to happen here
27.
© 2016 MapR
Technologies 27 Spot the Important Difference? GET /personal/comparison-table.jsp?iODg2OQ=51a90 HTTP/1.1 Host: www.sometarget.com User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;) Accept-Encoding: deflate Accept-Charset: UTF-8 Accept-Language: fr Cache-Control: no-cache Pragma: no-cache Connection: Keep-Alive GET /photo.jpg HTTP/1.1 Host: lh4.googleusercontent.com User-Agent: Mozilla/5.0 (Macint Accept: image/png,image/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, Referer: https://www.google.com Connection: keep-alive If-None-Match: "v9” Cache-Control: max-age=0 Attacker request Real request
28.
© 2016 MapR
Technologies 28 Spot the Important Difference? GET /personal/comparison-table.jsp?iODg2OQ=51a90 HTTP/1.1 Host: www.sometarget.com User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;) Accept-Encoding: deflate Accept-Charset: UTF-8 Accept-Language: fr Cache-Control: no-cache Pragma: no-cache Connection: Keep-Alive GET /photo.jpg HTTP/1.1 Host: lh4.googleusercontent.com User-Agent: Mozilla/5.0 (Macint Accept: image/png,image/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, Referer: https://www.google.com Connection: keep-alive If-None-Match: "v9” Cache-Control: max-age=0 Attacker request Real request
29.
© 2016 MapR
Technologies 29 This could only be found at scale
30.
© 2016 MapR
Technologies 30 Target Brobot Brobot Brobot Overall Outline Again Source First level C&C Second level C&C Tradecraft error!
31.
© 2016 MapR
Technologies 31 Large corpus analysis of source IP’s wins big
32.
© 2016 MapR
Technologies 32
33.
© 2016 MapR
Technologies 33 Example 2 - Common Point of Compromise • Scenario: – Merchant 0 is compromised, leaks account data during compromise – Fraud committed elsewhere during exploit – High background level of fraud – Limited detection rate for exploits • Goal: – Find merchant 0 • Meta-goal: – Screen algorithms for this task without leaking sensitive data
34.
© 2016 MapR
Technologies 34 Example 2 - Common Point of Compromise skim exploit Merchant 0 Skimmed data Merchant n Card data is stolen from Merchant 0 That data is used in frauds at other merchants
35.
© 2016 MapR
Technologies 35 Simulation Setup 0 20 40 60 80 100 0100300500 day count Compromise period Exploit period compromises frauds
36.
© 2016 MapR
Technologies 36 Simulation Strategy • For each consumer – Pick consumer parameters such as transaction rate, preferences – Generate transactions until end of sim-time • If merchant 0 during compromise time, possibly mark as compromised • For all transactions, possible mark as fraud, probability depends on history • Merchants are selected using hierarchical Pittman-Yor • Restate data – Flatten transaction streams – Sort by time • Tunables – Compromise probability, transaction rates, background fraud, detection probability
37.
© 2016 MapR
Technologies 37
38.
© 2016 MapR
Technologies 38 ●●●●●●●●●●●●●●●●●●●● ● ●● ●●● ●●● ●●●●● ●●●●● ●●● ●●● ●● ● ●● ●● ●● ● ●●●● ●●●● ●● ●●●● ●●●● ●●● ●● ●● ● ●● ● ●●●● ●● ● ●●●● ●●●●●● ●● ●● ●●● ●●● ●●●●● ● ●●● ●● ●●● ●●● ●● ●●●● ● ●● ●●● ●●● ● ● ● ●● ● ● ● ●● 020406080 LLR score for real data Number of Merchants BreachScore(LLR) Real truly bad guys 100 101 102 103 104 105 106 Really truly bad guys
39.
© 2016 MapR
Technologies 39 Historical cooccurrence gives high S/N
40.
© 2016 MapR
Technologies 40 Summary • The world can be seen as sequences of symbols • We can find patterns • Those patterns can nail opponents • Many patterns only appear at scale • You can do this
41.
© 2016 MapR
Technologies 41
42.
© 2016 MapR
Technologies 42 Short Books by Ted Dunning & Ellen Friedman • Published by O’Reilly in 2014 and 2015 • For sale from Amazon or O’Reilly • Free e-books currently available courtesy of MapR http://bit.ly/ebook-real- world-hadoop http://bit.ly/mapr-tsdb- ebook http://bit.ly/ebook- anomaly http://bit.ly/recommend ation-ebook
43.
© 2016 MapR
Technologies 43 Streaming Architecture by Ted Dunning and Ellen Friedman © 2016 (published by O’Reilly) Free copies at book signing today (oops… that was earlier) http://bit.ly/mapr-ebook-streams
44.
© 2016 MapR
Technologies 44 Thank You!
45.
© 2016 MapR
Technologies 45 Q&A @mapr maprtech tdunning@mapr.tech.com Engage with us! MapR maprtech mapr-technologies
Descargar ahora