SlideShare una empresa de Scribd logo
1 de 40
Looking out
for
anomalies!
Sevvandi Kandanaarachchi, Rob Hyndman,
Hideya Ochiai, Asha Rao
Why anomalies?
• They tell a different story
• Fraudulent credit card transactions amongst billions of
legitimate transactions
• Computer network intrusions
• Astronomical anomalies – solar flares
• Weather anomalies – tsunamis
• Stock market anomalies – heralding a crash?
• Important to detect anomalies in a timely manner
Current
challenges
AD methods rank observations in terms of
anomalousness
• They don’t identify anomalies
• So, the user needs to define a threshold and
identify anomalies
High false positives
• Do not want an “alarm factory” – confidence in the
system goes down
Parameters need to be defined by the user
• But expert knowledge is needed
Overview
A real
world
application
Computer network security
lookout –
an
anomaly
detection
method
Uses topological data
analysis/persistent homology
Extreme value theory
Kernel density estimates
Sevvandi Kandanaarachchi, Rob Hyndman
Preprint - https://bit.ly/lookoutliers
Lookout – leave one
out kde for outlier
detection
Kernel density estimation(KDE)
• A density estimation technique using kernels
• A set of points on the real line
• Placing the kernel at every point
• Kernel function𝑓 𝑥, ℎ =
1
𝑛ℎ 𝑖 𝐾(
𝑥−𝑋𝑖
ℎ
)
• ℎ - the bandwidth parameter
• https://mathisonian.github.io/kde/
KDE for anomaly detection
• What do we want?
• Anomalies to have much lower kde values than other points.
• Why?
• Because anomalies are in low density regions.
• The literature on bandwidth selection focusses on representing the
data
• Minimize MISE (Mean Integrated Square Error)
• But, this doesn’t work for us.
Bandwidth, KDE and anomalies
• Anomalies in the middle
• Indices 1001 -1010
• Increasing bandwidth of KDE
• Lowest 10 KDE points (their indices)
• Want anomalies to have lowest KDE
0.05 0.2 0.35 0.5 0.65 0.8 0.95 1.1 1.25 1.4
232 232 1010 1010 1006 1006 1006 495 495 495
1010 446 1001 1001 1009 1009 1009 843 843 843
424 1010 1008 1008 1005 1005 1005 486 486 486
359 495 1004 1004 1002 1002 1002 1006 979 166
963 1001 1003 1002 1004 1004 1004 1009 166 979
814 975 1002 1003 1007 1007 1007 1005 948 948
70 1008 1007 1007 1003 1003 1003 1002 964 964
257 799 1006 1006 1008 1001 1001 1004 832 832
511 843 1009 1009 1001 1008 1008 1007 110 147
458 511 1005 1005 1010 1010 1010 1003 147 110
Bandwidth, KDE and anomalies
• The bandwidth minimising
MISE is 0.018
• Increasing bandwidth of KDE
• Lowest 10 KDE points (their indices)
• Want anomalies to have lowest KDE
0.05 0.2 0.35 0.5 0.65 0.8 0.95 1.1 1.25 1.4
232 232 1010 1010 1006 1006 1006 495 495 495
1010 446 1001 1001 1009 1009 1009 843 843 843
424 1010 1008 1008 1005 1005 1005 486 486 486
359 495 1004 1004 1002 1002 1002 1006 979 166
963 1001 1003 1002 1004 1004 1004 1009 166 979
814 975 1002 1003 1007 1007 1007 1005 948 948
70 1008 1007 1007 1003 1003 1003 1002 964 964
257 799 1006 1006 1008 1001 1001 1004 832 832
511 843 1009 1009 1001 1008 1008 1007 110 147
458 511 1005 1005 1010 1010 1010 1003 147 110
So we want a bigger bandwidth
for anomaly detection.
But not too big!
How do we select a bandwidth
appropriate for anomaly
detection?
In comes persistent homology
• Methodology in topological data analysis
Connected components and holes
Dimension 0 – connected components
Dimension 1 - holes
With an anomaly
Dimension 0 – connected components
We are interested in . . .
• The end-point diameter (death
diameters) sequences
• We want the maximum gap
• Diameter that starts the
maximum gap = 𝑑
• ℎ = 5 𝑑 for Epanechnikov
kernel
• Compute the kde values
• Anomalies will have the very low kde values
• We can rank the anomalies using the low kde values
• Low kde – anomalous
• High kde – not anomalous
Using this bandwidth
But, we want to identify anomalies!
Just because the kde is low, is it an
anomaly?
We want to have a cut off!
For that we use Extreme Value
Theory!
EVT – Peak Over Threshold method (POT)
• Pick a threshold – 90%
• Model the exceedences
• Generalized Pareto distribution
Method lookout
• Fit a GPD using the kde values
• Then use the leave one out kde values to determine the probability of
points according to the GPD
• We have a set of probabilities
• Low probabilities are more likely to be anomalies
• Have a pre-defined cut off 𝛼, this is your threshold
• If 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(𝑥𝑖) < 𝛼, then 𝑥𝑖 is an anomaly.
• So you can identify anomalies.
Example • Lookout outliers with 𝛼 = 0.05
• Outliers Probability
• 1001 0.02344059
• 1002 0.02513530
• 1003 0.02501901
• 1004 0.02504691
• 1005 0.02654359
• 1006 0.02636139
• 1007 0.02625216
• 1008 0.02452614
• 1009 0.02644570
• 1010 0.02283989
Practical advantages of lookout
The user does not need
to specify a bandwidth
parameter
•The user can be
anyone – not
necessarily a
mathematician
EVT based methods
have low false positive
rates
•Attractive for many
applications
•Not an alarm factory
For the mathematician/statistician in me
• Coming together of
• Topological data analysis
• Extreme Value Theory
• Kernel density estimates
• To find anomalies
Anomaly persistence
Anomaly Persistence
• What if a data-point is identified
as an anomaly for different
bandwidth values?
• Visual representation of
anomaly persistence
• Big picture
Application: Computer
Networks Security
Honeyboost: Boosting honeypot performance with data fusion and anomaly
detection – Sevvandi Kandanaarachchi, Hideya Ochiai, Asha Rao
Preprint - https://arxiv.org/abs/2105.02526
LAN Security Monitoring
• ‘LAN-Security Monitoring Device’ to capture suspicious/malicious
activities that happen inside a LAN.
LAN: Local Area Network
LAN-Security Monitoring Device
Though it is not a real camera, it works
like a ‘cyber-space surveillance camera’.
Smartphones
Printer
Smart Appliances
Data Server
it captures all the broadcast packets,
and direct packets to the
monitoring device.
LAN Security Monitoring
• ‘LAN-Security Monitoring Device’ to capture suspicious/ malicious
activities that happen inside a LAN.
LAN: Local Area Network
LAN-Security Monitoring Device
Honeypot - a trap for attackers
Smartphones
Printer
Smart Appliances
Data Server
Honeypot data
• ARP data – a big shout out to everyone (broadcast to the network)
• These nodes do not access the honeypot
• Who has got this address – I need to communicate to you
• Generally not a suspicious activity
• But malicious nodes can also make ARP calls
• TCP and UDP data – targeted at the honeypot
• These nodes have accessed the honeypot using TCP/UDP protocols
• Oooh suspicious!
A bit more on honeypots
• An intruder can be there without accessing the honeypot
• Limited vision of honeypots
• Honeypots are never stand alone security devices
• Identifying anomalous nodes is important - Honeyboost
Generally . . .
• Anomalies detected based on individual packets – packet-based
• Packet features separately for each packet
• Of all the traffic, which packets are anomalous
• Our contribution: we find anomalous nodes – node-based
• Features of nodes using the traffic – using multivariate time series
• Of all the nodes, which nodes are anomalous
Varying-dimensional time series
• Different protocols have different header features
• Finding anomalies from varying dimensional time series
• 200 computers/nodes = 200 varying-dimensional time series
• Which one is anomalous, if at all?
time
Varying-
dimensional time
series for each node
multivariate time
series
Compute features
Window model and process
Feature space for
all nodes
Lookout
time
Varying-
dimensional time
series for each node
multivariate time
series
Timestamp Protocol ARP count ARP
degree
TCP PC1 TCP PC2 UDP PC1 UDP PC2
30 ARP 10 12 0 0 0 0
55 TCP 0 0 -2.15 1.75 0 0
85 UDP 0 0 0 0 3.56 0.45
Node A
multivariate time
series
Compute features
Timest
amp
Protoc
ol
ARP
count
ARP
degree
TCP
PC1
TCP
PC2
UDP
PC1
UDP
PC2
30 ARP 10 12 0 0 0 0
55 TCP 0 0 -2.15 1.75 0 0
85 UDP 0 0 0 0 3.56 0.45
Node A
𝑅17
MV time series for each
node gets transformed to a
point in 𝑅17
Feature space for
all nodes
Features
• The total length of line segments in 𝑅6
• The maximum time difference
• Number of protocols used
• Number of TCP calls/UDP calls
• Total length of line segments in each protocol space
• Line of best fit in in each protocol space
• Sum of errors squared for the line of best fit
TCP PC1
TCP PC2
Findings
• Suspicious nodes that do not
access the honeypot
Feature space for
all nodes
Lookout
This node
does not
access the
honeypot
This node
does not
access the
honeypot
Insights
• Identify some nodes before
they access the honeypot
• Gain insights – find anomalies
and look back at the original
data
• Anomaly has set
suspicious flags – PSH flag
and URG flag
• PSH flag – PUSH flag –
push packet to the
application layer
• URG flag – URGENT flag –
treat packet as urgent?
Why when accessing the
honeypot
• Can be used to derive new
rules
Summary
• Lookout - a EVT based method to find anomalies (using TDA)
• An application in computer network security
• R package lookout is on CRAN
• Both preprints available
• https://bit.ly/lookoutliers
• https://arxiv.org/abs/2105.02526
Thank you!

Más contenido relacionado

La actualidad más candente

Amaya_Presentation
Amaya_PresentationAmaya_Presentation
Amaya_PresentationIsaias Amaya
 
Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...
Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...
Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...PT. Siwali Swantika
 
Serinus 10-ozone-o3-gas-analyser
Serinus 10-ozone-o3-gas-analyserSerinus 10-ozone-o3-gas-analyser
Serinus 10-ozone-o3-gas-analyserEuropean Tech Serv
 
Katalog agilent-digital-multimeter-L4411 a-system-tridinamika
Katalog agilent-digital-multimeter-L4411 a-system-tridinamikaKatalog agilent-digital-multimeter-L4411 a-system-tridinamika
Katalog agilent-digital-multimeter-L4411 a-system-tridinamikaPT. Tridinamika Jaya Instrument
 
OPINT at a glance
OPINT at a glanceOPINT at a glance
OPINT at a glanceAlon Cohen
 
Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...
Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...
Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...PT. Siwali Swantika
 

La actualidad más candente (6)

Amaya_Presentation
Amaya_PresentationAmaya_Presentation
Amaya_Presentation
 
Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...
Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...
Datasheet Fluke 96000 Extended Specification. Hubungi PT. Siwali Swantika 021...
 
Serinus 10-ozone-o3-gas-analyser
Serinus 10-ozone-o3-gas-analyserSerinus 10-ozone-o3-gas-analyser
Serinus 10-ozone-o3-gas-analyser
 
Katalog agilent-digital-multimeter-L4411 a-system-tridinamika
Katalog agilent-digital-multimeter-L4411 a-system-tridinamikaKatalog agilent-digital-multimeter-L4411 a-system-tridinamika
Katalog agilent-digital-multimeter-L4411 a-system-tridinamika
 
OPINT at a glance
OPINT at a glanceOPINT at a glance
OPINT at a glance
 
Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...
Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...
Datasheet Fluke Automated AC Measurement Standard. Hubungi PT. Siwali Swantik...
 

Similar a Looking out for anomalies

Mathematics of anomalies
Mathematics of anomaliesMathematics of anomalies
Mathematics of anomaliesCSIRO
 
Here is the anomalow-down!
Here is the anomalow-down!Here is the anomalow-down!
Here is the anomalow-down!CSIRO
 
Ntc 362 forecasting and strategic planning -uopstudy.com
Ntc 362 forecasting and strategic planning -uopstudy.comNtc 362 forecasting and strategic planning -uopstudy.com
Ntc 362 forecasting and strategic planning -uopstudy.comULLPTT
 
Ntc 362 effective communication uopstudy.com
Ntc 362 effective communication   uopstudy.comNtc 362 effective communication   uopstudy.com
Ntc 362 effective communication uopstudy.comULLPTT
 
ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介
ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介
ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介ShinsukeAiki1
 
60 hz Electromagnetic Field Detection-Interface System
60 hz Electromagnetic Field Detection-Interface System60 hz Electromagnetic Field Detection-Interface System
60 hz Electromagnetic Field Detection-Interface SystemGaurav Jaina
 
Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73APNIC
 
Network State Awareness & Troubleshooting
Network State Awareness & TroubleshootingNetwork State Awareness & Troubleshooting
Network State Awareness & TroubleshootingAPNIC
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing EcosystemsAmazon Web Services
 
LoRa online training for utility guys
LoRa online training for utility guysLoRa online training for utility guys
LoRa online training for utility guysNikolay Milovanov
 
Advance Portable & Low Cost 3 Lead ECG(1).pptx
Advance Portable & Low Cost 3 Lead ECG(1).pptxAdvance Portable & Low Cost 3 Lead ECG(1).pptx
Advance Portable & Low Cost 3 Lead ECG(1).pptxMdSazzad28
 
Compromising Industrial Facilities From 40 Miles Away
Compromising Industrial Facilities From 40 Miles AwayCompromising Industrial Facilities From 40 Miles Away
Compromising Industrial Facilities From 40 Miles AwayEnergySec
 
InternEncoderPresentation
InternEncoderPresentationInternEncoderPresentation
InternEncoderPresentationClayton Monahan
 
adaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptxadaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptxssuser6f1a8e1
 
Accurate Synchronization of EtherCAT Systems Using Distributed Clocks
Accurate Synchronization of EtherCAT Systems Using Distributed ClocksAccurate Synchronization of EtherCAT Systems Using Distributed Clocks
Accurate Synchronization of EtherCAT Systems Using Distributed ClocksDesign World
 
Introduction_to_Mechatronics_Chapter4.pdf
Introduction_to_Mechatronics_Chapter4.pdfIntroduction_to_Mechatronics_Chapter4.pdf
Introduction_to_Mechatronics_Chapter4.pdfBereket Walle
 

Similar a Looking out for anomalies (20)

Mathematics of anomalies
Mathematics of anomaliesMathematics of anomalies
Mathematics of anomalies
 
Raptor codes
Raptor codesRaptor codes
Raptor codes
 
Here is the anomalow-down!
Here is the anomalow-down!Here is the anomalow-down!
Here is the anomalow-down!
 
A_Seyedolhosseini_Tir_95_1
A_Seyedolhosseini_Tir_95_1A_Seyedolhosseini_Tir_95_1
A_Seyedolhosseini_Tir_95_1
 
Ntc 362 forecasting and strategic planning -uopstudy.com
Ntc 362 forecasting and strategic planning -uopstudy.comNtc 362 forecasting and strategic planning -uopstudy.com
Ntc 362 forecasting and strategic planning -uopstudy.com
 
Ntc 362 effective communication uopstudy.com
Ntc 362 effective communication   uopstudy.comNtc 362 effective communication   uopstudy.com
Ntc 362 effective communication uopstudy.com
 
ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介
ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介
ハイブリッドLoRa-BLEモジュールとTTN対応キャリアグレードLoRaWANゲートウェイの紹介
 
ROBOTICS - Introduction to Robotics Microcontroller
ROBOTICS -  Introduction to Robotics MicrocontrollerROBOTICS -  Introduction to Robotics Microcontroller
ROBOTICS - Introduction to Robotics Microcontroller
 
60 hz Electromagnetic Field Detection-Interface System
60 hz Electromagnetic Field Detection-Interface System60 hz Electromagnetic Field Detection-Interface System
60 hz Electromagnetic Field Detection-Interface System
 
Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73
 
Network State Awareness & Troubleshooting
Network State Awareness & TroubleshootingNetwork State Awareness & Troubleshooting
Network State Awareness & Troubleshooting
 
MSc_thesis_defence
MSc_thesis_defenceMSc_thesis_defence
MSc_thesis_defence
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
 
LoRa online training for utility guys
LoRa online training for utility guysLoRa online training for utility guys
LoRa online training for utility guys
 
Advance Portable & Low Cost 3 Lead ECG(1).pptx
Advance Portable & Low Cost 3 Lead ECG(1).pptxAdvance Portable & Low Cost 3 Lead ECG(1).pptx
Advance Portable & Low Cost 3 Lead ECG(1).pptx
 
Compromising Industrial Facilities From 40 Miles Away
Compromising Industrial Facilities From 40 Miles AwayCompromising Industrial Facilities From 40 Miles Away
Compromising Industrial Facilities From 40 Miles Away
 
InternEncoderPresentation
InternEncoderPresentationInternEncoderPresentation
InternEncoderPresentation
 
adaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptxadaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptx
 
Accurate Synchronization of EtherCAT Systems Using Distributed Clocks
Accurate Synchronization of EtherCAT Systems Using Distributed ClocksAccurate Synchronization of EtherCAT Systems Using Distributed Clocks
Accurate Synchronization of EtherCAT Systems Using Distributed Clocks
 
Introduction_to_Mechatronics_Chapter4.pdf
Introduction_to_Mechatronics_Chapter4.pdfIntroduction_to_Mechatronics_Chapter4.pdf
Introduction_to_Mechatronics_Chapter4.pdf
 

Más de CSIRO

The painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral dataThe painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral dataCSIRO
 
Explainable insights on algorithm performance
Explainable insights on algorithm performanceExplainable insights on algorithm performance
Explainable insights on algorithm performanceCSIRO
 
The painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS dataThe painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS dataCSIRO
 
Sophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data explorationSophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data explorationCSIRO
 
Explainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationExplainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationCSIRO
 
A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?CSIRO
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxCSIRO
 
Anomalous Networks
Anomalous NetworksAnomalous Networks
Anomalous NetworksCSIRO
 
Four, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparisonFour, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparisonCSIRO
 
Comparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial dataComparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial dataCSIRO
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networksCSIRO
 
Algorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response TheoryAlgorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response TheoryCSIRO
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesCSIRO
 
Evaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryEvaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryCSIRO
 
Anomalies! You can't escape them.
Anomalies! You can't escape them.Anomalies! You can't escape them.
Anomalies! You can't escape them.CSIRO
 
Anomalies and events keep us on our toes
Anomalies and events keep us on our toesAnomalies and events keep us on our toes
Anomalies and events keep us on our toesCSIRO
 
Algorithm evaluation using item response theory
Algorithm evaluation using item response theoryAlgorithm evaluation using item response theory
Algorithm evaluation using item response theoryCSIRO
 

Más de CSIRO (17)

The painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral dataThe painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral data
 
Explainable insights on algorithm performance
Explainable insights on algorithm performanceExplainable insights on algorithm performance
Explainable insights on algorithm performance
 
The painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS dataThe painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS data
 
Sophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data explorationSophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data exploration
 
Explainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationExplainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in education
 
A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptx
 
Anomalous Networks
Anomalous NetworksAnomalous Networks
Anomalous Networks
 
Four, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparisonFour, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparison
 
Comparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial dataComparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial data
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
 
Algorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response TheoryAlgorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response Theory
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensembles
 
Evaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryEvaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response Theory
 
Anomalies! You can't escape them.
Anomalies! You can't escape them.Anomalies! You can't escape them.
Anomalies! You can't escape them.
 
Anomalies and events keep us on our toes
Anomalies and events keep us on our toesAnomalies and events keep us on our toes
Anomalies and events keep us on our toes
 
Algorithm evaluation using item response theory
Algorithm evaluation using item response theoryAlgorithm evaluation using item response theory
Algorithm evaluation using item response theory
 

Último

Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 

Último (20)

Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 

Looking out for anomalies

  • 1. Looking out for anomalies! Sevvandi Kandanaarachchi, Rob Hyndman, Hideya Ochiai, Asha Rao
  • 2. Why anomalies? • They tell a different story • Fraudulent credit card transactions amongst billions of legitimate transactions • Computer network intrusions • Astronomical anomalies – solar flares • Weather anomalies – tsunamis • Stock market anomalies – heralding a crash? • Important to detect anomalies in a timely manner
  • 3. Current challenges AD methods rank observations in terms of anomalousness • They don’t identify anomalies • So, the user needs to define a threshold and identify anomalies High false positives • Do not want an “alarm factory” – confidence in the system goes down Parameters need to be defined by the user • But expert knowledge is needed
  • 4. Overview A real world application Computer network security lookout – an anomaly detection method Uses topological data analysis/persistent homology Extreme value theory Kernel density estimates
  • 5. Sevvandi Kandanaarachchi, Rob Hyndman Preprint - https://bit.ly/lookoutliers Lookout – leave one out kde for outlier detection
  • 6. Kernel density estimation(KDE) • A density estimation technique using kernels • A set of points on the real line • Placing the kernel at every point • Kernel function𝑓 𝑥, ℎ = 1 𝑛ℎ 𝑖 𝐾( 𝑥−𝑋𝑖 ℎ ) • ℎ - the bandwidth parameter • https://mathisonian.github.io/kde/
  • 7. KDE for anomaly detection • What do we want? • Anomalies to have much lower kde values than other points. • Why? • Because anomalies are in low density regions. • The literature on bandwidth selection focusses on representing the data • Minimize MISE (Mean Integrated Square Error) • But, this doesn’t work for us.
  • 8. Bandwidth, KDE and anomalies • Anomalies in the middle • Indices 1001 -1010 • Increasing bandwidth of KDE • Lowest 10 KDE points (their indices) • Want anomalies to have lowest KDE 0.05 0.2 0.35 0.5 0.65 0.8 0.95 1.1 1.25 1.4 232 232 1010 1010 1006 1006 1006 495 495 495 1010 446 1001 1001 1009 1009 1009 843 843 843 424 1010 1008 1008 1005 1005 1005 486 486 486 359 495 1004 1004 1002 1002 1002 1006 979 166 963 1001 1003 1002 1004 1004 1004 1009 166 979 814 975 1002 1003 1007 1007 1007 1005 948 948 70 1008 1007 1007 1003 1003 1003 1002 964 964 257 799 1006 1006 1008 1001 1001 1004 832 832 511 843 1009 1009 1001 1008 1008 1007 110 147 458 511 1005 1005 1010 1010 1010 1003 147 110
  • 9. Bandwidth, KDE and anomalies • The bandwidth minimising MISE is 0.018 • Increasing bandwidth of KDE • Lowest 10 KDE points (their indices) • Want anomalies to have lowest KDE 0.05 0.2 0.35 0.5 0.65 0.8 0.95 1.1 1.25 1.4 232 232 1010 1010 1006 1006 1006 495 495 495 1010 446 1001 1001 1009 1009 1009 843 843 843 424 1010 1008 1008 1005 1005 1005 486 486 486 359 495 1004 1004 1002 1002 1002 1006 979 166 963 1001 1003 1002 1004 1004 1004 1009 166 979 814 975 1002 1003 1007 1007 1007 1005 948 948 70 1008 1007 1007 1003 1003 1003 1002 964 964 257 799 1006 1006 1008 1001 1001 1004 832 832 511 843 1009 1009 1001 1008 1008 1007 110 147 458 511 1005 1005 1010 1010 1010 1003 147 110
  • 10. So we want a bigger bandwidth for anomaly detection. But not too big!
  • 11. How do we select a bandwidth appropriate for anomaly detection?
  • 12. In comes persistent homology • Methodology in topological data analysis
  • 13. Connected components and holes Dimension 0 – connected components Dimension 1 - holes
  • 14. With an anomaly Dimension 0 – connected components
  • 15. We are interested in . . . • The end-point diameter (death diameters) sequences • We want the maximum gap • Diameter that starts the maximum gap = 𝑑 • ℎ = 5 𝑑 for Epanechnikov kernel
  • 16. • Compute the kde values • Anomalies will have the very low kde values • We can rank the anomalies using the low kde values • Low kde – anomalous • High kde – not anomalous Using this bandwidth
  • 17. But, we want to identify anomalies! Just because the kde is low, is it an anomaly?
  • 18. We want to have a cut off! For that we use Extreme Value Theory!
  • 19. EVT – Peak Over Threshold method (POT) • Pick a threshold – 90% • Model the exceedences • Generalized Pareto distribution
  • 20. Method lookout • Fit a GPD using the kde values • Then use the leave one out kde values to determine the probability of points according to the GPD • We have a set of probabilities • Low probabilities are more likely to be anomalies • Have a pre-defined cut off 𝛼, this is your threshold • If 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(𝑥𝑖) < 𝛼, then 𝑥𝑖 is an anomaly. • So you can identify anomalies.
  • 21. Example • Lookout outliers with 𝛼 = 0.05 • Outliers Probability • 1001 0.02344059 • 1002 0.02513530 • 1003 0.02501901 • 1004 0.02504691 • 1005 0.02654359 • 1006 0.02636139 • 1007 0.02625216 • 1008 0.02452614 • 1009 0.02644570 • 1010 0.02283989
  • 22. Practical advantages of lookout The user does not need to specify a bandwidth parameter •The user can be anyone – not necessarily a mathematician EVT based methods have low false positive rates •Attractive for many applications •Not an alarm factory
  • 23. For the mathematician/statistician in me • Coming together of • Topological data analysis • Extreme Value Theory • Kernel density estimates • To find anomalies
  • 25. Anomaly Persistence • What if a data-point is identified as an anomaly for different bandwidth values? • Visual representation of anomaly persistence • Big picture
  • 26. Application: Computer Networks Security Honeyboost: Boosting honeypot performance with data fusion and anomaly detection – Sevvandi Kandanaarachchi, Hideya Ochiai, Asha Rao Preprint - https://arxiv.org/abs/2105.02526
  • 27. LAN Security Monitoring • ‘LAN-Security Monitoring Device’ to capture suspicious/malicious activities that happen inside a LAN. LAN: Local Area Network LAN-Security Monitoring Device Though it is not a real camera, it works like a ‘cyber-space surveillance camera’. Smartphones Printer Smart Appliances Data Server it captures all the broadcast packets, and direct packets to the monitoring device.
  • 28. LAN Security Monitoring • ‘LAN-Security Monitoring Device’ to capture suspicious/ malicious activities that happen inside a LAN. LAN: Local Area Network LAN-Security Monitoring Device Honeypot - a trap for attackers Smartphones Printer Smart Appliances Data Server
  • 29. Honeypot data • ARP data – a big shout out to everyone (broadcast to the network) • These nodes do not access the honeypot • Who has got this address – I need to communicate to you • Generally not a suspicious activity • But malicious nodes can also make ARP calls • TCP and UDP data – targeted at the honeypot • These nodes have accessed the honeypot using TCP/UDP protocols • Oooh suspicious!
  • 30. A bit more on honeypots • An intruder can be there without accessing the honeypot • Limited vision of honeypots • Honeypots are never stand alone security devices • Identifying anomalous nodes is important - Honeyboost
  • 31. Generally . . . • Anomalies detected based on individual packets – packet-based • Packet features separately for each packet • Of all the traffic, which packets are anomalous • Our contribution: we find anomalous nodes – node-based • Features of nodes using the traffic – using multivariate time series • Of all the nodes, which nodes are anomalous
  • 32. Varying-dimensional time series • Different protocols have different header features • Finding anomalies from varying dimensional time series • 200 computers/nodes = 200 varying-dimensional time series • Which one is anomalous, if at all? time
  • 33. Varying- dimensional time series for each node multivariate time series Compute features Window model and process Feature space for all nodes Lookout time
  • 34. Varying- dimensional time series for each node multivariate time series Timestamp Protocol ARP count ARP degree TCP PC1 TCP PC2 UDP PC1 UDP PC2 30 ARP 10 12 0 0 0 0 55 TCP 0 0 -2.15 1.75 0 0 85 UDP 0 0 0 0 3.56 0.45 Node A
  • 35. multivariate time series Compute features Timest amp Protoc ol ARP count ARP degree TCP PC1 TCP PC2 UDP PC1 UDP PC2 30 ARP 10 12 0 0 0 0 55 TCP 0 0 -2.15 1.75 0 0 85 UDP 0 0 0 0 3.56 0.45 Node A 𝑅17 MV time series for each node gets transformed to a point in 𝑅17 Feature space for all nodes
  • 36. Features • The total length of line segments in 𝑅6 • The maximum time difference • Number of protocols used • Number of TCP calls/UDP calls • Total length of line segments in each protocol space • Line of best fit in in each protocol space • Sum of errors squared for the line of best fit TCP PC1 TCP PC2
  • 37. Findings • Suspicious nodes that do not access the honeypot Feature space for all nodes Lookout This node does not access the honeypot This node does not access the honeypot
  • 38. Insights • Identify some nodes before they access the honeypot • Gain insights – find anomalies and look back at the original data • Anomaly has set suspicious flags – PSH flag and URG flag • PSH flag – PUSH flag – push packet to the application layer • URG flag – URGENT flag – treat packet as urgent? Why when accessing the honeypot • Can be used to derive new rules
  • 39. Summary • Lookout - a EVT based method to find anomalies (using TDA) • An application in computer network security • R package lookout is on CRAN • Both preprints available • https://bit.ly/lookoutliers • https://arxiv.org/abs/2105.02526