SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 1
Anomaly Detection
through Reinforcement Learning
.
.
Dr. Hari Koduvely
Chief Data Scientist
ZIGHRA.COM
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 2
Outline of Talk:
.
● Zighra and SensifyID Platform
● Sequential Anomaly Detection Problem
● Introduction to Reinforcement Learning
● Markov Decision Process and Q-Learning
● Function Approximation using Neural Networks
● Application to Network Intrusion Detection Problem
● Implementation using TensorFlow
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 3
ZIGHRA.COM
.
● Zighra (https://zighra.com) provides solutions for Continuous Behavioural
Authentication & Threat Detection
● Highlights of our SensifyID Platform:
○ Core is an AI based 6-layer Anomaly Detection System combining
behavioral biometrics with contextual, social and other signals
○ Cover uses cases such as User Verification, Account Takeover,
Remote Attacks and Bot Attacks
○ Can be integrated to any Web, Mobile & IoT application
○ 2 patents granted and 10+ in application stage
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 4
Sequential Anomaly Detection Problem
.
● Classical Anomaly Detection Problem is to find patterns in a dataset that do
not conform to expected normal behavior
● Formulated as a one-class classification task in machine learning
● In many domains the data distribution changes continuously (concept shift)
● An online learning setting is more ideal to deal with concept shifts
current_week_purchase
average_weekly_purchase
Source of image https://www.linkedin.com/pulse/part-2-keep-simple-machine-learning-algorithms-big-dr-dinesh/
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 5
Sequential Anomaly Detection Problem
.
● In Sequential Anomaly Detection problem the goal is to find out if a
subsequence of a sequence of events shows anomaly or not
● Each event in isolation would appear to be normal and only the sequence
of events would indicate an anomaly
○ Username-Password, Username-Password, Username-Password,....
○ Login to corporate network in midnight, Access a DB rarely used, Download lot of data,
Transfer to USB,......
● A straightforward supervised learning is not feasible here because of credit
assignment problem
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 6
Introduction to Reinforcement Learning
.
● In Reinforcement Learning, an autonomous agent interacts with an environment and
takes certain actions at
in each state st
● The environment in return supplies a reward rt
for the action agent performed as a
supervision signal and also a new state st+1
Agent
Environm
ent
at
st
rt
, st+1
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 7
Introduction to Reinforcement Learning
.
● Reinforcement Learning can be formally defined as a Markov Decision Process
● A Markov Decision Process (MDP) is defined by the 5-tuple {st
, at
, P(st+1
|st
, at
), γ , rt
}
○ st
- State at time t
○ at
- Action in state s
○ P(st+1
|st
, at
) - State transition probabilities
○ γ - Discount factor
○ rt
- Reward function
● Objective of MDP is to come up with an Optimum Policy that achieves maximum
cumulative rewards over long period of time
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 8
Q-Learning and Markov Decision Process
.
● Q-value Function Q(s, a) - an estimate of maximum total long term rewards starting from
state s and performing action a
● Bellman Equation:
Q(s, a) = r(s) + γ maxa’
∑s’
P(s’ |s, a) Q(s’, a’)
Q-value for a state-action pair is the current reward plus the expected Q-value of its successor states
● Central theoretical concept used in almost all formulations of reinforcement learning
● It can be proved that starting from random initial conditions, upon iteration of Bellman
equation Q(s, a) will converge to an optimum quality function Q*(s, a)
● Optimum policy is given by
Π*(s) = argmaxa
Q*(s, a)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 9
Q-Learning and Markov Decision Process
.
● It is difficult to know the state transition probabilities P(st+1
|st
, at
) for a given problem
● Bellman’s equation can be cast in a derivative form where transition probabilities are
not needed
● Only the actual observed state from the environment is used
● Temporal Difference Learning Algorithm:
When an agent makes a transition from state s by performing an action a to state s’,
its Q value is updated as follows:
Q(s, a) ← Q(s, a) + α [ r(s) + γ maxa’
Q(s’, a’) - Q(s, a) ]
α is a learning rate << 1
● The Q-values are adjusted towards the ideal local equilibrium when Bellman’s equation holds.
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 10
Function Approximation using Neural Networks
.
● The Bellman’s equation is a deterministic algorithm
● For problems where the state and action spaces are small one can use a table to
represent Q(s, a)
● In many practical applications, state and action spaces are continuous
● One needs an efficient function approximation method for representing Q(s, a)
● Two standard approaches for this are
○ Tile Coding: Partition continuous space into overlapping set of tiles.
➢ Success depends up on the number and width of tiles.
➢ It is a linear function approximation
○ Neural Networks: Nonlinear function approximation, more powerful representation
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 11
Function Approximation using Neural Networks
.
● One can use Neural Networks to approximate Q(s, a) as follows:
○ Inputs : State s represented by the D-dimensional vector {s1
,s2
,......,sD
}
○ Outputs: Q values for each of the N actions {Q1
, Q2
,........,QN
}
Hidden Layers
s1
s2
s3
sD
Q1
Q2
QN
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 12
Function Approximation using Neural Networks
.
● The loss function for training NN is taken as the difference between Q values predicted
by the DNN and target Q values given by the Bellman’s equation
L = ½ [ (r + γ maxa
Q(s’, a’)) - Q(s,a) ]2
● NN is trained using back propagation as follows:
1. Start an episode of explorations
2. Initialize NN and start from a random state s
3. Do a forward pass of state s through the DNN
and get Q-values for all actions
4. Perform an ε-greedy exploration for choosing an
action a for the current state s
5. Get the next state s’ and reward r from the
environment
6. Pass s’ also through the DNN and compute
maxa
Q(s’, a’)
7. Set the target Q-value for the output node
corresponding to action a to be
r + γ maxa’
Q(s’, a’)
8. For all other nodes, keep the target Q-value
same as that obtained from DNN prediction in
step 2
9. Update the weights using backpropagation
10. Repeat the steps 2-6 till a termination condition
is reached
11. Repeat the episodes till network is trained
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 13
Function Approximation using Neural Networks
.
● High Level TD NN Learning iteration flow
DNN Model
Iteration over episodes
Iteration over exploration
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 14
Network Intrusion Detection
.
● Can we use Reinforcement Learning for Network Intrusion Detection?
● Related research works:
○ James Cannady used a CMAC Neural Network and formulated Network Intrusion
Detection as an online learning problem 1
○ Xin Xu studied the problem of host-based intrusion detection as a multi-stage cyber
attack and applied reinforcement learning 2
○ Arturo Servin studied the DDoS attack as a traffic anomaly problem and used
reinforcement learning for detection 3
○ Kleanthis M used a distributed reinforcement network for network intrusion response
4
● None of these have used a DNN for function approximation
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 15
Network Intrusion Detection
.
● Standard dataset for scientific research NSL-KDD Dataset 5
● Dataset contains 4 categories of attacks in a local area network
○ DOS - Denial of Service Attacks
○ R2L - Remote to Local where remote hacker trying to get local user privileges
○ U2R - Hacker operates as a normal user and exploit vulnerabilities
○ Probing - Hacker scans the machine to determine vulnerabilities
● Dataset contains 125, 973 connections for Training and 22, 543 for Testing
● Training set has 53.5% normal connections and 46.5% abnormal connections
● There are 41 features (32 continuous, 3 nominal and 6 binary)
● Eg. Type of protocol (TCP, UDP), port number, packet size, rate of transmission
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 16
Network Intrusion Detection
.
Source of image https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 17
Network Intrusion Detection
.
● However NLS-KDD dataset cannot be used for sequential anomaly detection
○ There is not time stamp. Dataset is not a time series data
○ There is no way one can identify the different connections are from the same
user/hacker or not
○ One could use it for standard anomaly detection problem using reinforcement
learning
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 18
Network Intrusion Detection
.
● However NLS-KDD dataset cannot be used for sequential anomaly detection
○ There is not time stamp. Dataset is not a time series data
○ There is no way one can identify the different connections are from the same
user/hacker or not
○ One could use the dataset for standard anomaly detection problem using
reinforcement learning
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 19
Network Intrusion Detection
.
● Reinforcement Learning Formulation with NSL-KDD Dataset
○ The states are characterized by the 41 features in the data set
○ For every state the agent takes either of the two actions:
■ Send an alert
■ Not send an alert
○ The rewards generated by the environment:
■ +1 if the state is normal and action is not send alert
■ +1 if the state is malicious and action is send alert
■ -1 if the state is malicious and action is not send alert
■ -1 if the state is normal and action is send alert
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 20
Implementation using TensorFlow
.
● Creation of the Environment
○ Goal of the environment is to stimulate the reward scheme mentioned for the
NSL-KDD dataset and also supply a new state every time
○ This can be done using the Gym toolkit from Open AI
https://github.com/openai/gym/tree/master/gym/envs
gym-network_intrusion/
README.md
setup.py
gym_network_intrusion/
__init__.py
envs/
__init__.py
network_intrusion_env.py
from gym.envs.registration import register
register(
id='NetworkIntrusion-v0',
entry_point='gym_network_intrusion.envs:NetworkIntr
usionEnv',
)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 21
Implementation using TensorFlow
.
● Creation of the Environment
import gym
from gym import error, spaces, utils
from gym.utils import seeding
class NetworkIntrusionEnv(gym.Env):
def __init__(self):
...
def _step(self, action):
return new_state, reward, episode_over, details
...
def _reset(self):
return initial_state
...
def _get_reward(self, action):
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 22
Implementation using TensorFlow
.
● Implementation using TensorFlow
● Two architectures:
○ Deep NN architecture:
■ Discretize continuous variables and use one hot representation
○ Deep and Wide NN architecture:
■ Useful for combining continuous and discrete variables into one NN model
■ Also combines the power of memorization and generalization
■ https://www.tensorflow.org/tutorials/wide_and_deep
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 23
Implementation using TensorFlow
.
● Implementation a simple NN using TensorFlow
○ Discretize continuous variables and use one hot representation
○ Used binning (#bins = 5) to convert continuous to categorical
○ There are 226 one hot vectors
○ 3 layer feed forward neural network (226 X 10 X 1)
● Code available at https://github.com/harik68/RL4AD
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 24
Implementation using TensorFlow
.
● Model performance (work in progress !)
Baseline DNN-RL Model V0.1
TPR
FPR
Source of image for baseline https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 25
Next Steps
.
● Experiment with different discretization scheme or even tile coding
● Experiment with different NN architectures (Deep and Wide)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 26
References
.
1. Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks,
J. Cannadey, 23rd National Information Systems Security Conference (2000)
2. Sequential anomaly detection based on temporal-difference learning: Principles,
models and case studies, Xin Xu, Applied Soft Computing 10 (2010) 859–867
3. Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow, A. Servin
[PDF] york.ac.uk
4. Distributed response to network intrusions using multiagent reinforcement learning, Engineering
Applications of Artificial Intelligence, Volume 41 Issue C, May 2015 Pages 270-284
5. NSL-KDD dataset, Canadian Institute for Cyber Security, University of New Brunswick,
(http://www.unb.ca/cic/datasets/nsl.html)
6. Artificial Intelligence a Modern Approach by Peter Norvig and Stuart J. Russell, Prentice Hall
(2009)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM
THANK YOU !
We are hiring Data Scientists, Machine Learning Engineers and Mobile Developers
Apply at career@zighra.com
Anomaly Detection through Reinforcement Learning

Más contenido relacionado

Similar a Anomaly Detection through Reinforcement Learning

Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5TigerGraph
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationTigerGraph
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningGiancarlo Frison
 
Lifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event DataLifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event DataDatabricks
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 
DDPG algortihm for angry birds
DDPG algortihm for angry birdsDDPG algortihm for angry birds
DDPG algortihm for angry birdsWangyu Han
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with PythonGLC Networks
 
Designing States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session SearchDesigning States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session SearchGrace Yang
 
Machine Learning with Python
Machine Learning with Python Machine Learning with Python
Machine Learning with Python GLC Networks
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
MUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMANMUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMANGLC Networks
 
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...TigerGraph
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learningBig Data Colombia
 
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...Gobinath Loganathan
 
Deep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection systemDeep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection systemAvinash Kumar
 
Botnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesBotnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesIvan Letteri
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionKishor Datta Gupta
 

Similar a Anomaly Detection through Reinforcement Learning (20)

Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement Learning
 
Lifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event DataLifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event Data
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
IDS for IoT.pptx
IDS for IoT.pptxIDS for IoT.pptx
IDS for IoT.pptx
 
DDPG algortihm for angry birds
DDPG algortihm for angry birdsDDPG algortihm for angry birds
DDPG algortihm for angry birds
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with Python
 
Designing States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session SearchDesigning States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session Search
 
Machine Learning with Python
Machine Learning with Python Machine Learning with Python
Machine Learning with Python
 
Reinforcement Learning - DQN
Reinforcement Learning - DQNReinforcement Learning - DQN
Reinforcement Learning - DQN
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
MUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMANMUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMAN
 
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
 
Deep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection systemDeep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection system
 
Botnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesBotnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniques
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
 

Último

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 

Último (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 

Anomaly Detection through Reinforcement Learning

  • 1. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 1 Anomaly Detection through Reinforcement Learning . . Dr. Hari Koduvely Chief Data Scientist ZIGHRA.COM
  • 2. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 2 Outline of Talk: . ● Zighra and SensifyID Platform ● Sequential Anomaly Detection Problem ● Introduction to Reinforcement Learning ● Markov Decision Process and Q-Learning ● Function Approximation using Neural Networks ● Application to Network Intrusion Detection Problem ● Implementation using TensorFlow
  • 3. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 3 ZIGHRA.COM . ● Zighra (https://zighra.com) provides solutions for Continuous Behavioural Authentication & Threat Detection ● Highlights of our SensifyID Platform: ○ Core is an AI based 6-layer Anomaly Detection System combining behavioral biometrics with contextual, social and other signals ○ Cover uses cases such as User Verification, Account Takeover, Remote Attacks and Bot Attacks ○ Can be integrated to any Web, Mobile & IoT application ○ 2 patents granted and 10+ in application stage
  • 4. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 4 Sequential Anomaly Detection Problem . ● Classical Anomaly Detection Problem is to find patterns in a dataset that do not conform to expected normal behavior ● Formulated as a one-class classification task in machine learning ● In many domains the data distribution changes continuously (concept shift) ● An online learning setting is more ideal to deal with concept shifts current_week_purchase average_weekly_purchase Source of image https://www.linkedin.com/pulse/part-2-keep-simple-machine-learning-algorithms-big-dr-dinesh/
  • 5. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 5 Sequential Anomaly Detection Problem . ● In Sequential Anomaly Detection problem the goal is to find out if a subsequence of a sequence of events shows anomaly or not ● Each event in isolation would appear to be normal and only the sequence of events would indicate an anomaly ○ Username-Password, Username-Password, Username-Password,.... ○ Login to corporate network in midnight, Access a DB rarely used, Download lot of data, Transfer to USB,...... ● A straightforward supervised learning is not feasible here because of credit assignment problem
  • 6. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 6 Introduction to Reinforcement Learning . ● In Reinforcement Learning, an autonomous agent interacts with an environment and takes certain actions at in each state st ● The environment in return supplies a reward rt for the action agent performed as a supervision signal and also a new state st+1 Agent Environm ent at st rt , st+1
  • 7. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 7 Introduction to Reinforcement Learning . ● Reinforcement Learning can be formally defined as a Markov Decision Process ● A Markov Decision Process (MDP) is defined by the 5-tuple {st , at , P(st+1 |st , at ), γ , rt } ○ st - State at time t ○ at - Action in state s ○ P(st+1 |st , at ) - State transition probabilities ○ γ - Discount factor ○ rt - Reward function ● Objective of MDP is to come up with an Optimum Policy that achieves maximum cumulative rewards over long period of time
  • 8. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 8 Q-Learning and Markov Decision Process . ● Q-value Function Q(s, a) - an estimate of maximum total long term rewards starting from state s and performing action a ● Bellman Equation: Q(s, a) = r(s) + γ maxa’ ∑s’ P(s’ |s, a) Q(s’, a’) Q-value for a state-action pair is the current reward plus the expected Q-value of its successor states ● Central theoretical concept used in almost all formulations of reinforcement learning ● It can be proved that starting from random initial conditions, upon iteration of Bellman equation Q(s, a) will converge to an optimum quality function Q*(s, a) ● Optimum policy is given by Π*(s) = argmaxa Q*(s, a)
  • 9. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 9 Q-Learning and Markov Decision Process . ● It is difficult to know the state transition probabilities P(st+1 |st , at ) for a given problem ● Bellman’s equation can be cast in a derivative form where transition probabilities are not needed ● Only the actual observed state from the environment is used ● Temporal Difference Learning Algorithm: When an agent makes a transition from state s by performing an action a to state s’, its Q value is updated as follows: Q(s, a) ← Q(s, a) + α [ r(s) + γ maxa’ Q(s’, a’) - Q(s, a) ] α is a learning rate << 1 ● The Q-values are adjusted towards the ideal local equilibrium when Bellman’s equation holds.
  • 10. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 10 Function Approximation using Neural Networks . ● The Bellman’s equation is a deterministic algorithm ● For problems where the state and action spaces are small one can use a table to represent Q(s, a) ● In many practical applications, state and action spaces are continuous ● One needs an efficient function approximation method for representing Q(s, a) ● Two standard approaches for this are ○ Tile Coding: Partition continuous space into overlapping set of tiles. ➢ Success depends up on the number and width of tiles. ➢ It is a linear function approximation ○ Neural Networks: Nonlinear function approximation, more powerful representation
  • 11. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 11 Function Approximation using Neural Networks . ● One can use Neural Networks to approximate Q(s, a) as follows: ○ Inputs : State s represented by the D-dimensional vector {s1 ,s2 ,......,sD } ○ Outputs: Q values for each of the N actions {Q1 , Q2 ,........,QN } Hidden Layers s1 s2 s3 sD Q1 Q2 QN
  • 12. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 12 Function Approximation using Neural Networks . ● The loss function for training NN is taken as the difference between Q values predicted by the DNN and target Q values given by the Bellman’s equation L = ½ [ (r + γ maxa Q(s’, a’)) - Q(s,a) ]2 ● NN is trained using back propagation as follows: 1. Start an episode of explorations 2. Initialize NN and start from a random state s 3. Do a forward pass of state s through the DNN and get Q-values for all actions 4. Perform an ε-greedy exploration for choosing an action a for the current state s 5. Get the next state s’ and reward r from the environment 6. Pass s’ also through the DNN and compute maxa Q(s’, a’) 7. Set the target Q-value for the output node corresponding to action a to be r + γ maxa’ Q(s’, a’) 8. For all other nodes, keep the target Q-value same as that obtained from DNN prediction in step 2 9. Update the weights using backpropagation 10. Repeat the steps 2-6 till a termination condition is reached 11. Repeat the episodes till network is trained
  • 13. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 13 Function Approximation using Neural Networks . ● High Level TD NN Learning iteration flow DNN Model Iteration over episodes Iteration over exploration
  • 14. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 14 Network Intrusion Detection . ● Can we use Reinforcement Learning for Network Intrusion Detection? ● Related research works: ○ James Cannady used a CMAC Neural Network and formulated Network Intrusion Detection as an online learning problem 1 ○ Xin Xu studied the problem of host-based intrusion detection as a multi-stage cyber attack and applied reinforcement learning 2 ○ Arturo Servin studied the DDoS attack as a traffic anomaly problem and used reinforcement learning for detection 3 ○ Kleanthis M used a distributed reinforcement network for network intrusion response 4 ● None of these have used a DNN for function approximation
  • 15. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 15 Network Intrusion Detection . ● Standard dataset for scientific research NSL-KDD Dataset 5 ● Dataset contains 4 categories of attacks in a local area network ○ DOS - Denial of Service Attacks ○ R2L - Remote to Local where remote hacker trying to get local user privileges ○ U2R - Hacker operates as a normal user and exploit vulnerabilities ○ Probing - Hacker scans the machine to determine vulnerabilities ● Dataset contains 125, 973 connections for Training and 22, 543 for Testing ● Training set has 53.5% normal connections and 46.5% abnormal connections ● There are 41 features (32 continuous, 3 nominal and 6 binary) ● Eg. Type of protocol (TCP, UDP), port number, packet size, rate of transmission
  • 16. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 16 Network Intrusion Detection . Source of image https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
  • 17. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 17 Network Intrusion Detection . ● However NLS-KDD dataset cannot be used for sequential anomaly detection ○ There is not time stamp. Dataset is not a time series data ○ There is no way one can identify the different connections are from the same user/hacker or not ○ One could use it for standard anomaly detection problem using reinforcement learning
  • 18. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 18 Network Intrusion Detection . ● However NLS-KDD dataset cannot be used for sequential anomaly detection ○ There is not time stamp. Dataset is not a time series data ○ There is no way one can identify the different connections are from the same user/hacker or not ○ One could use the dataset for standard anomaly detection problem using reinforcement learning
  • 19. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 19 Network Intrusion Detection . ● Reinforcement Learning Formulation with NSL-KDD Dataset ○ The states are characterized by the 41 features in the data set ○ For every state the agent takes either of the two actions: ■ Send an alert ■ Not send an alert ○ The rewards generated by the environment: ■ +1 if the state is normal and action is not send alert ■ +1 if the state is malicious and action is send alert ■ -1 if the state is malicious and action is not send alert ■ -1 if the state is normal and action is send alert
  • 20. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 20 Implementation using TensorFlow . ● Creation of the Environment ○ Goal of the environment is to stimulate the reward scheme mentioned for the NSL-KDD dataset and also supply a new state every time ○ This can be done using the Gym toolkit from Open AI https://github.com/openai/gym/tree/master/gym/envs gym-network_intrusion/ README.md setup.py gym_network_intrusion/ __init__.py envs/ __init__.py network_intrusion_env.py from gym.envs.registration import register register( id='NetworkIntrusion-v0', entry_point='gym_network_intrusion.envs:NetworkIntr usionEnv', )
  • 21. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 21 Implementation using TensorFlow . ● Creation of the Environment import gym from gym import error, spaces, utils from gym.utils import seeding class NetworkIntrusionEnv(gym.Env): def __init__(self): ... def _step(self, action): return new_state, reward, episode_over, details ... def _reset(self): return initial_state ... def _get_reward(self, action):
  • 22. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 22 Implementation using TensorFlow . ● Implementation using TensorFlow ● Two architectures: ○ Deep NN architecture: ■ Discretize continuous variables and use one hot representation ○ Deep and Wide NN architecture: ■ Useful for combining continuous and discrete variables into one NN model ■ Also combines the power of memorization and generalization ■ https://www.tensorflow.org/tutorials/wide_and_deep
  • 23. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 23 Implementation using TensorFlow . ● Implementation a simple NN using TensorFlow ○ Discretize continuous variables and use one hot representation ○ Used binning (#bins = 5) to convert continuous to categorical ○ There are 226 one hot vectors ○ 3 layer feed forward neural network (226 X 10 X 1) ● Code available at https://github.com/harik68/RL4AD
  • 24. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 24 Implementation using TensorFlow . ● Model performance (work in progress !) Baseline DNN-RL Model V0.1 TPR FPR Source of image for baseline https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
  • 25. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 25 Next Steps . ● Experiment with different discretization scheme or even tile coding ● Experiment with different NN architectures (Deep and Wide)
  • 26. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 26 References . 1. Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks, J. Cannadey, 23rd National Information Systems Security Conference (2000) 2. Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies, Xin Xu, Applied Soft Computing 10 (2010) 859–867 3. Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow, A. Servin [PDF] york.ac.uk 4. Distributed response to network intrusions using multiagent reinforcement learning, Engineering Applications of Artificial Intelligence, Volume 41 Issue C, May 2015 Pages 270-284 5. NSL-KDD dataset, Canadian Institute for Cyber Security, University of New Brunswick, (http://www.unb.ca/cic/datasets/nsl.html) 6. Artificial Intelligence a Modern Approach by Peter Norvig and Stuart J. Russell, Prentice Hall (2009)
  • 27. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM THANK YOU ! We are hiring Data Scientists, Machine Learning Engineers and Mobile Developers Apply at career@zighra.com