1. temporal defenses for robust
recommendations
neal lathia, s. hailes, l. capra
PSDML @ ECML/PKDD, Sept 24 2010
email: n.lathia@cs.ucl.ac.uk
twitter: @neal_lathia
http://www.cs.ucl.ac.uk/staff/n.lathia
2. what are recommender systems?
● web portals that (try to) connect you with the content
(movies, music, books,...) that interests you
●
many, many examples (netflix, last.fm, love film, amazon)
3. how do they work?
●
collaborative fltering: reasoning on the user-item
rating matrix; many techniques available (kNN, SVD)
●
ranking based on predicted interest
i1 i2 i3 i4 i5
u1 1* 5* 5* ? 1*
u2 3* 2* 2*
u3 4* 3* 3*
u4 4* 2* 3* 2*
u5 5* 1* 1*
4. wisdom of the (anonymous) crowds
● “based on the premise that people looking for
information should be able to make use of what others
have already found and evaluated”
5. wisdom of the (anonymous) crowds
● “based on the premise that people looking for
information should be able to make use of what others
have already found and evaluated”
+ you don't have to know who rated what to receive
recommendations
– who are they? are they rating honestly? are they
human?
6. ...a sybil attack...
shilling attack, profile injection attack
...when an attacker tries to subvert the system by
creating a large number of sybils—pseudonymous
identities—in order to gain a disproportionate amount of
influence...
10. attacks?
random targetted
inject noise structured attack
11. structured attacks: how?
target: item that attacker wants promoted/demoted
selected: similar items, to deceive the algorithm
filler: other items, to deceive humans
14. problems with static classification
i1 i2 i3 i4 i5
u1 when to run classifier?
honest
u2
when is system under
sybil u3 attack?
u4
when are sybils damaging
u5 recommendations?
15. proposal: temporal defenses
1. force sybils to draw out their attack
2. learn normal temporal behaviour
3. monitor & detect a wide range of attacks
~ and then ~
4. force sybils to attack more intelligently
19. 1. force sybils to draw out their attack
how? distrust newcomers
sybils are forced to appear more than once
20. 2. sybil group dynamics
single sybil = not an effective attack
sybils need to collude: how?
21. 2. examine sybil group dynamics
how many sybils are there?
how many ratings per sybil?
22. 2. examine sybil group dynamics
how many sybils are there?
(few, many) (many, many)
(few, few) (many, few)
how many ratings per sybil?
23. how does this affect data? (attack impact)
how many
sybils are
there?
how many ratings per sybil?
24. how to detect these attacks? (monitor!)
how many item-level system-level
sybils are
there?
user-level
how many ratings per sybil?
25. overview of methodology
● monitor: learn how data changes over time
● what data to look at?
● flag: anomalous changes due to attack
● when to flag?
● this work: simple anomaly-detection; flag
when time series is > a variance-adjusted
threshold above an exponentially weighted
moving average
28. how to evaluate our simple technique?
● a) simulation
● simulate stream of “average user ratings”
● play with mean/variance of time series
● measure precision/recall
● b) real data + injected attacks
● measure attack impact
36. c) item-level: slightly different context
1. the item is rated by many users
define many? using how other items were rated
2. the item is rated with extreme ratings
define extreme? what is avg item mean?
3. (from a + b) the item mean ratings shifts
nuke or promote?
flag: if all three conditions broken. Why?
1 popular item. 2 few extreme ratings. 3 cold start item
1 + 2 but not 3 attack doesn't change anything
40. contributions
1. force sybils to draw out their attack
2. learn normal temporal behaviour
3. monitor & detect a wide range of attacks
~ and then ~
4. force sybils to attack more intelligently
41. temporal defenses for robust
recommendations
n. lathia, s. hailes, l. capra
PSDML @ ECML/PKDD, Sept 24 2010
n.lathia@cs.ucl.ac.uk
@neal_lathia
http://www.cs.ucl.ac.uk/staff/n.lathia