SlideShare a Scribd company logo
1 of 42
Download to read offline
Behind the Scenes of Web Attacks
Davide Canali, Maurizio Abbà
Software and System Security Group
EURECOM, France

http://s3.eurecom.fr/
Motivations
●

Studying the internals of web attacks
─

─

●

What attackers do while and after they exploit a vulnerability on a
website
Understand why attacks are carried out (fun, profit, damaging
others, etc.)

Previous studies
─

how attacks against web sites are carried out

─

how criminals find their victims on the Internet

─

Lack of studies on the behavior of attackers (what they do during
and after a typical attack)
»

Previous works used static, non functional honeypots (not exploitable)

2
How

●

2500 vulnerable applications deployed on 500 websites on
100 domains
─

5 common CMSs (blog, forum, e-commerce web app, generic portal,
SQL manager), 1 static website and 17 PHP web shells
3
How - detail
●

Each deployed website acts as a proxy
─

Redirects traffic to the real web applications installed on VMs in our
premises

4
Honeypot Websites
●

Installed apps and their vulnerabilities:
─

Blog (Wordpress)
»

─

Forum (SMF)
»

─

multiple (admin pass reset, LFI, …)

Database management CMS (phpMyAdmin)
»

─

Remote File Upload

Generic portal CMS (Joomla)
»

─

multiple (HTML injection, XSS, …)

E-commerce application (osCommerce)
»

─

RFI

code injection

17 common PHP web shells + static website (defacements)
5
Containment
●

Avoid external exploitation and privilege escalations
─

Only 1 service (apache) exposed to the Internet
»

─

●

Up to date software and security patches

Avoid using the honeypot as a stepping stone for attacks
─

●

run as unprivileged user (in a Linux Container)

Blocked all outgoing traffic

Avoid hosting illegal content (mitigated)
─
─

●

Preventing the modification of directories, html and php files (chmod)
Regular restore of each VM to its original snapshot

Avoid promoting illegal goods or services
─

Code showing content of user posts and comments commented out for
each CMS
»

users and search engines are shown blank messages
6
Timeline
December 23,
2011

Start of the
experiments

April 13,
2012

December
2012

End of the
experiments

Most of the stats
presented are from
this period

We find a sponsor
for keeping the
infrastructure online

February
24, 2013

June
2013

Back online

New infrastructure:
● Complete redesign
● Clustering, file analysis
improvements
● Web interface

Paper published at NDSS 2013:
Davide Canali, Davide Balzarotti: “Behind The Scenes of Online
Attacks: an Analysis of Exploitation Behaviors on the Web”
7
Data collection
●

100 days of operation (2012)

●

Centralized data collection for simple and effective
management

●

Collected data (daily):
─

Created/modified/uploaded files

─

Web server logs

─

Database snapshot

─

(Blocked) Outgoing Traffic

8
Collected data
●

~10 GB of raw
HTTP requests

●

Requests volume

In average:
─

─

●

1-10K uploaded files
every day
100-200K HTTP
requests/day

First suspicious
activities:
─

automated: 2h 10'
after deployment

─

manual: after 4h 30'

9
Requests by country
(excluding known crawlers)

●

Color intensity is
logarithmic!

●

IPs from the USA,
Russia and Ukraine
account for 65% of
the total requests
10
Attack analysis
The four different phases
1. Discovery: how attackers find their targets
─ Referer analysis, dorks used to reach our websites, first
suspicious activities

69.8% of the attacks start
with a scout bot visiting
the pages often disguising
its User-Agent

11
Attack analysis
The four different phases
1. Discovery: how attackers find their targets
─ Referer analysis, dorks used to reach our websites, first
suspicious activities

2. Reconnaissance: how pages were visited
─ Automated systems and crawling patterns identification,
User-Agent analysis

69.8% of the attacks start
with a scout bot visiting
the pages often disguising
its User-Agent
In 84% of the cases, the
attack is launched by a 2nd
automated system, not
disguising its User-Agent
(exploitation bot)

12
Attack analysis
The four different phases
1. Discovery: how attackers find their targets
─ Referer analysis, dorks used to reach our websites, first
suspicious activities

2. Reconnaissance: how pages were visited
─ Automated systems and crawling patterns identification,
User-Agent analysis

3. Exploitation: attack against the vulnerable web app
─ Exploits detection and analysis, exploitation sessions,
uploaded files categorization, and attack time/location
normalization

69.8% of the attacks start
with a scout bot visiting
the pages often disguising
its User-Agent
In 84% of the cases, the
attack is launched by a 2nd
automated system, not
disguising its User-Agent
(exploitation bot)
46% of the successful
exploits upload a web shell

─ Analysis of forum activities: registrations, posts and
URLs, geolocation, message categories

13
Attack analysis
The four different phases
1. Discovery: how attackers find their targets
─ Referer analysis, dorks used to reach our websites, first
suspicious activities

2. Reconnaissance: how pages were visited
─ Automated systems and crawling patterns identification,
User-Agent analysis

3. Exploitation: attack against the vulnerable web app
─ Exploits detection and analysis, exploitation sessions,
uploaded files categorization, and attack time/location
normalization
─ Analysis of forum activities: registrations, posts and
URLs, geolocation, message categories

4. Post-Exploitation: second stage of the attack,
usually carried out manually (optional)
─ Session identification, analysis of shell commands

69.8% of the attacks start
with a scout bot visiting
the pages often disguising
its User-Agent
In 84% of the cases, the
attack is launched by a 2nd
automated system, not
disguising its User-Agent
(exploitation bot)
46% of the successful
exploits upload a web shell
3.5 hours after a successful
exploit, the typical attacker
reaches the uploaded shell
and performs a second
attack stage for an average
duration of 5' 37”
14
Attack analysis
phase #1: discovery
●

Discovery: Referer shows where visitors are coming from

●

Set in 50% of the cases

●

Attackers find our honeypots mostly from search engine
queries
─
─

Yandex

─

Bing

─

Yahoo

─

●

Google,

...

Some visits from web mail services (spam or phishing
victims) and social networks
15
Attack analysis
phase #2: reconnaissance
●

Reconnaissance: how were pages visited?

●

84% of the malicious traffic was from automated systems
─
─

Low inter-arrival time

─

●

No images or style-sheets requested
Multiple subdomains visited within a short time frame

6.8% of the requests mimicked the User-Agent string of
known search engines

16
Attack analysis
phase #3: exploitation
●

444 distinct exploitation sessions
─

─

●

Session = a set of requests that can be linked to the same origin,
arriving within 5' from each other
75% of the sessions used at least once 'libwww/perl' as UserAgent string → scout bots and automatic attacks

Almost one exploitation
out of two uploaded a
web shell, to continue
the attack at a later
stage (post-exploitation)

Web shells
Phishing files
File downloading
scripts
Information
gathering
Other/irrelevant

17
Attack analysis
phase #3: Forum activity
●

Daily averages:
─
─

1907 registrations

─

●

604 posts
232 online users

6687 different IP addresses
─

Mostly from US and Eastern Europe

─

One third of the IPs acting on the forum registered at least one
account, but never posted any message
→ any business related to selling forum accounts?

●

†

~1% of the links posted to the forum led to malicious
content†

According to Google SafeBrowsing and Wepawet

18
Attack analysis
phase #3: Forum activity
●

Simple message categorization allows to identify spam
campaigns
─

Trendy topics: drugs, SEO and electronics, health care

19
Attack analysis
phases #3-4
●

Clear hourly trends for post-exploitation (manual) sessions

20
Attack analysis
phase #4: post-exploitation
●

Almost 8500 interactive sessions collected
─

Known and unknown web shells

─

Average session duration: 5' 37”
»

─

9 sessions lasting more than one hour

Parsed commands from the logs
»

61% of the sessions upload a file to the system

»

50% of the sessions (try to) modify existing files
•

Defacement in 13% of the cases

21
Attacker goals
●

The analysis of collected files allows to understand the
attackers' goals
»

File normalization and similarity-based clustering

»

Manual labeling of clusters

22
File analysis
1) cleanup
●

Normalization (stripping)
─
─

●

Depends on file type (HTML != source code != text)
Remove comments, extra white spaces, email addresses, …

Dynamic code evaluation
─

Evalhook php extension†

─

For php files only

─

Allows to deobfuscate most of the files
»

†

Does not work for IonCube/Zend optimized code (rare)

by Stefan Esser, http://php-security.org/

23
File analysis
2) similarity clustering
●

Group files that are similar to each other
─
─

●

Identify code reuse or development (evolution)
How? Several approaches...

Plagiarism detection algorithms
─

Precise but too slow
»

●

Not suitable for large datasets

ssdeep, sdhash
─

Piecewise hashing tools (fuzzy hashing)

─

From the 'forensic world'

─

Fast and suitable for any kind of file
24
ssdeep and sdhash
●

ssdeep
─
─

●

Minimum file size: 4096 bytes
Fixed size hashes

sdhash
─
─

●

More precise than ssdeep, but

─

●

Minimum file size: 4096 bytes
Variable length hashes

Both tools produce a similarity score in [0,100]
We use both

25
Clustering example
●

Similarity clustering on web shells (ours are labeled)

26
Clustering new data (2013)
●

Can't manually label all data

●

Old data can be used as a starting point

●

Start with the labeled dataset (2012)
─

If file is similar to an already categorized group: add to cluster

─

Else:
»
»

●

Create new cluster
Allow the analyst to manually define cluster type (e.g.: web shell,
phishing kit, …)

Would be nice to provide a tool to help the analyst...
27
DEMO

28
Selected attack samples
Drive-by download
●

28/2/2012: intu.html uploaded to one of the honeypots

29
Selected attack samples
Drive-by download
●

28/2/2012: intu.html uploaded to one of the honeypots

●

Loads a remote document launching two exploits
─

Seen by Wepawet on the same day:

http://wepawet.cs.ucsb.edu/view.php?type=js&hash=45f9c9216818812939ab78071e9c9f54&t=1330442417

30
Selected attack samples
Privilege escalation
●

9/2/2012: Hungarian IP address uploads mempodipper.c
─
─

●

Known exploit for CVE-2012-0056
Very recent (published two weeks before the attack)

Attacker first tried to compile the code
─
─

●

Through a web shell
No gcc on our honeypots...

Then uploaded a pre-compiled ELF binary
─

The kernel of our VMs was not vulnerable :)

31
Selected attack samples
Defacement
●

6/3/2012: German IP modifies a page on the static
website using one of the web shells

32
Selected attack samples
Defacement
●

6/3/2012: German IP modifies a page on the static
website using one of the web shells

33
Selected attack samples
Defacement

34
Selected attack samples
Phishing
●

27/3/2012: 4776 requests hitting our honeypots with
Referer set to the webmail servers of sfr.fr
─

Only an image was requested (?!)
»

─

No such image on the honeypots, but...

A snapshot from 24/3/2012 contained such image:

35
Selected attack samples
Spamming and message flooding
●

21/2/2012: Nigerian IP uploads a1.php
─

Customizable mailer

36
Conclusions
●

The study confirmed some known trends
─

─

Scam and phishing campaigns often run from African countries

─

●

Strong presence of Eastern European countries in spamming
activities
Most common spam topic: pharmaceutical ads

Unexpected results
─

Most of the attacks involve some manual activity

─

Many IRC botnets still around

─

Despite their low sophistication, these represent a large fraction of
the attacks to which vulnerable websites are exposed every day

37
Thank you

?
Special thanks to Marco Pappalardo and Roberto Jordaney
(master students helping with the log analysis)

38
Backup slides →

39
40
Forum

41
Defacement

42

More Related Content

What's hot

OWASP AppSec USA 2011 - Dismantling Web Malware
OWASP AppSec USA 2011 - Dismantling Web MalwareOWASP AppSec USA 2011 - Dismantling Web Malware
OWASP AppSec USA 2011 - Dismantling Web MalwareAditya K Sood
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detectionChong-Kuan Chen
 
Html5 Application Security
Html5 Application SecurityHtml5 Application Security
Html5 Application Securitychuckbt
 
BSidesJXN 2016: Finding a Company's BreakPoint
BSidesJXN 2016: Finding a Company's BreakPointBSidesJXN 2016: Finding a Company's BreakPoint
BSidesJXN 2016: Finding a Company's BreakPointAndrew McNicol
 
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
Nguyen Phuong Truong Anh  - Some new vulnerabilities in modern web applicationNguyen Phuong Truong Anh  - Some new vulnerabilities in modern web application
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web applicationSecurity Bootcamp
 
Bsides-Philly-2016-Finding-A-Companys-BreakPoint
Bsides-Philly-2016-Finding-A-Companys-BreakPointBsides-Philly-2016-Finding-A-Companys-BreakPoint
Bsides-Philly-2016-Finding-A-Companys-BreakPointZack Meyers
 
Owasp web application security trends
Owasp web application security trendsOwasp web application security trends
Owasp web application security trendsbeched
 
Rahul-Analysis_of_Adversarial_Code
Rahul-Analysis_of_Adversarial_CodeRahul-Analysis_of_Adversarial_Code
Rahul-Analysis_of_Adversarial_Codeguest66dc5f
 
Lis4774.term paper part_a.cyber_eagles
Lis4774.term paper part_a.cyber_eaglesLis4774.term paper part_a.cyber_eagles
Lis4774.term paper part_a.cyber_eaglesAlexisHarvey8
 
Android Malware Analysis
Android Malware AnalysisAndroid Malware Analysis
Android Malware AnalysisJongWon Kim
 

What's hot (11)

OWASP AppSec USA 2011 - Dismantling Web Malware
OWASP AppSec USA 2011 - Dismantling Web MalwareOWASP AppSec USA 2011 - Dismantling Web Malware
OWASP AppSec USA 2011 - Dismantling Web Malware
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
 
Html5 Application Security
Html5 Application SecurityHtml5 Application Security
Html5 Application Security
 
BSidesJXN 2016: Finding a Company's BreakPoint
BSidesJXN 2016: Finding a Company's BreakPointBSidesJXN 2016: Finding a Company's BreakPoint
BSidesJXN 2016: Finding a Company's BreakPoint
 
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
Nguyen Phuong Truong Anh  - Some new vulnerabilities in modern web applicationNguyen Phuong Truong Anh  - Some new vulnerabilities in modern web application
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
 
Bsides-Philly-2016-Finding-A-Companys-BreakPoint
Bsides-Philly-2016-Finding-A-Companys-BreakPointBsides-Philly-2016-Finding-A-Companys-BreakPoint
Bsides-Philly-2016-Finding-A-Companys-BreakPoint
 
Owasp web application security trends
Owasp web application security trendsOwasp web application security trends
Owasp web application security trends
 
Rahul-Analysis_of_Adversarial_Code
Rahul-Analysis_of_Adversarial_CodeRahul-Analysis_of_Adversarial_Code
Rahul-Analysis_of_Adversarial_Code
 
Lis4774.term paper part_a.cyber_eagles
Lis4774.term paper part_a.cyber_eaglesLis4774.term paper part_a.cyber_eagles
Lis4774.term paper part_a.cyber_eagles
 
Android Malware Analysis
Android Malware AnalysisAndroid Malware Analysis
Android Malware Analysis
 
Flashack
FlashackFlashack
Flashack
 

Viewers also liked

Web Application Vulnerabilities
Web Application VulnerabilitiesWeb Application Vulnerabilities
Web Application VulnerabilitiesPreetish Panda
 
Lesson 6 web based attacks
Lesson 6 web based attacksLesson 6 web based attacks
Lesson 6 web based attacksFrank Victory
 
Web Attacks - Top threats - 2010
Web Attacks - Top threats - 2010Web Attacks - Top threats - 2010
Web Attacks - Top threats - 2010Shreeraj Shah
 
Top Ten Web Attacks
Top Ten Web Attacks Top Ten Web Attacks
Top Ten Web Attacks Ajay Ohri
 
Trends in Web Attacks
Trends in Web AttacksTrends in Web Attacks
Trends in Web AttacksIWMW
 
Web application attack Presentation
Web application attack PresentationWeb application attack Presentation
Web application attack PresentationKhoa Nguyen
 
Web application attacks
Web application attacksWeb application attacks
Web application attackshruth
 

Viewers also liked (12)

Anatomy Web Attack
Anatomy Web AttackAnatomy Web Attack
Anatomy Web Attack
 
Web Application Vulnerabilities
Web Application VulnerabilitiesWeb Application Vulnerabilities
Web Application Vulnerabilities
 
Web attacks
Web attacksWeb attacks
Web attacks
 
Lesson 6 web based attacks
Lesson 6 web based attacksLesson 6 web based attacks
Lesson 6 web based attacks
 
Web Attacks - Top threats - 2010
Web Attacks - Top threats - 2010Web Attacks - Top threats - 2010
Web Attacks - Top threats - 2010
 
Top Ten Web Attacks
Top Ten Web Attacks Top Ten Web Attacks
Top Ten Web Attacks
 
Trends in Web Attacks
Trends in Web AttacksTrends in Web Attacks
Trends in Web Attacks
 
Presentation on Web Attacks
Presentation on Web AttacksPresentation on Web Attacks
Presentation on Web Attacks
 
Real web-attack-scenario
Real web-attack-scenarioReal web-attack-scenario
Real web-attack-scenario
 
Web application attack Presentation
Web application attack PresentationWeb application attack Presentation
Web application attack Presentation
 
Hacking Web: Attacks & Tips
Hacking Web: Attacks & TipsHacking Web: Attacks & Tips
Hacking Web: Attacks & Tips
 
Web application attacks
Web application attacksWeb application attacks
Web application attacks
 

Similar to Behind The Scenes Of Web Attacks

Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment isc2-hellenic
 
Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?HackIT Ukraine
 
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Lastline, Inc.
 
Malware Classification and Analysis
Malware Classification and AnalysisMalware Classification and Analysis
Malware Classification and AnalysisPrashant Chopra
 
Hacking - penetration tools
Hacking - penetration toolsHacking - penetration tools
Hacking - penetration toolsJenishChauhan4
 
Cyber threat-hunting---part-2-25062021-095909pm
Cyber threat-hunting---part-2-25062021-095909pmCyber threat-hunting---part-2-25062021-095909pm
Cyber threat-hunting---part-2-25062021-095909pmMuhammadJalalShah1
 
Threat_Modelling.pdf
Threat_Modelling.pdfThreat_Modelling.pdf
Threat_Modelling.pdfMarlboroAbyad
 
H@dfex 2015 malware analysis
H@dfex 2015   malware analysisH@dfex 2015   malware analysis
H@dfex 2015 malware analysisCharles Lim
 
MonkeySpider at Sicherheit 2008
MonkeySpider at Sicherheit 2008MonkeySpider at Sicherheit 2008
MonkeySpider at Sicherheit 2008Ali Ikinci
 
Windows Threat Hunting
Windows Threat HuntingWindows Threat Hunting
Windows Threat HuntingGIBIN JOHN
 
Anomalies Detection: Windows OS - Part 1
Anomalies Detection: Windows OS - Part 1Anomalies Detection: Windows OS - Part 1
Anomalies Detection: Windows OS - Part 1Rhydham Joshi
 
Drive by downloads-cns
Drive by downloads-cnsDrive by downloads-cns
Drive by downloads-cnsmmubashirkhan
 
Application Explosion How to Manage Productivity vs Security
Application Explosion How to Manage Productivity vs SecurityApplication Explosion How to Manage Productivity vs Security
Application Explosion How to Manage Productivity vs SecurityLumension
 
IOCs for modern threat landscape-slideshare
IOCs for modern threat landscape-slideshareIOCs for modern threat landscape-slideshare
IOCs for modern threat landscape-slideshareSai Kesavamatham
 
Cybersecurity: Malware & Protecting Your Business From Cyberthreats
Cybersecurity: Malware & Protecting Your Business From CyberthreatsCybersecurity: Malware & Protecting Your Business From Cyberthreats
Cybersecurity: Malware & Protecting Your Business From CyberthreatsSecureDocs
 
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...grecsl
 
Cambodia CERT Seminar: Incident response for ransomeware attacks
Cambodia CERT Seminar: Incident response for ransomeware attacksCambodia CERT Seminar: Incident response for ransomeware attacks
Cambodia CERT Seminar: Incident response for ransomeware attacksAPNIC
 
HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
 HTTP(S)-Based Clustering for Assisted Cybercrime Investigations HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
HTTP(S)-Based Clustering for Assisted Cybercrime InvestigationsMarco Balduzzi
 
SANS Digital Forensics and Incident Response Poster 2012
SANS Digital Forensics and Incident Response Poster 2012SANS Digital Forensics and Incident Response Poster 2012
SANS Digital Forensics and Incident Response Poster 2012Rian Yulian
 
Security & Privacy - Lecture B
Security & Privacy - Lecture BSecurity & Privacy - Lecture B
Security & Privacy - Lecture BCMDLearning
 

Similar to Behind The Scenes Of Web Attacks (20)

Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment
 
Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?
 
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
 
Malware Classification and Analysis
Malware Classification and AnalysisMalware Classification and Analysis
Malware Classification and Analysis
 
Hacking - penetration tools
Hacking - penetration toolsHacking - penetration tools
Hacking - penetration tools
 
Cyber threat-hunting---part-2-25062021-095909pm
Cyber threat-hunting---part-2-25062021-095909pmCyber threat-hunting---part-2-25062021-095909pm
Cyber threat-hunting---part-2-25062021-095909pm
 
Threat_Modelling.pdf
Threat_Modelling.pdfThreat_Modelling.pdf
Threat_Modelling.pdf
 
H@dfex 2015 malware analysis
H@dfex 2015   malware analysisH@dfex 2015   malware analysis
H@dfex 2015 malware analysis
 
MonkeySpider at Sicherheit 2008
MonkeySpider at Sicherheit 2008MonkeySpider at Sicherheit 2008
MonkeySpider at Sicherheit 2008
 
Windows Threat Hunting
Windows Threat HuntingWindows Threat Hunting
Windows Threat Hunting
 
Anomalies Detection: Windows OS - Part 1
Anomalies Detection: Windows OS - Part 1Anomalies Detection: Windows OS - Part 1
Anomalies Detection: Windows OS - Part 1
 
Drive by downloads-cns
Drive by downloads-cnsDrive by downloads-cns
Drive by downloads-cns
 
Application Explosion How to Manage Productivity vs Security
Application Explosion How to Manage Productivity vs SecurityApplication Explosion How to Manage Productivity vs Security
Application Explosion How to Manage Productivity vs Security
 
IOCs for modern threat landscape-slideshare
IOCs for modern threat landscape-slideshareIOCs for modern threat landscape-slideshare
IOCs for modern threat landscape-slideshare
 
Cybersecurity: Malware & Protecting Your Business From Cyberthreats
Cybersecurity: Malware & Protecting Your Business From CyberthreatsCybersecurity: Malware & Protecting Your Business From Cyberthreats
Cybersecurity: Malware & Protecting Your Business From Cyberthreats
 
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
 
Cambodia CERT Seminar: Incident response for ransomeware attacks
Cambodia CERT Seminar: Incident response for ransomeware attacksCambodia CERT Seminar: Incident response for ransomeware attacks
Cambodia CERT Seminar: Incident response for ransomeware attacks
 
HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
 HTTP(S)-Based Clustering for Assisted Cybercrime Investigations HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
 
SANS Digital Forensics and Incident Response Poster 2012
SANS Digital Forensics and Incident Response Poster 2012SANS Digital Forensics and Incident Response Poster 2012
SANS Digital Forensics and Incident Response Poster 2012
 
Security & Privacy - Lecture B
Security & Privacy - Lecture BSecurity & Privacy - Lecture B
Security & Privacy - Lecture B
 

Recently uploaded

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Behind The Scenes Of Web Attacks

  • 1. Behind the Scenes of Web Attacks Davide Canali, Maurizio Abbà Software and System Security Group EURECOM, France http://s3.eurecom.fr/
  • 2. Motivations ● Studying the internals of web attacks ─ ─ ● What attackers do while and after they exploit a vulnerability on a website Understand why attacks are carried out (fun, profit, damaging others, etc.) Previous studies ─ how attacks against web sites are carried out ─ how criminals find their victims on the Internet ─ Lack of studies on the behavior of attackers (what they do during and after a typical attack) » Previous works used static, non functional honeypots (not exploitable) 2
  • 3. How ● 2500 vulnerable applications deployed on 500 websites on 100 domains ─ 5 common CMSs (blog, forum, e-commerce web app, generic portal, SQL manager), 1 static website and 17 PHP web shells 3
  • 4. How - detail ● Each deployed website acts as a proxy ─ Redirects traffic to the real web applications installed on VMs in our premises 4
  • 5. Honeypot Websites ● Installed apps and their vulnerabilities: ─ Blog (Wordpress) » ─ Forum (SMF) » ─ multiple (admin pass reset, LFI, …) Database management CMS (phpMyAdmin) » ─ Remote File Upload Generic portal CMS (Joomla) » ─ multiple (HTML injection, XSS, …) E-commerce application (osCommerce) » ─ RFI code injection 17 common PHP web shells + static website (defacements) 5
  • 6. Containment ● Avoid external exploitation and privilege escalations ─ Only 1 service (apache) exposed to the Internet » ─ ● Up to date software and security patches Avoid using the honeypot as a stepping stone for attacks ─ ● run as unprivileged user (in a Linux Container) Blocked all outgoing traffic Avoid hosting illegal content (mitigated) ─ ─ ● Preventing the modification of directories, html and php files (chmod) Regular restore of each VM to its original snapshot Avoid promoting illegal goods or services ─ Code showing content of user posts and comments commented out for each CMS » users and search engines are shown blank messages 6
  • 7. Timeline December 23, 2011 Start of the experiments April 13, 2012 December 2012 End of the experiments Most of the stats presented are from this period We find a sponsor for keeping the infrastructure online February 24, 2013 June 2013 Back online New infrastructure: ● Complete redesign ● Clustering, file analysis improvements ● Web interface Paper published at NDSS 2013: Davide Canali, Davide Balzarotti: “Behind The Scenes of Online Attacks: an Analysis of Exploitation Behaviors on the Web” 7
  • 8. Data collection ● 100 days of operation (2012) ● Centralized data collection for simple and effective management ● Collected data (daily): ─ Created/modified/uploaded files ─ Web server logs ─ Database snapshot ─ (Blocked) Outgoing Traffic 8
  • 9. Collected data ● ~10 GB of raw HTTP requests ● Requests volume In average: ─ ─ ● 1-10K uploaded files every day 100-200K HTTP requests/day First suspicious activities: ─ automated: 2h 10' after deployment ─ manual: after 4h 30' 9
  • 10. Requests by country (excluding known crawlers) ● Color intensity is logarithmic! ● IPs from the USA, Russia and Ukraine account for 65% of the total requests 10
  • 11. Attack analysis The four different phases 1. Discovery: how attackers find their targets ─ Referer analysis, dorks used to reach our websites, first suspicious activities 69.8% of the attacks start with a scout bot visiting the pages often disguising its User-Agent 11
  • 12. Attack analysis The four different phases 1. Discovery: how attackers find their targets ─ Referer analysis, dorks used to reach our websites, first suspicious activities 2. Reconnaissance: how pages were visited ─ Automated systems and crawling patterns identification, User-Agent analysis 69.8% of the attacks start with a scout bot visiting the pages often disguising its User-Agent In 84% of the cases, the attack is launched by a 2nd automated system, not disguising its User-Agent (exploitation bot) 12
  • 13. Attack analysis The four different phases 1. Discovery: how attackers find their targets ─ Referer analysis, dorks used to reach our websites, first suspicious activities 2. Reconnaissance: how pages were visited ─ Automated systems and crawling patterns identification, User-Agent analysis 3. Exploitation: attack against the vulnerable web app ─ Exploits detection and analysis, exploitation sessions, uploaded files categorization, and attack time/location normalization 69.8% of the attacks start with a scout bot visiting the pages often disguising its User-Agent In 84% of the cases, the attack is launched by a 2nd automated system, not disguising its User-Agent (exploitation bot) 46% of the successful exploits upload a web shell ─ Analysis of forum activities: registrations, posts and URLs, geolocation, message categories 13
  • 14. Attack analysis The four different phases 1. Discovery: how attackers find their targets ─ Referer analysis, dorks used to reach our websites, first suspicious activities 2. Reconnaissance: how pages were visited ─ Automated systems and crawling patterns identification, User-Agent analysis 3. Exploitation: attack against the vulnerable web app ─ Exploits detection and analysis, exploitation sessions, uploaded files categorization, and attack time/location normalization ─ Analysis of forum activities: registrations, posts and URLs, geolocation, message categories 4. Post-Exploitation: second stage of the attack, usually carried out manually (optional) ─ Session identification, analysis of shell commands 69.8% of the attacks start with a scout bot visiting the pages often disguising its User-Agent In 84% of the cases, the attack is launched by a 2nd automated system, not disguising its User-Agent (exploitation bot) 46% of the successful exploits upload a web shell 3.5 hours after a successful exploit, the typical attacker reaches the uploaded shell and performs a second attack stage for an average duration of 5' 37” 14
  • 15. Attack analysis phase #1: discovery ● Discovery: Referer shows where visitors are coming from ● Set in 50% of the cases ● Attackers find our honeypots mostly from search engine queries ─ ─ Yandex ─ Bing ─ Yahoo ─ ● Google, ... Some visits from web mail services (spam or phishing victims) and social networks 15
  • 16. Attack analysis phase #2: reconnaissance ● Reconnaissance: how were pages visited? ● 84% of the malicious traffic was from automated systems ─ ─ Low inter-arrival time ─ ● No images or style-sheets requested Multiple subdomains visited within a short time frame 6.8% of the requests mimicked the User-Agent string of known search engines 16
  • 17. Attack analysis phase #3: exploitation ● 444 distinct exploitation sessions ─ ─ ● Session = a set of requests that can be linked to the same origin, arriving within 5' from each other 75% of the sessions used at least once 'libwww/perl' as UserAgent string → scout bots and automatic attacks Almost one exploitation out of two uploaded a web shell, to continue the attack at a later stage (post-exploitation) Web shells Phishing files File downloading scripts Information gathering Other/irrelevant 17
  • 18. Attack analysis phase #3: Forum activity ● Daily averages: ─ ─ 1907 registrations ─ ● 604 posts 232 online users 6687 different IP addresses ─ Mostly from US and Eastern Europe ─ One third of the IPs acting on the forum registered at least one account, but never posted any message → any business related to selling forum accounts? ● † ~1% of the links posted to the forum led to malicious content† According to Google SafeBrowsing and Wepawet 18
  • 19. Attack analysis phase #3: Forum activity ● Simple message categorization allows to identify spam campaigns ─ Trendy topics: drugs, SEO and electronics, health care 19
  • 20. Attack analysis phases #3-4 ● Clear hourly trends for post-exploitation (manual) sessions 20
  • 21. Attack analysis phase #4: post-exploitation ● Almost 8500 interactive sessions collected ─ Known and unknown web shells ─ Average session duration: 5' 37” » ─ 9 sessions lasting more than one hour Parsed commands from the logs » 61% of the sessions upload a file to the system » 50% of the sessions (try to) modify existing files • Defacement in 13% of the cases 21
  • 22. Attacker goals ● The analysis of collected files allows to understand the attackers' goals » File normalization and similarity-based clustering » Manual labeling of clusters 22
  • 23. File analysis 1) cleanup ● Normalization (stripping) ─ ─ ● Depends on file type (HTML != source code != text) Remove comments, extra white spaces, email addresses, … Dynamic code evaluation ─ Evalhook php extension† ─ For php files only ─ Allows to deobfuscate most of the files » † Does not work for IonCube/Zend optimized code (rare) by Stefan Esser, http://php-security.org/ 23
  • 24. File analysis 2) similarity clustering ● Group files that are similar to each other ─ ─ ● Identify code reuse or development (evolution) How? Several approaches... Plagiarism detection algorithms ─ Precise but too slow » ● Not suitable for large datasets ssdeep, sdhash ─ Piecewise hashing tools (fuzzy hashing) ─ From the 'forensic world' ─ Fast and suitable for any kind of file 24
  • 25. ssdeep and sdhash ● ssdeep ─ ─ ● Minimum file size: 4096 bytes Fixed size hashes sdhash ─ ─ ● More precise than ssdeep, but ─ ● Minimum file size: 4096 bytes Variable length hashes Both tools produce a similarity score in [0,100] We use both 25
  • 26. Clustering example ● Similarity clustering on web shells (ours are labeled) 26
  • 27. Clustering new data (2013) ● Can't manually label all data ● Old data can be used as a starting point ● Start with the labeled dataset (2012) ─ If file is similar to an already categorized group: add to cluster ─ Else: » » ● Create new cluster Allow the analyst to manually define cluster type (e.g.: web shell, phishing kit, …) Would be nice to provide a tool to help the analyst... 27
  • 29. Selected attack samples Drive-by download ● 28/2/2012: intu.html uploaded to one of the honeypots 29
  • 30. Selected attack samples Drive-by download ● 28/2/2012: intu.html uploaded to one of the honeypots ● Loads a remote document launching two exploits ─ Seen by Wepawet on the same day: http://wepawet.cs.ucsb.edu/view.php?type=js&hash=45f9c9216818812939ab78071e9c9f54&t=1330442417 30
  • 31. Selected attack samples Privilege escalation ● 9/2/2012: Hungarian IP address uploads mempodipper.c ─ ─ ● Known exploit for CVE-2012-0056 Very recent (published two weeks before the attack) Attacker first tried to compile the code ─ ─ ● Through a web shell No gcc on our honeypots... Then uploaded a pre-compiled ELF binary ─ The kernel of our VMs was not vulnerable :) 31
  • 32. Selected attack samples Defacement ● 6/3/2012: German IP modifies a page on the static website using one of the web shells 32
  • 33. Selected attack samples Defacement ● 6/3/2012: German IP modifies a page on the static website using one of the web shells 33
  • 35. Selected attack samples Phishing ● 27/3/2012: 4776 requests hitting our honeypots with Referer set to the webmail servers of sfr.fr ─ Only an image was requested (?!) » ─ No such image on the honeypots, but... A snapshot from 24/3/2012 contained such image: 35
  • 36. Selected attack samples Spamming and message flooding ● 21/2/2012: Nigerian IP uploads a1.php ─ Customizable mailer 36
  • 37. Conclusions ● The study confirmed some known trends ─ ─ Scam and phishing campaigns often run from African countries ─ ● Strong presence of Eastern European countries in spamming activities Most common spam topic: pharmaceutical ads Unexpected results ─ Most of the attacks involve some manual activity ─ Many IRC botnets still around ─ Despite their low sophistication, these represent a large fraction of the attacks to which vulnerable websites are exposed every day 37
  • 38. Thank you ? Special thanks to Marco Pappalardo and Roberto Jordaney (master students helping with the log analysis) 38
  • 40. 40