SlideShare una empresa de Scribd logo
1 de 14
Descargar para leer sin conexión
1




                                        Diverse Firewall Design
                               Alex X. Liu, Member, IEEE, Mohamed G. Gouda, Member, IEEE



   Abstract— Firewalls are the mainstay of enterprise security               a firewall accepts legitimate packets and discards illegitimate ones
and the most widely adopted technology for protecting private                according to its “policy”, i.e., “configuration”.
networks. An error in a firewall policy either creates security                  A firewall policy consists of a sequence (i.e., an ordered list)
holes that will allow malicious traffic to sneak into a private
                                                                             of rules where each rule is of the form predicate → decision .
network or blocks legitimate traffic and disrupts normal business
processes, which in turn could lead to irreparable, if not tragic,           The predicate of a rule is a boolean expression over some packet
consequences. It has been observed that most firewall policies on             fields such as source IP address, destination IP address, source
the Internet are poorly designed and have many errors. There-                port number, destination port number, and protocol type. The
fore, how to design firewall policies correctly is an important                decision of a rule can be accept, or discard, or a combination of
issue.                                                                       these decisions with other options such as a logging option. The
   In this paper, we propose the method of diverse firewall design,           rules in a firewall policy often conflict. To resolve such conflicts,
which consists of three phases: a design phase, a comparison
                                                                             the decision for each packet is the decision of the first (i.e., highest
phase, and a resolution phase. In the design phase, the same
requirement specification of a firewall policy is given to multiple            priority) rule that the packet matches.
teams who proceed independently to design different versions
of the firewall policy. In the comparison phase, the resulting
multiple versions are compared with each other to detect all                 A. Motivation
functional discrepancies between them. In the resolution phase,                 Although a firewall policy is a mere sequence of rules, correctly
all discrepancies are resolved and a firewall that is agreed upon             designing one is by no means easy. The rules in a firewall policy
by all teams is generated. The major technical challenge in the
                                                                             are logically entangled because of conflicts among rules and the
method of diverse firewall design is how to discover all functional
discrepancies between two given firewall policies. We present a               resulting order sensitivity [26]. Ordering the rules correctly in a
series of three efficient algorithms for solving this problem: a              firewall is critical, yet difficult. The implication of any rule in a
construction algorithm, a shaping algorithm, and a comparison                firewall cannot be understood correctly without examining all the
algorithm.                                                                   rules listed above that rule. Furthermore, a firewall policy may
   The algorithms for discovering all functional discrepancies               consist of a large number of rules. A firewall on the Internet may
between two given firewall policies can be used to perform firewall            consist of hundreds or even a few thousand rules in extreme cases.
policy change-impact analysis as well. Firewall policies often need
                                                                             One can imagine the complexity of the logic underlying so many
to be changed as networks evolve and new threats emerge. Many
firewall policy errors are caused by the unintended side-effects of           conflicting rules.
policy changes. Our algorithms can be used directly to compute                  An error in a firewall policy, i.e., a wrong definition of
the impact of firewall policy changes by computing the functional             being legitimate or illegitimate for some packets, means that
discrepancies between the policy before changes and the policy               the firewall either accepts some malicious packets, which con-
after changes.                                                               sequently creates security holes in the firewall, or discards some
  Index Terms— Firewall policy, policy design, design diversity,             legitimate packets, which consequently disrupts normal business.
change-impact analysis, network security.                                    Either case could cause irreparable, if not tragic, consequences.
                                                                             Given the importance of firewalls, such errors are not acceptable.
                                                                             Unfortunately, it has been observed that most firewalls on the
                          I. I NTRODUCTION                                   Internet are poorly designed and have many errors in their policies
    Firewalls are crucial elements in network security, and they             [26]. Therefore, how to design firewall policies correctly is an
have been widely deployed to secure private networks in busi-                important issue.
nesses and institutions. A firewall is a security guard placed at the            Since the correctness of a firewall policy is the focus of this
point of entry between a private network and the outside Internet            paper, we assume a firewall is correct if and only if its policy is
such that all incoming and outgoing packets have to pass through             correct, and a firewall policy is correct if and only if it satisfies
it. A packet can be viewed as a tuple with a finite number of fields           its given requirement specification, which is usually written in
such as source IP address, destination IP address, source port               a natural language. In the rest of this paper, we use the term
number, destination port number, and protocol type. By examining             “firewall” to mean “firewall policy” or “firewall configuration”
the values of these fields for each incoming and outgoing packet,             unless otherwise specified.
                                                                                We categorize firewall errors into specification induced errors
   The preliminary version of this paper was published in the proceedings    and design induced errors. Specification induced errors are caused
of the IEEE International Conference on Dependable Systems and Networks      by the inherent ambiguities of informal requirement specifica-
(DSN-04), pages 595-604, Florence, Italy, June 2004. It won the William C.
Carter Award.                                                                tions, especially if the requirement specification is written in
   Alex X. Liu is with the Department of Computer Science and Engineer-      a natural language. Design induced errors are caused by the
ing at Michigan State University, East Lansing, Michigan, U.S.A. Email:      technical incapacity of individual firewall designers. Different
alexliu@cse.msu.edu                                                          designers may have different understandings of the same informal
   Mohamed G. Gouda is with the Department of Computer Sciences
at The University of Texas at Austin, Austin, Texas, U.S.A. Email:           requirement specification, and different designers may exhibit
gouda@cs.utexas.edu                                                          different technical strengths and weaknesses. Note that in this
2



paper we assume that the given requirement specification of a           administrators, and it is common that human administrators make
firewall is informal. Automatically converting a formal firewall         mistakes. It has been shown that administrator errors are the
specification to a deployable firewall policy has been addressed         largest cause of failure for Internet services, and policy errors
in [12]. However, the formal specification of a firewall policy is       are the largest category of administrator errors [21].
still difficult to specify correctly. Above observations motivate our      The algorithms for discovering all functional discrepancies
method of diverse firewall design.                                      between two given firewalls can be directly used to perform
                                                                       firewall change-impact analysis. The impact of the changes can
B. Our Solution                                                        literally be defined as the functional discrepancies between the
                                                                       firewall before changes and the firewall after changes.
   Our diverse firewall design method has the following three
phases: a design phase, a comparison phase, and a resolution
phase.                                                                 D. Relationship to Prior Art
   1) Design Phase: In this phase, the same requirement specifi-           Some firewall design and analysis methods have been proposed
       cation of a firewall is given to multiple teams who proceed      previously [1], [5], [11], [12], [15], [19], [20], [29]. However,
       independently to design different versions of the firewall.      none of them has ever explored design diversity. Furthermore,
       In industry firewalls are typically designed and maintained      none of them has ever tackled the problem of change-impact
       by a group of people rather than just one person. To apply      analysis for firewall policies. The proposed diverse firewall design
       the method of diverse firewall design, we can divide one         method is complementary to the previous work because these
       group into several teams.                                       methods can assist each individual team to design and analyze
   2) Comparison Phase: In this phase, the resulting multiple          their firewall in the design phase before cross comparison.
       versions are compared with each other to determine all             Note that the scope of this paper is on firewalls, not IDS/IPS
       functional discrepancies among them. The functional dis-        systems (Intrusion Detection/Prevention Systems). Although the
       crepancies need to be presented in human readable format        distinction between IDS/IPS systems and firewalls is blurry some-
       in order to be used in the next step.                           times in the commercial world, IDS/IPS systems fundamentally
   3) Resolution Phase: In this phase, first, every discrepancy is      differ from firewalls in that IDS/IPS systems check packet pay-
       discussed and resolved by all teams; second, a firewall that     loads while firewalls do not.
       is unanimously agreed upon by all teams is generated.
   The major technical challenge in the method of diverse firewall      E. Key Contributions
design is how to discover all functional discrepancies between            We make four key contributions in this paper.
two given firewalls in a human readable format. Our solution               1) We propose the method of diverse firewall design. This
to this problem consists of a series of three efficient algorithms            paper represents the first effort to apply the well-known
for solving this problem: a construction algorithm, a shaping                principle of diverse design to firewalls.
algorithm, and a comparison algorithm.                                    2) We present a method that can compare two given firewalls
   After all functional discrepancies are computed, the teams need           and output all functional discrepancies between them in a
to discuss the correct decision for each discrepancy. After all              human readable format. This is the first method created for
discrepancies are resolved, the technical question that we need to           this purpose.
answer is: how do we generate the final firewall that reflects the           3) We present a method to compute firewall change-impacts by
resolved functional discrepancies? We present two methods for                computing all functional discrepancies between the firewalls
this purpose in Section VI.                                                  before and after changes. This is the first method for doing
                                                                             firewall change-impact analysis.
C. Other Applications: Firewall Change-Impact Analysis                    4) We implemented our algorithms in Java, and we evaluated
   The algorithms presented in this paper can be used in other               their performance on both real-life and synthetic firewalls
applications as well, such as firewall change-impact analysis.                of large sizes. The experimental results show that our
Firewall policies are always subject to change due to a variety of           algorithms only use a few seconds to compare two different
reasons. Making policy changes is a major routine task for firewall           firewalls where each firewall has up to 3000 rules.
administrators. For example, new network threats such as worms            The rest of this paper is organized as follows. We start with an
and viruses may emerge. To protect a private network from new          overview of our diverse firewall design method in Section II. In
attacks, firewall policies need to be changed accordingly. Modern       Sections III, IV, and V, we present a series of three algorithms for
organizations also continually transform their network infrastruc-     discovering all functional discrepancies between two firewalls. In
ture to maintain their competitive edge by adding new servers,         Section VI, we discuss how to generate a firewall that is agreed
installing new software and services, expanding connectivity, etc.     upon by all teams after all discrepancies are resolved. We discuss
In accordance with network changes, firewall policies need to be        some further issues in Section VII. In Section VIII, we present
changed as well to provide necessary protection.                       the experimental results that show the effectiveness and efficiency
   Unfortunately, making changes is a major source of fire-             of our diverse firewall design method. Our conclusions are given
wall policy errors. Making correct firewall policy changes is           in Section X.
remarkably difficult due to the interleaving nature of firewall
rules. For example, when a firewall administrator inserts a new                                   II. OVERVIEW
rule to a firewall policy, the meaning of the rules listed under           In this section, we present an overview of our diverse firewall
this rule could be incorrectly changed without the administrator       design method using an illustrative example, which will be used
noticing. Furthermore, firewall policy changes are made by human        throughout the paper.
3


                                          Rule #      Interface     Source IP       Destination IP    Destination Port          Protocol     Decision
                                              r1          0             *            192.168.0.1            25                    TCP         accept
                                              r2          0       224.168.0.0/16          *                  *                     *         discard
                                              r3          *             *                 *                  *                     *          accept
                                                                                         TABLE I
                                                                              F IREWALL DESIGNED BY T EAM A



                                          Rule #      Interface     Source IP       Destination IP    Destination Port          Protocol     Decision
                                               ′
                                              r1          0       224.168.0.0/16          *                  *                     *         discard
                                               ′
                                              r2          0             *            192.168.0.1            25                    TCP         accept
                                               ′
                                              r3          0             *            192.168.0.1             *                     *         discard
                                               ′
                                              r4          *             *                 *                  *                     *          accept
                                                                                         TABLE II
                                                                              F IREWALL D ESIGNED BY T EAM B



 Discrepancy #                            Interface        Source IP       Destination IP     Destination Port       Protocol      Team A Decision       Team B Decision
          1                                   0          224.168.0.0/16     192.168.0.1              25                TCP             accept                discard
          2                                   0         !224.168.0.0/16     192.168.0.1              25               !TCP             accept                discard
          3                                   0         !224.168.0.0/16     192.168.0.1             !25                 *              accept                discard
                                                                                     TABLE III
                                                   F UNCTIONAL DISCREPANCIES BETWEEN THE TWO FIREWALLS DESIGNED BY T EAM A AND B




  In our example, for simplicity, we assume that a firewall maps                                B. Compare Multiple Firewalls
every packet to one of two decisions: accept or discard. Most                                     Next we briefly show our method for computing the functional
firewall software supports more than two decisions such as accept,                              discrepancies between two given firewalls. For example, given
accept-and-log, discard, and discard-and-log. Our diverse firewall                              the two firewalls in Table I and II, our method produces all the
design method can support any number of decisions.                                             functional discrepancies as shown in Table III.
                                                                                                  The core data structure used in this paper for comparing
                                                                                               multiple firewalls is Firewall Decision Diagrams (FDD). Firewall
A. Design Multiple Firewalls                                                                   decision diagrams were introduced in [10] by Gouda and Liu as
                                                                                               a notation for specifying firewalls. A Firewall Decision Diagram
   Consider the simple network in Figure 1. This network has a                                 with a decision set DS and over fields F1 , · · · , Fd is an acyclic
gateway router with two interfaces: interface 0, which connects                                and directed graph that has the following five properties:
the gateway router to the outside Internet, and interface 1, which                                1) There is exactly one node that has no incoming edges. This
connects the gateway router to the inside local network. The                                         node is called the root. The nodes that have no outgoing
firewall for this local network resides in the gateway router.                                        edges are called terminal nodes.
                                                                                                  2) Each node v has a label, denoted F (v), such that
                         Gateway           Mail Server                                                           
                      Router (Firewall) (IP: 192.168.0.1)
                                                                          Host 1     Host 2                          {F1 , · · · , Fd }   if v is a nonterminal node,
                                                                                                      F (v) ∈
                                                                                                                     DS                   if v is a terminal node.
                      C ISCO S YS TEM S




    Internet                                                                                     3) Each edge e:u → v is labeled with a nonempty set of
                                          0        1                                                  integers, denoted I(e), where I(e) is a subset of the domain
                                                                                                      of u’s label (i.e., I(e) ⊆ D(F (u))).
                                                                                                 4) A directed path from the root to a terminal node is called
Fig. 1.   A firewall
                                                                                                      a decision path. No two nodes on a decision path have the
                                                                                                      same label.
   Suppose the requirement specification for this firewall is as                                   5) The set of all outgoing edges of a node v , denoted E(v),
follows: The mail server with IP address 192.168.0.1 can receive                                      satisfies the following two conditions:
email packets. The packets from an outside malicious domain                                               a) Consistency: I(e) ∩ I(e′ ) = ∅ for any two distinct
224.168.0.0/16 should be blocked. Other packets should be ac-                                                 edges e and e′ in E(v).
                                                                                                                             S
cepted and allowed to proceed.                                                                           b) Completeness: e∈E(v) I(e) = D(F (v)).               2
   Suppose we give this specification to two teams: Team A and                                    A decision path in an FDD f is represented by
Team B, which design the firewalls as shown in Table I and II                                   (v1 e1 · · · vk ek vk+1 ) where v1 is the root, vk+1 is a terminal
respectively.                                                                                  node, and each ei is a directed edge from node vi to node vi+1 .
4



A decision path (v1 e1 · · · vk ek vk+1 ) in an FDD defines the             c) Step 3: Comparison: In this step, we compare the two
following rule:                                                       semi-isomorphic firewall decision diagrams in Figures 4 and 5
                                                                      for functional discrepancies. Table III shows all the functional
             F1 ∈ S1 ∧ · · · ∧ Fn ∈ Sn → F (vk+1 )                    discrepancies between the two semi-isomorphic firewall decision
where                                                                 diagrams in Figures 4 and 5, which are also the functional
                                                                      discrepancies between the two firewalls in Table I and II. The
           > I(ej )    if there is a node vj in the decision
           8
           >                                                          algorithm for discovering all functional discrepancies between
           >
           >
           <           path that is labelled with field Fi ,           two semi-isomorphic firewall decision diagrams is presented in
    Si =                                                              Section V.
           > D(F )
           >
                       if no node in the decision path is
           >
           >    i
           :
                       labelled with field Fi .                                        III. C ONSTRUCTION A LGORITHM
   For an FDD f , we use f.rules to denote the set of all rules         In this section, we discuss how to construct an equivalent
that are defined by all the decision paths of f . For any packet p,    firewall decision diagram from a sequence of rules.
there is one and only one rule in f.rules that p matches because
of the consistency and completeness properties of an FDD.             A. Firewalls
   Our method for computing the functional discrepancies be-             We first formally define the concepts of fields, packets, and
tween two given firewalls consists of the following three steps:       firewalls. A field Fi is a variable whose domain, denoted D(Fi ), is
conversion, shaping, and comparison.                                  a finite interval of nonnegative integers. For example, the domain
      a) Step 1: Conversion: In this step, we convert each firewall    of the source address in an IP packet is [0, 232 − 1]. A packet over
to an equivalent FDD. Figures 2 and 3 show the two FDDs that          the d fields F1 , · · · , Fd is a d-tuple (p1 , · · · , pd ) where each pi
are converted from the two firewalls in Table I and II respectively.   (1 ≤ i ≤ d) is an element of D(Fi ). We use Σ to denote the set
Note that the example FDDs used in this paper are presented as        of all packets over fields F1 , · · · , Fd . It follows that Σ is a finite
trees for the ease of understanding. The algorithm for constructing   set and |Σ| = |D(F1 )| × · · · × |D(Fd )|, where |Σ| denotes the
an equivalent firewall decision diagram from a sequence of rules       number of elements in set Σ and |D(Fi )| denotes the number of
is presented in Section III.                                          elements in set D(Fi ) for each i.
   In this example, we suppose that each packet has the following        A firewall rule has the form predicate → decision . A
five fields: Interface, Source IP address, Destination IP address,       predicate defines a set of packets over the fields F1 through
Destination Port, and Protocol Type. For ease of presentation, we     Fd specified as F1 ∈ S1 ∧ · · · ∧ Fd ∈ Sd where each Si is a
assume that each packet has a field called “interface” whose value     nonempty interval that is a subset of D(Fi ). If Si = D(Fi ), we
is the identification of the network interface on which a packet       can replace (Fi ∈ Si ) by (Fi ∈ all ), or remove the conjunct (Fi ∈
arrives. The shorthand for the five packet fields is listed in the      D(Fi )) altogether. A packet (p1 , · · · , pd ) matches a predicate
following table. For simplicity, we assume that the protocol type     F1 ∈ S1 ∧ · · · ∧ Fd ∈ Sd and the corresponding rule if and only if
value in a packet is either 0 (TCP) or 1 (UDP).                       the condition p1 ∈ S1 ∧· · ·∧pd ∈ Sd holds. We use α to denote the
                                                                      set of possible values that decision can be. Typical elements of
           shorthand   meaning                   domain
                                                                      α include accept, discard, accept with logging, and discard with
           I           Interface                 [0, 1]
           S           Source IP address         [0, 232 )
                                                                      logging. A firewall rule F1 ∈ S1 ∧ · · · ∧ Fd ∈ Sd → decision
           D           Destination IP address    [0, 232 )            is simple if and only if every Si (1 ≤ i ≤ d) is an interval of
           N           Destination Port          [0, 216 )            consecutive nonnegative integers.
           P           Protocol Type             [0, 1]                  A firewall f over the d fields F1 , · · · , Fd is a sequence of
                                                                      firewall rules. The size of f , denoted |f |, is the number of rules in
   In our examples, we also use the following shorthand. Note         F . A sequence of rules r1 , · · · , rn is comprehensive if and only
that α denotes the integer formed by the four bytes of the IP         if for any packet p, there is at least one rule in the sequence that
address 224.168.0.0. This applies similarly for β and γ .             p matches. A sequence of rules needs to be comprehensive for it
                  shorthand    meaning                                to serve as a firewall. To ensure that a firewall is comprehensive,
                  a            accept                                 the predicate of the last rule in a firewall is specified as F1 ∈
                  d            discard                                D(F1 ) ∧ · · · Fd ∈ ∧D(Fd ).
                  α            224.168.0.0                               Two rules in a firewall may overlap; that is, a single packet
                  β            224.168.255.255                        may match both rules. Furthermore, two rules in a firewall may
                  γ            192.168.0.1
                                                                      conflict; that is, the two rules not only overlap but also have
     b) Step 2: Shaping: In this step, we transform each firewall      different decisions. To resolve such conflicts, firewalls typically
decision diagram into another firewall decision diagram without        employ a first-match resolution strategy where the decision for a
changing its semantics such that the two resulting firewall decision   packet p is the decision of the first (i.e., highest priority) rule that
diagrams are semi-isomorphic. Two firewall decision diagrams are       p matches in f . The decision that firewall f makes for packet p
semi-isomorphic if and only if they are exactly the same except       is denoted f (p).
for the labels of their terminal nodes. Figures 4 and 5 show the         We can think of a firewall f as defining a many-to-one mapping
two semi-isomorphic firewall decision diagrams converted from          function from Σ to α. Two firewalls f1 and f2 are equivalent,
the firewall decision diagrams in Figures 2 and 3 respectively.        denoted f1 ≡ f2 , if and only if they define the same mapping
The algorithm for making two firewall decision diagrams semi-          function from Σ to α; that is, for any packet p ∈ Σ , we have
isomorphic without changing their semantics is presented in           f1 (p) = f2 (p). For any firewall f , we use {f } to denote the set
Section IV.                                                           of firewalls that are semantically equivalent to f .
5




                                                                     I
                                                           0                    1                                                                                                                                                             I
                                                                                                                                                                                                                                  0                         1

                                [0, α − 1]        S                                   S
                                                               [α, β]                           all                                                                                                                     S      [0, α − 1]                       a
                     [β + 1, 232 )                                                                                                                                                                    [α, β]
                                                                                                                                                                                                                                    [β + 1, 232 )
                                 D                                   D                                D
                                                                           [0, γ − 1]
                          γ          [0, γ − 1]                 γ                                               all                                                                               d                                          D
                                                                             [γ + 1, 232 )                                                                                                                                                            [0, γ − 1]
                                     [γ + 1, 232 )                                                                                                                                                                                γ
                                                                                                                                                                                                                                                           [γ + 1, 232 )
                 N                                             N                    N                                 N
                                                                     [0, 24]                                                                                                                                        N
          25        [0, 24] N                         25                                      all                             all
                                                                                                                                                                                                          25
                                                                                                                                                                                                                               [0, 24]                          a
                                                                         [26, 216 )
                    [26, 216 ) all                                                                                                                                                                                                   [26, 216 )
      P                                          P                         P                        P                             P
                P                P                                                                                                                                                           P                                              d
  0       1                                  0         1                        all                        all                            all
                    all                                                                                                                                                                                    1
                                     all                                                                                                                                      0
 a        a     a                a          a          d                        d                               d                          a
                                                                                                                                                                          a                                         d

Fig. 2. The firewall decision diagram constructed from the firewall designed                                                                               Fig. 3. The firewall decision diagram constructed from the firewall designed
by Team A in Table I                                                                                                                                     by Team B in Table II



                                                                                                                                                    1
                                                                                                                                      I
                                                                                                                                          0


                                                                                                                                      S                                                                                                           S
                                                                                      [0, α − 1]                                                                    [β + 1, 232 )
                                                                                                                                          [α, β]                                                                                                      all


                                                      D                                                   [0, γ − 1]                  D                 [γ + 1, 232 )                                      D                                      D
                          [0, γ − 1]                                     [γ + 1, 232 )                                                                                            [0, γ − 1]                                [γ + 1, 232 )
                                                           γ                                                                               γ                                                                   γ                                      all


                      N                               N                               N               N                               N                             N         N                            N                            N         N
                                  [0, 24]                            [26, 216 )                                     [0, 24]                         [26, 216 )                          [0, 24]                         [26, 216 )
                          all                              25                                 all         all                             25                            all       all                          25                           all       all


                      P          P                    P                    P          P               P             P                 P                   P         P         P         P                  P                  P         P         P
                          all         all        0               1              all           all         all           all       0             1             all       all       all       all       0             1             all       all       all


                      a           a         a                       a       a             a           d             d         a                 d          d        d         a         a         a                     a     a         a         a

Fig. 4.       The firewall decision diagram transformed from the one in Figure 2



                                                                                                                                                    1
                                                                                                                                      I
                                                                                                                                          0


                                                                                                                                      S                                                                                                           S
                                                                                      [0, α − 1]                                                                    [β + 1, 232 )
                                                                                                                                          [α, β]                                                                                                      all


                                                      D                                                   [0, γ − 1]                  D                 [γ + 1, 232 )                                      D                                      D
                          [0, γ − 1]                                     [γ + 1, 232 )                                                                                            [0, γ − 1]                                [γ + 1, 232 )
                                                           γ                                                                               γ                                                                   γ                                      all


                      N                               N                               N               N                               N                             N         N                            N                            N         N
                                  [0, 24]                            [26, 216 )                                     [0, 24]                         [26, 216 )                          [0, 24]                         [26, 216 )
                          all                              25                                 all         all                             25                            all       all                          25                           all       all


                      P          P                    P                    P          P               P             P                 P                   P         P         P         P                  P                  P         P         P
                          all         all        0               1              all           all         all           all       0             1             all       all       all       all       0             1             all       all       all


                      a           d         a                       d       d             a           d             d         d                 d          d        d         a         d         a                     d     d         a         a

Fig. 5.       The firewall decision diagram transformed from the one in Figure 3
6



          Before Appending:                          I                               After Appending:                                           I
                                                0                                                                                      0


                                            S                                                          [0, α − 1]         S
                                      all                                                                   32                         [α, β]
                                                                                                [β + 1, 2        )

                                 D                                                                       D                                      D
                                                                                                                                                        [0, γ − 1]
                             γ                                                                     γ                                       γ
                                                                                                                                                          [γ + 1, 232 )

                        N                                                                   N                                          N                          N
                                                                                                                                               [0, 24]
                   25                                                                25                                       25                                      all
                                                                                                                                                         16
                                                                                                                                                [26, 2        )

               P                                                                 P                                        P                         P                       P
           0                                                                0                                         0        1                         all                    all


          a                                                                 a                                         a            d                      d                       d

Fig. 6.       Appending rule (I ∈ {0}) ∧ (S ∈ [α, β]) ∧ (D ∈ all) ∧ (N ∈ all) ∧ (P ∈ all) → d



B. Construction of Firewall Decision Diagrams                                   in Figure 7. Here we use e.t to denote the (target) node that the
   Next, we discuss how to construct an equivalent FDD from a                   edge e points to.
sequence of rules r1 , · · · , rn , where each rule is of the format               As an example, consider the sequence of rules in Table I. Figure
(F1 ∈ S1 ) ∧ · · · ∧ (Fd ∈ Sd ) → decision . Note that all the d                6 shows the partial FDD that we construct from the first rule, and
packet fields appear in the predicate of each rule, and they appear              the partial FDD after we append the second rule. The FDD after
in the same order.                                                              we append the third rule is shown in Figure 2.
   We first construct a partial FDD from the first rule. A partial
FDD is a diagram that has all the properties of an FDD except the               Construction Algorithm
completeness property. The partial FDD constructed from a single                Input : A firewall f of a sequence of rules r1 , · · · , rn
rule contains only the decision path that defines the rule. Suppose              Output : An FDD f ′ such that f and f ′ are equivalent
from the first i rules, r1 through ri , we have constructed a partial            Steps:
FDD, whose root v is labelled F1 , and suppose v has k outgoing                 1. build a decision path with root v from rule r1 ;
edges e1 , · · · , ek . Let ri+1 be the rule (F1 ∈ S1 ) ∧ · · · ∧ (Fd ∈         2. for i := 2 to n do APPEND( v , ri );
Sd ) → decision . Next we consider how to append rule ri+1 to                   End
this partial FDD.
   At first, we examine whether we need to add another outgoing                  APPEND( v , (Fm ∈ Sm ) ∧ · · · ∧ (Fd ∈ Sd ) → dec )
edge to v . If S1 − (I(e1 ) ∪ · · · ∪ I(ek )) = ∅, we need to add a new         /*F (v) = Fm and E(v) = {e1 , · · · , ek }*/
outgoing edge with label S1 − (I(e1 ) ∪ · · · ∪ I(ek )) to v because            1. if ( Sm − ( I(e1 ) ∪ · · · ∪ I(ek ) ) ) = ∅ then
any packet whose F1 field is an element of S1 −(I(e1 ) · · ·∪I(ek ))                  (a) add an outgoing edge ek+1 with label
does not match any of the first i rules, but matches ri+1 provided                        Sm − (I(e1 ) ∪ · · · ∪ I(ek )) to v ;
that the packet satisfies (F2 ∈ S2 )∧· · ·∧(Fd ∈ Sd ). Then we build                  (b) build a decision path from rule
a decision path from (F2 ∈ S2 ) ∧ · · · ∧ (Fd ∈ Sd ) → decision ,                        (Fm+1 ∈ Sm+1 ) ∧ · · · ∧ (Fd ∈ Sd ) → dec ,
and make the new edge of the node v point to the first node of                            and make ek+1 point to the first node in this path;
this decision path.                                                             2. if m < d then
   Second, we compare S1 and I(ej ) for each j where 1 ≤ j ≤ k.                      for j := 1 to k do
This comparison leads to one of the following three cases:                             if I(ej ) ⊆ Sm then
   1) S1 ∩ I(ej ) = ∅: In this case, we skip edge ej because any                         APPEND(ej .t,
       packet whose value of field F1 is in set I(ej ) does not                                           (Fm+1 ∈ Sm+1 ) ∧ · · · ∧ (Fd ∈ Sd ) → dec );
       match ri+1 .                                                                       else if I(ej ) ∩ Sm = ∅ then
   2) S1 ∩ I(ej ) = I(ej ): In this case, for a packet whose value                          (a)add one outgoing edge e to v ,
       of field F1 is in set I(ej ), it may match one of the first                               and label e with I(ej ) ∩ Sm ;
       i rules, and it also may match rule ri+1 . So we append                              (b)make a copy of the subgraph rooted at ej .t,
       the rule (F2 ∈ S2 ) ∧ · · · ∧ (Fd ∈ Sd ) → decision to the                              and make e points to the root of the copy;
       subgraph rooted at the node that ej points to.                                       (a)replace the label of ej by I(ej ) − Sm ;
   3) S1 ∩ I(ej ) = ∅ and S1 ∩ I(ej ) = I(ej ): In this case, we                            (d)APPEND(ej .t,
       split edge e into two edges: e′ with label I(ej ) − S1 and                                                    (Fm+1 ∈ Sm+1 ) ∧ · · · ∧ (Fd ∈ Sd ) → dec );
       e′′ with label I(ej ) ∩ S1 . Then we make two copies of the
       subgraph rooted at the node that ej points to, and let e′ and            Fig. 7.    FDD Construction Algorithm
       e′′ point to one copy each. We then deal with e′ by the first
       case, and e′′ by the second case.                                          Theorem 1: Given a firewall of n simple rules, the maximum
   The pseudocode of the FDD construction algorithm is shown                    number of paths in the FDD constructed using the FDD construc-
7



tion algorithm is (2n − 1)d , where d is the number of the fields        A. FDD Simplifying
in each rule.                                                       2      Before applying the shaping algorithm, presented below, to
   Proof: Let the n simple rules be r1 , r2 , · · · , rn , where each   two ordered FDDs, we need to transform each of them into an
rule ri is denoted                                                      equivalent simple FDD. A simple FDD is defined as follows:
                i         i                 i                              Definition 4.3 (Simple FDDs): An FDD is simple iff each
     ri = F1 ∈ S1 ∧ F2 ∈ S2 ∧ · · · ∧ Fd ∈ Sd → decisioni
                                                                        node in the FDD has at most one incoming edge and each edge
                    i
For each field Fi , S1 has two end points (minimum and maximum           in the FDD is labelled with a single interval.                   2
value of the range). Thus, there are at most 2n points in the range        It is straightforward that the two operations of edge splitting
of Fi and the total number of intervals separated by the 2n points      and subgraph replication can be applied repetitively to an FDD in
is at most 2n − 1, which means the number of outgoing edges of          order to make this FDD simple. Note that the graph of a simple
a node labeled Fi is at most 2n − 1. Because the total number           FDD is an outgoing directed tree. In other words, each node in a
of fields is d, the number of paths in the constructed FDD is at         simple FDD, except the root, has only one parent node, and has
most (2n − 1)d .                                                 2      only one incoming edge (from the parent node).


                   IV. S HAPING A LGORITHM                              B. Node Shaping
                                                                           Next, we introduce the procedure for transforming two shapable
   In this section, we discuss how to transform two ordered, but        nodes into two semi-isomorphic nodes, which is the basic building
not semi-isomorphic FDDs fa and fb into two semi-isomorphic             block in the shaping algorithm for transforming two ordered
          ′        ′                                    ′
FDDs fa and fb such that fa is equivalent to fa , and fb is             FDDs into two semi-isomorphic FDDs. Shapable nodes and semi-
                ′
equivalent to fb . Informally, a firewall decision diagram is ordered    isomorphic nodes are defined as follows.
if and only if along every path from the root to a terminal node,          Definition 4.4 (Shapable Nodes): Let fa and fb be two or-
the labels of the non-terminal nodes obey the same order; two           dered simple FDDs, va be a node in fa , and vb be a node in
firewall decision diagrams are semi-isomorphic if and only if they       fb . Nodes va and vb are shapable iff one of the following two
are exactly the same except for the labels of their terminal nodes.     conditions holds:
The formal definitions of ordered FDDs and semi-isomorphic
                                                                           1) Both va and vb have no parents, i.e., they are the roots of
FDDs are as follows. Note that the FDDs constructed by the
                                                                               their respective FDDs;
construction algorithm in Section III are ordered.
                                                                           2) Both va and vb have parents, their parents have the same
   Definition 4.1 (Ordered FDDs): Let ≺ be the total order over
                                                                               label, and their incoming edges have the same label.       2
the packet fields F1 , · · · , Fd where F1 ≺ · · · ≺ Fd holds. An
                                                                           Definition 4.5 (Semi-isomorphic Nodes): Let fa and fb be two
FDD is ordered iff for each decision path (v1 e1 · · · vk ek vk+1 ),
                                                                        ordered simple FDDs, va be a node in fa and vb be a node in
we have F (v1 ) ≺ · · · ≺ F (vk ).                                 2
                                                                        fb . The two nodes va and vb are semi-isomorphic iff one of the
   Definition 4.2 (Semi-isomorphic FDDs): Two FDDs f and f ′
                                                                        following two conditions holds:
are semi-isomorphic iff there exists a one-to-one mapping σ from
the nodes of f onto the nodes of f ′ , such that the following two         1) Both va and vb are terminal nodes;
conditions hold:                                                           2) Both va and vb are nonterminal nodes with the same label
                                                                               and there exists a one-to-one mapping σ from the children
  1) For any node v in f , either both v and σ(v) are nonterminal              of va to the children of vb such that for each child v of va ,
      nodes with the same label, or both of them are terminal                  v and σ(v) are shapable.                                   2
      nodes;                                                               For example, the two nodes labelled F1 in Figure 8 are shapable
  2) For each edge e in f , where e is from a node v1 to a node         since they have no parents, and the two nodes labelled F1 in
      v2 , there is an edge e′ from σ(v1 ) to σ(v2 ) in f ′ , and the   Figure 9 are semi-isomorphic nodes.
      two edges e and e′ have the same label.                      2
  The algorithm for transforming two ordered FDDs into two                                           shapable nodes
semi-isomorphic FDDs uses the following three basic operations.
(Note that none of these operations change the semantics of the                             F1                                F1
FDDs.)
  1) Node Insertion: If along all the decision paths containing                   [1, 50]          [51, 100]        [1, 30]        [31, 100]
     a node v , there is no node that is labelled with a field F ,
     then we can insert a node v ′ labelled F above v as follows:
                                                                                    F2              F2                F2           F2
     make all incoming edges of v point to v ′ , create one edge
     from v ′ to v , and label this edge with the domain of F .
                                                                        Fig. 8.    Two shapable nodes in two FDDs
  2) Edge Splitting: For an edge e from v1 to v2 , if I(e) =
     S1 ∪ S2 , where neither S1 nor S2 is empty, then we can
     split e into two edges as follows: replace e by two edges             The algorithm for making two shapable nodes va and vb semi-
     from v1 to v2 , label one edge with S1 and label the other         isomorphic consists of two steps:
     with S2 .                                                             1) Step I: This step is skipped if va and vb have the same label,
  3) Subgraph Replication: If a node v has m (m ≥ 2) incoming                 or both of them are terminal nodes. Otherwise, without loss
     edges, we can make m copies of the subgraph rooted at v ,                of generality, assume F (va ) ≺ F (vb ). It is straightforward
     and make each incoming edge of v point to the root of one                to show that in this case along all the decision paths
     distinct copy.                                                           containing node vb , no node is labelled F (va ). Therefore,
8


                           semi-isomorphic nodes

                  F1                                    F1                      Procedure Node Shaping( fa , fb , va , vb )
                                                                                Input : Two ordered simple FDDs fa and fb , and
    [1, 30]         [31, 50]   [51, 100]   [1, 30]       [31, 50]   [51, 100]
                                                                                           two shapable nodes va in fa and vb in fb
                                                                                Output: The two nodes va and vb become semi-isomorphic,
     F2           F2            F2          F2          F2            F2                   and the procedure returns a set S of node pairs of
                                                                                           the form (wa , wb ) where wa is a child of va in fa ,
                                                                                           wb is a child of vb in fb , and the two nodes wa and
          shapable nodes       shapable nodes        shapable nodes                        wb are shapable.
Fig. 9.    Two semi-isomorphic nodes                                            Steps:
                                                                                1. if (both va and vb are terminal) return( ∅ );
                                                                                   else if ∼(both va and vb are nonterminal
                                                                                               and they have the same label)
                                         ′                                         then /*Here either both va and vb are nonterminal and
      we can create a new node vb with label F (va ), create a
                                                 ′                                           they have different labels, or one node is terminal
      new edge with label D(F (va )) from vb to vb , and make all
                                            ′                                                and the other is nonterminal. Without loss of
      incoming edges of vb point to vb . Now va has the same
                 ′                                                                           generality, assume one of the following conditions holds:
      label as vb . (Recall that this node insertion operation leaves
                                                                                             (1) both va and vb are nonterminal and F (va ) ≺ F (vb ),
      the semantics of the FDD unchanged.)
                                                                                             (2) va is nonterminal and vb is terminal.*/
   2) Step II: From the previous step, we can assume that va
                                                                                          insert a new node with label F (va ) above vb ,
      and vb have the same label. In the current step, we use the
                                                                                          and call the new node vb ;
      two operations of edge splitting and subgraph replication
                                                                                2. let E(va ) be {ea,1 , · · · , ea,m } where I(ea,1 ) < · · · < I(ea,m ).
      to build a one-to-one correspondence from the children of
                                                                                   let E(vb ) be {eb,1 , · · · , eb,n } where I(eb,1 ) < · · · < I(eb,n ).
      va to the children of vb such that each child of va and its
                                                                                3. i := 1; j := 1;
      corresponding child of vb are shapable.
                                                                                   while ( ( i < m ) or ( j < n ) ) do{
      Suppose D(F (va )) = D(F (vb )) = [a, b]. We know that
                                                                                       /*During this loop, the two intervals I(ea,i ) and
      each outgoing edge of va or vb is labelled with a single
                                                                                         I(eb,j ) always begin with the same integer.*/
      interval. Suppose va has m outgoing edges {e1 , · · · , em },
                                                                                       let I(ea,i ) = [A, B] and I(eb,j ) = [A, C], where
      where I(ei ) = [ai , bi ], a1 = a, bm = b, and every ai+1 =
                                                                                           A, B , C are three integers;
      bi + 1. Also suppose vb has n outgoing edges {e′ , · · · , e′ },
                                                           1       n
                                                                                       if B = C then {i := i + 1; j := j + 1; }
      where I(e′ ) = [a′ , b′ ], a′ = a, bn = b, and every a′
                   i       i i     1
                                           ′
                                                                i+1 =
                                                                                       else if B < C then{
      b ′ + 1.
        i
                                                                                            (a) create an outgoing edge e of vb ,
      Comparing edge e1 , whose label is [a, b1 ], and e′ , whose
                                                             1
                                                                                                and label e with [A, B];
      label is [a, b′ ], we have the following two cases: (1) b1 = b′ :
                     1                                               1
                                                                                            (b) make a copy of the subgraph rooted at eb,j .t and
      In this case I(e1 ) = I(e′ ), therefore, node e1 .t and node
                                   1
                                                                                                make e point to the root of the copy;
      e′ .t are shapable. (Recall that we use e.t to denote the node
        1
                                                                                            (c) I(eb,j ) := [B + 1, C];
      that edge e points to.) Then we can continue to compare e2
                                                                                            (d) i := i + 1;}
      and e′ since both I(e2 ) and I(e′ ) begin with b1 + 1. (2)
             2                             2
                                                                                       else {/*B > C */
      b1 = b′ : Without loss of generality, we assume b1 < b′ .
               1                                                     1
                                                                                            (a) create an outgoing edge e of va ,
      In this case, we split e′ into two edges e and e′ , where e
                                 1
                                                                                                and label e with [A, C];
      is labelled [a, b1 ] and e′ is labelled [b1 + 1, b′ ]. Then we
                                                         1
                                                        ′                                   (b) make a copy of the subgraph rooted at ea,j .t and
      make two copies of the subgraph rooted at e1 .t and let e
                                                                                                make e point to the root of the copy;
      and e′ point to one copy each. Thus I(e1 ) = I(e) and the
                                                                                            (c) I(ea,i ) := [C + 1, B];
      two nodes, e1 .t and e.t are shapable. Then we can continue
                                                                                            (d) j := j + 1;}
      to compare the two edges e2 and e′ since both I(e2 ) and
                                                                                   }
      I(e′ ) begin with b1 + 1.
                                                                                4. /*Now va and vb become semi-isomorphic.*/
      The above process continues until we reach the last outgo-
                                                                                   let E(va ) = {ea,1 , · · · , ea,k } where
      ing edge of va and the last outgoing edge of vb . Note that
                                                                                       I(ea,1 ) < · · · < I(ea,k ) and k ≥ 1;
      each time that we compare an outgoing edge of va and an
                                                                                   let E(vb ) = {eb,1 , · · · , eb,k } where
      outgoing edge of vb , the two intervals labelled on the two
                                                                                       I(eb,1 ) < · · · < I(eb,k ) and k ≥ 1;
      edges begin with the same value. Therefore, the last two
                                                                                   S := ∅;
      edges that we compare must have the same label because
                                                                                   for i = 1 to k do
      they both end with b. In other words, this edge splitting
                                                                                       add the pair of shapable nodes ( ea,i .t, eb,i .t ) to S ;
      and subgraph replication process will terminate. When it
                                                                                   return( S );
      terminates, va and vb become semi-isomorphic.
                                                                                End
  Figure 10 shows the pseudocode for making two shapable
                                                                                Fig. 10.   Node Shaping Algorithm
nodes in two ordered simple FDDs semi-isomorphic. We use
I(e) < I(e′ ) to indicate that every integer in I(e) is less than
every integer in I(e′ ).
9



   If we apply the above node shaping procedure to the two                these two sets manifest the functional discrepancies between the
shapable nodes labelled F1 in Figure 8, we make them semi-                two FDDs, the two design teams can investigate them to resolve
isomorphic as shown in Figure 9.                                          the discrepancies.
                                                                             Let fa be the FDD in Figure 4, and fb be the FDD in Figure 5.
C. FDD Shaping                                                            Here fa is equivalent to the firewall in Table I designed by Team
                                                                          A, and fb is equivalent to the firewall in Table II designed by
   To make two ordered FDDs fa and fb semi-isomorphic, we first
                                                                          Team B. By comparing fa and fb , we can discover all functional
make fa and fb simple, and then make fa and fb semi-isomorphic
                                                                          discrepancies between the firewalls designed by A and B. The
as follows. Suppose we have a queue Q, which is initially empty.
                                                                          discrepancies are shown in Table III, based on which the following
At first we put the pair of shapable nodes consisting of the root
                                                                          three questions need to be investigated:
of fa and the root of fb into Q. As long as Q is not empty,
we remove the head of Q, feed the two shapable nodes to the                  1) Should we allow the computers from the malicious domain
above Node Shaping procedure, then put all the pairs of shapable                to send email to the mail server? Team A says yes, while
nodes returned by the Node Shaping procedure into Q. When                       Team B says no.
the algorithm finishes, fa and fb become semi-isomorphic. The
pseudocode for this shaping algorithm is shown in Figure 11.
                                                                                 Interface             0
Shaping Algorithm                                                                Source IP             224.168.0.0/16
Input : Two ordered FDDs fa and fb                                               Destination IP:       192.168.0.1
Output : fa and fb become semi-isomorphic.                                       Destination Port:     25
Steps:                                                                           Protocol Type:        TCP
1. make the two FDDs fa and fb simple;                                           Team A Decision:      accept
2. Q := ∅;                                                                       Team B Decision:      discard
3. add the shapable pair (root of fa , root of fb ) to Q;
4. while Q = ∅ do{                                                          2) Should we allow non-TCP packets with destination port
       remove the header pair (va , vb ) from Q;                               number 25 to be sent from the hosts that are not in the
       S :=Node Shaping( fa , fb , va , vb );                                  malicious domain to the mail server? Team A says yes,
       add every shapable pair from S into Q;                                  while Team B says no.
   }
End                                                                              Interface             0
                                                                                 Source IP             !224.168.0.0/16
Fig. 11.   Shaping Algorithm                                                     Destination IP:       192.168.0.1
                                                                                 Destination Port:     25
  As an example, if we apply the above shaping algorithm to the                  Protocol Type:        !TCP
two FDDs in Figures 2 and 3, we obtain two semi-isomorphic                       Team A Decision:      accept
FDDs as shown in Figures 4 and 5.                                                Team B Decision:      discard

                   V. C OMPARISON A LGORITHM                                3) Should we allow the packets with a destination port
   In this section, we consider how to compare two semi-                       number other than 25 to be sent from the hosts who are
isomorphic FDDs. Given two semi-isomorphic FDDs fa                             not in the malicious domain to the mail server? Team A
and fb with a one-to-one mapping σ , each decision path                        says yes, while Team B says no.
(v1 e1 · · · vk ek vk+1 ) in fa has a corresponding decision path
(σ(v1 )σ(e1 ) · · · σ(vk )σ(ek )σ(vk+1 )) in fb . Similarly, each rule           Interface             0
(F (v1 ) ∈ I(e1 )) ∧ · · · ∧ (F (vk ) ∈ I(ek )) → F (vk+1 ) in                   Source IP             !224.168.0.0/16
fa .rules has a corresponding rule (F (σ(v1 )) ∈ I(σ(e1 )))∧∧ · · ·∧             Destination IP:       192.168.0.1
(F (σ(vk )) ∈ I(σ(ek ))) → F (σ(vk+1 )) in fb .rules . Note that                 Destination Port:     !25
F (vi ) = F (σ(vi )) and I(ei ) = I(σ(ei )) for each i where 1 ≤                 Protocol Type:        *
i ≤ k. Therefore, for each rule (F (v1 ) ∈ I(e1 )) ∧ · · · ∧ (F (vk ) ∈          Team A Decision:      accept
I(ek )) → F (vk+1 ) in fa .rules , the corresponding rule in fb .rules           Team B Decision:      discard
is (F (v1 ) ∈ I(e1 )) ∧ · · · ∧ (F (vk ) ∈ I(ek )) → F (σ(vk+1 )). Each
of these two rules is called the companion of the other.                                 VI. D ISCREPANCY R ESOLUTION
   This companionship implies a one-to-one mapping from the                  After all functional discrepancies are computed, the teams need
rules defined by the decision paths in fa to the rules defined by           to discuss correct decisions for each discrepancy. Consider the
the decision paths in fb . Note that for each rule and its companion,     discrepancies shown in Table III. Suppose these discrepancies are
either they are identical, or they have the same predicate but            resolved as shown in Table IV.
different decisions. Therefore, fa .rules − fb .rules is the set of          The question that we want to answer in this section is: how do
all the rules in fa .rules that have different decisions from their       we generate the final firewall that reflects the resolved functional
companions. This applies similarly for fb .rules − fa .rules . Note       discrepancies? Of course, if one team made all the correct
that the set of all the companions of the rules in fa .rules−fb .rules    decisions according to the discrepancy resolution, we can simply
is fb .rules − fa .rules ; and similarly the set of all the companions    deploy the firewall designed by that team. Next, we assume that
of the rules in fb .rules − fa .rules is fa .rules − fb .rules . Since    no team makes all the correct decisions. In this paper, we propose
10


         Discrepancy #       Interface       Source IP      Destination IP   Destination Port     Protocol   Resolved Decision
                1                0         224.168.0.0/16    192.168.0.1            25              TCP           discard
                2                0        !224.168.0.0/16    192.168.0.1            25             !TCP            accept
                3                0        !224.168.0.0/16    192.168.0.1           !25               *            discard
                                                                TABLE IV
                                                   R ESOLVED FUNCTIONAL DISCREPANCIES



                    Rule #    Interface      Source IP      Destination IP   Destination Port     Protocol    Decision
                      1           0        224.168.0.0/16         *                 *                *        discard
                      2           0              *           192.168.0.1           25                *         accept
                      3           0              *           192.168.0.1            *                *        discard
                      4           *              *                *                 *                *         accept
                                                                TABLE V
                                              F IREWALL GENERATED FROM THE CORRECTED FDD




two methods for this purpose. Then we discuss which methods           two steps, the resulting firewall is shown in Table VI. Similarly,
we should choose in practice.                                         we can pick the firewall in Table II designed by Team B, and
                                                                      then add the second rule from the discrepancy resolution in Table
                                                                      IV to the beginning of the firewall. After the above two steps, the
A. Method 1: Generate Rules from Corrected FDD
                                                                      resulting firewall is shown in Table VII.
   This method has two steps. First, correct one of the two semi-
isomorphic FDDs using discrepancy resolution. Second, generate
                                                                                                VII. D ISCUSSION
rules from the resulting FDD using the algorithms presented in
[12].                                                                 A. Prefix and Intervals
   Step 1: FDD Correction. We can pick either of the two semi-           Real-life firewalls usually check five packet fields: source IP
isomorphic FDDs generated by the FDD shaping algorithm and            address, destination IP address, source port number, destination
apply corrections on the labels of the terminal nodes. Note that      port number, and protocol type. Of these five fields, the first two
after we apply fixes to two semi-isomorphic FDDs, they become          fields are usually represented using prefix formats, and the last
exactly the same. Note that we cannot directly use the corrected      three fields are usually integer intervals. Note that prefix formats
FDD as the configuration of a firewall because most existing            and interval formats are interconvertable. For example, IP prefix
firewall devices take a sequence of rules as their configuration.       192.168.0.0/16 can be converted to the interval from 192.168.0.0
   Step 2: Firewall Generation. Given the corrected FDD, we can       to 192.168.255.255, where an IP address can be regarded as a 32-
apply the algorithms in [12] for generating a compact firewall         bit integer. As another example, the interval [2, 8] can be converted
from an FDD. Table V shows the firewall generated from the             to 3 prefixes: 001∗, 01∗, 1000.
corrected FDD. Interested readers can refer to [12] for more             To use the algorithms presented in this paper, we first convert
technical details.                                                    the source and destination IP addresses from prefix formats to
                                                                      integer intervals. Note that every prefix can be converted to
B. Method 2: Combine Corrections with Original Firewalls              only one integer interval. Second, we run the three algorithms
                                                                      described in this paper. Note that the functional discrepancies
   The second method is to create a new firewall using the rules       directly produced by our algorithms are in interval formats. Third,
in the discrepancy resolution and one of the original firewalls.       for each functional discrepancy computed, we convert the source
This method consists of the following two steps.                      and destination IP addresses from intervals to prefixes. Thus, the
   Step 1: Firewall Composition. In this step, we first pick an        formats of outputs are similar to those of original firewall rules,
original firewall, and then we take all the rules in the discrepancy   which are easy to understand for firewall administrators. (A w−bit
resolution in which the original firewall made incorrect decisions     integer interval can be converted to at most 2w − 2 prefixes [14].)
and add them to the beginning of the firewall.
   Step 2: Redundancy Removal In this step, we apply the firewall
compaction algorithm in [19] to remove redundant rules from the       B. Design in FDDs
resulting sequence of rules. A rule is redundant if and only if          In our discussion so far, we have assumed that the two teams
removing the rule does not change the semantics of the firewall.       all design their firewalls using a sequence of rules. In fact, a team
   For example, we can pick the firewall in Table I designed by        can use the structured firewall design method in [12] to design
Team A, and on top of that, we can add the first rule and the third    the firewall using an FDD. Such cases are easy to handle using
rule from the discrepancy resolution in Table IV. Note that Team      the FDD construction algorithm in this paper and the firewall
A only made incorrect decisions for the packets that match the        generation algorithm in [12]. For example, if only one team
first rule and the third rule in Table IV. By adding these two rules   designs the firewall using a non-ordered FDD, we can use the
to the beginning of the original three rules designed by Team A,      firewall generation algorithm in [12] to generate a sequence of
all packets are mapped to the correct decisions. After the above      rules from the FDD first, and then apply the algorithms in this
11


                 Rule #     Interface      Source IP      Destination IP   Destination Port    Protocol   Decision
                    1           0        224.168.0.0/16    192.168.0.1            25             TCP      discard
                    2           0       !224.168.0.0/16    192.168.0.1           !25              *       discard
                    3           0              *           192.168.0.1            25             TCP       accept
                    4           0        224.168.0.0/16         *                  *              *       discard
                    5           *              *                *                  *              *        accept
                                                            TABLE VI
                          F IREWALL GENERATED BY COMBINING THE RULES IN TABLE IV AND THE RULES IN TABLE I



                 Rule #     Interface      Source IP      Destination IP   Destination Port    Protocol   Decision
                    1           0       !224.168.0.0/16    192.168.0.1           25             !TCP       accept
                    2           0        224.168.0.0/16         *                 *               *       discard
                    3           0              *           192.168.0.1           25              TCP       accept
                    4           0              *           192.168.0.1            *               *       discard
                    5           *              *                *                 *               *        accept
                                                            TABLE VII
                          F IREWALL GENERATED BY COMBINING THE RULES IN TABLE IV AND THE RULES IN TABLE II




paper. As another example, if two teams design two ordered          Decision Diagrams (BDDs) [6]? A BDD is a rooted, directed,
firewall decision diagrams that are in a different order, we can     acyclic graph that represents a Boolean function. In a BDD, each
first generate an equivalent sequence of rules from one diagram,     non-terminal node is labeled by a Boolean variable and it has
and then we can construct an equivalent ordered firewall decision    only two outgoing edges labeled 0 and 1 respectively. Each edge
diagram from the sequence of rules using the order of packet        represents an assignment of 0 or 1. A BDD only has two terminal
fields from the other firewall decision diagram.                      nodes labeled 0 and 1 respectively.
                                                                       The answer is that the functional discrepancies computed by
C. More Than Two Teams                                              BDDs are not human readable. First, the BDD itself, i.e., the one
                                                                    that represents the functional discrepancies between two firewalls,
   In terms of firewall comparison, what we have discussed so far
                                                                    is not human readable because every node in a BDD represents
is how to compare two firewalls. If we have N firewalls designed
                                                                    only a bit of a packet, not a field of a packet. Second, generating
by N teams, where N > 2, there are two ways to compare them:
                                                                    human readable discrepancies, which are similar to rules, from
cross comparison and direct comparison. Cross comparison means
                                                                    a BDD results in an exorbitant number of rules, which is in
to compare each of the N ∗(N −1) pairs, where each pair consists
                                                                    terms of millions. We have implemented BDD-based solutions
of two of the N firewalls. Direct comparison means to extend the
                                                                    using CUDD package [23]. Unfortunately, comparing two small
shaping algorithm and the comparison algorithm to handle N
                                                                    firewalls results in millions of rules. While compressing millions
firewalls. This extension is considered fairly straightforward.
                                                                    of rules may not be impossible, it is by no means trivial. In
                                                                    contrast, using the data structure of FDDs, we can easily generate
D. Complexity Analysis                                              human readable functional discrepancies in a rule-like format.
   Let n be the number of rules in a firewall, and d be the
total number of distinct packet fields that are examined by a                        VIII. E XPERIMENTAL R ESULTS
firewall. Based on Theorem 1, the time and space complexity            In this section, we present the results of the experiments that
of the FDD construction algorithm is O(nd ). Similarly, the time    we conducted to evaluate both the effectiveness and efficiency of
and space complexity of the FDD shaping algorithm and the FDD       our diverse firewall design method.
comparison algorithm is O((n+m)d ), where n and m are the total
number of rules in the two given firewalls respectively. Despite     A. Effectiveness
such worst case complexities, our algorithms are practical for
                                                                       To evaluate the effectiveness of the diverse firewall design
two reasons. First, d is typically small. Most real-life firewalls
                                                                    method, we conducted a real experiment as follows. First, we
only examine four packet fields: source IP address, destination IP
                                                                    obtained a real-life firewall used in a university. This firewall
address, destination port number, and protocol type. Second, the
                                                                    was maintained by a senior firewall administrator as a sequence
worst case of our algorithms is extremely unlikely to happen in
                                                                    of rules. This firewall, unfortunately, did not have a requirement
practice. The experimental results in the next section confirm the
                                                                    specification. However, the rules in this firewall were well doc-
above observations.
                                                                    umented in that each rule had some detailed comments about
                                                                    why the rule was added. Taking the comments of the rules
E. Why not BDDs?                                                    as the requirement specification, we let a computer science
  Our solution uses FDDs as the basic data structure for com-       undergraduate student design a firewall using firewall decision
puting the functional discrepancies between two given firewalls.     diagrams. Before the design started, we gave the student some
One question that we need to answer is: why not use Binary          training on designing firewalls using firewall decision diagrams.
10.1.1.92.7063
10.1.1.92.7063
10.1.1.92.7063

Más contenido relacionado

Destacado

NE FIRST Town Hall Meetings - Connecticut
NE FIRST Town Hall Meetings - ConnecticutNE FIRST Town Hall Meetings - Connecticut
NE FIRST Town Hall Meetings - Connecticutne-first
 
Zap wordmark presentation
Zap wordmark presentationZap wordmark presentation
Zap wordmark presentationbblubaugh
 
真桑文楽
真桑文楽真桑文楽
真桑文楽akitomoko
 
Information technology
Information technologyInformation technology
Information technologyMarvie Ramos
 
NE FIRST Town Hall Meetings - Maine
NE FIRST Town Hall Meetings - MaineNE FIRST Town Hall Meetings - Maine
NE FIRST Town Hall Meetings - Mainene-first
 

Destacado (6)

āSana v2
āSana v2āSana v2
āSana v2
 
NE FIRST Town Hall Meetings - Connecticut
NE FIRST Town Hall Meetings - ConnecticutNE FIRST Town Hall Meetings - Connecticut
NE FIRST Town Hall Meetings - Connecticut
 
Zap wordmark presentation
Zap wordmark presentationZap wordmark presentation
Zap wordmark presentation
 
真桑文楽
真桑文楽真桑文楽
真桑文楽
 
Information technology
Information technologyInformation technology
Information technology
 
NE FIRST Town Hall Meetings - Maine
NE FIRST Town Hall Meetings - MaineNE FIRST Town Hall Meetings - Maine
NE FIRST Town Hall Meetings - Maine
 

Similar a 10.1.1.92.7063

SURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENT
SURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENTSURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENT
SURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENTijsrd.com
 
A Novel Management Framework for Policy Anomaly in Firewall
A Novel Management Framework for Policy Anomaly in FirewallA Novel Management Framework for Policy Anomaly in Firewall
A Novel Management Framework for Policy Anomaly in Firewallijsrd.com
 
An Effective Policy Anomaly Management Framework for Firewalls
An Effective Policy Anomaly Management Framework for FirewallsAn Effective Policy Anomaly Management Framework for Firewalls
An Effective Policy Anomaly Management Framework for FirewallsIJMER
 
WIRELESS COMPUTING AND IT ECOSYSTEMS
WIRELESS COMPUTING AND IT ECOSYSTEMSWIRELESS COMPUTING AND IT ECOSYSTEMS
WIRELESS COMPUTING AND IT ECOSYSTEMScscpconf
 
Cross-Domain Privacy-Preserving Cooperative Firewall Optimization
Cross-Domain Privacy-Preserving Cooperative Firewall OptimizationCross-Domain Privacy-Preserving Cooperative Firewall Optimization
Cross-Domain Privacy-Preserving Cooperative Firewall OptimizationVenkatavarma Vegiraju
 
IRJET- Software Architecture and Software Design
IRJET- Software Architecture and Software DesignIRJET- Software Architecture and Software Design
IRJET- Software Architecture and Software DesignIRJET Journal
 
Defensive coding practices is one of the most critical proactive s
Defensive coding practices is one of the most critical proactive sDefensive coding practices is one of the most critical proactive s
Defensive coding practices is one of the most critical proactive sLinaCovington707
 
Software engineering Questions and Answers
Software engineering Questions and AnswersSoftware engineering Questions and Answers
Software engineering Questions and AnswersBala Ganesh
 
Auto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall PolicyAuto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall PolicyIOSR Journals
 
The Role of the Architect in ERP and PDM System Deployment
The Role of the Architect in ERP and PDM System DeploymentThe Role of the Architect in ERP and PDM System Deployment
The Role of the Architect in ERP and PDM System DeploymentGlen Alleman
 
Improved Strategy for Distributed Processing and Network Application Developm...
Improved Strategy for Distributed Processing and Network Application Developm...Improved Strategy for Distributed Processing and Network Application Developm...
Improved Strategy for Distributed Processing and Network Application Developm...Editor IJCATR
 
Improved Strategy for Distributed Processing and Network Application Development
Improved Strategy for Distributed Processing and Network Application DevelopmentImproved Strategy for Distributed Processing and Network Application Development
Improved Strategy for Distributed Processing and Network Application DevelopmentEditor IJCATR
 
Elements of Innovation Management in Computer Software and Services
Elements of Innovation Management in Computer Software and ServicesElements of Innovation Management in Computer Software and Services
Elements of Innovation Management in Computer Software and ServicesMichael Le Duc
 
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...ijcseit
 
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...ijcseit
 
Application and Website Security -- Developer Edition: Introducing Security I...
Application and Website Security -- Developer Edition:Introducing Security I...Application and Website Security -- Developer Edition:Introducing Security I...
Application and Website Security -- Developer Edition: Introducing Security I...Daniel Owens
 
[EMC] Source Code Protection
[EMC] Source Code Protection[EMC] Source Code Protection
[EMC] Source Code ProtectionPerforce
 

Similar a 10.1.1.92.7063 (20)

SURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENT
SURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENTSURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENT
SURVEY ON COOPERATIVE FIREWALL ANOMALY DETECTION AND REDUNDANCY MANAGEMENT
 
A Novel Management Framework for Policy Anomaly in Firewall
A Novel Management Framework for Policy Anomaly in FirewallA Novel Management Framework for Policy Anomaly in Firewall
A Novel Management Framework for Policy Anomaly in Firewall
 
An Effective Policy Anomaly Management Framework for Firewalls
An Effective Policy Anomaly Management Framework for FirewallsAn Effective Policy Anomaly Management Framework for Firewalls
An Effective Policy Anomaly Management Framework for Firewalls
 
WIRELESS COMPUTING AND IT ECOSYSTEMS
WIRELESS COMPUTING AND IT ECOSYSTEMSWIRELESS COMPUTING AND IT ECOSYSTEMS
WIRELESS COMPUTING AND IT ECOSYSTEMS
 
Cross-Domain Privacy-Preserving Cooperative Firewall Optimization
Cross-Domain Privacy-Preserving Cooperative Firewall OptimizationCross-Domain Privacy-Preserving Cooperative Firewall Optimization
Cross-Domain Privacy-Preserving Cooperative Firewall Optimization
 
Dp4301696701
Dp4301696701Dp4301696701
Dp4301696701
 
IRJET- Software Architecture and Software Design
IRJET- Software Architecture and Software DesignIRJET- Software Architecture and Software Design
IRJET- Software Architecture and Software Design
 
Defensive coding practices is one of the most critical proactive s
Defensive coding practices is one of the most critical proactive sDefensive coding practices is one of the most critical proactive s
Defensive coding practices is one of the most critical proactive s
 
Software engineering Questions and Answers
Software engineering Questions and AnswersSoftware engineering Questions and Answers
Software engineering Questions and Answers
 
Auto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall PolicyAuto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall Policy
 
The Role of the Architect in ERP and PDM System Deployment
The Role of the Architect in ERP and PDM System DeploymentThe Role of the Architect in ERP and PDM System Deployment
The Role of the Architect in ERP and PDM System Deployment
 
Improved Strategy for Distributed Processing and Network Application Developm...
Improved Strategy for Distributed Processing and Network Application Developm...Improved Strategy for Distributed Processing and Network Application Developm...
Improved Strategy for Distributed Processing and Network Application Developm...
 
Improved Strategy for Distributed Processing and Network Application Development
Improved Strategy for Distributed Processing and Network Application DevelopmentImproved Strategy for Distributed Processing and Network Application Development
Improved Strategy for Distributed Processing and Network Application Development
 
Elements of Innovation Management in Computer Software and Services
Elements of Innovation Management in Computer Software and ServicesElements of Innovation Management in Computer Software and Services
Elements of Innovation Management in Computer Software and Services
 
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
 
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
ANALYSIS OF SECURITY ASPECTS FOR DYNAMIC RESOURCE MANAGEMENT IN DISTRIBUTED S...
 
Application and Website Security -- Developer Edition: Introducing Security I...
Application and Website Security -- Developer Edition:Introducing Security I...Application and Website Security -- Developer Edition:Introducing Security I...
Application and Website Security -- Developer Edition: Introducing Security I...
 
[EMC] Source Code Protection
[EMC] Source Code Protection[EMC] Source Code Protection
[EMC] Source Code Protection
 
dist_systems.pdf
dist_systems.pdfdist_systems.pdf
dist_systems.pdf
 
scp
scpscp
scp
 

10.1.1.92.7063

  • 1. 1 Diverse Firewall Design Alex X. Liu, Member, IEEE, Mohamed G. Gouda, Member, IEEE Abstract— Firewalls are the mainstay of enterprise security a firewall accepts legitimate packets and discards illegitimate ones and the most widely adopted technology for protecting private according to its “policy”, i.e., “configuration”. networks. An error in a firewall policy either creates security A firewall policy consists of a sequence (i.e., an ordered list) holes that will allow malicious traffic to sneak into a private of rules where each rule is of the form predicate → decision . network or blocks legitimate traffic and disrupts normal business processes, which in turn could lead to irreparable, if not tragic, The predicate of a rule is a boolean expression over some packet consequences. It has been observed that most firewall policies on fields such as source IP address, destination IP address, source the Internet are poorly designed and have many errors. There- port number, destination port number, and protocol type. The fore, how to design firewall policies correctly is an important decision of a rule can be accept, or discard, or a combination of issue. these decisions with other options such as a logging option. The In this paper, we propose the method of diverse firewall design, rules in a firewall policy often conflict. To resolve such conflicts, which consists of three phases: a design phase, a comparison the decision for each packet is the decision of the first (i.e., highest phase, and a resolution phase. In the design phase, the same requirement specification of a firewall policy is given to multiple priority) rule that the packet matches. teams who proceed independently to design different versions of the firewall policy. In the comparison phase, the resulting multiple versions are compared with each other to detect all A. Motivation functional discrepancies between them. In the resolution phase, Although a firewall policy is a mere sequence of rules, correctly all discrepancies are resolved and a firewall that is agreed upon designing one is by no means easy. The rules in a firewall policy by all teams is generated. The major technical challenge in the are logically entangled because of conflicts among rules and the method of diverse firewall design is how to discover all functional discrepancies between two given firewall policies. We present a resulting order sensitivity [26]. Ordering the rules correctly in a series of three efficient algorithms for solving this problem: a firewall is critical, yet difficult. The implication of any rule in a construction algorithm, a shaping algorithm, and a comparison firewall cannot be understood correctly without examining all the algorithm. rules listed above that rule. Furthermore, a firewall policy may The algorithms for discovering all functional discrepancies consist of a large number of rules. A firewall on the Internet may between two given firewall policies can be used to perform firewall consist of hundreds or even a few thousand rules in extreme cases. policy change-impact analysis as well. Firewall policies often need One can imagine the complexity of the logic underlying so many to be changed as networks evolve and new threats emerge. Many firewall policy errors are caused by the unintended side-effects of conflicting rules. policy changes. Our algorithms can be used directly to compute An error in a firewall policy, i.e., a wrong definition of the impact of firewall policy changes by computing the functional being legitimate or illegitimate for some packets, means that discrepancies between the policy before changes and the policy the firewall either accepts some malicious packets, which con- after changes. sequently creates security holes in the firewall, or discards some Index Terms— Firewall policy, policy design, design diversity, legitimate packets, which consequently disrupts normal business. change-impact analysis, network security. Either case could cause irreparable, if not tragic, consequences. Given the importance of firewalls, such errors are not acceptable. Unfortunately, it has been observed that most firewalls on the I. I NTRODUCTION Internet are poorly designed and have many errors in their policies Firewalls are crucial elements in network security, and they [26]. Therefore, how to design firewall policies correctly is an have been widely deployed to secure private networks in busi- important issue. nesses and institutions. A firewall is a security guard placed at the Since the correctness of a firewall policy is the focus of this point of entry between a private network and the outside Internet paper, we assume a firewall is correct if and only if its policy is such that all incoming and outgoing packets have to pass through correct, and a firewall policy is correct if and only if it satisfies it. A packet can be viewed as a tuple with a finite number of fields its given requirement specification, which is usually written in such as source IP address, destination IP address, source port a natural language. In the rest of this paper, we use the term number, destination port number, and protocol type. By examining “firewall” to mean “firewall policy” or “firewall configuration” the values of these fields for each incoming and outgoing packet, unless otherwise specified. We categorize firewall errors into specification induced errors The preliminary version of this paper was published in the proceedings and design induced errors. Specification induced errors are caused of the IEEE International Conference on Dependable Systems and Networks by the inherent ambiguities of informal requirement specifica- (DSN-04), pages 595-604, Florence, Italy, June 2004. It won the William C. Carter Award. tions, especially if the requirement specification is written in Alex X. Liu is with the Department of Computer Science and Engineer- a natural language. Design induced errors are caused by the ing at Michigan State University, East Lansing, Michigan, U.S.A. Email: technical incapacity of individual firewall designers. Different alexliu@cse.msu.edu designers may have different understandings of the same informal Mohamed G. Gouda is with the Department of Computer Sciences at The University of Texas at Austin, Austin, Texas, U.S.A. Email: requirement specification, and different designers may exhibit gouda@cs.utexas.edu different technical strengths and weaknesses. Note that in this
  • 2. 2 paper we assume that the given requirement specification of a administrators, and it is common that human administrators make firewall is informal. Automatically converting a formal firewall mistakes. It has been shown that administrator errors are the specification to a deployable firewall policy has been addressed largest cause of failure for Internet services, and policy errors in [12]. However, the formal specification of a firewall policy is are the largest category of administrator errors [21]. still difficult to specify correctly. Above observations motivate our The algorithms for discovering all functional discrepancies method of diverse firewall design. between two given firewalls can be directly used to perform firewall change-impact analysis. The impact of the changes can B. Our Solution literally be defined as the functional discrepancies between the firewall before changes and the firewall after changes. Our diverse firewall design method has the following three phases: a design phase, a comparison phase, and a resolution phase. D. Relationship to Prior Art 1) Design Phase: In this phase, the same requirement specifi- Some firewall design and analysis methods have been proposed cation of a firewall is given to multiple teams who proceed previously [1], [5], [11], [12], [15], [19], [20], [29]. However, independently to design different versions of the firewall. none of them has ever explored design diversity. Furthermore, In industry firewalls are typically designed and maintained none of them has ever tackled the problem of change-impact by a group of people rather than just one person. To apply analysis for firewall policies. The proposed diverse firewall design the method of diverse firewall design, we can divide one method is complementary to the previous work because these group into several teams. methods can assist each individual team to design and analyze 2) Comparison Phase: In this phase, the resulting multiple their firewall in the design phase before cross comparison. versions are compared with each other to determine all Note that the scope of this paper is on firewalls, not IDS/IPS functional discrepancies among them. The functional dis- systems (Intrusion Detection/Prevention Systems). Although the crepancies need to be presented in human readable format distinction between IDS/IPS systems and firewalls is blurry some- in order to be used in the next step. times in the commercial world, IDS/IPS systems fundamentally 3) Resolution Phase: In this phase, first, every discrepancy is differ from firewalls in that IDS/IPS systems check packet pay- discussed and resolved by all teams; second, a firewall that loads while firewalls do not. is unanimously agreed upon by all teams is generated. The major technical challenge in the method of diverse firewall E. Key Contributions design is how to discover all functional discrepancies between We make four key contributions in this paper. two given firewalls in a human readable format. Our solution 1) We propose the method of diverse firewall design. This to this problem consists of a series of three efficient algorithms paper represents the first effort to apply the well-known for solving this problem: a construction algorithm, a shaping principle of diverse design to firewalls. algorithm, and a comparison algorithm. 2) We present a method that can compare two given firewalls After all functional discrepancies are computed, the teams need and output all functional discrepancies between them in a to discuss the correct decision for each discrepancy. After all human readable format. This is the first method created for discrepancies are resolved, the technical question that we need to this purpose. answer is: how do we generate the final firewall that reflects the 3) We present a method to compute firewall change-impacts by resolved functional discrepancies? We present two methods for computing all functional discrepancies between the firewalls this purpose in Section VI. before and after changes. This is the first method for doing firewall change-impact analysis. C. Other Applications: Firewall Change-Impact Analysis 4) We implemented our algorithms in Java, and we evaluated The algorithms presented in this paper can be used in other their performance on both real-life and synthetic firewalls applications as well, such as firewall change-impact analysis. of large sizes. The experimental results show that our Firewall policies are always subject to change due to a variety of algorithms only use a few seconds to compare two different reasons. Making policy changes is a major routine task for firewall firewalls where each firewall has up to 3000 rules. administrators. For example, new network threats such as worms The rest of this paper is organized as follows. We start with an and viruses may emerge. To protect a private network from new overview of our diverse firewall design method in Section II. In attacks, firewall policies need to be changed accordingly. Modern Sections III, IV, and V, we present a series of three algorithms for organizations also continually transform their network infrastruc- discovering all functional discrepancies between two firewalls. In ture to maintain their competitive edge by adding new servers, Section VI, we discuss how to generate a firewall that is agreed installing new software and services, expanding connectivity, etc. upon by all teams after all discrepancies are resolved. We discuss In accordance with network changes, firewall policies need to be some further issues in Section VII. In Section VIII, we present changed as well to provide necessary protection. the experimental results that show the effectiveness and efficiency Unfortunately, making changes is a major source of fire- of our diverse firewall design method. Our conclusions are given wall policy errors. Making correct firewall policy changes is in Section X. remarkably difficult due to the interleaving nature of firewall rules. For example, when a firewall administrator inserts a new II. OVERVIEW rule to a firewall policy, the meaning of the rules listed under In this section, we present an overview of our diverse firewall this rule could be incorrectly changed without the administrator design method using an illustrative example, which will be used noticing. Furthermore, firewall policy changes are made by human throughout the paper.
  • 3. 3 Rule # Interface Source IP Destination IP Destination Port Protocol Decision r1 0 * 192.168.0.1 25 TCP accept r2 0 224.168.0.0/16 * * * discard r3 * * * * * accept TABLE I F IREWALL DESIGNED BY T EAM A Rule # Interface Source IP Destination IP Destination Port Protocol Decision ′ r1 0 224.168.0.0/16 * * * discard ′ r2 0 * 192.168.0.1 25 TCP accept ′ r3 0 * 192.168.0.1 * * discard ′ r4 * * * * * accept TABLE II F IREWALL D ESIGNED BY T EAM B Discrepancy # Interface Source IP Destination IP Destination Port Protocol Team A Decision Team B Decision 1 0 224.168.0.0/16 192.168.0.1 25 TCP accept discard 2 0 !224.168.0.0/16 192.168.0.1 25 !TCP accept discard 3 0 !224.168.0.0/16 192.168.0.1 !25 * accept discard TABLE III F UNCTIONAL DISCREPANCIES BETWEEN THE TWO FIREWALLS DESIGNED BY T EAM A AND B In our example, for simplicity, we assume that a firewall maps B. Compare Multiple Firewalls every packet to one of two decisions: accept or discard. Most Next we briefly show our method for computing the functional firewall software supports more than two decisions such as accept, discrepancies between two given firewalls. For example, given accept-and-log, discard, and discard-and-log. Our diverse firewall the two firewalls in Table I and II, our method produces all the design method can support any number of decisions. functional discrepancies as shown in Table III. The core data structure used in this paper for comparing multiple firewalls is Firewall Decision Diagrams (FDD). Firewall A. Design Multiple Firewalls decision diagrams were introduced in [10] by Gouda and Liu as a notation for specifying firewalls. A Firewall Decision Diagram Consider the simple network in Figure 1. This network has a with a decision set DS and over fields F1 , · · · , Fd is an acyclic gateway router with two interfaces: interface 0, which connects and directed graph that has the following five properties: the gateway router to the outside Internet, and interface 1, which 1) There is exactly one node that has no incoming edges. This connects the gateway router to the inside local network. The node is called the root. The nodes that have no outgoing firewall for this local network resides in the gateway router. edges are called terminal nodes. 2) Each node v has a label, denoted F (v), such that Gateway Mail Server  Router (Firewall) (IP: 192.168.0.1) Host 1 Host 2 {F1 , · · · , Fd } if v is a nonterminal node, F (v) ∈ DS if v is a terminal node. C ISCO S YS TEM S Internet 3) Each edge e:u → v is labeled with a nonempty set of 0 1 integers, denoted I(e), where I(e) is a subset of the domain of u’s label (i.e., I(e) ⊆ D(F (u))). 4) A directed path from the root to a terminal node is called Fig. 1. A firewall a decision path. No two nodes on a decision path have the same label. Suppose the requirement specification for this firewall is as 5) The set of all outgoing edges of a node v , denoted E(v), follows: The mail server with IP address 192.168.0.1 can receive satisfies the following two conditions: email packets. The packets from an outside malicious domain a) Consistency: I(e) ∩ I(e′ ) = ∅ for any two distinct 224.168.0.0/16 should be blocked. Other packets should be ac- edges e and e′ in E(v). S cepted and allowed to proceed. b) Completeness: e∈E(v) I(e) = D(F (v)). 2 Suppose we give this specification to two teams: Team A and A decision path in an FDD f is represented by Team B, which design the firewalls as shown in Table I and II (v1 e1 · · · vk ek vk+1 ) where v1 is the root, vk+1 is a terminal respectively. node, and each ei is a directed edge from node vi to node vi+1 .
  • 4. 4 A decision path (v1 e1 · · · vk ek vk+1 ) in an FDD defines the c) Step 3: Comparison: In this step, we compare the two following rule: semi-isomorphic firewall decision diagrams in Figures 4 and 5 for functional discrepancies. Table III shows all the functional F1 ∈ S1 ∧ · · · ∧ Fn ∈ Sn → F (vk+1 ) discrepancies between the two semi-isomorphic firewall decision where diagrams in Figures 4 and 5, which are also the functional discrepancies between the two firewalls in Table I and II. The > I(ej ) if there is a node vj in the decision 8 > algorithm for discovering all functional discrepancies between > > < path that is labelled with field Fi , two semi-isomorphic firewall decision diagrams is presented in Si = Section V. > D(F ) > if no node in the decision path is > > i : labelled with field Fi . III. C ONSTRUCTION A LGORITHM For an FDD f , we use f.rules to denote the set of all rules In this section, we discuss how to construct an equivalent that are defined by all the decision paths of f . For any packet p, firewall decision diagram from a sequence of rules. there is one and only one rule in f.rules that p matches because of the consistency and completeness properties of an FDD. A. Firewalls Our method for computing the functional discrepancies be- We first formally define the concepts of fields, packets, and tween two given firewalls consists of the following three steps: firewalls. A field Fi is a variable whose domain, denoted D(Fi ), is conversion, shaping, and comparison. a finite interval of nonnegative integers. For example, the domain a) Step 1: Conversion: In this step, we convert each firewall of the source address in an IP packet is [0, 232 − 1]. A packet over to an equivalent FDD. Figures 2 and 3 show the two FDDs that the d fields F1 , · · · , Fd is a d-tuple (p1 , · · · , pd ) where each pi are converted from the two firewalls in Table I and II respectively. (1 ≤ i ≤ d) is an element of D(Fi ). We use Σ to denote the set Note that the example FDDs used in this paper are presented as of all packets over fields F1 , · · · , Fd . It follows that Σ is a finite trees for the ease of understanding. The algorithm for constructing set and |Σ| = |D(F1 )| × · · · × |D(Fd )|, where |Σ| denotes the an equivalent firewall decision diagram from a sequence of rules number of elements in set Σ and |D(Fi )| denotes the number of is presented in Section III. elements in set D(Fi ) for each i. In this example, we suppose that each packet has the following A firewall rule has the form predicate → decision . A five fields: Interface, Source IP address, Destination IP address, predicate defines a set of packets over the fields F1 through Destination Port, and Protocol Type. For ease of presentation, we Fd specified as F1 ∈ S1 ∧ · · · ∧ Fd ∈ Sd where each Si is a assume that each packet has a field called “interface” whose value nonempty interval that is a subset of D(Fi ). If Si = D(Fi ), we is the identification of the network interface on which a packet can replace (Fi ∈ Si ) by (Fi ∈ all ), or remove the conjunct (Fi ∈ arrives. The shorthand for the five packet fields is listed in the D(Fi )) altogether. A packet (p1 , · · · , pd ) matches a predicate following table. For simplicity, we assume that the protocol type F1 ∈ S1 ∧ · · · ∧ Fd ∈ Sd and the corresponding rule if and only if value in a packet is either 0 (TCP) or 1 (UDP). the condition p1 ∈ S1 ∧· · ·∧pd ∈ Sd holds. We use α to denote the set of possible values that decision can be. Typical elements of shorthand meaning domain α include accept, discard, accept with logging, and discard with I Interface [0, 1] S Source IP address [0, 232 ) logging. A firewall rule F1 ∈ S1 ∧ · · · ∧ Fd ∈ Sd → decision D Destination IP address [0, 232 ) is simple if and only if every Si (1 ≤ i ≤ d) is an interval of N Destination Port [0, 216 ) consecutive nonnegative integers. P Protocol Type [0, 1] A firewall f over the d fields F1 , · · · , Fd is a sequence of firewall rules. The size of f , denoted |f |, is the number of rules in In our examples, we also use the following shorthand. Note F . A sequence of rules r1 , · · · , rn is comprehensive if and only that α denotes the integer formed by the four bytes of the IP if for any packet p, there is at least one rule in the sequence that address 224.168.0.0. This applies similarly for β and γ . p matches. A sequence of rules needs to be comprehensive for it shorthand meaning to serve as a firewall. To ensure that a firewall is comprehensive, a accept the predicate of the last rule in a firewall is specified as F1 ∈ d discard D(F1 ) ∧ · · · Fd ∈ ∧D(Fd ). α 224.168.0.0 Two rules in a firewall may overlap; that is, a single packet β 224.168.255.255 may match both rules. Furthermore, two rules in a firewall may γ 192.168.0.1 conflict; that is, the two rules not only overlap but also have b) Step 2: Shaping: In this step, we transform each firewall different decisions. To resolve such conflicts, firewalls typically decision diagram into another firewall decision diagram without employ a first-match resolution strategy where the decision for a changing its semantics such that the two resulting firewall decision packet p is the decision of the first (i.e., highest priority) rule that diagrams are semi-isomorphic. Two firewall decision diagrams are p matches in f . The decision that firewall f makes for packet p semi-isomorphic if and only if they are exactly the same except is denoted f (p). for the labels of their terminal nodes. Figures 4 and 5 show the We can think of a firewall f as defining a many-to-one mapping two semi-isomorphic firewall decision diagrams converted from function from Σ to α. Two firewalls f1 and f2 are equivalent, the firewall decision diagrams in Figures 2 and 3 respectively. denoted f1 ≡ f2 , if and only if they define the same mapping The algorithm for making two firewall decision diagrams semi- function from Σ to α; that is, for any packet p ∈ Σ , we have isomorphic without changing their semantics is presented in f1 (p) = f2 (p). For any firewall f , we use {f } to denote the set Section IV. of firewalls that are semantically equivalent to f .
  • 5. 5 I 0 1 I 0 1 [0, α − 1] S S [α, β] all S [0, α − 1] a [β + 1, 232 ) [α, β] [β + 1, 232 ) D D D [0, γ − 1] γ [0, γ − 1] γ all d D [γ + 1, 232 ) [0, γ − 1] [γ + 1, 232 ) γ [γ + 1, 232 ) N N N N [0, 24] N 25 [0, 24] N 25 all all 25 [0, 24] a [26, 216 ) [26, 216 ) all [26, 216 ) P P P P P P P P d 0 1 0 1 all all all all 1 all 0 a a a a a d d d a a d Fig. 2. The firewall decision diagram constructed from the firewall designed Fig. 3. The firewall decision diagram constructed from the firewall designed by Team A in Table I by Team B in Table II 1 I 0 S S [0, α − 1] [β + 1, 232 ) [α, β] all D [0, γ − 1] D [γ + 1, 232 ) D D [0, γ − 1] [γ + 1, 232 ) [0, γ − 1] [γ + 1, 232 ) γ γ γ all N N N N N N N N N N [0, 24] [26, 216 ) [0, 24] [26, 216 ) [0, 24] [26, 216 ) all 25 all all 25 all all 25 all all P P P P P P P P P P P P P P P P all all 0 1 all all all all 0 1 all all all all 0 1 all all all a a a a a a d d a d d d a a a a a a a Fig. 4. The firewall decision diagram transformed from the one in Figure 2 1 I 0 S S [0, α − 1] [β + 1, 232 ) [α, β] all D [0, γ − 1] D [γ + 1, 232 ) D D [0, γ − 1] [γ + 1, 232 ) [0, γ − 1] [γ + 1, 232 ) γ γ γ all N N N N N N N N N N [0, 24] [26, 216 ) [0, 24] [26, 216 ) [0, 24] [26, 216 ) all 25 all all 25 all all 25 all all P P P P P P P P P P P P P P P P all all 0 1 all all all all 0 1 all all all all 0 1 all all all a d a d d a d d d d d d a d a d d a a Fig. 5. The firewall decision diagram transformed from the one in Figure 3
  • 6. 6 Before Appending: I After Appending: I 0 0 S [0, α − 1] S all 32 [α, β] [β + 1, 2 ) D D D [0, γ − 1] γ γ γ [γ + 1, 232 ) N N N N [0, 24] 25 25 25 all 16 [26, 2 ) P P P P P 0 0 0 1 all all a a a d d d Fig. 6. Appending rule (I ∈ {0}) ∧ (S ∈ [α, β]) ∧ (D ∈ all) ∧ (N ∈ all) ∧ (P ∈ all) → d B. Construction of Firewall Decision Diagrams in Figure 7. Here we use e.t to denote the (target) node that the Next, we discuss how to construct an equivalent FDD from a edge e points to. sequence of rules r1 , · · · , rn , where each rule is of the format As an example, consider the sequence of rules in Table I. Figure (F1 ∈ S1 ) ∧ · · · ∧ (Fd ∈ Sd ) → decision . Note that all the d 6 shows the partial FDD that we construct from the first rule, and packet fields appear in the predicate of each rule, and they appear the partial FDD after we append the second rule. The FDD after in the same order. we append the third rule is shown in Figure 2. We first construct a partial FDD from the first rule. A partial FDD is a diagram that has all the properties of an FDD except the Construction Algorithm completeness property. The partial FDD constructed from a single Input : A firewall f of a sequence of rules r1 , · · · , rn rule contains only the decision path that defines the rule. Suppose Output : An FDD f ′ such that f and f ′ are equivalent from the first i rules, r1 through ri , we have constructed a partial Steps: FDD, whose root v is labelled F1 , and suppose v has k outgoing 1. build a decision path with root v from rule r1 ; edges e1 , · · · , ek . Let ri+1 be the rule (F1 ∈ S1 ) ∧ · · · ∧ (Fd ∈ 2. for i := 2 to n do APPEND( v , ri ); Sd ) → decision . Next we consider how to append rule ri+1 to End this partial FDD. At first, we examine whether we need to add another outgoing APPEND( v , (Fm ∈ Sm ) ∧ · · · ∧ (Fd ∈ Sd ) → dec ) edge to v . If S1 − (I(e1 ) ∪ · · · ∪ I(ek )) = ∅, we need to add a new /*F (v) = Fm and E(v) = {e1 , · · · , ek }*/ outgoing edge with label S1 − (I(e1 ) ∪ · · · ∪ I(ek )) to v because 1. if ( Sm − ( I(e1 ) ∪ · · · ∪ I(ek ) ) ) = ∅ then any packet whose F1 field is an element of S1 −(I(e1 ) · · ·∪I(ek )) (a) add an outgoing edge ek+1 with label does not match any of the first i rules, but matches ri+1 provided Sm − (I(e1 ) ∪ · · · ∪ I(ek )) to v ; that the packet satisfies (F2 ∈ S2 )∧· · ·∧(Fd ∈ Sd ). Then we build (b) build a decision path from rule a decision path from (F2 ∈ S2 ) ∧ · · · ∧ (Fd ∈ Sd ) → decision , (Fm+1 ∈ Sm+1 ) ∧ · · · ∧ (Fd ∈ Sd ) → dec , and make the new edge of the node v point to the first node of and make ek+1 point to the first node in this path; this decision path. 2. if m < d then Second, we compare S1 and I(ej ) for each j where 1 ≤ j ≤ k. for j := 1 to k do This comparison leads to one of the following three cases: if I(ej ) ⊆ Sm then 1) S1 ∩ I(ej ) = ∅: In this case, we skip edge ej because any APPEND(ej .t, packet whose value of field F1 is in set I(ej ) does not (Fm+1 ∈ Sm+1 ) ∧ · · · ∧ (Fd ∈ Sd ) → dec ); match ri+1 . else if I(ej ) ∩ Sm = ∅ then 2) S1 ∩ I(ej ) = I(ej ): In this case, for a packet whose value (a)add one outgoing edge e to v , of field F1 is in set I(ej ), it may match one of the first and label e with I(ej ) ∩ Sm ; i rules, and it also may match rule ri+1 . So we append (b)make a copy of the subgraph rooted at ej .t, the rule (F2 ∈ S2 ) ∧ · · · ∧ (Fd ∈ Sd ) → decision to the and make e points to the root of the copy; subgraph rooted at the node that ej points to. (a)replace the label of ej by I(ej ) − Sm ; 3) S1 ∩ I(ej ) = ∅ and S1 ∩ I(ej ) = I(ej ): In this case, we (d)APPEND(ej .t, split edge e into two edges: e′ with label I(ej ) − S1 and (Fm+1 ∈ Sm+1 ) ∧ · · · ∧ (Fd ∈ Sd ) → dec ); e′′ with label I(ej ) ∩ S1 . Then we make two copies of the subgraph rooted at the node that ej points to, and let e′ and Fig. 7. FDD Construction Algorithm e′′ point to one copy each. We then deal with e′ by the first case, and e′′ by the second case. Theorem 1: Given a firewall of n simple rules, the maximum The pseudocode of the FDD construction algorithm is shown number of paths in the FDD constructed using the FDD construc-
  • 7. 7 tion algorithm is (2n − 1)d , where d is the number of the fields A. FDD Simplifying in each rule. 2 Before applying the shaping algorithm, presented below, to Proof: Let the n simple rules be r1 , r2 , · · · , rn , where each two ordered FDDs, we need to transform each of them into an rule ri is denoted equivalent simple FDD. A simple FDD is defined as follows: i i i Definition 4.3 (Simple FDDs): An FDD is simple iff each ri = F1 ∈ S1 ∧ F2 ∈ S2 ∧ · · · ∧ Fd ∈ Sd → decisioni node in the FDD has at most one incoming edge and each edge i For each field Fi , S1 has two end points (minimum and maximum in the FDD is labelled with a single interval. 2 value of the range). Thus, there are at most 2n points in the range It is straightforward that the two operations of edge splitting of Fi and the total number of intervals separated by the 2n points and subgraph replication can be applied repetitively to an FDD in is at most 2n − 1, which means the number of outgoing edges of order to make this FDD simple. Note that the graph of a simple a node labeled Fi is at most 2n − 1. Because the total number FDD is an outgoing directed tree. In other words, each node in a of fields is d, the number of paths in the constructed FDD is at simple FDD, except the root, has only one parent node, and has most (2n − 1)d . 2 only one incoming edge (from the parent node). IV. S HAPING A LGORITHM B. Node Shaping Next, we introduce the procedure for transforming two shapable In this section, we discuss how to transform two ordered, but nodes into two semi-isomorphic nodes, which is the basic building not semi-isomorphic FDDs fa and fb into two semi-isomorphic block in the shaping algorithm for transforming two ordered ′ ′ ′ FDDs fa and fb such that fa is equivalent to fa , and fb is FDDs into two semi-isomorphic FDDs. Shapable nodes and semi- ′ equivalent to fb . Informally, a firewall decision diagram is ordered isomorphic nodes are defined as follows. if and only if along every path from the root to a terminal node, Definition 4.4 (Shapable Nodes): Let fa and fb be two or- the labels of the non-terminal nodes obey the same order; two dered simple FDDs, va be a node in fa , and vb be a node in firewall decision diagrams are semi-isomorphic if and only if they fb . Nodes va and vb are shapable iff one of the following two are exactly the same except for the labels of their terminal nodes. conditions holds: The formal definitions of ordered FDDs and semi-isomorphic 1) Both va and vb have no parents, i.e., they are the roots of FDDs are as follows. Note that the FDDs constructed by the their respective FDDs; construction algorithm in Section III are ordered. 2) Both va and vb have parents, their parents have the same Definition 4.1 (Ordered FDDs): Let ≺ be the total order over label, and their incoming edges have the same label. 2 the packet fields F1 , · · · , Fd where F1 ≺ · · · ≺ Fd holds. An Definition 4.5 (Semi-isomorphic Nodes): Let fa and fb be two FDD is ordered iff for each decision path (v1 e1 · · · vk ek vk+1 ), ordered simple FDDs, va be a node in fa and vb be a node in we have F (v1 ) ≺ · · · ≺ F (vk ). 2 fb . The two nodes va and vb are semi-isomorphic iff one of the Definition 4.2 (Semi-isomorphic FDDs): Two FDDs f and f ′ following two conditions holds: are semi-isomorphic iff there exists a one-to-one mapping σ from the nodes of f onto the nodes of f ′ , such that the following two 1) Both va and vb are terminal nodes; conditions hold: 2) Both va and vb are nonterminal nodes with the same label and there exists a one-to-one mapping σ from the children 1) For any node v in f , either both v and σ(v) are nonterminal of va to the children of vb such that for each child v of va , nodes with the same label, or both of them are terminal v and σ(v) are shapable. 2 nodes; For example, the two nodes labelled F1 in Figure 8 are shapable 2) For each edge e in f , where e is from a node v1 to a node since they have no parents, and the two nodes labelled F1 in v2 , there is an edge e′ from σ(v1 ) to σ(v2 ) in f ′ , and the Figure 9 are semi-isomorphic nodes. two edges e and e′ have the same label. 2 The algorithm for transforming two ordered FDDs into two shapable nodes semi-isomorphic FDDs uses the following three basic operations. (Note that none of these operations change the semantics of the F1 F1 FDDs.) 1) Node Insertion: If along all the decision paths containing [1, 50] [51, 100] [1, 30] [31, 100] a node v , there is no node that is labelled with a field F , then we can insert a node v ′ labelled F above v as follows: F2 F2 F2 F2 make all incoming edges of v point to v ′ , create one edge from v ′ to v , and label this edge with the domain of F . Fig. 8. Two shapable nodes in two FDDs 2) Edge Splitting: For an edge e from v1 to v2 , if I(e) = S1 ∪ S2 , where neither S1 nor S2 is empty, then we can split e into two edges as follows: replace e by two edges The algorithm for making two shapable nodes va and vb semi- from v1 to v2 , label one edge with S1 and label the other isomorphic consists of two steps: with S2 . 1) Step I: This step is skipped if va and vb have the same label, 3) Subgraph Replication: If a node v has m (m ≥ 2) incoming or both of them are terminal nodes. Otherwise, without loss edges, we can make m copies of the subgraph rooted at v , of generality, assume F (va ) ≺ F (vb ). It is straightforward and make each incoming edge of v point to the root of one to show that in this case along all the decision paths distinct copy. containing node vb , no node is labelled F (va ). Therefore,
  • 8. 8 semi-isomorphic nodes F1 F1 Procedure Node Shaping( fa , fb , va , vb ) Input : Two ordered simple FDDs fa and fb , and [1, 30] [31, 50] [51, 100] [1, 30] [31, 50] [51, 100] two shapable nodes va in fa and vb in fb Output: The two nodes va and vb become semi-isomorphic, F2 F2 F2 F2 F2 F2 and the procedure returns a set S of node pairs of the form (wa , wb ) where wa is a child of va in fa , wb is a child of vb in fb , and the two nodes wa and shapable nodes shapable nodes shapable nodes wb are shapable. Fig. 9. Two semi-isomorphic nodes Steps: 1. if (both va and vb are terminal) return( ∅ ); else if ∼(both va and vb are nonterminal and they have the same label) ′ then /*Here either both va and vb are nonterminal and we can create a new node vb with label F (va ), create a ′ they have different labels, or one node is terminal new edge with label D(F (va )) from vb to vb , and make all ′ and the other is nonterminal. Without loss of incoming edges of vb point to vb . Now va has the same ′ generality, assume one of the following conditions holds: label as vb . (Recall that this node insertion operation leaves (1) both va and vb are nonterminal and F (va ) ≺ F (vb ), the semantics of the FDD unchanged.) (2) va is nonterminal and vb is terminal.*/ 2) Step II: From the previous step, we can assume that va insert a new node with label F (va ) above vb , and vb have the same label. In the current step, we use the and call the new node vb ; two operations of edge splitting and subgraph replication 2. let E(va ) be {ea,1 , · · · , ea,m } where I(ea,1 ) < · · · < I(ea,m ). to build a one-to-one correspondence from the children of let E(vb ) be {eb,1 , · · · , eb,n } where I(eb,1 ) < · · · < I(eb,n ). va to the children of vb such that each child of va and its 3. i := 1; j := 1; corresponding child of vb are shapable. while ( ( i < m ) or ( j < n ) ) do{ Suppose D(F (va )) = D(F (vb )) = [a, b]. We know that /*During this loop, the two intervals I(ea,i ) and each outgoing edge of va or vb is labelled with a single I(eb,j ) always begin with the same integer.*/ interval. Suppose va has m outgoing edges {e1 , · · · , em }, let I(ea,i ) = [A, B] and I(eb,j ) = [A, C], where where I(ei ) = [ai , bi ], a1 = a, bm = b, and every ai+1 = A, B , C are three integers; bi + 1. Also suppose vb has n outgoing edges {e′ , · · · , e′ }, 1 n if B = C then {i := i + 1; j := j + 1; } where I(e′ ) = [a′ , b′ ], a′ = a, bn = b, and every a′ i i i 1 ′ i+1 = else if B < C then{ b ′ + 1. i (a) create an outgoing edge e of vb , Comparing edge e1 , whose label is [a, b1 ], and e′ , whose 1 and label e with [A, B]; label is [a, b′ ], we have the following two cases: (1) b1 = b′ : 1 1 (b) make a copy of the subgraph rooted at eb,j .t and In this case I(e1 ) = I(e′ ), therefore, node e1 .t and node 1 make e point to the root of the copy; e′ .t are shapable. (Recall that we use e.t to denote the node 1 (c) I(eb,j ) := [B + 1, C]; that edge e points to.) Then we can continue to compare e2 (d) i := i + 1;} and e′ since both I(e2 ) and I(e′ ) begin with b1 + 1. (2) 2 2 else {/*B > C */ b1 = b′ : Without loss of generality, we assume b1 < b′ . 1 1 (a) create an outgoing edge e of va , In this case, we split e′ into two edges e and e′ , where e 1 and label e with [A, C]; is labelled [a, b1 ] and e′ is labelled [b1 + 1, b′ ]. Then we 1 ′ (b) make a copy of the subgraph rooted at ea,j .t and make two copies of the subgraph rooted at e1 .t and let e make e point to the root of the copy; and e′ point to one copy each. Thus I(e1 ) = I(e) and the (c) I(ea,i ) := [C + 1, B]; two nodes, e1 .t and e.t are shapable. Then we can continue (d) j := j + 1;} to compare the two edges e2 and e′ since both I(e2 ) and } I(e′ ) begin with b1 + 1. 4. /*Now va and vb become semi-isomorphic.*/ The above process continues until we reach the last outgo- let E(va ) = {ea,1 , · · · , ea,k } where ing edge of va and the last outgoing edge of vb . Note that I(ea,1 ) < · · · < I(ea,k ) and k ≥ 1; each time that we compare an outgoing edge of va and an let E(vb ) = {eb,1 , · · · , eb,k } where outgoing edge of vb , the two intervals labelled on the two I(eb,1 ) < · · · < I(eb,k ) and k ≥ 1; edges begin with the same value. Therefore, the last two S := ∅; edges that we compare must have the same label because for i = 1 to k do they both end with b. In other words, this edge splitting add the pair of shapable nodes ( ea,i .t, eb,i .t ) to S ; and subgraph replication process will terminate. When it return( S ); terminates, va and vb become semi-isomorphic. End Figure 10 shows the pseudocode for making two shapable Fig. 10. Node Shaping Algorithm nodes in two ordered simple FDDs semi-isomorphic. We use I(e) < I(e′ ) to indicate that every integer in I(e) is less than every integer in I(e′ ).
  • 9. 9 If we apply the above node shaping procedure to the two these two sets manifest the functional discrepancies between the shapable nodes labelled F1 in Figure 8, we make them semi- two FDDs, the two design teams can investigate them to resolve isomorphic as shown in Figure 9. the discrepancies. Let fa be the FDD in Figure 4, and fb be the FDD in Figure 5. C. FDD Shaping Here fa is equivalent to the firewall in Table I designed by Team A, and fb is equivalent to the firewall in Table II designed by To make two ordered FDDs fa and fb semi-isomorphic, we first Team B. By comparing fa and fb , we can discover all functional make fa and fb simple, and then make fa and fb semi-isomorphic discrepancies between the firewalls designed by A and B. The as follows. Suppose we have a queue Q, which is initially empty. discrepancies are shown in Table III, based on which the following At first we put the pair of shapable nodes consisting of the root three questions need to be investigated: of fa and the root of fb into Q. As long as Q is not empty, we remove the head of Q, feed the two shapable nodes to the 1) Should we allow the computers from the malicious domain above Node Shaping procedure, then put all the pairs of shapable to send email to the mail server? Team A says yes, while nodes returned by the Node Shaping procedure into Q. When Team B says no. the algorithm finishes, fa and fb become semi-isomorphic. The pseudocode for this shaping algorithm is shown in Figure 11. Interface 0 Shaping Algorithm Source IP 224.168.0.0/16 Input : Two ordered FDDs fa and fb Destination IP: 192.168.0.1 Output : fa and fb become semi-isomorphic. Destination Port: 25 Steps: Protocol Type: TCP 1. make the two FDDs fa and fb simple; Team A Decision: accept 2. Q := ∅; Team B Decision: discard 3. add the shapable pair (root of fa , root of fb ) to Q; 4. while Q = ∅ do{ 2) Should we allow non-TCP packets with destination port remove the header pair (va , vb ) from Q; number 25 to be sent from the hosts that are not in the S :=Node Shaping( fa , fb , va , vb ); malicious domain to the mail server? Team A says yes, add every shapable pair from S into Q; while Team B says no. } End Interface 0 Source IP !224.168.0.0/16 Fig. 11. Shaping Algorithm Destination IP: 192.168.0.1 Destination Port: 25 As an example, if we apply the above shaping algorithm to the Protocol Type: !TCP two FDDs in Figures 2 and 3, we obtain two semi-isomorphic Team A Decision: accept FDDs as shown in Figures 4 and 5. Team B Decision: discard V. C OMPARISON A LGORITHM 3) Should we allow the packets with a destination port In this section, we consider how to compare two semi- number other than 25 to be sent from the hosts who are isomorphic FDDs. Given two semi-isomorphic FDDs fa not in the malicious domain to the mail server? Team A and fb with a one-to-one mapping σ , each decision path says yes, while Team B says no. (v1 e1 · · · vk ek vk+1 ) in fa has a corresponding decision path (σ(v1 )σ(e1 ) · · · σ(vk )σ(ek )σ(vk+1 )) in fb . Similarly, each rule Interface 0 (F (v1 ) ∈ I(e1 )) ∧ · · · ∧ (F (vk ) ∈ I(ek )) → F (vk+1 ) in Source IP !224.168.0.0/16 fa .rules has a corresponding rule (F (σ(v1 )) ∈ I(σ(e1 )))∧∧ · · ·∧ Destination IP: 192.168.0.1 (F (σ(vk )) ∈ I(σ(ek ))) → F (σ(vk+1 )) in fb .rules . Note that Destination Port: !25 F (vi ) = F (σ(vi )) and I(ei ) = I(σ(ei )) for each i where 1 ≤ Protocol Type: * i ≤ k. Therefore, for each rule (F (v1 ) ∈ I(e1 )) ∧ · · · ∧ (F (vk ) ∈ Team A Decision: accept I(ek )) → F (vk+1 ) in fa .rules , the corresponding rule in fb .rules Team B Decision: discard is (F (v1 ) ∈ I(e1 )) ∧ · · · ∧ (F (vk ) ∈ I(ek )) → F (σ(vk+1 )). Each of these two rules is called the companion of the other. VI. D ISCREPANCY R ESOLUTION This companionship implies a one-to-one mapping from the After all functional discrepancies are computed, the teams need rules defined by the decision paths in fa to the rules defined by to discuss correct decisions for each discrepancy. Consider the the decision paths in fb . Note that for each rule and its companion, discrepancies shown in Table III. Suppose these discrepancies are either they are identical, or they have the same predicate but resolved as shown in Table IV. different decisions. Therefore, fa .rules − fb .rules is the set of The question that we want to answer in this section is: how do all the rules in fa .rules that have different decisions from their we generate the final firewall that reflects the resolved functional companions. This applies similarly for fb .rules − fa .rules . Note discrepancies? Of course, if one team made all the correct that the set of all the companions of the rules in fa .rules−fb .rules decisions according to the discrepancy resolution, we can simply is fb .rules − fa .rules ; and similarly the set of all the companions deploy the firewall designed by that team. Next, we assume that of the rules in fb .rules − fa .rules is fa .rules − fb .rules . Since no team makes all the correct decisions. In this paper, we propose
  • 10. 10 Discrepancy # Interface Source IP Destination IP Destination Port Protocol Resolved Decision 1 0 224.168.0.0/16 192.168.0.1 25 TCP discard 2 0 !224.168.0.0/16 192.168.0.1 25 !TCP accept 3 0 !224.168.0.0/16 192.168.0.1 !25 * discard TABLE IV R ESOLVED FUNCTIONAL DISCREPANCIES Rule # Interface Source IP Destination IP Destination Port Protocol Decision 1 0 224.168.0.0/16 * * * discard 2 0 * 192.168.0.1 25 * accept 3 0 * 192.168.0.1 * * discard 4 * * * * * accept TABLE V F IREWALL GENERATED FROM THE CORRECTED FDD two methods for this purpose. Then we discuss which methods two steps, the resulting firewall is shown in Table VI. Similarly, we should choose in practice. we can pick the firewall in Table II designed by Team B, and then add the second rule from the discrepancy resolution in Table IV to the beginning of the firewall. After the above two steps, the A. Method 1: Generate Rules from Corrected FDD resulting firewall is shown in Table VII. This method has two steps. First, correct one of the two semi- isomorphic FDDs using discrepancy resolution. Second, generate VII. D ISCUSSION rules from the resulting FDD using the algorithms presented in [12]. A. Prefix and Intervals Step 1: FDD Correction. We can pick either of the two semi- Real-life firewalls usually check five packet fields: source IP isomorphic FDDs generated by the FDD shaping algorithm and address, destination IP address, source port number, destination apply corrections on the labels of the terminal nodes. Note that port number, and protocol type. Of these five fields, the first two after we apply fixes to two semi-isomorphic FDDs, they become fields are usually represented using prefix formats, and the last exactly the same. Note that we cannot directly use the corrected three fields are usually integer intervals. Note that prefix formats FDD as the configuration of a firewall because most existing and interval formats are interconvertable. For example, IP prefix firewall devices take a sequence of rules as their configuration. 192.168.0.0/16 can be converted to the interval from 192.168.0.0 Step 2: Firewall Generation. Given the corrected FDD, we can to 192.168.255.255, where an IP address can be regarded as a 32- apply the algorithms in [12] for generating a compact firewall bit integer. As another example, the interval [2, 8] can be converted from an FDD. Table V shows the firewall generated from the to 3 prefixes: 001∗, 01∗, 1000. corrected FDD. Interested readers can refer to [12] for more To use the algorithms presented in this paper, we first convert technical details. the source and destination IP addresses from prefix formats to integer intervals. Note that every prefix can be converted to B. Method 2: Combine Corrections with Original Firewalls only one integer interval. Second, we run the three algorithms described in this paper. Note that the functional discrepancies The second method is to create a new firewall using the rules directly produced by our algorithms are in interval formats. Third, in the discrepancy resolution and one of the original firewalls. for each functional discrepancy computed, we convert the source This method consists of the following two steps. and destination IP addresses from intervals to prefixes. Thus, the Step 1: Firewall Composition. In this step, we first pick an formats of outputs are similar to those of original firewall rules, original firewall, and then we take all the rules in the discrepancy which are easy to understand for firewall administrators. (A w−bit resolution in which the original firewall made incorrect decisions integer interval can be converted to at most 2w − 2 prefixes [14].) and add them to the beginning of the firewall. Step 2: Redundancy Removal In this step, we apply the firewall compaction algorithm in [19] to remove redundant rules from the B. Design in FDDs resulting sequence of rules. A rule is redundant if and only if In our discussion so far, we have assumed that the two teams removing the rule does not change the semantics of the firewall. all design their firewalls using a sequence of rules. In fact, a team For example, we can pick the firewall in Table I designed by can use the structured firewall design method in [12] to design Team A, and on top of that, we can add the first rule and the third the firewall using an FDD. Such cases are easy to handle using rule from the discrepancy resolution in Table IV. Note that Team the FDD construction algorithm in this paper and the firewall A only made incorrect decisions for the packets that match the generation algorithm in [12]. For example, if only one team first rule and the third rule in Table IV. By adding these two rules designs the firewall using a non-ordered FDD, we can use the to the beginning of the original three rules designed by Team A, firewall generation algorithm in [12] to generate a sequence of all packets are mapped to the correct decisions. After the above rules from the FDD first, and then apply the algorithms in this
  • 11. 11 Rule # Interface Source IP Destination IP Destination Port Protocol Decision 1 0 224.168.0.0/16 192.168.0.1 25 TCP discard 2 0 !224.168.0.0/16 192.168.0.1 !25 * discard 3 0 * 192.168.0.1 25 TCP accept 4 0 224.168.0.0/16 * * * discard 5 * * * * * accept TABLE VI F IREWALL GENERATED BY COMBINING THE RULES IN TABLE IV AND THE RULES IN TABLE I Rule # Interface Source IP Destination IP Destination Port Protocol Decision 1 0 !224.168.0.0/16 192.168.0.1 25 !TCP accept 2 0 224.168.0.0/16 * * * discard 3 0 * 192.168.0.1 25 TCP accept 4 0 * 192.168.0.1 * * discard 5 * * * * * accept TABLE VII F IREWALL GENERATED BY COMBINING THE RULES IN TABLE IV AND THE RULES IN TABLE II paper. As another example, if two teams design two ordered Decision Diagrams (BDDs) [6]? A BDD is a rooted, directed, firewall decision diagrams that are in a different order, we can acyclic graph that represents a Boolean function. In a BDD, each first generate an equivalent sequence of rules from one diagram, non-terminal node is labeled by a Boolean variable and it has and then we can construct an equivalent ordered firewall decision only two outgoing edges labeled 0 and 1 respectively. Each edge diagram from the sequence of rules using the order of packet represents an assignment of 0 or 1. A BDD only has two terminal fields from the other firewall decision diagram. nodes labeled 0 and 1 respectively. The answer is that the functional discrepancies computed by C. More Than Two Teams BDDs are not human readable. First, the BDD itself, i.e., the one that represents the functional discrepancies between two firewalls, In terms of firewall comparison, what we have discussed so far is not human readable because every node in a BDD represents is how to compare two firewalls. If we have N firewalls designed only a bit of a packet, not a field of a packet. Second, generating by N teams, where N > 2, there are two ways to compare them: human readable discrepancies, which are similar to rules, from cross comparison and direct comparison. Cross comparison means a BDD results in an exorbitant number of rules, which is in to compare each of the N ∗(N −1) pairs, where each pair consists terms of millions. We have implemented BDD-based solutions of two of the N firewalls. Direct comparison means to extend the using CUDD package [23]. Unfortunately, comparing two small shaping algorithm and the comparison algorithm to handle N firewalls results in millions of rules. While compressing millions firewalls. This extension is considered fairly straightforward. of rules may not be impossible, it is by no means trivial. In contrast, using the data structure of FDDs, we can easily generate D. Complexity Analysis human readable functional discrepancies in a rule-like format. Let n be the number of rules in a firewall, and d be the total number of distinct packet fields that are examined by a VIII. E XPERIMENTAL R ESULTS firewall. Based on Theorem 1, the time and space complexity In this section, we present the results of the experiments that of the FDD construction algorithm is O(nd ). Similarly, the time we conducted to evaluate both the effectiveness and efficiency of and space complexity of the FDD shaping algorithm and the FDD our diverse firewall design method. comparison algorithm is O((n+m)d ), where n and m are the total number of rules in the two given firewalls respectively. Despite A. Effectiveness such worst case complexities, our algorithms are practical for To evaluate the effectiveness of the diverse firewall design two reasons. First, d is typically small. Most real-life firewalls method, we conducted a real experiment as follows. First, we only examine four packet fields: source IP address, destination IP obtained a real-life firewall used in a university. This firewall address, destination port number, and protocol type. Second, the was maintained by a senior firewall administrator as a sequence worst case of our algorithms is extremely unlikely to happen in of rules. This firewall, unfortunately, did not have a requirement practice. The experimental results in the next section confirm the specification. However, the rules in this firewall were well doc- above observations. umented in that each rule had some detailed comments about why the rule was added. Taking the comments of the rules E. Why not BDDs? as the requirement specification, we let a computer science Our solution uses FDDs as the basic data structure for com- undergraduate student design a firewall using firewall decision puting the functional discrepancies between two given firewalls. diagrams. Before the design started, we gave the student some One question that we need to answer is: why not use Binary training on designing firewalls using firewall decision diagrams.