2. ExperimentTime:2017/05/20
PROPOSAL FOR SYSTEM ANALYSIS AND DESIGN
COMPUTER NETWORK TRAFFIC ANALYSIS
Project description
An unknown number of attacks on government computer networksoccur every day.
Some of these attacks are successful and/or undetected and can have disastrous
consequences. One of the aims of this project is to detectand ultimately prevent
these attacks. In today’s digital age, we are surrounded by massive amounts of data.
In many cases, we do not know the best way to store, manage, integrate, obtain
information from, or visualize it. Such is the case for data regarding packet flows
over a network. Research involving the analysis of this type of data is in its early
stages. Interesting problems such as behavioral authentication of server flows and
intrusion detection are beginning to be solved using this type of data. We are
particularly interested in analyzing network data for the purposes of anomaly
detection (attacks, masquerades, and networkinterruptions), user profiling,
workload management, and application verification.
3. Our tasks include:
1. processing the data consisting of packets into a useful format
2. extracting information from the data flows
3. developing traffic flow models for the purposes mentioned above
4. visualizing the data
5. recognizing data patterns for the purposes mentioned above.
The client for this projectis my honorable prof. Mr. Yi Ding from University of
Electronic Science and Technology of China.
Computer Network Traffic Analysis Requirements:
Proper network planning can save time and expense, and can ensure a timely
deployment of Microsoft Speech Server (MSS).
Monitoring network bandwidth and traffic patterns at an interface specific level
Drill drown into interface level details to discover traffic patterns and device
performance
Get real-time-insight into your network bandwidth with one-minute granularity
reports
Network forensics and security analysis-detect a broad spectrum and internal
security threads using continuous stream mining engine technology.
Track network anomalies that surpass your network firewall.
Network planning involves: knowing the number of telephone lines and the types of
associated services and equipment that are needed to support telephony (voice-only)
applications; anticipating increased TCP/IP network traffic; and subsequently
determining the optimal network architecture needed for the system.
TCP/IP Network: A physical TCP/IP network is required for MSS. All MSS
computers, Web servers and load balancers communicate using this network. Install
at least one network adapter in each computer running MSS. The use of a firewall
between MSS computers is not supported. To determine network planning
requirements
Load Balancers – This section applies to Enterprise Edition only. Load balancing is
required whenever two or more computers are used for running Speech Engine
Services (SES), Telephony Application Services (TAS), or Web server software in a
server farm or cluster configuration. Either hardware or software load balancing can
be used.
A TAS server farm, a Private Branch Exchange (PBX) unit is needed to provide load
balancing and call routing functionality.
Telephony Boards – Each computer that runs Telephony Application Services (TAS)
for supporting telephony (voice-only) applications requires telephony interface
manager software and possibly a hardware telephony board that accepts telephone
line connections.
4. Data Sets-Testing and evaluating is an important of network traffic analysis. In
order to evaluate the effectiveness of all research works using similar standard list is
recommended to use standard data set. There are several standard data sets used
throughout the recent years. We enlist a few important data sets that are being used
by researchers for network traffic analysis.
DARPA data set: KDD cup data has been the most widely used for evaluating of
network traffic analysis with respect to intrusion detection. This data set is
presented by Stolon at al.
NSL-KDD data set: The NSL-KDD is publicly available for researchers and it is
improved version of original KDD cup data set
CAIDA data sets: This data set contains DoS attacks
Waikato data set: It contains internet storage
Supervised and Unsupervised method.
Global and Local methods
Top-down and bottom-up: Top-down (splitting) discretization methods begin with
long as and value of interval then divide values into smaller intervals at each
iteration.
Direct and Incremental method.
Feature Selection methods: Feature selection (FS) is a preprocessing method to be
applied before applying data mining techniques. Feature selection used to improve
the data mining techniques performance through the removal of redundant or
irrelevant attributes.
We have identified some techniques including principal component analysis,
information entropy, rough set theory, feature selection is used frequently for
preprocessing network traffic data
Data mining: Data mining plays an important role in analyzing network traffic.
Clustering technique: Clustering is the process of partitioning data into groups
according to certain characteristics of data
Hybrid models-The hybrid models are a combination of two or more approaches for
analysis of network traffic. The hybrid model achieved good results in the analysis
of network traffic.
time-series Graph Mining for detecting anomalous packets from network traffic.
Evaluation metrics:
-In data mining techniques, many different metrics are used to investigate
the data mining techniques. The detection rate, false positive rate, accuracy and time
cost metrics are employed for measuring the performance of classifier for different
data set. A number of metrics exist to express predictive accuracy. The metrics used
using confusion matrix. Each metric is defined as below
a) True negatives (TN)
Total number of packets correctly classified.
b) True positives(TP)
Total numbers of malicious packets correctly classified.
c) False negatives(FN)
False Negatives is total numbers of malicious packets incorrectly classified as
normal packets.
d) False Positives (FP)
False positive is Total numbers of normal packets incorrectly classified as
malicious packets.
e) Detection Rate (DR)
5. It is the ratio of total numbers of attacks detected divided by total numbers of
false positive plus total number of true negative
f) Precision Rate (PR)
It is the ratio of total numbers of TP divided by total number of TP plus total
number of FP.
g) Recall Rate (RR)
It is ratio of total numbers of TP divided by total number of TP plus total number
of FN.
h) Overall Rate (OR)
It is ratio of total numbers of TP pulse total number of TN divided by total
number of TP plus total number of FP plus total number of plus total number of
TN.
i) Sensitivity
It is the ratio of total numbers of TP divided by total number of FP
j) Specificity
It is the ratio of total numbers of TN divided by total number of FN.
k) Accuracy
It is the ratio of total numbers of TP plus total numbers of TN divided by total
number of FP plus total number of FN.
l) Percentage of Successful prediction (PSP)
It is the ratio of total numbers of successful instances classified divided by the
total numbers of actual instance.
Traffic Flows:
The nature of internet traffic can better be understood by knowing the concept of
the flow. Flow is the sequence of packets or a packet that belonged to certain
network sessions between two hosts but delimited by the setting of flow
generation or analyzing tool. the definition of flow may also be coined as, a series
of packets that share the same source IP, destination IP, source port, destination
port and the protocol.
E-R Diagram:
Yes
No
Application generates traffic
Sends Packet to socket
Sends packets to transport
layer
Sends packet to network layer
Packet arrives at device
Packet
for host?
Drops packet
Sends packet to
network layer
Forward
packet
Sends packet to
transport layer
Drops packet
Looks up route to
destination
TRANSPORT LAYER (IP)
6. Experiment Results:
App-centric Monitoring and Shape app traffic: -
Recognize and classify non-standard application that hog your network
bandwidth using NetFlow Analyzer.
Reconfigure policieswith traffic shaping technique via ACL or class-based policy
to gain control over bandwidth-hungry application.
NetFlow analyzer leverages on Cisco NBAR to give you deep visibilityinto layer
7 traffic and recognize applications that use dynamic port numbers or hide
behind well-known ports.
Capacity Planning and Billing:
Make informed decisions on your bandwidth using capacity planning reports.
Measure your bandwidth growth over a period time long term reporting.
Accurate trend over extended historic periods
Generate on demand billing for accounting and departmental chargebacks.
Monitor Voice, Video and Data effectively:
Analyze IP service levels for network-based applications and services using
NetFlow analyzer IP SLA monitor
Ensure high level of data and voice communication quality using Cisco IP SLA
technology
Keep a tap on key performance metrics of voice and data traffic.
7. Some common thingsthat we need:
A computer Mouse
A touch screen/Normalscreen
A program on your Mac or Windows that include a translation, icons of disk
drives, and folder.
Pull-down menus
Principles of Human-Computer Interface Design:
Recognize Diversity- In order to recognize diversity, the designer, must take into
account the type of user frequenting system, ranging from novice user, knowledgeable but
intermittentuser and expert frequent user. Each type of user expectsthe screen layout to
accommodate their desires, novicesneeding extensive help, experts wanting to get where
they want to go as quickly as possible. Accommodating both styles on the same page can be
quite challenging. You can addressthe differences in users by including both menu or icon
choices as well as commands (i.e. Command or Control P for Print as well as an icon or
menu entry), or providing an option for both full descriptive menus and single letter
commands.
8. Eight Golden Rules of Interface Design:
1. Strive for consistency
consistent sequences of actions should be required in similar situations
identical terminology should be used in prompts, menus, and help screens
consistent color, layout, capitalization, fonts, and so on should be employed
throughout
2. Enable frequent users to use shortcuts
to increase the pace of interaction use abbreviations, special keys, hidden
commands, and macros
3. Offer informative feedback
for every user action, the system should respond in some way (in web
design, this can be accomplished by DHTML - for example, a button will
make a clicking sound or change color when clicked to show the user
something has happened)
4. Design dialogs to yield closure
Sequences of actions should be organized into groups with a beginning,
middle, and end. The informative feedback at the completion of a group of
actions shows the user their activity has completed successfully
5. Offer error prevention and simple error handling
design the form so that users cannot make a serious error; for example,
prefer menu selection to form fill-in and do not allow alphabetic characters
in numeric entry fields
if users make an error, instructions should be written to detect the error
and offer simple, constructive, and specific instructions for recovery
segment long forms and send sections separately so that the user is not
penalized by having to fill the form in again - but make sure you inform
the user that multiple sections are comingup
6. Permit easy reversal of actions
7. Support internal locus of control
Experienced users want to be in charge. Surprising system actions, tedious
sequences of data entries, inability or difficulty in obtaining necessary
information, and inability to produce the action desired all build anxiety
and dissatisfaction
8. Reduce short-term memory load
9. A human can store only 7 (plus or minus 2) pieces of information in their
short term memory. You can reduce shortterm memory load by designing
screens where options are clearly visible, or using pull-down menus and
icons
Prevent Errors - The third principle is to prevent errors whenever possible. Steps
can be taken to design so that errors are less likely to occur, using methods such as
organizing screensand menus functionally, designing screensto be distinctive and
making it difficult for usersto commit irreversible actions. Expect users to make
errors, try to anticipate where they will go wrong and design with those actions in
mind.
Norman's Research
One researcher who has contributed extensively to the field of human-computer interface
design is Donald Norman. This psychologist has taken insights from the field of industrial
product design and applied them to the design of user interfaces. According to Norman,
design should:
Use both knowledge in the world and knowledge in the head. Knowledge in the
world is overt - we don't have to overload our short term memoryby having to remember
too many things (icons, buttons and menus provide us with knowledge in the world - we
don't have to remember the command for printing, it's there in front of us). On the other
hand, while knowledge in the head may be harder to retrieve and involves learning, it is
more efficient for tasks which are used over and over again “make it easy to determine
what actions are possible at any moment (make use of constraints)".
For example:
well-designed things can only be put together certain ways (the trapezoidal
SCSI cable is an example of good design - I can only plug it in one way)
menus only display the actions which can be carried out at that time (other
options are dimmed).
"Make things visible, including the conceptual model of the system, the alternative actions
and the results of actions". You can also provide an overview map of your site so that your
user can design their own mental map of how things work.
"Make it easy to evaluate the current state of the system". You can do that by providing
feedback in the form of messages or flashing buttons.
"Follow natural mappingsbetween intentions and the required actions, between actions
and the resulting effect; and between the information that is visible and the interpretation
of the system state".
For example:
10. It should be obvious what the function of a button or menu is - use
conventionsalready established for the web, don't try to design something
which changes what people are familiar with.
The underlined phrase on a web page is a well-known clue that a link is
present. From past experience, users understand that clicking on an
underlined phrase should take them somewhere else.
"In other words, make sure that the user can figure out what to do, and (2) the user can tell
what is going on.
Summary
How can we relate the recommendations from human-computer interface design research
directly to web design?
1. Recognize Diversity
make your main navigation area fast loading for repeat users
provide a detailed explanation of your topics, symbols, and navigation
options for new users
provide a text index for quick access to all pages of the site
ensure your pages are readable in many formats, to accommodate users
who are blind or deaf, users with old versions of browsers, lynx users,
users on slow modems or those with graphics turned off
2. Strive for consistency in:
menus
help screens
color
layout
capitalization
fonts
sequences of actions
3. Offer informative feedback - rollover buttons, sounds when clicked
4. Build in error prevention in online forms
5. Give users control as much as possible
6. Reduce short term memory load by providing menus, buttons or icons. If you use
icons, make sure you have a section which explains what they mean. Make things
obvious by using constraints - grayed out items in menus for options not available in
that page
7. Make use of web conventions such as underlined links, color change in links for
visited pages, common terminology
8. Provide a conceptual model of your site using a site map or an index