3. Knowledge Engineering
It is the process of building intelligent
systems.
1. Problem Assessment
2. Data & knowledge acquisition
3. Prototype
4. Complete System
5. Evaluation & Revise
6. Integration & Maintain System
4. Knowledge Engineering
1. Problem Assessment
● Determine the problem's characteristics
● Identify the main participants in the project
● Specify the project's objectives
● Determine the resources need for building the
system
Types of problems: Diagnosis, selection, prediction,
classification, clustering, optimization, control
E.g. Diagnosis -> domain knowledge, explanation facilities
5. Knowledge Engineering
2. Data and Knowledge Acquisition
● Collect and analyse data and knowledge
● Make key concepts of the system design more
explicit
Intelligent System
6. Knowledge Engineering
3. Development of a Prototype System
● Choose a tool for building an intelligent system
● Transform data and represent knowledge
● Design and implement a prototype
● Test the prototype
7. Knowledge Engineering
4. Development of a complete system
● Prepare a detailed design for a full-scale system
● Collect additional data and knowledge
● Develop the user interface
● Implement the complete System
8. Knowledge Engineering
5. Evaluation and revision of the system
● Evaluate the system against the performance
criteria
● Revise the system as necessary
9. Knowledge Engineering
6. Integration and Maintenance of the System
● Make arrangements for technology transfer
● Establish an effective maintenance program
10. Case Study 1: Diagnostic Expert
System
I want to develop an Intelligent system that
can help me to fix malfunctions of my Mac
computer.
Will an expert system work for this problem?
11. Case Study 1: Diagnostic Expert
System
Phone Call Rule (Firebaugh, 1988) - "Any
problem that can be solved by your in-house
expert in a 10-30 minute phone call can be
developed in an expert system."
● Troubleshooting manuals
● Series of visual inspections
● Rule structure with domain knowledge
12. Case Study 1: Diagnostic Expert
System
Taken from N. Ch9 Pg 309 & 310
13. Case Study 2: Classification
Expert System
Nick Remish
14. Case Study 2: Classification E.S.
● Problem: Identify different classes of
sailboat (typical classification problem)
○ Handled well by both expert systems and NNs
● Collect information
○ In this case, the sail plan can help identify the class
of boat.
● Issues with the expert system approach
○ What if the information is incomplete or inexact?
(Rough weather obscuring sails)
■ Manage incrementally acquired evidence with
certainty factors.
15. 9.3 Will a fuzzy expert system work
for my problem?
● A Fuzzy Solution?
○ Useful when you cannot define a set of exact rules.
○ Great for inherent imprecise properties and
modeling human decision making.
■ Sometimes parameters are imprecise (a doctor
dealing with a patient)
○ Mainly used in engineering, but has applications in
any sector that relies on human experience or is
too complex or uncertain (ex: finance)
17. Decision-Support Fuzzy System
● Problem: assessing mortgage applications
○ Use a Decision-Support Fuzzy System
● Steps:
○ Represent the concept in fuzzy terms
○ Implement the concept in a prototype
○ Test and optimize
18. Decision-Support Fuzzy System
● Represent the concept in fuzzy terms:
Triangular and trapezoidal fuzzy membership
functions are used to represent knowledge.
20. Decision-Support Fuzzy System
● Evaluate and analyse performance:
● Despite having 100+ rules, decision-support
fuzzy systems can be developed, tested and
implemented relatively quickly.
21. 9.4 Will a neural network
work for my problem?
Sean Ruck
22. Neural Network Overview
● Very powerful, general purpose tools
● Successfully applied to prediction,
classification, and clustering problems
● Quite popular due to the versatility of
neural networks
23. Case Study 4: Character Recognition
Neural Networks
● Suppose you want to copy a document onto
your computer without retyping the whole
thing.
○ How?
● Optical Character Recognition
○ The ability of a computer to translate character
images into a text file using software
○ Capture the character images by scanning the
document
■ Converts the scanned document into a bit map
24. Choosing The Neural Network
Architecture
● Architecture and size of neural newtork
dependent upon complexity of the problem
○ Handwritten character recognition is far more
complex than computer printed
● A 3-layer network will suffice for printed
digit recognition
25. Determining an optimal number of
hidden neurons
● More neurons leads to a more accurate
network, but takes longer to train
● Too many neurons may actually prevent the
network from generalising or working for
anything other than training examples
○ Overfitting
● How to prevent overfitting
○ Choose the smallest number of neurons that give
good results and generalisation
26. cont'd.
● We should test out the training of the
network with various numbers of hidden
neurons
○ Performance rated by sum of squared errors
■ The training runs that have a good enough sum
of squared errors result have a number of
hidden neurons to consider using
27. Test Examples?
● The test set should be entirely independent
of the training examples
○ Only use the training runs that passed the previous
test
● Test examples should also contain "noise"
○ Distortion of the input
● The training runs that give us a reasonable
error in recognition even with noise have a
good enough number of hidden neurons to
use
○ Use the lowest number for practical purposes
28. Improving Performance
● A neural network is only as good as the
examples used to train it
● Improve the network by training it with
noisy examples
29. Case Study 5: Prediction Neural
Networks
● Neural networks are useful in prediction
situations such as predicting the market
value of a house
● Using a neural network creates a black box
around how the results were reached
○ The result is more important than the how anyway
● For prediction training examples are
critically important
○ We need a wide array of examples to cover all
possible inputs
30. Determining The Size Of A Training
Set
● Can be estimated with "Widrow's rule of
Thumb": N = nw/e
○ Where N is the number of training examples, nw is
the number of synaptic weights in the network, and
e is the network error permitted
31. Dealing With The Data
● Neural networks work best with inputs in
the 0 to 1 range, but in cases such as with
determining the value of a house, our
inputs are not all in that range
○ Number of bedrooms, square footage, etc.
● So we need to "massage" the data to this
range
○ massaged value = (actual value - minimum value) /
(maximum value - minimum value)
○ Good for up to a dozen possible values
32. cont'd
● We can also utilize 1 of N coding
○ Each possible value is taken as its own input each
with a value of 0 or 1
33. Dealing With The Results
● To validate the results we test the network
with never before seen examples, as before
● Our network is working with values between
0 and 1. We need to convert back to actual
values
○ We can reverse the "massaging" we did before
● To test the importance of certin inputs we
can test the network's sensitivity to them:
"Sensitivity Analysis"
○ Set each input one at a time to its minimum and
maximum values and measure the results
34. Case Study 6: Classification Neural
Networks With Competitive Learning
● Using a neural network we can discover
significant features of input patterns and
separate the data into different classes
● Using competitive learning a single layer
neural network can perform clustering
○ Combining similar data into groups or clusters
○ Uses 1 input neuron for each input and 1
competitive neuron for each cluster
35. When Is The Learning Process
Completed?
● In a competitive neural network, there is no
obvious way to know if the network is done
learning
○ We do not know the desired output, so we cannot
use the sum of squared errors
● Use Euclidean Distance criterion instead
○ When there has been no noticeable change in the
weights of the competitive neurons, the network
can be considered to have converged
36. How Can We Associate Neurons to
Specific Classes or Clusters?
● Competitive neural networks enable us to
identify clusters in input data, but does
nothing to label the clusters
○ We can connect a competitive neuron with a
cluster/class by analyzing its weights
● We can identify exactly which cluster is
which by feeding the network test data
corresponding to one particular cluster
○ The output neuron that most often is utilized is
labeled as that class
38. Genetic Algorithm Review
● Most applicable to optimization problems
○ Process of finding a better solution to a problem
■ More than one solution not of equal quality
● Generates a population of competing
candidate solutions
● Causes candidates to evolve through
process of natural selection
○ Poor solutions die out while better solutions survive
and reproduce
● Process repetition breeds an optimal
solution
39. Case Study 7: The Traveling
Salesman Problem
● I want to develop an intelligent system that can produce an
optimal itinerary. I am going to travel by car and I want to
visit all major cities in Western and Central Europe and
then return home. Will a genetic algorithm work for this
problem?
○ Known as the traveling salesman problem (TSP)
○ Given a finite number of cities, and the cost of travel (or the distance)
between each pair of cities, we need to find the cheapest way (or shortest
route) for visiting each city exactly once and returning to the starting point.
○ TSP naturally represented in numerous transportation and logistics
applications.
■ Arranging routes, scheduling drilling of holes in a circuit board (time
efficient - shortest distance)
● Although we can not be completely sure if the selected
route is the best one, after several runs we can be sure that
the route obtained is a good one.
40. How does a genetic algorithm solve
the TSP?
Representation
Chromosome where order of integers
represents order in which cities will be
visited.
41. Genetic Operators in the TSP
● Genetic operators used to create new routes
● Crossover Operator
○ Classical form cannot be directly applied because a simple exchange
of parts between parents would contain duplicates and omissions.
Clearly classical crossover with single crossover point does not work.
43. Genetic Operators in the TSP
Continued
● Mutation Operator
○ Reciprocal Exchange
■ Simply swaps two randomly selected cities in
the chromosome
○ Inversion
■ Selects two random points along the
chromosome string and reverses order of cities
between these points
44. Fitness Function in the TSP
● Evaluate total length of the route
○ Fitness of each individual chromosome is
determined as the reciprocal of the route length
● Shorter the route, fitter the chromosome
45. 9.6 Will a hybrid
intelligent system work
for my problem?
46. Hybrid Intelligent Systems
● Solving complex real-world problems
require an application of complex
intelligent systems that combine the
advantages of expert systems, fuzzy logic,
neural networks, and evolutionary
computation.
● Such systems can integrate human-like
expertise in a specific domain with abilities
to learn and adapt to a rapidly changing
environment.
47. Case Study 8: Neuro-fuzzy decision-
support systems
● I want to develop an intelligent system for diagnosing myocardial
perfusion from cardiac images. I have a set of cardiac images as
well as the clinical notes and physician's interpretation. Will a
hybrid system work for this problem?
○ Analysis of two SPECT images must be done
■ One stress image taken 10-15 minutes after injection with radioactive tracer
■ One rest image taken 2-5 hours after the injection
○ Brighter patches on image correspond to well-perfused areas while darker patches
may indicate the presence of an ischemia.
○ Visual inspection is highly subjective--intelligent system can help a cardiologist
diagnose.
○ One binary feature assigns an overall diagnosis--normal or abnormal
● The neuro-fuzzy system in this example has a heterogeneous
structure - the neural network and fuzzy system will work as
independent components but cooperate in solving the problem.
48. Back-Propagation Neural Network to Classify
the SPECT Images into Normal and Abnormal
● Each image is divided into 22 regions, so we
need 44 input neurons.
● Since SPECT images are to be classified as
either normal or abnormal, we should use
two output neurons.
● Good generalization in this study can be
obtained with 5 to 7 neurons in the hidden
layer.
49. Testing the Neural Network
● Testing the network, we find the network's
performance is rather poor
○ 25% normal are misclassified as abnormal
○ Over 35% abnormal are misclassified as normal
○ Indicates that the training set may lack some
important examples
● Can improve this still
50. Neural Network Output
● Two outputs
○ First - possibility that the SPECT image belongs to class
normal
○ Second - possibility that the SPECT image belongs to class
abnormal
● Examples:
○ NORMAL OUTPUT HIGH AND ABNORMAL OUTPUT LOW First
(normal) output is 0.92 and second (abnormal) is 0.16 - image
classified as normal - risk for heart attack is low
○ NORMAL OUTPUT LOW AND ABNORMAL OUTPUT HIGH First
(normal) output is 0.17 and second (abnormal) is 0.51 - image
classified as abnormal 0 risk for heart attack is high
○ BOTH OUTPUTS ARE CLOSE First (normal) output is 0.51 and
second (abnormal) is 0.49 - we cannot confidently classify the
image.
51. Adding Fuzzy Logic for Decision-
Making in Medical Diagnosis
● Fuzzy logic provides us with a means of modeling how the
cardiologist asses the risk of a heart attack.
● Need to determine input and output variables, define fuzzy sets,
and construct fuzzy rules.
○ Two inputs (NN output 1 and NN output 2) and one output (the risk
of a heart attack).
■ Inputs [0, 1] and output vary between 0 and 100 percent.
○ Fuzzy sets shown in Negnevitsky page 342 and 343 - Figure 9.33,
Figure 9.34, and Figure 9.35
○ Fuzzy rules in Negnevitsky page 343 - Figure 9.36
■ Examples:
1. If (NN-output1 is Low) and (NN_output2 is Low) then (Risk is Moderate)
2. If (NN-output1 is Low) and (NN_output2 is Medium) then (Risk is High)
3. If (NN-output1 is Low) and (NN_output2 is High) then (Risk is Very_High)
4. If (NN-output1 is Medium) and (NN_output2 is Low) then (Risk is Low)
52. More Certainty
● Risk between 30 and 50 percent cannot be
classified as either normal or abnormal -
uncertain.
● Apply the following heuristics known by
experienced cardiologists to all
corresponding regions (22 in each image)
1. If perfusion inside region i at stress is higher than perfusion
inside the same region at rest, then then risk of a heart
attack should be decreased.
2. If perfusion inside region i is not higher than perfusion
inside the same region at rest, then the risk of a heart
attack should be increased.
53. Three Heuristics Implemented In the
Diagnostic System
Step 1 Present the neuro-fuzzy system with the cardiac case.
Step 2 If the system's output is less than 30, classify the presented case as normal and then stop. If the output
is greater than 50, classify the case as abnormal and stop. Otherwise go to step 3.
Step 3 For region 1, subtract perfusion at rest fro perfusion at stress. If the result is positive, decrease the
current risk by multiplying its value by 0.99. Otherwise, increase the risk by multiplying its value by
1.01. Repeat this procedure for all 22 regions then go to Step 4.
Step 4 If the new risk value is less than 30, classify the case as normal; if the risk is greater than 50, classify
the case as abnormal; otherwise, classify the case as uncertain.
● When we now apply the test set to the neuro-fuzzy system, we find that the
accuracy of diagnosis has dramatically improved - the overall diagnostic error does
not exceed 5 percent, while only 3 percent of abnormal cases are misclassified as
normal.
● Although we have not improved the system's performance on normal cases (over
30 percent of normal cases are misclassified as abnormal), and up to 20 percent of
the total number of cases are classified as uncertain, the neuro-fuzzy system can
actually achieve even better results in classifying SPECT images than a cardiologist
can.
54. Homogeneous Structure of Neuro-
Fuzzy Systems
● A typical example of a neuro-fuzzy system
with a homogeneous structure is an
Adaptive Neuro-Fuzzy Inference System
(ANFIS).
○ It cannot be divided into two independent distinct
parts.
○ An ANFIS is a multilayer neural network that
performs fuzzy inferencing.
● Case Study 9: Time series prediction
○ Page 346 of Negnevitsky
56. Data Mining
● Definition:
○ The extraction of knowledge from data
○ The exploration and analysis of large quantities of
data to to discover patterns.
● Ultimate goal is to discover knowledge
● Amount of data doubles every year
● Important to have fast algorithms to
process data
57. Data Warehouses
● Definition:
○ Large databases that store historical data.
○ Contain millions and in some cases billions of data
records.
● The data stored is time dependent and
integrated
● Used to help support decision making
● Query tools are used to discover
relationships in the data.
58. Query Tools vs. Data Mining
● Query tools are assumption-based
○ User must ask the right questions to get result
○ User must make assumptions
○ Can select a specific variable that affects the
outcome
● Data Mining tools determine the most
significant factors
○ No assumptions are necessary
○ Discovers patterns automatically
● The representation of data in data
warehouses helps facilitate the data mining
process
59. Data Mining Practice
● Data Mining is a new and evolving field
● Very popular in the banking, finance,
marketing, and telecommunications
industries
● Data Mining uses:
○ Determine trends in markets
○ Detect frauds
○ Target people most likely to buy a product/use a
service
60. Data Mining Tools
● People used to use query tools and
statistics to solve data mining problems
○ These techniques are not very efficient for large
amounts of data
○ Can only correlate a few variables at a time
● Now, tools are based off of intelligent
technologies:
○ Neural networks, neuro-fuzzy systems, and decision
trees
● Decision trees are currently the most
popular tool used for data mining
61. Decision Tree
● A map of the reasoning process
● These trees do not allow for the use of
noisy or incomplete data
● Uses tree structure to describe the data set
● Very effective in solving classification
problems
● Popular because they help you visualize the
problem
● Nodes are separated by predictors
○ In the book example, homeownership was used to
split the tree
63. Gini Coefficient
● A measure of how well the predictor
separates the classes contained in the
parent node
● Introduced by Corrado Gini, an Italian
economist
● He used it to measure the inequality in
Italy's income distribution
64. Calculating the Gini Coefficient
● Top curve
represents the real
economy
● Bottom line
represents equal
distribution of
wealth
● Coefficient:
○ (shaded area)/ area
below bottom line
67. Summary - Knowledge engineering
● What is knowledge engineering?
○ Problem Assessment
○ Data & knowledge acquisition
○ Prototype
○ Complete System
○ Evaluation & Revise
○ Integration & Maintain System
68. Summary - Assess the Problem
● Assess the Problem
○ problem type
■ diagnosis, selection, prediction, classification,
clustering, optimization, control
○ availability of data
■ precise data? complete set input?
○ form of content of the solution
■ final result only? reasoning behind the answer?
○ availability of expertise
■ extra info provided? trouble to present problem
solving strategy?
69. Summary - Data & Knowledge acquisition
● Questions about the data
○ Range? Continues? Discrete? Precise? Noise
Tolerance? Numerical? Symbolic?
● Data Mining
○ analyze data, finding pattern & rules, extracting
knowledge from large quantities of data
○ decision tree
■ easy to follow
■ visualization of solution
■ makes clear sets of rules
70. Summary - Prototype
● shows understanding of
○ the problem
○ problem-solving strategy
○ tool selected
● Test
○ Throw it away if needed
○ Forcing wrong tool leads to more time waste in the
later development process
○ Prototype is there for discovering any
inappropriate/wrong decisions made
71. Summary -
Complete System,Evaluation ,Revision, Integration & maintenance
● Complete System Development
○ plan, schedule, budget
● Evaluation
○ no clear right/wrong
○ user satisfaction = measurement
● Revision
○ Modify as limitation & weaknesses discovered
● Maintenance
○ Knowledge evolves over time
○ keep modifying and updating to maintain efficiency
and accuracy