2 lectures 16 17-informed search algorithms ch 4.3

ICS-381
Principles of Artificial Intelligence

Week 6

Informed Search Algorithms
Chapter 4

Dr.Tarek Helmy El-Basuny

Dr. Tarek Helmy, ICS-KFUPM 1

Last Time

We discussed:
Greedy Best-First Search Example (8-Puzzle)
A* Searching algorithm
Proofing the optimality of A* algorithm
We are going to present
Improving the performance of A*
Iterative-Deepening A* (IDA*)
Recursive Best-First Search (RBFS)
Simplified Memory Bounded A* (SMA*)


Analyzing the Heuristic Function h(n)

If h(n) = h*(n) for all n,
Only nodes on optimal solution path are expanded
No unnecessary work is performed
If h(n) = 0 for all n,
This heuristic is admissible, A* performs exactly as Uniform-Cost
Search (UCS)
The closer h is to h*,
The fewer extra nodes that will be expanded

If h1(n) <= h2(n) <= h*(n) for all n that aren't goals then h2 dominates h1
h2 is a better heuristic than h1
A1* using h1 expands at least as many if not more nodes than using
A2* with h2.
A2* is said to be better informed than A1*


A* Search

Major drawback of A*
A* keeps all generated nodes in memory and usually runs out of space long before it runs
out of time.
It cannot venture down a single path unless it is almost continuously having success
(i.e., h is decreasing). Any failure to decrease h will almost immediately cause the
search to switch to another path.

3 3 4
5 3
Initial State 4
2
4 2 1
3 3 0
4 Goal
5 4
3 important factors influencing the efficiency of algorithm A*
The cost (or length) of the path found
The number of nodes expanded in finding the path
The computational effort required to compute
We can overcome A* space problem without sacrificing optimality or completeness, at a
small cost in execution time.

Improving A*: Memory-bounded Heuristic Search

Iterative-Deepening A* (IDA*)
Using f(g+h) as a cut off rather than the depth for the iteration.
Cutoff value is the smallest f-cost of any node that exceeded the
cutoff on the previous iteration; keep these nodes only.
Space complexity O(bd)
Recursive Best-First Search (RBFS)
It replaces the f-value of each node along the path with the best f-
value of its children.
Space complexity O(bd)
Simplified Memory Bounded A* (SMA*)
Works like A* until memory is full
Then SMA* drops the node in the fringe with the largest f value and
“backs up” this value to its parent.
When all children of a node n have been dropped, the smallest
backed up value replaces f(n)

IDA* and ID first

IDA* is similar to Iterative depth-first:

d=1 A* f1
Depth-
first d=2 f2
d=3
f3
d=4
f4
Expand by depth-layers
Expands by f-contours


Iterative deepening A*

Depth-first Perform depth-first search
in each f-
LIMITED to some f-bound.
contour f1

f2 If goal found: ok.
Else: increase de f-bound and
f3
restart.
f4

How to establish the f-bounds?
- initially: f(S)
generate all successors
record the minimal f(succ) > f(S)
Continue with minimal f(succ) instead of f(S)


Iterative Deepening A* Search Algorithm

OPEN: to keep track of the current fringe of the search.
CLOSED: to record states already visited.

start set threshold as h(s)

put s in OPEN, compute f(s)

yes Threshold =
OPEN empty ? min( f(s) , threshold )

Remove the node of OPEN whose f(s) value is smallest
and put it in CLOSE (call it n)

n = goal yes
Success
?

Expand n. calculate f(s) of successor
if f(suc) < threshold then
Put successors to OPEN if
pointers back to n


8-Puzzle Example: Iterative Deepening A*

Use f(n) = g(n) + h(n) with admissible and consistent h
The cutoff used is the f-cost (g+h) rather than the depth
we again perform depth-first search, but only nodes with f-cost less than or equal
to smallest f-cost of nodes expanded at last iteration.

f(n) = g(n) + h(n), with h(n) = number of misplaced tiles

4
Cutoff=4

Goal

6




4
Cutoff=4 4

Goal

6 6




4
Cutoff=4 4 5

Goal

6 6




5

4
Cutoff=4 4 5

Goal

6 6




6 5

4
Cutoff=4 4 5

Goal

6 6




4
Cutoff=5

Goal

6




4
Cutoff=5 4

Goal

6 6




4
Cutoff=5 4 5

Goal

6 6




4
Cutoff=5 4 5

7 Goal

6 6




4 5
Cutoff=5 4 5

7 Goal

6 6




4 5 5
Cutoff=5 4 5

7 Goal

6 6




3 3 4
5 3
Initial State 4
2
4 2 1
3 3 0
4 Goal
5 4

With A * 16 nodes have been expanded

4 5 5
Cutoff=5 4 5

7 Goal

6 6 With IDA * Only 9 nodes expanded


Example 2: Iterative Deepening A*

f-limited, f-bound = 100 f-new = 120
S
f=100

A B C
f=120 f=130 f=120

D G E F
f=140 f=125 f=140 f=125



f-limited, f-bound = 120 f-new = 125
S
f=100

A B C
f=120 f=130 f=120

D G E F
f=140 f=125 f=140 f=125



f-limited, f-bound = 125
S
f=100

A B C
f=120 f=130 f=120

D G E F
f=140 f=125 f=140 f=125

SUCCESS

Recursive Best-First Search

Keeps track of the f-value of the best-alternative path available.
If current f-values exceeds this alternative f-value then backtrack to
alternative path.
Upon backtracking change f-value to best f-value of its children.
It takes 2 arguments:
a node
an upper bound
Upper bound= min (upper bound on it’s parent, current value of it’s lowest
cost brother).
It explores the sub-tree below the node as long as it contains child nodes whose
costs do not exceed the upper bound.
If the current node exceeds this limit, the recursion unwinds back to the
alternative path.
As the recursion unwinds, RBFS replaces the f-value of each node along the
path with the best f-value of its children.


RBFS: Example

Path until Rumnicu Vilcea is already expanded
Above node; f-limit for every recursive call is shown on top.
Below node: f(n)
The path is followed until Pitesti which has a f-value worse than the f-limit.


RBFS: Example

Unwind recursion and store best f-value for current best leaf Pitesti
result, f [best] ← RBFS(problem, best, min(f_limit, alternative))
best is now Fagaras. Call RBFS for new best
best value is now 450


RBFS: Example

Unwind recursion and store best f-value for current best leaf Fagaras
result, f [best] ← RBFS(problem, best, min(f_limit, alternative))
best is now Rimnicu Viclea (again). Call RBFS for new best
Subtree is again expanded.
Best alternative subtree is now through Timisoara.
Solution is found since because 447 > 417.


Simple Recursive Best-First Search (SRBFS)

Implementation
If one of successors of node n has the smallest f-value over all OPEN
nodes, it is expanded in turn, and so on.
When other OPEN node, n’, (not a successor of n) has the lowest
value of f-value
backtracks to the lowest common ancestor, node k
kn: successor of node k on the path to n
RBFS removes the sub-tree rooted at kn, from OPEN
kn becomes an OPEN node with f-value value (its backed-up
value)
Search continues below that OPEN node with the lowest f-value.


SRBFS -The Algorithm

SRBFS ( node: N ,bound B)
IF f( N) > B RETURN f(n)
IF N is a goal, EXIT algorithm
IF N has no children, RETURN infinity
FOR each child Ni of N, F[i] := f(Ni)
sort Ni and F[i] in increasing order of F[i]
IF only one child, F[2] = infinity
WHILE (F[1] ≤ B and f[1] < infinity)
F[1] := SRBFS (N1, MIN(B, F[2]))
insert N1 and F[1] in sorted order
RETURN F[1]


RBFS and IDA* Comparison

RBFS is somewhat more efficient than IDA*
But still suffers from excessive node regeneration.
If h(n) is admissible, RBFS is optimal.
Like A*, optimal if h(n) is admissible
Space complexity is O(bd).
IDA* keeps only one single number (the current f-cost limit)
Time complexity difficult to characterize
Depends on accuracy if h(n) and how often best path changes.
Difficult to characterize
Both IDA* and RBFS are subject to potential exponential increase
in complexity associated with searching on Graph, because they
cannot check for repeated states other than those on the current
path.
Thus, they may explore the same state many times.


ICS-381
Principles of Artificial Intelligence

Week 6.2

Informed Search Algorithms
Chapter 4

Dr.Tarek Helmy El-Basuny


Major 1 Reminder

Time: 5 – 7:00 PM,
Date: Tomorrow (Sat., Nov. 3),
Place: Building 24-121
Materials: Every thing we presented up to the last class


Simplified Memory-Bounded A* (SMA*)

Idea
Expand the best leaf (just like A* ) until memory is full
When memory is full, drop the worst leaf node (the one with the
highest f-value) to accommodate new node.
If all leaf nodes have the same f-value, SMA* deletes the oldest
worst leaf and expanding the newest best leaf.
Avoids re-computation of already explored area
Keeps information about the best path of a “forgotten” subtree in
its ancestor.
Complete if there is enough memory for the shortest solution path
Often better than A* and IDA*
Trade-off between time and space requirements


Simplified Memory-bounded A*

Optimizes A* to work within reduced memory.
Key idea:

13 If memory is full and we need to
(15) S generate an extra node (C):
Remove the highest f-value leaf
15 13 from QUEUE (A).
A B Remember the f-value of the
best ‘forgotten’ child in each
parent node (15 in S).
18
C
memory of 3 nodes only

Generate Successor 1 by 1

13 When expanding a node (S), only
S add its successor 1 at a time to
QUEUE.
we use left-to-right
A B

Avoids memory overflow and
allows monitoring of whether we
need to delete another node
First add A, later B


Adjust f-values

15 13 If all children M of a node N have
S been explored and for all M:
f(S...M) > f(S...N)
15 24
A B Then reset:
f(S…N) = min { f(S…M) | M
child of N}

better estimate for f(S)
A path through N needs to go
through 1 of its children !


Too long path: give up

13 If extending a node would produce
S a path longer than memory: give
13 up on this path (C).
B

Set the f-value of the node (C) to ∞
∞ 18 C
(to remember that we can’t find a
path here)
D
memory of 3 nodes only


SMA*: an example

0+12=12
10 S 8
10+5=15 8+5=13
A 8 B
10 16
10
20+5=25 C G1 20+0=20 D 16+2=18 G2 24+0=24
10 10 8 8
30+5=35 E G3 30+0=30 24+0=24 G4 F 24+5=29

13 (15)
12 12 12 13
12 S
S S S

A A B
A 15
B 15 13
15 13

D 18 ∞

Example: continued
13 (15) 0+12=12
S 10 S 8
10+5=15 8+5=13
A 8 B
10 16
B 10
13 20+5=25 C G1 20+0=20 D 16+2=18 G2 24+0=24
10 10 8 8
D∞ 30+5=35 E G3 30+0=30 24+0=24G4 F 24+5=29

15 13 (15) 15 (15) 15 (24) 20 15 (24)
15
S S S S

B 13 A B A B 20 A (∞)
24 (∞) 15 (∞) 24 15 24 15

G2 G2
D∞ 24 24 C 25 ∞ C∞ G1 20


SMA*: Properties:

Complete: If available memory allows to store the shortest path.

Optimal: If available memory allows to store the best path.

Otherwise: returns the best path that fits in memory.

Memory: Uses whatever memory available.

Speed: If enough memory to store entire tree: same as A*

Thrashing

SMA* switches back and forth continually between a set of candidate
solution paths, only a small subset of which can fit in memory. More
computation for switching rather than searching.


Performance Measure

Performance
How search algorithm focus on goal rather than walk off in irrelevant
directions.
P=L/T
L : depth of goal
T : Total number of node expanded
Effective Branching Factor (B)
B + B2 + B3 + ..... + BL = T


Partial Searching

The searches covered so far are characterized as partial searches. Why?
Partial Searching:
Means, it looks through a set of nodes for shortest path to the goal state
using a heuristic function.
The heuristic function is an estimate, based on domain-specific
information, of how close we are to a goal.
Nodes: state descriptions, partial solutions
Edges: action that changes state for some cost
Solution: sequence of actions that change from the start to the goal state
BFS, IDS, UCS, Greedy, A*, etc.
Ok for small search spaces that are often "toy world" problems
Not ok for Hard problems requiring exponential time to find the optimal
solution, i.e. Traveling Salesperson Problem (TSP)


Local Search and Optimization

Previous searches:
keep paths in memory, and remember alternatives so search can
backtrack. Solution is a path to a goal.
Path may be irrelevant, if the final configuration only is needed (8-
queens, IC design, network optimization, …)
Local Search:
Use a single current state and move only to neighbors.
Use little space
Can find reasonable solutions in large or infinite (continuous) state
spaces for which the other algorithms are not suitable
Optimization:
Local search is often suitable for optimization problems.
Search for best state by optimizing an objective function.


Another Approach: Complete Searching

Typically used for problems where finding a goal state is more important than
finding the shortest path to it. Examples: airline scheduling, VLSI layout.
A problem is called NP (nondeterministic polynomial time) class if it is solvable
in polynomial time.
How can hard NP problems be solved in a reasonable (i.e. polynomial) time? by
using either:
Approximate model: find an exact solution to a simpler version of the problem
Approximate solution: find a non-optimal solution of the original hard problem
Next we'll explore means to search through complete solution space for a solution
that is near optimal.
Complete Searching
Look through solution space for better solution
Nodes: complete solution
Edge: operator changes to another solution
Can stop at any time and have a solution
Technique suited for:
Optimization problems
Hard problems, e.g. Traveling Salesman Problem (TSP):


Traveling Salesperson Problem (TSP)

A salesman wants to visit a list of cities 5 Cities TSP

Stopping in each city only once
Returning to the first city A
5 8
Traveling the shortest distance
Nodes are cities B 6 C
Arcs are labeled with distances between cities 9 7
5 2
5 3

A B C D E D E
4
A 0 5 8 9 7

B 5 0 6 5 5

C 8 6 0 2 3

D 9 5 2 0 4

E 7 5 3 4 0


Traveling Salesperson Problem (TSP)

A solution is a permutation of cities, called a tour
e.g. A – B – C – D – E – A
Length 24
How many solutions exist?
5 City TSP
(n-1)!/2 where n = # of cities
n = 5 results in 12 tours
n = 10 results in 181440 tours A
n = 20 results in ~6*1016 tours 5 8

6
A B C D E B C
A 0 5 8 9 7 9 7
B 5 0 6 5 5 5 2
5 3
C 8 6 0 2 3
D 9 5 2 0 4
D E
E 7 5 3 4 0 4


Another Approach: Complete Searching

An operator is needed to transform one solution to another.
What operators work for TSP?
Two-swap (common)
Take two cities and swap their location in the tour
e.g. A-B-C-D-E swap(A,D) yields D-B-C-A-E
Two-interchange
Reverse the path between two cities e.g. A-B-C-D-E interchange (A,D)
yields D-C-B-A-E. Both work since graph is fully connected
Solutions that can be reached with one application of an operator are in the current
solution's neighborhood .
Local searches only consider solutions in the neighborhood.
In general, the neighborhood should be much smaller than the size of the search
space.
Hill-Climbing
Beam Search
Simulated Annealing
Genetic Algorithms, other
An evaluation function f(n) is used to map each solution to a number corresponding
to the quality of that solution.

Hill-Climbing Search

“Continuously moves in the direction of increasing value”
It terminates when a peak is reached.
Looks one step ahead to determine if any successor is better than the current
state; if there is, move to the best successor.
If there exists a successor s for the current state n such that
h(s) < h(n)
h(s) <= h(t) for all the successors t of n,
Then move from n to s. Otherwise, halt at n.
Similar to Greedy search in that it uses h, but does not allow backtracking or
jumping to an alternative path since it doesn’t “remember” where it has been.
Not complete since the search will terminate at "local minima," "plateaus," and
"ridges."


Hill Climbing: Example-1

S Perform depth-first, BUT:
instead of left-to-right selection,
FIRST select the child with the
A 10.4 D 8.9
best heuristic value

1. Pick a random point in the search
A 10.4 6.9
E space
2. Consider all the neighbors of the
current state
6.7 B F 3.0 3. Choose the neighbor with the
best quality and move to that
state
G 4. Repeat 2 thru 4 until all the
neighboring states are of lower
quality
5. Return the current state as the
solution state


Hill Climbing Search: Implementation

1. QUEUE <-- path only containing the root;
2. WHILE QUEUE is not empty AND goal is not reached
DO remove the first path from the QUEUE;
create new paths (to all children);
reject the new paths with loops;
sort new paths (HEURISTIC) ;
add the new paths to front of QUEUE;
3. IF goal reached
THEN success;
ELSE failure;
function HILL-CLIMBING( problem) return a state that is a local maximum
input: problem, a problem
local variables: current, a node.
neighbor, a node.
current ← MAKE-NODE(INITIAL-STATE[problem])
loop do
neighbor ← a highest valued successor of current
if VALUE [neighbor] ≤ VALUE[current] then return STATE[current]
current ← neighbor


Hill Climbing: Example-2

3 start state (candidate solution)
Neighbor solutions
These numbers measure
6 4 value, not path length
2

7 12 Optimal goal (this
4 5 might not be reached)

Suboptimal goal
8 10
8


8 Queens Example: Hill Climbing

Move the queen Move the queen
Done!
in this column in this column
2 3
2 3
1
2 2
3 3
1 2
2 3
0
NNNYNNNY NNNNNYNY NNNNNNNN
Conflicts Conflicts Conflicts
The numbers give the number of conflicts. Choose a move with
the lowest number of conflicts. Randomly break ties. Hill descending.


Example-3: Hill Climbing

2 8 3 1 2 3
start 1 6 4 h=4 goal 8 4 h=0
7 5 7 6 5

5 5 2
2 8 3 1 2 3
1 4 h=3 8 4 h=1
7 6 5 7 6 5

3 4
2 3 2 3
1 8 4 1 8 4 h=2
7 6 5 7 6 5
h=3 4
f(n) = (number of tiles out of place)

Example-3: Hill Climbing

1 2 3 h(n)
We can use heuristics 4
7
8
6 5
5
to guide “hill climbing”
search. 1
4
2
8 3
1
4
2
8
3
5
7 6 5
6 7 6
4

In this example, the 1 2 3
Manhattan Distance 4 8 5
3
7 6
heuristic helps us
quickly find a solution 1 2 3 1 2 3
4 8 5
4 4 5 2
to the 8-puzzle. 7 6 7 8 6

1 2 3 1 2 3 1 3
4 5 1 4 5
3 4 2 5
3
7 8 6 7 8 6 7 8 6

1 2 3 1 2
4 5 6 4 5 3
7 8
0 7 8 6
2


Drawbacks of Hill Climbing

Problems:
Local Maxima: peaks that aren’t the highest point in the space: No progress
Plateaus: the space has a broad flat region that gives the search algorithm no direction
(random selection to solve)
Ridges: flat like a plateau, but with drop-offs to the sides; Search may oscillate from side
to side, making little progress
Remedy:
Introduce randomness

Global Maximum


Problems of Hill Climbing

1 2 3 h(n)
In this example, 4 5 8 6
hill climbing does 6 7
not work!

All the nodes on 1 2 3 1 2 3
the fringe are 4 5 8 4 5 5
7
taking a step
6 7 6 7 8
“backwards”
(local minima)

1 2 3 1 2
4 5 6 4 5 3 6
6 7 8 6 7 8


Local Beam Search

The idea is that you just keep around those states that are relatively good, and just
forget the rest.
Local beam search: somewhat similar to Hill Climbing:
Start from N initial states.
Expand all N states and keep k best successors.
This can avoid some local optima, but not always. It is most useful when the search
space is big and the local optima aren't too common.
Local Beam Search Algorithm
Keep track of k states instead of one
Initially: k random states
Next: determine all successors of k states
Extend all paths one step
Reject all paths with loops
Sort all paths in queue by estimated distance to goal
If any of successors is goal → finished
Else select k best from successors and repeat.


Local Beam Search

Concentrates on
promising paths


Local Beam Search

Depth 1) S Assume a pre-fixed
WIDTH i.e. 2
10.4 8.9
Perform breadth-first,
A D
BUT:
Only keep the WIDTH
Depth 2) best new nodes
S
depending on heuristic
A D at each new level.

6.7 8.9 10.4 6.9
B D A E
X X
ignore ignore


Local Beam Search: Example 2

Depth 3) S

A D Ignore leafs
that are not
B D A E
4.0
C E 6.9 X X 6.7 B F 3.0 goal nodes
_
X
end
ignore

Depth 4) S

A D

B D A E
C
_ E X X B F
X
10.4 0.0
A C G
_


Genetic Algorithms: Basic Terminology

1. Chromosomes: Chromosome means a candidate solution to a problem and is
encoded as a string of bits.
2. Genes: a chromosome can be divided into functional blocks of DNA, genes, which
encode traits, such as eye color. A different settings for a trait (blue, green, brown,
etc.) are called alleles. Each gene is located at a particular position, called a locus,
on the chromosome. Genes are single bits or short blocks of adjacent bits.
3. Genome: the complete collection of chromosomes is called the organism’s
genome.
Population: A set of Chromosomes (Collection of Solutions)
1. Genotype: a set of genes contained in a genome.
2. Crossover (or recombination): occurs when two chromosomes bump into one
another exchanging chunks of genetic information, resulting in an offspring.
3. Mutation: offspring is subject to mutation, in which elementary bits of DNA are
changed from parent to offspring. In GAs, crossover and mutation are the two most
widely used operators.
4. Fitness/Evaluation Function: the probability that the sates will live to reproduce.


Definitions

0100101010011011 0100101010011011
0100101010011011
0100101010011011
0100101010011011

0100101010011011 0100101010011011
0100101010011011

Population

Chromosome: 0100101010011011

Gene: 1


Crossover

Merging 2 chromosomes to create new chromosomes

0100101010011011 0100101010011011


Mutation

Random changing of a gene in a chromosome

0100101010011011
1

Mutation can help for escaping from Local Maximums in our state
space.


Genetic Algorithms: Crossover

Reproduction Parent state A + Parent state B +
(Crossover)
log sin cos 5

pow tan /

x y - x y

y x

Child of A and B +

Parent state A cos sin

10011101 / tan
10011000
11101000 x y -
Child of A and B
Parent state B
y x


Genetic Algorithms: Mutation

Parent state A Child of A
+ +

cos 5 tan 5
Mutation

/ /

x y x y

Parent state A
Parent state A A C E F D B A
10011101 10011111
Child of A A C E D F B A
Child of A
Mutation

GA Operators

A genetic algorithm maintains a
population of candidate solutions
for the problem at hand, and makes Parent 1 1 0 1 0 1 1 1
it evolve by iteratively applying a
set of stochastic operators:
Selection replicates the most Parent 2 1 1 0 0 0 1 1
successful solutions found in a
population at a rate proportional to
their relative quality.
Crossover: decomposes two
distinct solutions and then
Child 1 1 0 1 0 0 1 1
randomly mixes their parts to form
novel solutions.
Mutation: Randomly converts
Child 2 1 1 0 0 1 1 0 Mutation
some of the bits in a chromosome.
For example, if mutation occurs at
the second bit in chromosome
11000001, the result is 10000001.

A simple Genetic Algorithm

The outline of a simple genetic algorithm is the following:
1. Start with the randomly generated population of “n” j-bit chromosomes.
2. Evaluate the fitness of each chromosome.
3. Repeat the following steps until n offspring have been created:
a. Select a pair of parent chromosomes from the current population
based on their fitness.
b. With the probability pc, called the crossover rate, crossover the pair at
a randomly chosen point to form two offspring.
c. If no crossover occurs, the two offspring are exact copies of their
respective parents.
d. Mutate the two offspring at each locus with probability pm, called the
mutation rate, and place the resulting chromosomes in the new
population.
e. If n is odd, one member of the new population is discarded at
random.
4. Replace the current population with the new population.
5. Go to step 2.


Example 1: Genetic Algorithm

Assume the following:
Length of each chromosome = 8,
Fitness function f(x) = the number of ones in the bit string,
Population size n = 4,
Crossover rate pc = 0.7,
Mutation rate pm = 0.001
The initial, randomly generated, population is the following:
Chromosome label Chromosome string Fitness
A 00000110 2
B 11101110 6
C 00100000 1
D 00110100 3


Example 1: Step 3a

We will use a fitness-proportionate selection, where the number of times an individual is
selected for reproduction is equal to its fitness divided by the average of the finesses in the
population, which is (2 + 6 + 1 + 3) / 4=3
For chromosome A, this number is 2 / 3 = 0.667
For chromosome B, this number is 6 / 3 = 2
Fitness Functions
For chromosome C, this number is 1 / 3 = 0.333
For chromosome D, this number is 3 / 3 = 1
(0.667 + 2 + 0.333 + 1 = 4)
Step 3b Apply the crossover operator on the selected parents:
Given that B and D are selected as parents, assume they crossover after the first locus
with probability pc to form two offspring:
E = 10110100 and F = 01101110.
Assume that B and C do not crossover thus forming two offspring which are exact
copies of B and C.
Step 3c: Apply the mutation operator on the selected parents:
Each offspring is subject to mutation at each locus with probability pm.
Let E is mutated after the sixth locus to form E’ = 10110000, and offspring B is
mutated after the first locus to form B’ = 01101110.


Example 1: Steps 3b and 3c

The new population now becomes:
Chromosome label Chromosome string Fitness
E’ 10110000 3
F 01101110 5
C 00100000 1
B’ 01101110 5
Note that the best string, B, with fitness 6 was lost, but the average fitness
of the population increased to (3 + 5 + 1 + 5) / 4= 3.5.
Iterating this process will eventually result in a string with all ones.


Example 2: Genetic Algorithms

Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 × 7/2 =
28)
24/(24+23+20+11) = 31%
23/(24+23+20+11) = 29% etc


Discussion of Genetic Algorithms

• Genetic Algorithms require many parameters... (population size, fraction of
the population generated by crossover; mutation rate, etc... )

• How do we set these?

• Genetic Algorithms are really just a kind of hill-climbing search, but seem
to have less problems with local maximums…

• Genetic Algorithms are very easy to parallelize...

• Applications: Protein Folding, Circuit Design, Job-Scheduling Problem,
Timetabling, designing wings for aircraft, ….


Simulated Annealing

Motivated by the physical annealing process
Annealing: harden metals and glass by heating them to a high temperature and
then gradually cooling them
At the start, make lots of moves and then gradually slow down
More formally…
Instead of picking the best move (as in Hill Climbing),
Generate a random new neighbor from current state,
If it’s better take it,
If it’s worse then take it with some probability proportional to the temperature
and the delta between the new and old states,
Probability gets smaller as time passes and by the amount of “badness” of the
move,
Compared to hill climbing the main difference is that SA allows downwards steps;
(moves to higher cost successors).
Simulated annealing also differs from hill climbing in that a move is selected at
random and then decides whether to accept it.


Simulated Annealing

Function simulated-annealing (problem) returns a solution state
current = start state
For time = 1 to forever do
T = schedule[time]
If T = 0 return current
next = a randomly chosen successor of current
∆ = value(next) – value(current)
If ∆ > 0 then current = next
else current = next with probability e∆/T
End for

This version usually (but not always) makes uphill moves.

Intuition behind simulated annealing

The probability of making a downhill move decreases with time (length of the
exploration path from a start state).
The choice of probability distribution for allowing downhill moves is derived from
the physical process of annealing metals (cooling molten metal to solid minimal-
energy state).
During the annealing process in metals, there is a probability p that a transition to a
higher energy state (which is sub-optimal) occurs. Higher energy implies lower
value. The probability of going to a higher energy state is e∆/T where
∆ = (energy of next state) – (energy of current state)
T is the temperature of the metal.
p is higher when T is higher, and movement to higher energy states become less
likely as the temperature cools down.
For both real and simulated annealing, the rate at which a system is cooled is called
the annealing schedule, or just the “schedule.”


The effect of varying ∆ for fixed T=10

A negative value for ∆
∆ e ∆/10 implies a downhill step.
The lower the
value of ∆, the
-43 0.01
bigger the step would
be downhill, and the
-13 0.27 lower the probability
of taking this downhill
move.
0 1.00


The effect of varying T for fixed ∆ = -13

T e -13/T The greater the
value of T, the
1 0.000002 smaller the relative
importance of ∆
and the higher the
50 0.56 probability of
choosing a downhill
move.
1010 0.9999…


Logistic (sigmoid) function for Simulated
Annealing

Alternatively we could select any move (uphill or downhill)
with probability 1/(1+e –∆/T). This is the logistic function,
also called the sigmoid function. If we use this function, the
algorithm is called a stochastic hill climber.
The logistic function:

p T near 0
1.0

T very high
0.5

-20 20 Δ
0


Example

e−∆/ T ≅ 0
Uphill
1
Δ=13; T=1: Pr obability = − ∆/T
≅1
1+ e
1
Δ=13; T=1010: e −∆ / T
≅1 Pr obability = − ∆/T
≅ 0.5
1+ e

Downhill
Δ=-13; T=1: e−∆ / T ≅ 0 Pr obability =
1
≅0
− ∆/T
1+ e
1
Δ=-13; T=1010: e −∆ / T
≅0 Pr obability = ≅ 0.5
1 + e −∆/T


Summary

Best-first search = general search, where the minimum-cost nodes (according to some
measure) are expanded first.
Greedy search = best-first with the estimated cost to reach the goal as a heuristic
measure.
- Generally faster than uninformed search
- Not optimal, Not complete.
A* search = best-first with measure = path cost so far + estimated path cost to goal.
- Combines advantages of uniform-cost and greedy searches
- Complete, optimal and optimally efficient
- Space complexity still exponential
Time complexity of heuristic algorithms depend on quality of heuristic function.
Good heuristics can sometimes be constructed by examining the problem definition or
by generalizing from experience with the problem class.
Iterative improvement algorithms keep only a single state in memory.
Can get stuck in local extreme; simulated annealing provides a way to escape local
extreme, and is complete and optimal given a slow enough cooling schedule.

Summary

What if we follow some small number of paths?
• Hill Climbing
1. Moves in the direction of increasing value.
2. Terminates when it reaches a “peak”.
3. Does not look beyond immediate neighbors
• Local beam search:
1. Start with k randomly generated nodes.
2. Generate k successors of each.
3. Keep the best k out of the them.
• Simulated Annealing
1. Generate a random new neighbor from current state.
2. If it’s better take it.
3. If it’s worse then take it with some probability proportional
• Genetic algorithms:
1. Start with k randomly generated nodes called the population.
2. Generate successors by combining pairs of nodes.
3. Allow mutations.

Announcement

Term Projects progress

Time: During the Class

Date: (Mon. & Wed., Nov. 17, 24),

Each group will present/demonstrate the progress of

their project within 10 minutes.


Projects Proposal Presentation Schedule-071

Team Members Time Project Title
224236 Noor Al-Sheala 8:00, Sat.
214539 Ali Al-Ribh
Pebble Picking Game
235213 Husain Al-Saleh
222628 Sawd Al-Sadoun 8:10, Sat.
235091 Abdulaziz Al-Jraifni
Speech Recognition
235865 Ibrahim Al-Hawwas
224792 Maitham Al-Zayer
223132 Mahmoud Al-Saba 8:20: Sat. 4*4 game
222974 Reda Al-Ahmed
234383 Husain Al-Jaafar 8:30: Sat.
226574 Yousf Al-Alag
Smart Registrar
224510 Abdullah Al-Safwan
224328 Khalid Al-Sufyany 8:40, Sat. XXXXXXXXXXXXXXXXXXXXXXXX
215789 Yousef Al-Numan
234681 Nader Al-Marhoun 8:00, Mon.
234165 Ali Aoun
Fingerprint recognition
226246 Abdullah Al-Mubearik Intelligent Adviser
8:10, Mon.
230417 Nawaf Al-Surahi

234681 Saeed Al-Marhoon XXXXXXXXXXXXXXXXXXXXXXXXX
8:20, Mon.
215027 Mohamed Habib
XXXXXXXXXXXXXXXXXXXXXXXXX
207091 Mohamed Al-Shakhs 8:30, Mon.


The End!!

Thank you

Any Questions?


2 lectures 16 17-informed search algorithms ch 4.3

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a 2 lectures 16 17-informed search algorithms ch 4.3

Similar a 2 lectures 16 17-informed search algorithms ch 4.3 (16)

2 lectures 16 17-informed search algorithms ch 4.3