Ten Organizational Design Models to align structure and operations to busines...
Parallel search
1. Class Assignment
CLASS ASSIGNMENT-01
Parallel Searching Algorithms
Submitted By:
Md.Mahedi Mahfuj -BIT 0207
Submitted To:
Sumon Ahmed
Lecturer,IIT
Date: 21-03-2012
Institute of Information Technology
University of Dhaka
Page 1 of 8
2. Class Assignment
INTRODUCTION:
Parallel Search, also known as Multithreaded Search or SMP Search, is a way to increase
search speed by using additional processors. This topic that has been gaining popularity
recently with multiprocessor computers becoming widely available.
Actually, a parallel algorithm is an algorithm which can be executed a piece at a time on
many different processing devices, and then put back together again at the end to get the
correct result.
The cost or complexity of serial algorithms is estimated in terms of the space (memory)
and time (processor cycles) that they take. Parallel algorithms need to optimize one more
resource, the communication between different processors. There are two ways parallel
processors communicating, shared memory or message passing.
This document gives a brief summary of four types SMP algorithms which are classified by
their scalability (trend in search speed as the number of processors becomes large) and
their speedup (change in time to complete a search). Typically, programmers use scaling
to mean change in nodes per second (NPS) rates, and speedup to mean change in time to
depth. The algorithms are described below in brief:
ALPHA – BETA SEARCH:
The Alpha-Beta algorithm (Alpha-Beta Pruning, Alpha-Beta Heuristic) is a significant
enhancement to the minimax search algorithm that eliminates the need to search large
portions of the game tree applying a branch-and-bound technique. Remarkably, it does
this without any potential of overlooking a better move. If one already has found a quite
good move and search for alternatives, one refutation is enough to avoid it. No need to look
for even stronger refutations.
Actually, the algorithm maintains two values, alpha and beta. They represent the minimum
score that the maximizing player is assured of and the maximum score that the minimizing
player is assured of respectively.
Page 2 of 8
3. Class Assignment
IMPLEMENTATION:
int alphaBetaMax( int alpha, int beta, int depthleft )
{
if ( depthleft == 0 ) return evaluate();
for ( all moves)
{
score = alphaBetaMin( alpha, beta, depthleft - 1 );
if( score >= beta )
return beta; // fail hard beta-cutoff
if( score > alpha )
alpha = score; // alpha acts like max in MiniMax
}
return alpha;
}
int alphaBetaMin( int alpha, int beta, int depthleft )
{
if ( depthleft == 0 ) return -evaluate();
for ( all moves)
{
score = alphaBetaMax( alpha, beta, depthleft - 1 );
if( score <= alpha )
return alpha; // fail hard alpha-cutoff
if( score < beta )
beta = score; // beta acts like min in MiniMax
}
return beta;
}
JAMBOREE SEARCH:
Jamboree Search was introduced by Bradley Kuszmaul in his 1994 thesis, Synchronized
MIMD Computing. This algorithm is actually a parallelized version of the Scout search
algorithm. The idea is that all of the testing of any child that is not the first one is done in
parallel and any test that fail are sequentially valued.
Page 3 of 8
4. Class Assignment
Jamboree was used in the massive parallel chess programs StarTech and Socrates. It
sequentialize full-window searches for values, because, while their authors are willing to
take a chance that an empty window search will be squandered work, they are not willing
to take the chance that a full-window search (which does not prune very much) will be
squandered work.
IMPLEMENTATION:
int jamboree(CNode n, int α, int β)
{
if (n is leaf) return static_eval(n);
c[ ] = the childen of n;
b = -jamboree(c[0], -β, -α);
if (b >= β) return b;
if (b > α) α = b;
In Parallel: for (i=1; i < |c[ ]|; i++)
{
s = -jamboree(c[i], -α - 1, -α);
if (s > b) b = s;
if (s >= β) abort_and_return s;
if (s > α)
{
s = -jamboree(c[i], -β, -α);
if (s >= β) abort_and_return s;
if (s > α) α = s;
if (s > b) b = s;
}
}
return b;
}
DEPTH – FIRST SEARCH:
We start the graph traversal at an arbitrary vertex and go down a particular branch until
we reach a dead end. Then we back up and go as deep possible. In this way we visit all
vertices and edges as well.
Page 4 of 8
5. Class Assignment
The search is similar to searching maze of hallways, edges, and rooms, vertices, with a
string and paint. We fix the string in the starting we room and mark the room with the
paint as visited we then go down the an incident hallway into the next room. We mark that
room and go to the next room always marking the rooms as visited with the paint. When
we get to a dead end or a room we have already visited we follow the string back a room
that has a hall way we have not gone through yet.
This graph traversal is very similar to a tree traversal; either post order or preorder, in fact
if the graph is a tree then the traversal is same. The algorithm is naturally recursive, just as
the tree traversal. The algorithm is forecast here:
IMPLEMENTATION:
Algorithm DFS (graph G, Vertex v)
// Recursive algorithm
for all edges e in G.incidentEdges(v) do
if edge e is unexplored then
w = G.opposite(v, e)
if vertex w is unexplored then
label e as discovery edge
recursively call DFS(G, w)
else
label e back edge.
PVS SEARCH:
The best-known early attempt at searching such trees in parallel was the Principal
Variation Splitting (PVS) algorithm. This was both simple to understand and easy to
implement.
Page 5 of 8
6. Class Assignment
When starting an N-ply search, one processor generates the moves at the root position,
makes the first move (leading to what is often referred to as the left-most descendent
position), then generates the moves at ply=2, makes the first move again, and continues
this until reaching ply=N.
At this point, the processor pool searches all of the moves at this ply (N) in parallel, and the
best value is backed up to ply N-1. Now that the lower bound for ply N-1 is known, the rest
of the moves at N-1 are searched in parallel, and the best value again backed up to N-2. This
continues until the first root move has been searched and the value is known. The
remainder of the root moves is searched in parallel, until none are left. The next iteration is
started and the process repeats for depth N+1.
Performance analysis with this algorithm (PVS) produced speedups given below in table 1.
+-------------+-----+-----+-----+-----+-----+
|# processors | 1 | 2 | 4 | 8 | 16 |
+-------------+-----+-----+-----+-----+-----+
|speedup | 1.0 | 1.8 | 3.0 | 4.1 | 4.6 |
+-------------+-----+-----+-----+-----+-----+
Table 1 PVS performance results
DRAWBACKS:
Firstly,
All of the processors work together at a single node, searching descendent positions in
parallel. If the number of possible moves is small, or the number of processors is large,
some have nothing to do. Second, every branch from a given position does not produce a
tree of equal size, since some branches may grow into complicated positions with lots of
checks and search extensions that make the tree very large, while other branches grow into
simple positions that are searched quickly. This leads to a load balancing problem where
one processor begins searching a very large tree and the others finish the easy moves and
have to wait for the remaining processor to slowly traverse the tree.
Secondly,
With a reasonable number of processors, the speedup can look very bad if most of the time
many of the processors are waiting on one last node to be completed before they can back
up to ply N-1 and start to work there.
Page 6 of 8