Más contenido relacionado
Similar a data structure (20)
data structure
- 2. What this course is about ?
• Data structures: conceptual and concrete ways to organize
data for efficient storage and efficient manipulation
• Employment of this data structures in the design of efficient
algorithms
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
2
- 3. Why do we need them ?
• Computers take on more and more complex tasks
• Imagine: index of 8 billion pages ! (Google)
• Software implementation and maintenance is difficult.
• Clean conceptual framework allows for more efficient and
more correct code
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
3
- 4. Why do we need them
• Requirements for a good software:
• Clean Design
• Easy maintenance
• Reliable (no core dumps)
• Easy to use
• Fast algorithms
Efficient data structures
Efficient algorithms
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
4
- 5. Example
• A collection of 3,000 texts with avg. of 20 lines each, with avg.
10 words / line
• 600,000 words
• Find all occurrences of the word “happy”
• Suppose it takes 1 sec. to check a word for correct matching
• What to do?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
5
- 6. Example (cont’d)
• What to do?
Sol. 1 Sequential matching: 1 sec. x 600,000 words = 166 hours
Sol. 2 Binary searching:
- order the words
- search only half at a time
Ex. Search 25 in 5 8 12 15 15 17 23 25 27
25 ? 15 15 17 23 25 27
25 ? 23 23 25 27
25 ? 25
How many steps?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
6
- 7. Some example data structures
• log 2 600000 = 19 sec. vs .166 hours!
Set Stack Tree
Data structure = representation and operations associated with
a data type
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
7
- 8. Data Structure Philosophy
Each data structure has costs and benefits.
Rarely is one data structure better than another in all situations.
A data structure requires:
• space for each data item it stores,
• time to perform each basic operation,
• programming effort.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
8
- 9. Data Structure Philosophy (cont)
Each problem has constraints on available space and time.
Only after a careful analysis of problem characteristics can we
know the best data structure for the task.
Bank example:
• Start account: a few minutes
• Transactions: a few seconds
• Close account: overnight
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
9
- 10. What will you learn?
• What are some of the common data structures
• What are some ways to implement them
• How to analyze their efficiency
• How to use them to solve practical problems
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
10
- 11. What you need
• Programming experience with C / C++
• Some Java experience may help as well (but not required)
• Textbook
• Data Structures and Algorithm Analysis in C++
• Mark Allen Weiss
• An Unix account to write, compile and run your C/C++
programs
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
11
- 12. Topics
Analysis Tools / ADT
Arrays
Stacks and Queues
Vectors, lists and sequences
Trees
Heaps / Priority Queues
Binary Search Trees –
Search Trees
Hashing / Dictionaries
Sorting
Graphs and graph algorithms
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
12
- 13. Problem Solving: Main Steps
1. Problem definition
2. Algorithm design / Algorithm specification
3. Algorithm analysis
4. Implementation
5. Testing
6. [Maintenance]
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
13
- 14. 1. Problem Definition
• What is the task to be accomplished?
• Calculate the average of the grades for a given student
• Understand the talks given out by politicians and translate them
in Chinese
• What are the time / space / speed / performance
requirements ?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
14
- 15. Problems
• Problem: a task to be performed.
• Best thought of as inputs and matching outputs.
• Problem definition should include constraints on the resources
that may be consumed by any acceptable solution.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
15
- 16. Problems (cont)
• Problems mathematical functions
• A function is a matching between inputs (the domain) and outputs
(the range).
• An input to a function may be single number, or a collection of
information.
• The values making up an input are called the parameters of the
function.
• A particular input must always result in the same output every
time the function is computed.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
16
- 17. 2. Algorithm Design /
Specifications
• Algorithm: Finite set of instructions that, if followed,
accomplishes a particular task.
• Describe: in natural language / pseudo-code / diagrams /
etc.
• Criteria to follow:
• Input: Zero or more quantities (externally produced)
• Output: One or more quantities
• Definiteness: Clarity, precision of each instruction
• Finiteness: The algorithm has to stop after a finite (may be very
large) number of steps
• Effectiveness: Each instruction has to be basic enough and feasible
• Understand speech
• Translate to Chinese
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
17
- 18. Algorithms and Programs
Algorithm: a method or a process followed to solve a problem.
• A recipe.
An algorithm takes the input to a problem (function) and
transforms it to the output.
• A mapping of input to output.
A problem can have many algorithms.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
18
- 19. Algorithm Properties
An algorithm possesses the following properties:
• It must be correct.
• It must be composed of a series of concrete steps.
• There can be no ambiguity as to which step will be
performed next.
• It must be composed of a finite number of steps.
• It must terminate.
A computer program is an instance, or concrete representation,
for an algorithm in some programming language.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
19
- 20. 4,5,6: Implementation, Testing,
Maintainance
• Implementation
• Decide on the programming language to use
• C, C++, Lisp, Java, Perl, Prolog, assembly, etc. , etc.
• Write clean, well documented code
• Test, test, test
• Integrate feedback from users, fix bugs, ensure
compatibility across different versions Maintenance
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
20
- 21. 3. Algorithm Analysis
• Space complexity
• How much space is required
• Time complexity
• How much time does it take to run the algorithm
• Often, we deal with estimates!
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
21
- 22. Space Complexity
• Space complexity = The amount of memory required by
an algorithm to run to completion
• *Core dumps = the most often encountered cause is “memory
leaks” – the amount of memory required larger than the memory
available on a given system]
• Some algorithms may be more efficient if data
completely loaded into memory
• Need to look also at system limitations
• E.g. Classify 2GB of text in various categories [politics, tourism,
sport, natural disasters, etc.] – can I afford to load the entire
collection?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
22
- 23. Space Complexity (cont’d)
1. Fixed part: The size required to store certain
data/variables, that is independent of the size of the
problem:
- e.g. name of the data collection
- same size for classifying 2GB or 1MB of texts
2. Variable part: Space needed by variables, whose size is
dependent on the size of the problem:
- e.g. actual text
- load 2GB of text VS. load 1MB of text
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
23
- 24. Space Complexity (cont’d)
• S(P) = c + S(instance characteristics)
• c = constant
• Example:
void float sum (float* a, int n)
{
float s = 0;
for(int i = 0; i<n; i++) {
s+ = a[i];
}
return s;
}
Space? one word for n, one for a [passed by reference!], one for i
constant space!
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
24
- 25. Time Complexity
• Often more important than space complexity
• space available (for computer programs!) tends to be larger and
larger
• time is still a problem for all of us
• 3-4GHz processors on the market
• still …
• researchers estimate that the computation of various
transformations for 1 single DNA chain for one single protein on 1
TerraHZ computer would take about 1 year to run to completion
• Algorithms running time is an important issue
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
25
- 26. Running Time
• Problem: prefix averages
• Given an array X
• Compute the array A such that A[i] is the average of elements X[0]
… X*i+, for i=0..n-1
• Sol 1
• At each step i, compute the element X[i] by traversing the array A
and determining the sum of its elements, respectively the average
• Sol 2
• At each step i update a sum of the elements in the array A
• Compute the element X[i] as sum/I
Big question: Which solution to choose?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
26
- 27. Running time
Input
1 ms
2 ms
3 ms
4 ms
5 ms
A B C D E F G
worst-case
best-case
}average-case?
Suppose the program includes an if-then statement that may
execute or not: variable running time
Typically algorithms are measured by their worst case
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
27
- 28. Experimental Approach
• Write a program that implements the algorithm
• Run the program with data sets of varying size.
• Determine the actual running time using a system call to
measure time (e.g. system (date) );
• Problems?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
28
- 29. Experimental Approach
• It is necessary to implement and test the algorithm in order to
determine its running time.
• Experiments can be done only on a limited set of inputs, and
may not be indicative of the running time for other inputs.
• The same hardware and software should be used in order to
compare two algorithms. – condition very hard to achieve!
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
29
- 30. Use a Theoretical Approach
• Based on high-level description of the algorithms, rather than
language dependent implementations
• Makes possible an evaluation of the algorithms that is
independent of the hardware and software environments
Generality
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
30
- 31. Algorithm Description
• How to describe algorithms independent of a programming
language
• Pseudo-Code = a description of an algorithm that is
• more structured than usual prose but
• less formal than a programming language
• (Or diagrams)
• Example: find the maximum element of an array.
Algorithm arrayMax(A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
currentMax A[0]
for i 1 to n -1 do
if currentMax < A[i] then currentMax A[i]
return currentMax
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
31
- 32. Pseudo Code
• Expressions: use standard mathematical symbols
• use for assignment ( ? in C/C++)
• use = for the equality relationship (? in C/C++)
• Method Declarations: -Algorithm name(param1, param2)
• Programming Constructs:
• decision structures: if ... then ... [else ..]
• while-loops while ... do
• repeat-loops: repeat ... until ...
• for-loop: for ... do
• array indexing: A[i]
• Methods
• calls: object method(args)
• returns: return value
• Use comments
• Instructions have to be basic enough and feasible!
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
32
- 33. Low Level Algorithm Analysis
• Based on primitive operations (low-level computations
independent from the programming language)
• E.g.:
• Make an addition = 1 operation
• Calling a method or returning from a method = 1 operation
• Index in an array = 1 operation
• Comparison = 1 operation etc.
• Method: Inspect the pseudo-code and count the
number of primitive operations executed by the
algorithm
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
33
- 34. Example
Algorithm arrayMax(A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
currentMax A[0]
for i 1 to n -1 do
if currentMax < A[i] then
currentMax A[i]
return currentMax
How many operations ?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
34
- 35. Asymptotic Notation
• Need to abstract further
• Give an “idea” of how the algorithm performs
• n steps vs. n+5 steps
• n steps vs. n2 steps
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
35
- 36. Problem
• Fibonacci numbers
• F[0] = 0
• F[1] = 1
• F[i] = F[i-1] + F[i-2] for i 2
• Pseudo-code
• Number of operations
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
36
- 37. Last Time
• Steps in problem solving
• Algorithm analysis
• Space complexity
• Time complexity
• Pseudo-code
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
37
- 38. Algorithm Analysis
• Last time:
• Experimental approach – problems
• Low level analysis – count operations
• Abstract even further
• Characterize an algorithm as a function of the “problem size”
• E.g.
• Input data = array problem size is N (length of array)
• Input data = matrix problem size is N x M
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
38
- 39. Asymptotic Notation
• Goal: to simplify analysis by getting rid of
unneeded information (like “rounding”
1,000,001≈1,000,000)
• We want to say in a formal way 3n2 ≈ n2
• The “Big-Oh” Notation:
• given functions f(n) and g(n), we say that f(n) is
O(g(n)) if and only if there are positive constants c
and n0 such that f(n)≤ c g(n) for n ≥ n0
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
39
- 40. Graphic Illustration
• f(n) = 2n+6
• Conf. def:
• Need to find a function
g(n) and a const. c such
as f(n) < cg(n)
• g(n) = n and c = 4
• f(n) is O(n)
• The order of f(n) is n
g (n ) = n
c g (n ) = 4 n
n
f(n) =2n +6
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
40
- 41. More examples
• What about f(n) = 4n2 ? Is it O(n)?
• Find a c such that 4n2 < cn for any n > n0
• 50n3 + 20n + 4 is O(n3)
• Would be correct to say is O(n3+n)
• Not useful, as n3 exceeds by far n, for large values
• Would be correct to say is O(n5)
• OK, but g(n) should be as closed as possible to f(n)
• 3log(n) + log (log (n)) = O( ? )
•Simple Rule: Drop lower order
terms and constant factors
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
41
- 42. Properties of Big-Oh
• If f(n) is O(g(n)) then af(n) is O(g(n)) for any a.
• If f(n) is O(g(n)) and h(n) is O(g’(n)) then f(n)+h(n) is O(g(n)+g’(n))
• If f(n) is O(g(n)) and h(n) is O(g’(n)) then f(n)h(n) is O(g(n)g’(n))
• If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n))
• If f(n) is a polynomial of degree d , then f(n) is O(nd)
• nx = O(an), for any fixed x > 0 and a > 1
• An algorithm of order n to a certain power is better than an algorithm of order a ( >
1) to the power of n
• log nx is O(log n), fox x > 0 – how?
• log x n is O(ny) for x > 0 and y > 0
• An algorithm of order log n (to a certain power) is better than an algorithm of n
raised to a power y.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
42
- 43. Asymptotic analysis -
terminology
• Special classes of algorithms:
logarithmic: O(log n)
linear: O(n)
quadratic: O(n2)
polynomial: O(nk), k ≥ 1
exponential: O(an), n > 1
• Polynomial vs. exponential ?
• Logarithmic vs. polynomial ?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
43
- 44. Some Numbers
log n n n log n n2
n3
2n
0 1 0 1 1 2
1 2 2 4 8 4
2 4 8 16 64 16
3 8 24 64 512 256
4 16 64 256 4096 65536
5 32 160 1024 32768 4294967296
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
44
- 45. Common plots of O( )
O(2n)
O(n3 )
O(n2)
O(nlogn)
O(n)
O(√n)
O(logn)
O(1)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
45
- 46. “Relatives” of Big-Oh
• “Relatives” of the Big-Oh
• (f(n)): Big Omega – asymptotic lower bound
• (f(n)): Big Theta – asymptotic tight bound
• Big-Omega – think of it as the inverse of O(n)
• g(n) is (f(n)) if f(n) is O(g(n))
• Big-Theta – combine both Big-Oh and Big-Omega
• f(n) is (g(n)) if f(n) is O(g(n)) and g(n) is (f(n))
• Make the difference:
• 3n+3 is O(n) and is (n)
• 3n+3 is O(n2) but is not (n2)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
46
- 47. More “relatives”
• Little-oh – f(n) is o(g(n)) if for any c>0 there is n0 such that f(n)
< c(g(n)) for n > n0.
• Little-omega
• Little-theta
• 2n+3 is o(n2)
• 2n + 3 is o(n) ?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
47
- 48. Best, Worst, Average Cases
Not all inputs of a given size take the same time to run.
Sequential search for K in an array of n integers:
• Begin at first element in array and look at each element in turn
until K is found
Best case:
Worst case:
Average case:
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
48
- 49. Example
Remember the algorithm for computing prefix averages
- compute an array A starting with an array X
- every element A[i] is the average of all elements X[j] with j < i
Remember some pseudo-code … Solution 1
Algorithm prefixAverages1(X):
Input: An n-element array X of numbers.
Output: An n -element array A of numbers such that A[i] is the average
of elements X[0], ... , X[i].
Let A be an array of n numbers.
for i 0 to n - 1 do
a 0
for j 0 to i do
a a + X[j]
A[i] a/(i+ 1)
return array A
Analyze this
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
49
- 50. Example (cont’d)
Algorithm prefixAverages2(X):
Input: An n-element array X of numbers.
Output: An n -element array A of numbers such that A[i] is
the average of elements X[0], ... , X[i].
Let A be an array of n numbers.
s 0
for i 0 to n do
s s + X[i]
A[i] s/(i+ 1)
return array A
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
50
- 51. Back to the original question
• Which solution would you choose?
• O(n2) vs. O(n)
• Some math …
• properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba= logxa/logxb
• properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
b = a log
a
b
bc = a c*log
a
b
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
51
- 52. Important Series
• Sum of squares:
• Sum of exponents:
• Geometric series:
• Special case when A = 2
• 20 + 21 + 22 + … + 2N = 2N+1 - 1
Nlargefor
36
)12)(1( 3
1
2 NNNN
i
N
i
==
-1kandNlargefor
|1|
1
1
=
k
N
i
kN
i
k
1
11
0
=
=
A
A
A
NN
i
i
=
===
N
i
NNiNNS
1
2/)1(21)(
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
52
- 53. Analyzing recursive algorithms
function foo (param A, param B) {
statement 1;
statement 2;
if (termination condition) {
return;
foo(A’, B’);
}
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
53
- 54. Solving recursive equations by repeated
substitution
T(n) = T(n/2) + c substitute for T(n/2)
= T(n/4) + c + c substitute for T(n/4)
= T(n/8) + c + c + c
= T(n/23) + 3c in more compact form
= …
= T(n/2k) + kc “inductive leap”
T(n) = T(n/2logn) + clogn “choose k = logn”
= T(n/n) + clogn
= T(1) + clogn = b + clogn = θ(logn)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
54
- 55. Solving recursive equations by telescoping
T(n) = T(n/2) + c initial equation
T(n/2) = T(n/4) + c so this holds
T(n/4) = T(n/8) + c and this …
T(n/8) = T(n/16) + c and this …
…
T(4) = T(2) + c eventually …
T(2) = T(1) + c and this …
T(n) = T(1) + clogn sum equations, canceling the
terms appearing on both sides
T(n) = θ(logn)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
55
- 56. Problem
• Running time for finding a number in a sorted array
[binary search]
• Pseudo-code
• Running time analysis
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
56
- 57. Space/Time Tradeoff Principle
One can often reduce time if one is willing to sacrifice space, or
vice versa.
• Encoding or packing information
Boolean flags
• Table lookup
Factorials
Disk-based Space/Time Tradeoff Principle: The smaller you
make the disk storage requirements, the faster your
program will run.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
57
- 58. ADT
• ADT = Abstract Data Types
• A logical view of the data objects together with specifications
of the operations required to create and manipulate them.
• Describe an algorithm – pseudo-code
• Describe a data structure – ADT
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
58
- 59. What is a data type?
• A set of objects, each called an instance of the data type.
Some objects are sufficiently important to be provided
with a special name.
• A set of operations. Operations can be realized via
operators, functions, procedures, methods, and special
syntax (depending on the implementing language)
• Each object must have some representation (not
necessarily known to the user of the data type)
• Each operation must have some implementation (also not
necessarily known to the user of the data type)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
59
- 60. What is a representation?
• A specific encoding of an instance
• This encoding MUST be known to implementors of the data
type but NEED NOT be known to users of the data type
• Terminology: "we implement data types using data structures“
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
60
- 61. Two varieties of data types
• Opaque data types in which the representation is not known to
the user.
• Transparent data types in which the representation is profitably
known to the user:- i.e. the encoding is directly accessible
and/or modifiable by the user.
• Which one you think is better?
• What are the means provided by C++ for creating opaque data
types?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
61
- 62. Why are opaque data types
better?
• Representation can be changed without affecting user
• Forces the program designer to consider the operations more
carefully
• Encapsulates the operations
• Allows less restrictive designs which are easier to extend and
modify
• Design always done with the expectation that the data type will
be placed in a library of types available to all.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
62
- 63. How to design a data type
Step 1: Specification
• Make a list of the operations (just their names) you think you
will need. Review and refine the list.
• Decide on any constants which may be required.
• Describe the parameters of the operations in detail.
• Describe the semantics of the operations (what they do) as
precisely as possible.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
63
- 64. How to design a data type
Step 2: Application
• Develop a real or imaginary application to test the
specification.
• Missing or incomplete operations are found as a side-effect of
trying to use the specification.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
64
- 65. How to design a data type
Step 3: Implementation
• Decide on a suitable representation.
• Implement the operations.
• Test, debug, and revise.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
65
- 66. Example - ADT Integer
Name of ADT Integer
Operation Description C/C++
Create Defines an identifier with an
undefined value int id1;
Assign Assigns the value of one integer id1 = id2;
identifier or value to another integer
identifier
isEqual Returns true if the values associated id1 == id2;
with two integer identifiers are the
same
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
66
- 67. Example – ADT Integer
LessThan Returns true if an identifier integer is
less than the value of the second id1<id2
integer identifier
Negative Returns the negative of the integer value -id1
Sum Returns the sum of two integer values id1+id2
Operation Signatures
Create: identifier Integer
Assign: Integer Identifier
IsEqual: (Integer,Integer) Boolean
LessThan: (Integer,Integer) Boolean
Negative: Integer Integer
Sum: (Integer,Integer) Integer
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
67
- 68. More examples
• We’ll see more examples throughout the course
• Stack
• Queue
• Tree
• And more
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
68
- 69. Arrays
Array: a set of pairs (index and value)
data structure
For each index, there is a value associated with
that index.
representation (possible)
implemented by using consecutive memory.
©RohitBirlaDataStructure
RevisionTutorial
69
15-Oct-2011
- 70. Objects:Asetofpairs <index, value> where foreachvalueofindex
thereisavalue fromthesetitem. Indexisafinite ordered setofoneor
moredimensions, for example,,0,…,n-1}foronedimension,
{(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)} fortwodimensions,
etc.
Methods:
forall AArray,iindex, xitem,j,sizeinteger
ArrayCreate(j, list) ::=returnanarrayof jdimensions where listisa
j-tuple whosekthelementisthesizeofthe
kthdimension. Itemsareundefined.
ItemRetrieve(A, i) ::=if(iindex) returntheitemassociated with
index valueiinarray A
elsereturnerror
ArrayStore(A, i,x) ::= if(iinindex)
returnanarraythatisidentical toarray
Aexceptthenewpair<i,x>hasbeen
inserted elsereturnerror
The Array ADT
©RohitBirlaDataStructure
RevisionTutorial
70
15-Oct-2011
- 71. Arrays in C
int list[5], *plist[5];
list[5]: five integers
list[0], list[1], list[2], list[3], list[4]
*plist[5]: five pointers to integers
plist[0], plist[1], plist[2], plist[3], plist[4]
implementation of 1-D array
list[0] base address =
list[1] + sizeof(int)
list[2] + 2*sizeof(int)
list[3] + 3*sizeof(int)
list[4] + 4*size(int)
©RohitBirlaDataStructure
RevisionTutorial
71
15-Oct-2011
- 72. Arrays in C (cont’d)
Compare int *list1 and int list2[5] in C.
Same: list1 and list2 are pointers.
Difference: list2 reserves five locations.
Notations:
list2 - a pointer to list2[0]
(list2 + i) - a pointer to list2[i] (&list2[i])
*(list2 + i) - list2[i]
©RohitBirlaDataStructure
RevisionTutorial
72
15-Oct-2011
- 73. Address Contents
1228 0
1230 1
1232 2
1234 3
1236 4
Example:
int one[] = {0, 1, 2, 3, 4}; //Goal: print out
address and value
void print1(int *ptr, int rows)
{
printf(“Address Contentsn”);
for (i=0; i < rows; i++)
printf(“%8u%5dn”, ptr+i, *(ptr+i));
printf(“n”);
}
Example
©RohitBirlaDataStructure
RevisionTutorial
73
15-Oct-2011
- 74. ne
n
e
xaxaxp = ...)( 1
1
Polynomials A(X)=3X20+2X5+4, B(X)=X4+10X3+3X2+1
Other Data Structures
Based on Arrays
•Arrays:
•Basic data structure
•May store any type of elements
Polynomials: defined by a list of coefficients and
exponents
- degree of polynomial = the largest exponent in a
polynomial
©RohitBirlaDataStructure
RevisionTutorial
74
15-Oct-2011
- 75. Polynomial ADT
Objects: a set of ordered pairs of <ei,ai>
where ai in Coefficients and
ei in Exponents, ei are integers >= 0
Methods:
for all poly, poly1, poly2 Polynomial, coef Coefficients, expon
Exponents
Polynomial Zero( ) ::= return the polynomial p(x) = 0
Boolean IsZero(poly) ::= if (poly) return FALSE
else return TRUE
Coefficient Coef(poly, expon) ::= if (expon poly) return its
coefficient else return Zero
Exponent Lead_Exp(poly) ::= return the largest exponent in
poly
Polynomial Attach(poly,coef, expon) ::= if (expon poly) return error
else return the polynomial poly
with the term <coef, expon>
inserted
©RohitBirlaDataStructure
RevisionTutorial
75
15-Oct-2011
- 76. Polyomial ADT (cont’d)
Polynomial Remove(poly, expon) ::= if (expon poly) return the
polynomial poly with the term
whose exponent is expon deleted
else return error
Polynomial SingleMult(poly, coef, expon)::= return the polynomial
poly • coef • xexpon
Polynomial Add(poly1, poly2) ::= return the polynomial
poly1 +poly2
Polynomial Mult(poly1, poly2) ::= return the polynomial
poly1 • poly2
©RohitBirlaDataStructure
RevisionTutorial
76
15-Oct-2011
- 77. Polynomial Addition (1)
#define MAX_DEGREE 101
typedef struct {
int degree;
float coef[MAX_DEGREE];
} polynomial;
Addition(polynomial * a, polynomial * b, polynomial* c)
{
…
}
advantage: easy implementation
disadvantage: waste space when sparse
Running time?
©RohitBirlaDataStructure
RevisionTutorial
77
15-Oct-2011
- 78. • Use one global array to store all polynomials
Polynomial Addition (2)
2 1 1 10 3 1
1000 0 4 3 2 0
coef
exp
starta finisha startb finishb avail
0 1 2 3 4 5 6
A(X)=2X1000+1
B(X)=X4+10X3+3X2+1
©RohitBirlaDataStructure
RevisionTutorial
78
15-Oct-2011
- 79. Polynomial Addition (2) (cont’d)
#define MAX_DEGREE 101
typedef struct {
int exp;
float coef;
} polynomial_term;
polynomial_term terms[3*MAX_DEGREE];
Addition(int starta, int enda, int startb, int endb, int startc, int endc)
{
…
}
advantage: less space
disadvantage: longer code
Running time?
©RohitBirlaDataStructure
RevisionTutorial
79
15-Oct-2011
- 81. Sparse Matrix ADT
Objects: a set of triples, <row, column, value>, where row
and column are integers and form a unique combination, and
value comes from the set item.
Methods:
for all a, b Sparse_Matrix, x item, i, j, max_col,
max_row index
Sparse_Marix Create(max_row, max_col) ::=
return a Sparse_matrix that can hold up to
max_items = max _row max_col and
whose maximum row size is max_row and
whose maximum column size is max_col.
©RohitBirlaDataStructure
RevisionTutorial
81
15-Oct-2011
- 82. Sparse Matrix ADT (cont’d)
Sparse_Matrix Transpose(a) ::=
return the matrix produced by interchanging
the row and column value of every triple.
Sparse_Matrix Add(a, b) ::=
if the dimensions of a and b are the same
return the matrix produced by adding
corresponding items, namely those with
identical row and column values.
else return error
Sparse_Matrix Multiply(a, b) ::=
if number of columns in a equals number of rows in b
return the matrix d produced by multiplying
a by b according to the formula: d [i] [j] =
(a[i][k]•b[k][j]) where d (i, j) is the (i,j)th
element
else return error.
©RohitBirlaDataStructure
RevisionTutorial
82
15-Oct-2011
- 83. (1) Represented by a two-dimensional array.
Sparse matrix wastes space.
(2) Each element is characterized by <row, col, value>.
Sparse Matrix Representation
Sparse_matrix Create(max_row, max_col) ::=
#define MAX_TERMS 101 /* maximum number of terms +1*/
typedef struct {
int col;
int row;
int value;
} term;
term A[MAX_TERMS]
The terms in A should be ordered
based on <row, col>
©RohitBirlaDataStructure
RevisionTutorial
83
15-Oct-2011
- 84. Sparse Matrix Operations
• Transpose of a sparse matrix.
• What is the transpose of a matrix?
row col value row col value
a[0] 6 6 8 b[0] 6 6 8
[1] 0 0 15 [1] 0 0 15
[2] 0 3 22 [2] 0 4 91
[3] 0 5 -15 [3] 1 1 11
[4] 1 1 11 [4] 2 1 3
[5] 1 2 3 [5] 2 5 28
[6] 2 3 -6 [6] 3 0 22
[7] 4 0 91 [7] 3 2 -6
[8] 5 2 28 [8] 5 0 -15
transpose
©RohitBirlaDataStructure
RevisionTutorial
84
15-Oct-2011
- 85. (1) for each row i
take element <i, j, value> and store it
in element <j, i, value> of the transpose.
difficulty: where to put <j, i, value>?
(0, 0, 15) ====> (0, 0, 15)
(0, 3, 22) ====> (3, 0, 22)
(0, 5, -15) ====> (5, 0, -15)
(1, 1, 11) ====> (1, 1, 11)
Move elements down very often.
(2) For all elements in column j,
place element <i, j, value> in element <j, i, value>
Transpose a Sparse Matrix
©RohitBirlaDataStructure
RevisionTutorial
85
15-Oct-2011
- 86. Transpose of a Sparse Matrix (cont’d)
void transpose (term a[], term b[])
/* b is set to the transpose of a */
{
int n, i, j, currentb;
n = a[0].value; /* total number of elements */
b[0].row = a[0].col; /* rows in b = columns in a */
b[0].col = a[0].row; /*columns in b = rows in a */
b[0].value = n;
if (n > 0) { /*non zero matrix */
currentb = 1;
for (i = 0; i < a[0].col; i++)
/* transpose by columns in a */
for( j = 1; j <= n; j++)
/* find elements from the current column */
if (a[j].col == i) {
/* element is in current column, add it to b */
©RohitBirlaDataStructure
RevisionTutorial
86
15-Oct-2011
- 87. Linked Lists
• Avoid the drawbacks of fixed size arrays with
• Growable arrays
• Linked lists
©RohitBirlaDataStructure
RevisionTutorial
87
15-Oct-2011
- 88. Growable arrays
• Avoid the problem of fixed-size arrays
• Increase the size of the array when needed (I.e. when capacity
is exceeded)
• Two strategies:
• tight strategy (add a constant): f(N) = N + c
• growth strategy (double up): f(N) = 2N
©RohitBirlaDataStructure
RevisionTutorial
88
15-Oct-2011
- 89. Tight Strategy
• Add a number k (k = constant) of elements every time the
capacity is exceeded
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
C0 + (C0+k) + … (C0+Sk) =
S = (N – C0) / k
Running time?
C0 * S + S*(S+1) / 2 O(N2)
©RohitBirlaDataStructure
RevisionTutorial
89
15-Oct-2011
- 90. Tight Strategy
void insertLast(int rear, element o) {
if ( size == rear) {
capacity += k;
element* B = new element[capacity];
for(int i=0; i<size; i++) {
B[i] = A[i];
}
A = B;
}
A[rear] = o;
rear++;
size++; }
©RohitBirlaDataStructure
RevisionTutorial
90
15-Oct-2011
- 91. Growth Strategy
• Double the size of the array every time is needed (I.e. capacity
exceeded)
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
C0 + (C0 * 2) + (C0*4) + … + (C0*2i) =
i = log (N / C0)
Running time?
C0 *1 + 2 + … + 2 log(N/C0) ] O(N)
How does the previous
code change?
©RohitBirlaDataStructure
RevisionTutorial
91
15-Oct-2011
- 92. Linked Lists
• Avoid the drawbacks of fixed size arrays with
• Growable arrays
• Linked lists
©RohitBirlaDataStructure
RevisionTutorial
92
15-Oct-2011
- 93. int i, *pi;
float f, *pf;
pi = (int *) malloc(sizeof(int));
pf = (float *) malloc (sizeof(float));
*pi =1024;
*pf =3.14;
printf(”an integer = %d, a float = %fn”, *pi, *pf);
free(pi);
free(pf);
request memory
return memory
Using Dynamically
Allocated Memory (review)
©RohitBirlaDataStructure
RevisionTutorial
93
15-Oct-2011
- 94. bat cat sat vat NULL
Linked Lists
©RohitBirlaDataStructure
RevisionTutorial
94
15-Oct-2011
- 95. bat cat sat vat NULL
mat
Insertion
Compare this with the insertion in arrays!
©RohitBirlaDataStructure
RevisionTutorial
95
15-Oct-2011
- 96. bat cat sat vat NULLmat
dangling
reference
Deletion
©RohitBirlaDataStructure
RevisionTutorial
96
15-Oct-2011
- 97. List ADT
• ADT with position-based methods
• generic methods size(), isEmpty()
• query methods isFirst(p), isLast(p)
• accessor methods first(), last()
before(p), after(p)
• update methods swapElements(p,q),
replaceElement(p,e)
insertFirst(e), insertLast(e)
insertBefore(p,e), insertAfter(p,e)
removeAfter(p)
©RohitBirlaDataStructure
RevisionTutorial
97
15-Oct-2011
- 99. b a t 0 NULL
address of
first node
ptr data ptr link
ptr
e name (*e).name
strcpy(ptr data, “bat”);
ptr link = NULL;
Create one Node
©RohitBirlaDataStructure
RevisionTutorial
99
15-Oct-2011
- 103. 10 20 NULL50 20 NULL50
node trail = NULL node
(a) before deletion (b)after deletion
Deletion
Delete node other than the first node
10 20 NULL50 20 NULL10
head node head
©RohitBirlaDataStructure
RevisionTutorial
103
15-Oct-2011
- 106. Other List Operations
• swapElements
• insertFirst
• insertLast
• deleteBefore
• deleteLast
©RohitBirlaDataStructure
RevisionTutorial
106
15-Oct-2011
- 107. Running Time Analysis
• insertAfter O(?)
• deleteAfter O(?)
• deleteBeforeO(?)
• deleteLast O(?)
• insertFirst O(?)
• insertLast O(?)
©RohitBirlaDataStructure
RevisionTutorial
107
15-Oct-2011
- 108. Applications of Linked Lists
• Stacks and Queues Implemented with Linked Lists
• Polynomials Implemented with Linked Lists
• Remember the array based implementation?
• Hint: two strategies, one efficient in terms of space, one in terms of
running time
©RohitBirlaDataStructure
RevisionTutorial
108
15-Oct-2011
- 109. Operations on Linked Lists
• Running time?
• insert, remove
• traverse, swap
• How to reverse the elements of a list?
©RohitBirlaDataStructure
RevisionTutorial
109
15-Oct-2011
- 110. typedef struct poly_node *poly_pointer;
typedef struct poly_node {
int coef;
int expon;
poly_pointer next;
};
poly_pointer a, b, c;
A x a x a x a xm
e
m
e em m
( ) ...=
1 2 0
1 2 0
coef expon link
Representation
Polynomials
©RohitBirlaDataStructure
RevisionTutorial
110
15-Oct-2011
- 111. 3 14 2 8 1 0
a
8 14 -3 10 10 6
b
a x x= 3 2 114 8
b x x x= 8 3 1014 10 6
null
null
Example
©RohitBirlaDataStructure
RevisionTutorial
111
15-Oct-2011
- 112. 3 14 2 8 1 0
a
8 14 -3 10 10 6
b
11 14
d
a->expon == b->expon
3 14 2 8 1 0
a
8 14 -3 10 10 6
b
11 14
d
a->expon < b->expon-3 10
Adding Polynomials
©RohitBirlaDataStructure
RevisionTutorial
112
15-Oct-2011
- 113. 3 14 2 8 1 0
a
8 14 -3 10 10 6
b
11 14
a->expon > b->expon
-3 10
d
2 8
Adding Polynomials (cont’d)
©RohitBirlaDataStructure
RevisionTutorial
113
15-Oct-2011
- 114. poly_pointer padd(poly_pointer a, poly_pointer b)
{
poly_pointer front, rear, temp;
int sum;
rear =(poly_pointer)malloc(sizeof(poly_node));
if (IS_FULL(rear)) {
fprintf(stderr, “The memory is fulln”);
exit(1);
}
front = rear;
while (a && b) {
switch (COMPARE(a->expon, b->expon)) {
Adding Polynomials (cont’d)
©RohitBirlaDataStructure
RevisionTutorial
114
15-Oct-2011
- 115. case -1: /* a->expon < b->expon */
attach(b->coef, b->expon, &rear);
b= b->next;
break;
case 0: /* a->expon == b->expon */
sum = a->coef + b->coef;
if (sum) attach(sum,a->expon,&rear);
a = a->next; b = b->next;
break;
case 1: /* a->expon > b->expon */
attach(a->coef, a->expon, &rear);
a = a->next;
}
}
for (; a; a = a->next)
attach(a->coef, a->expon, &rear);
for (; b; b=b->next)
attach(b->coef, b->expon, &rear);
rear->next = NULL;
temp = front; front = front->next; free(temp);
return front;
}
©RohitBirlaDataStructure
RevisionTutorial
115
15-Oct-2011
- 116. (1) coefficient additions
0 additions min(m, n)
where m (n) denotes the number of terms in A (B).
(2) exponent comparisons
extreme case
em-1 > fm-1 > em-2 > fm-2 > … > e0 > f0
m+n-1 comparisons
(3) creation of new nodes
extreme case
m + n new nodes
summary O(m+n)
Analysis
©RohitBirlaDataStructure
RevisionTutorial
116
15-Oct-2011
- 117. void attach(float coefficient, int exponent,
poly_pointer *ptr)
{
/* create a new node attaching to the node pointed to
by ptr. ptr is updated to point to this new node. */
poly_pointer temp;
temp = (poly_pointer) malloc(sizeof(poly_node));
if (IS_FULL(temp)) {
fprintf(stderr, “The memory is fulln”);
exit(1);
}
temp->coef = coefficient;
temp->expon = exponent;
(*ptr)->next = temp;
*ptr = temp;
}
Attach a Term
©RohitBirlaDataStructure
RevisionTutorial
117
15-Oct-2011
- 118. Other types of lists:
• Circular lists
• Doubly linked lists
©RohitBirlaDataStructure
RevisionTutorial
118
15-Oct-2011
- 119. 3 14 2 8 1 0
ptr
ptr
avail ...
avail
temp
circular list vs. chain
Circularly linked lists
©RohitBirlaDataStructure
RevisionTutorial
119
15-Oct-2011
- 120. X1 X2 X3 a
What happens when we insert a node to the front of a circular
linked list?
Problem: move down the whole list.
Operations in a circular list
X1 X2 X3 a
Keep a pointer points to the last node.
A possible solution:
©RohitBirlaDataStructure
RevisionTutorial
120
15-Oct-2011
- 121. void insertFront (pnode* ptr, pnode node)
{
/* insert a node in the list with head
(*ptr)->next */
if (IS_EMPTY(*ptr)) {
*ptr= node;
node->next = node; /* circular link */
}
else {
node->next = (*ptr)->next; (1)
(*ptr)->next = node; (2)
}
}
X1 X2 X3
(1)
(2) ptr
Insertion
©RohitBirlaDataStructure
RevisionTutorial
121
15-Oct-2011
- 122. int length(pnode ptr)
{
pnode temp;
int count = 0;
if (ptr) {
temp = ptr;
do {
count++;
temp = temp->next;
} while (temp!=ptr);
}
return count;
}
List length
©RohitBirlaDataStructure
RevisionTutorial
122
15-Oct-2011
- 123. Doubly Linked List
• Keep a pointer to the next and the previous element in the list
typedef struct node *pnode;
typedef struct node {
char data [4];
pnode next;
pnode prev;
}
©RohitBirlaDataStructure
RevisionTutorial
123
15-Oct-2011
- 124. Doubly Linked List
• Keep a header and trailer pointers (sentinels) with no content
• header.prev = null; header.next = first element
• trailer.next = null; trailer.prev = last element
• Update pointers for every operation performed on the list
• How to remove an element from the tail of the list ?
©RohitBirlaDataStructure
RevisionTutorial
124
15-Oct-2011
- 125. Doubly Linked List –
removeLast()
• Running time?
• How does this
compare to simply
linked lists?
©RohitBirlaDataStructure
RevisionTutorial
125
15-Oct-2011
- 126. Doubly Linked List
• insertFirst
• swapElements
©RohitBirlaDataStructure
RevisionTutorial
126
15-Oct-2011
- 129. 4 4
1 0
12
2 1
-4
0 2
11
3 3
-15
1 1
5
Circular linked list
Linked Representation
©RohitBirlaDataStructure
RevisionTutorial
129
15-Oct-2011
- 131. Queue
• Stores a set of elements in a particular order
• Stack principle: FIRST IN FIRST OUT
• = FIFO
• It means: the first element inserted is the first one to be
removed
• Example
• The first one in line is the first one to be served
©RohitBirlaDataStructure
RevisionTutorial
131
15-Oct-2011
- 132. Queue Applications
• Real life examples
• Waiting in line
• Waiting on hold for tech support
• Applications related to Computer Science
• Threads
• Job scheduling (e.g. Round-Robin algorithm for CPU allocation)
©RohitBirlaDataStructure
RevisionTutorial
132
15-Oct-2011
- 134. front rear Q[0] Q[1] Q[2] Q[3] Comments
-1
-1
-1
-1
0
1
-1
0
1
2
2
2
J1
J1 J2
J1 J2 J3
J2 J3
J3
queue is empty
Job 1 is added
Job 2 is added
Job 3 is added
Job 1 is deleted
Job 2 is deleted
Applications: Job Scheduling
©RohitBirlaDataStructure
RevisionTutorial
134
15-Oct-2011
- 137. Array-based Queue
Implementation
• As with the array-based stack implementation, the array is of
fixed size
• A queue of maximum N elements
• Slightly more complicated
• Need to maintain track of both front and rear
Implementation 1
Implementation 2
©RohitBirlaDataStructure
RevisionTutorial
137
15-Oct-2011
- 141. EMPTY QUEUE
[2] [3] [2] [3]
[1] [4] [1] [4]
[0] [5] [0] [5]
front = 0 front = 0
rear = 0 rear = 3
J2
J1
J3
Implementation 2:
Wrapped Configuration
Can be seen as a circular queue
©RohitBirlaDataStructure
RevisionTutorial
141
15-Oct-2011
- 142. FULL QUEUE FULL QUEUE
[2] [3] [2] [3]
[1] [4][1] [4]
[0] [5] [0] [5]
front =0
rear = 5
front =4
rear =3
J2 J3
J1 J4
J5 J6 J5
J7
J8 J9
Leave one empty space when queue is full
Why?
How to test when queue is empty?
How to test when queue is full?
©RohitBirlaDataStructure
RevisionTutorial
142
15-Oct-2011
- 145. void enqueue(pnode front, pnode rear, element item)
{ /* add an element to the rear of the queue */
pnode temp =
(pnode) malloc(sizeof (queue));
if (IS_FULL(temp)) {
fprintf(stderr, “ The memory is fulln”);
exit(1);
}
temp->item = item;
temp->next= NULL;
if (front) { (rear) -> next= temp;}
else front = temp;
rear = temp; }
List-based Queue
Implementation: Enqueue
©RohitBirlaDataStructure
RevisionTutorial
145
15-Oct-2011
- 146. element dequeue(pnode front) {
/* delete an element from the queue */
pnode temp = front;
element item;
if (IS_EMPTY(front)) {
fprintf(stderr, “The queue is emptyn”);
exit(1);
}
item = temp->item;
front = temp->next;
free(temp);
return item;
}
Dequeue
©RohitBirlaDataStructure
RevisionTutorial
146
15-Oct-2011
- 147. Algorithm Analysis
• enqueue O(?)
• dequeue O(?)
• size O(?)
• isEmpty O(?)
• isFull O(?)
• What if I want the first element to be always at Q[0] ?
©RohitBirlaDataStructure
RevisionTutorial
147
15-Oct-2011
- 148. Stacks
• Stack: what is it?
• ADT
• Applications
• Implementation(s)
©RohitBirlaDataStructure
RevisionTutorial
148
15-Oct-2011
- 149. What is a stack?
• Stores a set of elements in a particular order
• Stack principle: LAST IN FIRST OUT
• = LIFO
• It means: the last element inserted is the first one to be removed
• Example
• Which is the first element to pick up?
©RohitBirlaDataStructure
RevisionTutorial
149
15-Oct-2011
- 150. Last In First Out
B
A
D
C
B
A
C
B
A
D
C
B
A
E
D
C
B
A
top
top
top
top
top
A
©RohitBirlaDataStructure
RevisionTutorial
150
15-Oct-2011
- 151. Stack Applications
• Real life
• Pile of books
• Plate trays
• More applications related to computer science
• Program execution stack (read more from your text)
• Evaluating expressions
©RohitBirlaDataStructure
RevisionTutorial
151
15-Oct-2011
- 154. Array-based Stack
Implementation
• Allocate an array of some size (pre-defined)
• Maximum N elements in stack
• Bottom stack element stored at element 0
• last index in the array is the top
• Increment top when one element is pushed, decrement after
pop
©RohitBirlaDataStructure
RevisionTutorial
154
15-Oct-2011
- 160. Algorithm Analysis
• pushO(?)
• pop O(?)
• isEmpty O(?)
• isFull O(?)
• What if top is stored at the beginning of the array?
©RohitBirlaDataStructure
RevisionTutorial
160
15-Oct-2011
- 161. A Legend
The Towers of Hanoi
• In the great temple of Brahma in Benares, on a brass
plate under the dome that marks the center of the world,
there are 64 disks of pure gold that the priests carry one
at a time between these diamond needles according to
Brahma's immutable law: No disk may be placed on a
smaller disk. In the begging of the world all 64 disks
formed the Tower of Brahma on one needle. Now,
however, the process of transfer of the tower from one
needle to another is in mid course. When the last disk is
finally in place, once again forming the Tower of Brahma
but on a different needle, then will come the end of the
world and all will turn to dust.
©RohitBirlaDataStructure
RevisionTutorial
161
15-Oct-2011
- 162. The Towers of Hanoi
A Stack-based Application
• GIVEN: three poles
• a set of discs on the first pole, discs of different sizes, the smallest
discs at the top
• GOAL: move all the discs from the left pole to the right one.
• CONDITIONS: only one disc may be moved at a time.
• A disc can be placed either on an empty pole or on top of a larger
disc.
©RohitBirlaDataStructure
RevisionTutorial
162
15-Oct-2011
- 171. Towers of Hanoi – Recursive
Solution
void hanoi (int discs,
Stack fromPole,
Stack toPole,
Stack aux) {
Disc d;
if( discs >= 1) {
hanoi(discs-1, fromPole, aux, toPole);
d = fromPole.pop();
toPole.push(d);
hanoi(discs-1,aux, toPole, fromPole);
}
©RohitBirlaDataStructure
RevisionTutorial
171
15-Oct-2011
- 172. Is the End of the World
Approaching?
• Problem complexity 2n
• 64 gold discs
• Given 1 move a second
600,000,000,000 years until the end of the world
©RohitBirlaDataStructure
RevisionTutorial
172
15-Oct-2011
- 173. Applications
• Infix to Postfix conversion
[Evaluation of Expressions]
©RohitBirlaDataStructure
RevisionTutorial
173
15-Oct-2011
- 175. Token Operator Precedence1
Associativity
( )
[ ]
-> .
function call
array element
struct or union member
17 left-to-right
-- ++ increment, decrement2
16 left-to-right
-- ++
!
-
- +
& *
sizeof
decrement, increment3
logical not
one’s complement
unary minus or plus
address or indirection
size (in bytes)
15 right-to-left
(type) type cast 14 right-to-left
* / % mutiplicative 13 Left-to-right
©RohitBirlaDataStructure
RevisionTutorial
175
15-Oct-2011
- 176. + - binary add or subtract 12 left-to-right
<< >> shift 11 left-to-right
> >=
< <=
relational 10 left-to-right
== != equality 9 left-to-right
& bitwise and 8 left-to-right
^ bitwise exclusive or 7 left-to-right
bitwise or 6 left-to-right
&& logical and 5 left-to-right
logical or 4 left-to-right
©RohitBirlaDataStructure
RevisionTutorial
176
15-Oct-2011
- 177. ?: conditional 3 right-to-left
= += -=
/= *= %=
<<= >>=
&= ^= =
assignment 2 right-to-left
, comma 1 left-to-right
©RohitBirlaDataStructure
RevisionTutorial
177
15-Oct-2011
- 179. Token Stack
[0] [1] [2]
Top
6
2
/
3
-
4
2
*
+
6
6 2
6/2
6/2 3
6/2-3
6/2-3 4
6/2-3 4 2
6/2-3 4*2
6/2-3+4*2
0
1
0
1
0
1
2
1
0
©RohitBirlaDataStructure
RevisionTutorial
179
15-Oct-2011
- 180. #defineMAX_STACK_SIZE100/*maximumstacksize*/
#defineMAX_EXPR_SIZE 100/*maxsizeofexpression*/
typedefenum{1paran,rparen,plus,minus,times,divide,
mod,eos,operand}precedence;
intstack[MAX_STACK_SIZE];/*globalstack*/
charexpr[MAX_EXPR_SIZE]; /*inputstring*/
Assumptions:
operators: +, -, *, /, %
operands: single digit integer
Infix to Postfix
©RohitBirlaDataStructure
RevisionTutorial
180
15-Oct-2011
- 185. Infix to Postfix Conversion
(Intuitive Algorithm)
(1) Fully parenthesized expression
a / b - c + d * e - a * c -->
((((a / b) - c) + (d * e)) – (a * c))
(2) All operators replace their corresponding right
parentheses.
((((a / b) - c) + (d * e)) – (a * c))
(3) Delete all parentheses.
ab/c-de*+ac*-
two passes / - *+ *-
©RohitBirlaDataStructure
RevisionTutorial
185
15-Oct-2011
- 186. Token Stack
[0] [1] [2]
Top Output
a
+
b
*
c
eos
+
+
+ *
+ *
-1
0
0
1
1
-1
a
a
ab
ab
abc
abc*=
The orders of operands in infix and postfix are the same.
a + b * c, * > +
©RohitBirlaDataStructure
RevisionTutorial
186
15-Oct-2011
- 187. Token Stack
[0] [1] [2]
Top Output
a
*1
(
b
+
c
)
*2
d
eos
*1
*1 (
*1 (
*1 ( +
*1 ( +
*1
*2
*2
*2
-1
0
1
1
2
2
0
0
0
0
a
a
a
ab
ab
abc
abc+
abc+*1
abc+*1d
abc+*1d*2
a *1 (b +c) *2 d
match )
*1 = *2
©RohitBirlaDataStructure
RevisionTutorial
187
15-Oct-2011
- 188. (1) Operators are taken out of the stack as long as their
in-stack precedence is higher than or equal to the
incoming precedence of the new operator.
(2) ( has low in-stack precedence, and high incoming
precedence.
( ) + - * / % eos
isp 0 19 12 12 13 13 13 0
icp 20 19 12 12 13 13 13 0
Rules
©RohitBirlaDataStructure
RevisionTutorial
188
15-Oct-2011
- 189. precedencestack[MAX_STACK_SIZE];
/*ispandicparrays--indexisvalueofprecedence
lparen,rparen,plus,minus,times,divide,mod,eos*/
staticintisp[]={0,19,12,12,13,13,13,0};
staticinticp[]={20,19,12,12,13,13,13,0};
isp: in-stack precedence
icp: incoming precedence
©RohitBirlaDataStructure
RevisionTutorial
189
15-Oct-2011
- 191. /*unstack tokensuntilleftparenthesis */
while(stack[top] !=lparen)
print_token(delete(&top));
pop(&top);/*discardtheleftparenthesis */
}
else{
/*removeandprintsymbolswhoseispisgreater
thanorequaltothecurrenttoken’sicp*/
while(isp[stack[top]] >=icp[token] )
print_token(delete(&top));
push(&top, token);
}
}
while((token =pop(&top))!=eos)
print_token(token);
print(“n”);
}
Infix to Postfix (cont’d)
©RohitBirlaDataStructure
RevisionTutorial
191
15-Oct-2011
- 192. The British Constitution
Crown
Church of
England
Cabine
t
House of
Commons
House of
Lords
Suprem
e
Court
Minister
s
County
Council
Metropolita
n
police
County Borough
Council
Rural District
Council
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
192
- 193. More Trees Examples
• Unix / Windows file structure
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
193
- 194. Definition of Tree
A tree is a finite set of one or more nodes
such that:
There is a specially designated node called
the root.
The remaining nodes are partitioned into n>=0
disjoint sets T1, ..., Tn, where each of these sets is
a tree.
We call T1, ..., Tn the subtrees of the root.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
194
- 195. Level and Depth
K L
E F
B
G
C
M
H I J
D
A
Level
1
2
3
4
node (13)
degree of a node
leaf (terminal)
nonterminal
parent
children
sibling
degree of a tree (3)
ancestor
level of a node
height of a tree (4)
3
2 1 3
2 0 0 1 0 0
0 0 0
1
2 2 2
3 3 3 3 3 3
4 4 4
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
195
- 196. Terminology
The degree of a node is the number of subtrees
of the node
The degree of A is 3; the degree of C is 1.
The node with degree 0 is a leaf or terminal
node.
A node that has subtrees is the parent of the
roots of the subtrees.
The roots of these subtrees are the children of
the node.
Children of the same parent are siblings.
The ancestors of a node are all the nodes
along the path from the root to the node.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
196
- 197. Tree Properties
A
B C
D
G
E F
IH
Property Value
Number of nodes
Height
Root Node
Leaves
Interior nodes
Number of levels
Ancestors of H
Descendants of B
Siblings of E
Right subtree
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
197
- 198. Representation of Trees
List Representation
( A ( B ( E ( K, L ), F ), C ( G ), D ( H ( M ), I, J ) ) )
The root comes first, followed by a list of sub-trees
data link 1 link 2 ... link n
How many link fields are
needed in such a representation?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
198
- 199. A Tree Node
• Every tree node:
• object – useful information
• children – pointers to its children nodes
O
O O
O
O
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
199
- 200. Left Child - Right Sibling
A
B C D
E F G H I J
K L M
data
left child right sibling
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
200
- 201. Tree ADT
• Objects: any type of objects can be stored in a tree
• Methods:
• accessor methods
• root() – return the root of the tree
• parent(p) – return the parent of a node
• children(p) – returns the children of a node
• query methods
• size() – returns the number of nodes in the tree
• isEmpty() - returns true if the tree is empty
• elements() – returns all elements
• isRoot(p), isInternal(p), isExternal(p)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
201
- 202. Tree Implementation
typedef struct tnode {
int key;
struct tnode* lchild;
struct tnode* sibling;
} *ptnode;
- Create a tree with three nodes (one root & two children)
- Insert a new node (in tree with root R, as a new child at level L)
- Delete a node (in tree with root R, the first child at level L)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
202
- 203. Tree Traversal
• Two main methods:
• Preorder
• Postorder
• Recursive definition
• PREorder:
• visit the root
• traverse in preorder the children (subtrees)
• POSTorder
• traverse in postorder the children (subtrees)
• visit the root
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
203
- 205. Postorder
• postorder traversal
Algorithm postOrder(v)
for each child w of v do
recursively perform postOrder(w)
“visit” node v
• du (disk usage) command in Unix
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
205
- 206. Preorder Implementation
public void preorder(ptnode t) {
ptnode ptr;
display(t->key);
for(ptr = t->lchild; NULL != ptr; ptr = ptr->sibling) {
preorder(ptr);
}
}
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
206
- 207. Postorder Implementation
public void postorder(ptnode t) {
ptnode ptr;
for(ptr = t->lchild; NULL != ptr; ptr = ptr->sibling) {
postorder(ptr);
}
display(t->key);
}
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
207
- 208. Binary Trees
A special class of trees: max degree for each node
is 2
Recursive definition: A binary tree is a finite set
of nodes that is either empty or consists of a root
and two disjoint binary trees called the left
subtree and the right subtree.
Any tree can be transformed into binary tree.
by left child-right sibling representation
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
208
- 210. ADT Binary Tree
objects: a finite set of nodes either empty or
consisting of a root node, left BinaryTree,
and right BinaryTree.
method:
for all bt, bt1, bt2 BinTree, item element
Bintree create()::= creates an empty binary tree
Boolean isEmpty(bt)::= if (bt==empty binary
tree) return TRUE else return FALSE
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
210
- 211. BinTree makeBT(bt1, item, bt2)::= return a binary tree
whose left subtree is bt1, whose right subtree is bt2,
and whose root node contains the data item
Bintree leftChild(bt)::= if (IsEmpty(bt)) return error
else return the left subtree of bt
element data(bt)::= if (IsEmpty(bt)) return error
else return the data in the root node of bt
Bintree rightChild(bt)::= if (IsEmpty(bt)) return error
else return the right subtree of bt
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
211
- 212. Samples of Trees
A
B
A
B
A
B C
GE
I
D
H
F
Complete Binary Tree
Skewed Binary Tree
E
C
D
1
2
3
4
5
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
212
- 213. Maximum Number of Nodes in BT
The maximum number of nodes on level i of a
binary tree is 2i-1, i>=1.
The maximum nubmer of nodes in a binary tree
of depth k is 2k-1, k>=1.
Prove by induction.
2 2 11
1
i
i
k
k
=
=
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
213
- 214. Full BT vs. Complete BT
A full binary tree of depth k is a binary tree of
depth k having 2 -1 nodes, k>=0.
A binary tree with n nodes and depth k is
complete iff its nodes correspond to the nodes numbered from 1 to n in the full
binary tree of depth k.
k
A
B C
GE
I
D
H
F
A
B C
GE
K
D
J
F
IH ONML
Full binary tree of depth 4Complete binary tree
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
215
- 215. Binary Tree Representations
If a complete binary tree with n nodes (depth =
log n + 1) is represented sequentially, then for
any node with index i, 1<=i<=n, we have:
parent(i) is at i/2 if i!=1. If i=1, i is at the root and
has no parent.
leftChild(i) is at 2i if 2i<=n. If 2i>n, then i has no
left child.
rightChild(i) is at 2i+1 if 2i +1 <=n. If 2i +1 >n,
then i has no right child.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
216
- 217. Space Overhead (1)
From the Full Binary Tree Theorem:
• Half of the pointers are null.
If leaves store only data, then overhead depends on whether
the tree is full.
Ex: All nodes the same, with two pointers to
children:
• Total space required is (2p + d)n
• Overhead: 2pn
• If p = d, this means 2p/(2p + d) = 2/3 overhead.
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
218
- 218. Space Overhead (2)
Eliminate pointers from the leaf nodes:
n/2(2p) p
n/2(2p) + dn p + d
This is 1/2 if p = d.
2p/(2p + d) if data only at leaves 2/3 overhead.
Note that some method is needed to distinguish leaves from
internal nodes.
=
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
219
- 219. Array Implementation (1)
Position 0 1 2 3 4 5 6 7 8 9 10 11
Parent -- 0 0 1 1 2 2 3 3 4 4 5
Left Child 1 3 5 7 9 11 -- -- -- -- -- --
Right Child 2 4 6 8 10 -- -- -- -- -- -- --
Left Sibling -- -- 1 -- 3 -- 5 -- 7 -- 9 --
Right Sibling -- 2 -- 4 -- 6 -- 8 -- 10 -- --
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
220
- 220. Array Implementation (1)
Parent (r) =
Leftchild(r) =
Rightchild(r) =
Leftsibling(r) =
Rightsibling(r) =
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
221
- 221. Linked Representation
typedef struct tnode *ptnode;
typedef struct tnode {
int data;
ptnode left, right;
};
dataleft right
data
left right
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
222
- 222. Binary Tree Traversals
Let L, V, and R stand for moving left, visiting
the node, and moving right.
There are six possible combinations of traversal
lRr, lrR, Rlr, Rrl, rRl, rlR
Adopt convention that we traverse left before
right, only 3 traversals remain
lRr, lrR, Rlr
inorder, postorder, preorder
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
223
- 223. Arithmetic Expression Using BT
+
*
A
*
/
E
D
C
B
inorder traversal
A / B * C * D + E
infix expression
preorder traversal
+ * * / A B C D E
prefix expression
postorder traversal
A B / C * D * E +
postfix expression
level order traversal
+ * E * D / C A B
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
224
- 224. Inorder Traversal (recursive version)
void inorder(ptnode ptr)
/* inorder tree traversal */
{
if (ptr) {
inorder(ptr->left);
printf(“%d”, ptr->data);
indorder(ptr->right);
}
}
A / B * C * D + E
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
225
- 225. Preorder Traversal(recursive version)
void preorder(ptnode ptr)
/* preorder tree traversal */
{
if (ptr) {
printf(“%d”, ptr->data);
preorder(ptr->left);
predorder(ptr->right);
}
}
+ * * / A B C D E
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
226
- 226. Postorder Traversal(recursive version)
void postorder(ptnode ptr)
/* postorder tree traversal */
{
if (ptr) {
postorder(ptr->left);
postdorder(ptr->right);
printf(“%d”, ptr->data);
}
}
A B / C * D * E +
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
227
- 227. Level Order Traversal
(using queue)
void levelOrder(ptnode ptr)
/* level order tree traversal */
{
int front = rear = 0;
ptnode queue[MAX_QUEUE_SIZE];
if (!ptr) return; /* empty queue */
enqueue(front, &rear, ptr);
for (;;) {
ptr = dequeue(&front, rear);
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
228
- 228. if (ptr) {
printf(“%d”, ptr->data);
if (ptr->left)
enqueue(front, &rear,
ptr->left);
if (ptr->right)
enqueue(front, &rear,
ptr->right);
}
else break;
}
}
+ * E * D / C A B
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
229
- 229. Euler Tour Traversal
• generic traversal of a binary tree
• the preorder, inorder, and postorder traversals are special
cases of the Euler tour traversal
• “walk around” the
tree and visit each
node three times:
• on the left
• from below
• on the right
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
230
- 230. Euler Tour Traversal (cont’d)
eulerTour(node v) {
perform action for visiting node on the left;
if v is internal then
eulerTour(v->left);
perform action for visiting node from below;
if v is internal then
eulerTour(v->right);
perform action for visiting node on the right;
}
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
231
- 231. Euler Tour Traversal (cont’d)
• preorder traversal = Euler Tour with a “visit” only on the left
• inorder = ?
• postorder = ?
• Other applications: compute number of descendants for
each node v:
• counter = 0
• increment counter each time node is visited on the left
• #descendants = counter when node is visited on the right –
counter when node is visited on the left +
1
• Running time for Euler Tour?
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
232
- 232. Application: Evaluation of
Expressions
+
*
A
*
/
E
D
C
B
inorder traversal
A / B * C * D + E
infix expression
preorder traversal
+ * * / A B C D E
prefix expression
postorder traversal
A B / C * D * E +
postfix expression
level order traversal
+ * E * D / C A B
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
233
- 233. Inorder Traversal (recursive version)
void inorder(ptnode ptr)
/* inorder tree traversal */
{
if (ptr) {
inorder(ptr->left);
printf(“%d”, ptr->data);
inorder(ptr->right);
}
}
A / B * C * D + E
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
234
- 234. Preorder Traversal(recursive version)
void preorder(ptnode ptr)
/* preorder tree traversal */
{
if (ptr) {
printf(“%d”, ptr->data);
preorder(ptr->left);
preorder(ptr->right);
}
}
+ * * / A B C D E
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
235
- 235. Postorder Traversal(recursive version)
void postorder(ptnode ptr)
/* postorder tree traversal */
{
if (ptr) {
postorder(ptr->left);
postorder(ptr->right);
printf(“%d”, ptr->data);
}
}
A B / C * D * E +
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
236
- 236. Application:
Propositional Calculus Expression
• A variable is an expression.
• If x and y are expressions, then ¬x, xy,
xy are expressions.
• Parentheses can be used to alter the normal order of evaluation
(¬ > > ).
• Example: x1 (x2 ¬x3)
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
237
- 238. Node Structure
left data value right
typedef emun {not, and, or, true, false } logical;
typedef struct tnode *ptnode;
typedef struct node {
logical data;
short int value;
ptnode right, left;
} ;
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
239
- 239. Postorder Eval
void post_order_eval(ptnode node)
{
/* modified post order traversal to evaluate a propositional calculus tree */
if (node) {
post_order_eval(node->left);
post_order_eval(node->right);
switch(node->data) {
case not: node->value =
!node->right->value;
break;
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
240
- 240. Postorder Eval (cont’d)
case and: node->value =
node->right->value &&
node->left->value;
break;
case or: node->value =
node->right->value | |
node->left->value;
break;
case true: node->value = TRUE;
break;
case false: node->value = FALSE;
}
}
}
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
241
- 241. A Taxonomy of Trees
• General Trees – any number of children / node
• Binary Trees – max 2 children / node
• Heaps – parent < (>) children
• Binary Search Trees
©RohitBirlaDataStructure
RevisionTutorial
242
15-Oct-2011
- 242. Binary Trees
• Binary search tree
• Every element has a unique key.
• The keys in a nonempty left subtree (right subtree) are smaller
(larger) than the key in the root of subtree.
• The left and right subtrees are also binary search trees.
©RohitBirlaDataStructure
RevisionTutorial
243
15-Oct-2011
- 243. Binary Search Trees
• Binary Search Trees (BST) are a type of Binary
Trees with a special organization of data.
• This data organization leads to O(log n)
complexity for searches, insertions and deletions
in certain types of the BST (balanced trees).
• O(h) in general
©RohitBirlaDataStructure
RevisionTutorial
244
15-Oct-2011
- 244. 34 41 56 63 72 89 95
0 1 2 3 4 5 6
34 41 56
0 1 2
72 89 95
4 5 6
34 56
0 2
72 95
4 6
Binary Search algorithm of an array of sorted items
reduces the search space by one half after each comparison
Binary Search Algorithm
©RohitBirlaDataStructure
RevisionTutorial
245
15-Oct-2011
- 245. 63
41 89
34 56
72 95
• the values in all nodes in the left subtree of a node are less than
the node value
• the values in all nodes in the right subtree of a node are greater
than the node values
Organization Rule for BST
©RohitBirlaDataStructure
RevisionTutorial
246
15-Oct-2011
- 246. Binary Tree
typedef struct tnode *ptnode;
typedef struct node {
short int key;
ptnode right, left;
} ;
©RohitBirlaDataStructure
RevisionTutorial
247
15-Oct-2011
- 247. Searching in the BST
method search(key)
• implements the binary search based on comparison of the items
in the tree
• the items in the BST must be comparable (e.g integers, string, etc.)
The search starts at the root. It probes down, comparing the
values in each node with the target, till it finds the first item equal
to the target. Returns this item or null if there is none.
BST Operations: Search
©RohitBirlaDataStructure
RevisionTutorial
248
15-Oct-2011
- 248. if the tree is empty
return NULL
else if the item in the node equals the target
return the node value
else if the item in the node is greater than the target
return the result of searching the left subtree
else if the item in the node is smaller than the target
return the result of searching the right subtree
Search in BST - Pseudocode
©RohitBirlaDataStructure
RevisionTutorial
249
15-Oct-2011
- 249. Search in a BST: C code
Ptnode search(ptnode root,
int key)
{
/* return a pointer to the node that
contains key. If there is no such
node, return NULL */
if (!root) return NULL;
if (key == root->key) return root;
if (key < root->key)
return search(root->left,key);
return search(root->right,key);
}
©RohitBirlaDataStructure
RevisionTutorial
250
15-Oct-2011
- 250. method insert(key)
places a new item near the frontier of the BST while retaining its organization of data:
starting at the root it probes down the tree till it finds a node whose left or right
pointer is empty and is a logical place for the new value
uses a binary search to locate the insertion point
is based on comparisons of the new item and values of nodes in the BST
Elements in nodes must be comparable!
BST Operations: Insertion
©RohitBirlaDataStructure
RevisionTutorial
251
15-Oct-2011
- 251. 9
7
5
4 6 8
Case 1: The Tree is Empty
Set the root to a new node containing the item
Case 2: The Tree is Not Empty
Call a recursive helper method to insert the
item
10
10 > 7
10 > 9
10
©RohitBirlaDataStructure
RevisionTutorial
252
15-Oct-2011
- 252. if tree is empty
create a root node with the new key
else
compare key with the top node
if key = node key
replace the node with the new value
else if key > node key
compare key with the right subtree:
if subtree is empty create a leaf node
else add key in right subtree
else key < node key
compare key with the left subtree:
if the subtree is empty create a leaf node
else add key to the left subtree
Insertion in BST - Pseudocode
©RohitBirlaDataStructure
RevisionTutorial
253
15-Oct-2011
- 253. Insertion into a BST: C code
void insert (ptnode *node, int key)
{
ptnode ptr,
temp = search(*node, key);
if (temp || !(*node)) {
ptr = (ptnode) malloc(sizeof(tnode));
if (IS_FULL(ptr)) {
fprintf(stderr, “The memory is fulln”);
exit(1);
}
ptr->key = key;
ptr->left = ptr->right = NULL;
if (*node)
if (key<temp->key) temp->left=ptr;
else temp->right = ptr;
else *node = ptr;
}
}
©RohitBirlaDataStructure
RevisionTutorial
254
15-Oct-2011
- 254. The order of supplying the data determines where it is
placed in the BST , which determines the shape of the BST
Create BSTs from the same set of data presented each time
in a different order:
a) 17 4 14 19 15 7 9 3 16 10
b) 9 10 17 4 3 7 14 16 15 19
c) 19 17 16 15 14 10 9 7 4 3 can you guess this shape?
BST Shapes
©RohitBirlaDataStructure
RevisionTutorial
255
15-Oct-2011
- 255. removes a specified item from the BST and adjusts the tree
uses a binary search to locate the target item:
starting at the root it probes down the tree till it finds the target or reaches a leaf
node (target not in the tree)
removal of a node must not leave a ‘gap’ in the tree,
BST Operations: Removal
©RohitBirlaDataStructure
RevisionTutorial
256
15-Oct-2011
- 256. method remove (key)
I if the tree is empty return false
II Attempt to locate the node containing the target using the
binary search algorithm
if the target is not found return false
else the target is found, so remove its node:
Case 1: if the node has 2 empty subtrees
replace the link in the parent with null
Case 2: if the node has a left and a right subtree
- replace the node's value with the max value in the
left subtree
- delete the max node in the left subtree
Removal in BST - Pseudocode
©RohitBirlaDataStructure
RevisionTutorial
257
15-Oct-2011
- 257. Case 3: if the node has no left child
- link the parent of the node
- to the right (non-empty) subtree
Case 4: if the node has no right child
- link the parent of the target
- to the left (non-empty) subtree
Removal in BST - Pseudocode
©RohitBirlaDataStructure
RevisionTutorial
258
15-Oct-2011
- 258. 9
7
5
64 8 10
9
7
5
6 8 10
Case 1: removing a node with 2 EMPTY SUBTREES
parent
cursor
Removal in BST: Example
Removing 4
replace the link in the
parent with null
©RohitBirlaDataStructure
RevisionTutorial
259
15-Oct-2011
- 259. Case 2: removing a node with 2 SUBTREES
9
7
5
6 8 10
9
6
5
8 10
cursor
cursor
- replace the node's value with the max value in the left subtree
- delete the max node in the left subtree
44
Removing 7
Removal in BST: Example
What other element
can be used as
replacement?
©RohitBirlaDataStructure
RevisionTutorial
260
15-Oct-2011
- 260. 9
7
5
6 8 10
9
7
5
6 8 10
cursor
cursor
parent
parent
the node has no left child:
link the parent of the node to the right (non-empty) subtree
Case 3: removing a node with 1 EMPTY SUBTREE
Removal in BST: Example
©RohitBirlaDataStructure
RevisionTutorial
261
15-Oct-2011
- 261. 9
7
5
8 10
9
7
5
8 10
cursor
cursor
parent
parent
the node has no right child:
link the parent of the node to the left (non-empty) subtree
Case 4: removing a node with 1 EMPTY SUBTREE
Removing 5
4 4
Removal in BST: Example
©RohitBirlaDataStructure
RevisionTutorial
262
15-Oct-2011
- 262. The complexity of operations get, insert and
remove in BST is O(h) , where h is the height.
O(log n) when the tree is balanced. The updating
operations cause the tree to become unbalanced.
The tree can degenerate to a linear shape and the
operations will become O (n)
Analysis of BST Operations
©RohitBirlaDataStructure
RevisionTutorial
263
15-Oct-2011
- 263. BST tree = new BST();
tree.insert ("E");
tree.insert ("C");
tree.insert ("D");
tree.insert ("A");
tree.insert ("H");
tree.insert ("F");
tree.insert ("K");
>>>> Items in advantageous order:
K
H
F
E
D
C
A
Output:
Best Case
©RohitBirlaDataStructure
RevisionTutorial
264
15-Oct-2011
- 264. BST tree = new BST();
for (int i = 1; i <= 8; i++)
tree.insert (i);
>>>> Items in worst order:
8
7
6
5
4
3
2
1
Output:
Worst Case
©RohitBirlaDataStructure
RevisionTutorial
265
15-Oct-2011
- 265. tree = new BST ();
for (int i = 1; i <= 8; i++)
tree.insert(random());
>>>> Items in random order:
X
U
P
O
H
F
B
Output:
Random Case
©RohitBirlaDataStructure
RevisionTutorial
266
15-Oct-2011
- 266. Applications for BST
• Sorting with binary search trees
• Input: unsorted array
• Output: sorted array
• Algorithm ?
• Running time ?
©RohitBirlaDataStructure
RevisionTutorial
267
15-Oct-2011
- 267. Better Search Trees
Prevent the degeneration of the BST :
• A BST can be set up to maintain balance during updating
operations (insertions and removals)
• Types of ST which maintain the optimal performance:
• splay trees
• AVL trees
• 2-4 Trees
• Red-Black trees
• B-trees
©RohitBirlaDataStructure
RevisionTutorial
268
15-Oct-2011
- 268. Trees: A Review (again? )
• General trees
• one parent, N children
• Binary tree
• ISA General tree
• + max 2 children
• Binary search tree
• ISA Binary tree
• + left subtree < parent < right subtree
• AVL tree
• ISA Binary search tree
• + | height left subtree – height right subtree | 1
©RohitBirlaDataStructure
RevisionTutorial
269
15-Oct-2011
- 269. Trees: A Review (cont’d)
• Multi-way search tree
• ISA General tree
• + Each node has K keys and K+1 children
• + All keys in child K < key K < all keys in child K+1
• 2-4 Tree
• ISA Multi-way search tree
• + All nodes have at most 3 keys / 4 children
• + All leaves are at the same level
• B-Tree
• ISA Multi-way search tree
• + All nodes have at least T keys, at most 2T(+1) keys
• + All leaves are at the same level
©RohitBirlaDataStructure
RevisionTutorial
270
15-Oct-2011
- 270. Tree Applications
• Data Compression
• Huffman tree
• Automatic Learning
• Decision trees
©RohitBirlaDataStructure
RevisionTutorial
271
15-Oct-2011
- 271. Huffman code
• Very often used for text compression
• Do you know how gzip or winzip works?
• Compression methods
• ASCII code uses codes of equal length for all letters how
many codes?
• Today’s alternative to ASCII?
• Idea behind Huffman code: use shorter length codes for
letters that are more frequent
©RohitBirlaDataStructure
RevisionTutorial
272
15-Oct-2011
- 272. Huffman Code
• Build a list of letters and frequencies
“have a great day today”
• Build a Huffman Tree bottom up, by grouping letters with
smaller occurrence frequencies
©RohitBirlaDataStructure
RevisionTutorial
273
15-Oct-2011
- 273. Huffman Codes
• Write the Huffman codes for the strings
• “abracadabra”
• “Veni Vidi Vici”
©RohitBirlaDataStructure
RevisionTutorial
274
15-Oct-2011
- 274. Huffman Code
• Running time?
• Suppose N letters in input string, with L unique letters
• What is the most important factor for obtaining highest
compression?
• Compare: [assume a text with a total of 1000 characters]
• I. Three different characters, each occurring the same number of
times
• II. 20 different characters, 19 of them occurring only once, and
the 20st occurring the rest of the time
©RohitBirlaDataStructure
RevisionTutorial
275
15-Oct-2011
- 275. Huffman Coding Trees
ASCII codes: 8 bits per character.
• Fixed-length coding.
Can take advantage of relative frequency of
letters to save space.
• Variable-length coding
Build the tree with minimum external path weight.
Z K F C U D L E
2 7 24 32 37 42 42 120
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
276
- 278. Assigning Codes
Letter Freq Code Bits
C 32
D 42
E 120
F 24
K 7
L 42
U 37
Z 2
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
279
- 279. Coding and Decoding
A set of codes is said to meet the prefix property if no code in
the set is the prefix of another.
Code for DEED:
Decode 1011001110111101:
Expected cost per letter:
15-Oct-2011
©RohitBirlaDataStructure
RevisionTutorial
280
- 280. One More Application
• Heuristic Search
• Decision Trees
• Given a set of examples, with an associated decision (e.g.
good/bad, +/-, pass/fail, caseI/caseII/caseIII, etc.)
• Attempt to take (automatically) a decision when a new
example is presented
• Predict the behavior in new cases!
©RohitBirlaDataStructure
RevisionTutorial
281
15-Oct-2011
- 281. Data Records
Name A B C D E F G
1. Jeffrey B. 1 0 1 0 1 0 1 -
2. Paul S. 0 1 1 0 0 0 1 -
3. Daniel C. 0 0 1 0 0 0 0 -
4. Gregory P. 1 0 1 0 1 0 0 -
5. Michael N. 0 0 1 1 0 0 0 -
6. Corinne N. 1 1 1 0 1 0 1 +
7. Mariyam M. 0 1 0 1 0 0 1 +
8. Stephany D. 1 1 1 1 1 1 1 +
9. Mary D. 1 1 1 1 1 1 1 +
10. Jamie F. 1 1 1 0 0 1 1 +
©RohitBirlaDataStructure
RevisionTutorial
282
15-Oct-2011
- 282. Fields in the Record
A: First name ends in a vowel?
B: Neat handwriting?
C: Middle name listed?
D: Senior?
E: Got extra-extra credit?
F: Google brings up home page?
G: Google brings up reference?
©RohitBirlaDataStructure
RevisionTutorial
283
15-Oct-2011
- 283. Build a Classification Tree
Internal nodes: features
Leaves: classification
F
A D
A
0 1
8,9
2,3,7 1,4,5,6 10
Error: 30%
©RohitBirlaDataStructure
RevisionTutorial
284
15-Oct-2011
- 284. Different Search Problem
Given a set of data records with their classifications, pick a
decision tree: search problem!
Challenges:
• Scoring function?
• Large space of trees.
What’s a good tree?
• Low error on given set of records
• Small
©RohitBirlaDataStructure
RevisionTutorial
285
15-Oct-2011
- 285. “Perfect” Decision Tree
C
E
B
0 1
F
middle name?
EEC?
Neat?Google?
Training set Error: 0%
(can always do this?)
0
0 0
1
1 1
©RohitBirlaDataStructure
RevisionTutorial
286
15-Oct-2011
- 286. Search For a Classification
• Classify new records
New1. Mike M. 1 0 1 1 0 0 1 ?
New2. Jerry K. 0 1 0 1 0 0 0 ?
©RohitBirlaDataStructure
RevisionTutorial
287
15-Oct-2011
- 287. Heaps
• A heap is a binary tree T that stores a key-element pairs at its
internal nodes
• It satisfies two properties:
• MinHeap: key(parent) key(child)
• [OR MaxHeap: key(parent) key(child)]
• all levels are full, except
the last one, which is
left-filled
4
6
207
811
5
9
1214
15
2516
©RohitBirlaDataStructure
RevisionTutorial
288
15-Oct-2011
- 288. What are Heaps Useful for?
• To implement priority queues
• Priority queue = a queue where all elements have a “priority”
associated with them
• Remove in a priority queue removes the element with the
smallest priority
• insert
• removeMin
©RohitBirlaDataStructure
RevisionTutorial
289
15-Oct-2011
- 289. Heap or Not a Heap?
©RohitBirlaDataStructure
RevisionTutorial
290
15-Oct-2011
- 290. Heap Properties
• A heap T storing n keys has height h = log(n + 1), which
is O(log n)
4
6
207
811
5
9
1214
15
2516
©RohitBirlaDataStructure
RevisionTutorial
291
15-Oct-2011
- 291. ADT for Min Heap
objects: n > 0 elements organized in a binary tree so that the value in each
node is at least as large as those in its children
method:
Heap Create(MAX_SIZE)::= create an empty heap that can
hold a maximum of max_size elements
Boolean HeapFull(heap, n)::= if (n==max_size) return TRUE
else return FALSE
Heap Insert(heap, item, n)::= if (!HeapFull(heap,n)) insert
item into heap and return the resulting heap
else return error
Boolean HeapEmpty(heap, n)::= if (n>0) return FALSE
else return TRUE
Element Delete(heap,n)::= if (!HeapEmpty(heap,n)) return one
instance of the smallest element in the heap
and remove it from the heap
else return error
©RohitBirlaDataStructure
RevisionTutorial
292
15-Oct-2011
- 293. Heap Insertion
• Add key in next available position
©RohitBirlaDataStructure
RevisionTutorial
294
15-Oct-2011
- 296. Heap Insertion
• Terminate unheap when
• reach root
• key child is greater than key parent
©RohitBirlaDataStructure
RevisionTutorial
297
15-Oct-2011