3. This presentation §1
Understand what data structures are
How they are represented internally
How “fast” each one is and why that is
3
4. Data structures §1
Classes that offer the means to store and retrieve data,
possibly in a particular order
Implementation is (often) optimised for certain use cases
array is PHP’s oldest and most frequently used data
structure
PHP 5.3 adds support for several others
4
5. Current SPL data structures §1
SplDoublyLinkedList
SplStack
SplQueue
SplHeap
SplMaxHeap
SplMinHeap
SplPriorityQueue
SplFixedArray
SplObjectStorage
5
6. Why care? §1
Using the right data structure in the right place could
improve performance
Already implemented and tested: saves work
Can add a type hint in a function definition
Adds semantics to your code
6
7. Algorithmic complexity §1
We want to be able to talk about the performance of the
data structure implementation
Running speed (time complexity)
Space consumption (space complexity)
We describe complexity in terms of input size, which is
machine and programming language independent
7
8. Example §1
for ($i = 0; $i < $n; $i++) {
for ($j = 0; $j < $n; $j++) {
echo ’tick’;
}
}
For some n, how many times is “tick” printed? I.e. what is the
time complexity of this algorithm?
8
9. Example §1
for ($i = 0; $i < $n; $i++) {
for ($j = 0; $j < $n; $j++) {
echo ’tick’;
}
}
For some n, how many times is “tick” printed? I.e. what is the
time complexity of this algorithm?
n2 times
8
10. Talking about complexity §1
Pick a function to act as boundary for the algorithm’s
complexity
Worst-case
Denoted O (big-Oh)
“My algorithm will not be slower than this function”
Best-case
Denoted Ω (big-Omega)
“My algorithm will at least be as slow as this function”
If they are the same, we write Θ (big-Theta)
In example: both cases are n2 , so the algorithm is in Θ(n2 )
9
12. Example 2 §1
for ($i = 0; $i < $n; $i++) {
if ($myBool) {
for ($j = 0; $j < $n; $j++) {
echo ’tick’;
}
}
}
What is the time complexity of this algorithm?
11
13. Example 2 §1
for ($i = 0; $i < $n; $i++) {
if ($myBool) {
for ($j = 0; $j < $n; $j++) {
echo ’tick’;
}
}
}
What is the time complexity of this algorithm?
O(n2 )
Ω(n) (if $myBool is false)
No Θ!
11
14. We can be a bit sloppy §1
for ($i = 0; $i < $n; $i++) {
if ($myBool) {
for ($j = 0; $j < $n; $j++) {
echo ’tick’;
}
}
}
We describe algorithmic behaviour as input size grows to
infinity
constant factors and smaller terms don’t matter too much
E.g. 3n2 + 4n + 1 is in O(n2 )
12
15. Other functions §1
for ($i = 0; $i < $n; $i++) {
for ($j = 0; $j < $n; $j++) {
echo ’tick’;
}
}
for ($i = 0; $i < $n; $i++) {
echo ’tock’;
}
This algorithm is still in Θ(n2 ).
13
16. Bounds §1
Figure: Order relations1
1
14 Taken from Cormen et al. 2009
17. Complexity Comparison §1
3
10
Superexponential Factorial Exponential
Quadratic
102
Linear
1
10
Logarithmic
100
101
Constant: 1, logarithmic: lg n, linear: n, quadratic: n2 ,
exponential: 2n , factorial: n!, super-exponential: nn
15
18. In numbers §1
Approximate growth for n = 50:
1 1
lg n 5.64
n 50
n2 2500
n3 12500
2n 1125899906842620
n! 3.04 ∗ 1064
nn 8.88 ∗ 1084
16
19. Some more notes on complexity §1
Constant time is written 1, but goes for any constant c
Polynomial time contains all functions in nc for some
constant c
Everything in this presentation will be in polynomial time
17
21. Credit where credit is due §2
The first three pictures in this section are from Wikipedia
19
22. SplDoublyLinkedList §2
12 99 37
Superclass of SplStack and SplQueue
SplDoublyLinkedList is not truly a doubly linked list; it
behaves like a hashtable
20
23. SplDoublyLinkedList §2
12 99 37
Superclass of SplStack and SplQueue
SplDoublyLinkedList is not truly a doubly linked list; it
behaves like a hashtable
Usual doubly linked list time complexity
Append/prepend to available node in Θ(1)
Lookup by scanning in O(n)
Access to beginning/end in Θ(1)
20
24. SplDoublyLinkedList §2
12 99 37
Superclass of SplStack and SplQueue
SplDoublyLinkedList is not truly a doubly linked list; it
behaves like a hashtable
Usual doubly linked list time complexity
Append/prepend to available node in Θ(1)
Lookup by scanning in O(n)
Access to beginning/end in Θ(1)
SplDoublyLinkedList time complexity
Insert/delete by index in Θ(1)
Lookup by index in Θ(1)
Access to beginning/end in Θ(1)
20
25. SplStack §2
Subclass of SplDoublyLinkedList; adds no new operations
Last-in, first-out (LIFO)
Pop/push value from/on the top of the stack in Θ(1)
Push Pop
21
26. SplQueue §2
Subclass of SplDoublyLinkedList; adds enqueue/dequeue
operations
First-in, first-out (FIFO)
Read/dequeue element from front in Θ(1)
Enqueue element to the end in Θ(1)
Enqueue
Dequeue
22
27. Short excursion: trees §2
100
19 36
17 3 25 1
2 7
Consists of nodes (vertices) and directed edges
Each node always has in-degree 1
Except the root: always in-degree 0
Previous property implies there are no cycles
Binary tree: each node has at most two child-nodes
23
28. SplHeap, SplMaxHeap and SplMinHeap §2
100
19 36
17 3 25 1
2 7
A heap is a tree with the heap property : for all A and B, if
B is a child node of A, then
val(A) val(B) for a max-heap: SplMaxHeap
val(A) val(B) for a min-heap: SplMinHeap
Where val(A) denotes the value of node A
24
29. Heaps contd. §2
SplHeap is an abstract superclass
Implemented as binary tree
Access to root element in Θ(1)
Insertion/deletion in O(lg n)
25
30. SplPriorityQueue §2
Variant of SplMaxHeap: for all A and B, if B is a child
node of A, then prio(A) prio(B)
Where prio(A) denotes the priority of node A
26
31. SplFixedArray §2
Fixed-size array with numerical indices only
Efficient OO array implementation
No hashing required for keys
Can make assumptions about array size
Lookup, insertion, deletion in Θ(1) time
Resize in Θ(n)
27
32. SplObjectStorage §2
Storage container for objects
Insertion, deletion in Θ(1)
Verification of presence in Θ(1)
Missing: set operations
Union, intersection, difference, etc.
28
34. Missing in PHP §3
Set data structure
Map/hashtable data structure
Does SplDoublyLinkedList satisfy this use case?
If yes: split it in two separate structures and make
SplDoublyLinkedList a true doubly linked list
Immutable data structures
Allows us to more easily emulate “pure” functions
Less bugs in your code due to lack of mutable state
30
35. Closing remarks §3
Use the SPL data structures!
Choose them with care
Reason about your code’s complexity
31