Chapter 4
Algorithmic Complexity/Efficiency
To think about the complexity of computation, we need a model of reality. As with
everything else in the real world, we cannot handle the full complexity, so we make
some simplifications that enables us to reason about the world.
A common model in computer science is the RAM (random access memory)
model. It is the model that we will use.
It shares some commonalities with other models, though not all, so do not think
that the explanation here is unique to the RAM model, but different models can
have slightly different assumptions about what you can do as "primitive"
operations and which you cannot. That is usually the main difference between
them; another is the cost of operations that can vary from model to model.
Common for most the models is an assumption about what you can do with
numbers, especially what you can do with numbers smaller than the input size. The
space it takes to store a number, and the time it takes to operate on it, is not
constant. The number of bits you need to store and manipulate depends on the
size of the number.
Many list operations will also be primitive in the RAM model. Not because the RAM
model knows anything about Python lists—it doesn’t—but because we can
express Python lists in terms of the RAM model (with some assumptions about
how lists are represented).
The RAM has a concept of memory as contiguous "memory words", and a Python
list can thought of as a contiguous sequence of memory words. (Things get a little
bit more complex if lists store something other than numbers, but we don’t care
about that right now). Lists also explicitly store their length, so we can get that
without having to run through the list and count.
In the RAM model we can get what is at any memory location as a primitive
operation and we can store a value at any memory location as a primitive
operation. To get the index of a list, we get the memory location of the list and then
If we have this idea of lists as contiguous memory locations, we can see that
concatenation of lists is not a single primitive operation. To make the list x + y, we
need to create a new list to store the concatenated list and then we need to copy
all the elements from both x and y into it.
So, with lists, we can get their length and values at any index one or a few
operations. It is less obvious, but we can also append to lists in a few (constant
number) primitive operations—I’ll sketch how shortly, but otherwise just trust me
on this.
Concatenating two lists, or extending one with another, are not primitive
operations; neither is deleting an element in the middle of a list.
You can see that the primitive list operations map to one or perhaps a handful of
primitive operations in a model that just work with memory words, simply by
mapping a list to a sequence of words.
The append operation is—as I said—a bit more complex, but it works because we
have usually allocated a bit more memory than we need for a list, so we have
empty words following the list items, and we can put the appended value there.
This doesn’t always work, because sometimes we run out of this extra memory,
and then we need to do more. We can set it up such that this happens sufficiently
infrequently that appending takes a few primitive operations on average.
Thinking of the lists as contiguous blocks of memory also makes it clear why
concatenating and extending lists are not primitive, but requires a number of
operations proportional to the lengths of the lists.
If you delete an element inside the list, you need to copy all the preceding items,
so that is also an operation that requires a number of primitive operations that is
proportional to the number of items copied. (You can delete the last element with a
few operations because you do not need to copy any items in that case).
Assumptions:
• All primitive operations take the
same time
• The cost of complex operations
is the sum of their primitive
operations
When we figure out how much time it takes to solve a particular problem, we
simply count the number of primitive operations the task takes. We do not
distinguish between the types of operations—that would be too hard, trust me, and
wouldn’t necessarily map well to actual hardware.
In all honesty, I am lying when I tell you that there even are such things as complex
operations. There are operations in Python that looks like they are operations at the
same level as getting the value at index i in list x, x[i], but are actually more
complicated. I call such things "complex operations", but the only reason that I
have to distinguish between primitive and complex operations is that a lot is
hidden from you when you ask Python to do such things as concatenate two lists
(or two strings) or when you slice out parts of a list. At the most primitive level, the
computer doesn’t have complex operations. If you had to implement Python based
only one the primitive operations you have there, then you would appreciate that
For some operations it isn’t necessarily clear exactly how many primitive
operations we need.
Can we assign to and read from variables in constant time? If we equate variable
names with memory locations, then yes, but otherwise it might be more complex.
When we do an operation such as "x = x + 5" do we count that as "read the value
of x" then "add 5 to it" and finally "put the result in the memory location referred to
as x"? That would be three operations. But hardware might support adding a
constant to a location as a single operation—quite frequently—so "x += 5" might
be faster; only one primitive operation.
Similarly, the number of operations it takes to access or update items at a given
index into a list can vary depending on how we imagine they are done. If the
variable x indicates the start address of the elements in the list (ignoring where we
store the length of the list), then we can get index i by adding i to x: x[0] is memory
address x, x[1] is memory address x + 1, …, x[i] is memory address x + i. Getting
that value could be
1.get x
2.add i
3.read has is at the address x+i
that would be three operations. Most hardware can combine some of them,
though. There are instructions that can take a location and an offset and get the
value in that word as a single instruction. That would be
When we have operations that involve moving or looking at more than one memory
word we have a complex operation. These operations typically take time
proportional to how many elements you look at your or you move around.
Extending a list is also a complex operation. We do not (necessarily) need to copy
the vector we modify, but we do need to copy all the elements from the second
vector.
When we construct a list from a sequence of values, we have another complex
operation. We need to create the space for the list—this can take time proportional
to the length of the list or constant time, depending on how memory is managed—
and then we need to move all the elements into the list—costing whatever time
that takes.
Appending to a list is actually also a complex operation. We will just treat it as a
primitive one because it can be implemented such that on average it takes a fixed
number of primitive operations to implement. It is actually a bit better than just
saying "on average", it always take a linear number of operations to append n
elements. Such a sequence of append-operations will consist of some cheap and
some expensive operations, but amortised over the n appends we end up with on
the order of n operations.
How this actually works we have to leave for later, but the essence is that lists
allocate a bit more memory than they need and can put new items there. Whenever
it runs out of memory it allocates a block that is twice as large as it was when it ran
out of memory. It turns out that this strategy lets us pretend that appending to a list
always takes a fixed number of primitive operations. We just call it one operation.
When we discuss the complexity of an algorithm, we usually discard the cost of
getting the input of passing on the output. We assume that the input is given to us
in a form that we can immediately use, and we assume that the way we leave the
output matches what the next computation needs.
We usually measure the cost of running an algorithm as a function of the input size.
This, by convention, we call n.
It is usually not a problem to see what the size of the input is. If you get a list, it is
the length of the list. If it is a string, it is the length of the string. If it is a graph—like
the connected component algorithm from the previous chapter—then it is the
number of nodes and the number of edges (cities and roads in that example).
One case where it might be a bit strange is when numbers are involved. It takes log
n bits (log base 2) to represent the number n. So if we have a list of n numbers, all
smaller than n, is the input size then n × log n? Of if the input is just a number do
we have n=1 or the log of that number?
This is an issue, but it hardly ever matters. Unless you use numbers larger than
To work out the complexity of an algorithm (or, with a positive spin on it, the
efficiency) we count how many operations it takes on input of size n.
Best case?
Average case?
Worst case?
Sometimes, the running time is not just a function of the size of the input but also
what the actual input is. Taking into account all possible input to give a measure of
algorithmic efficiency is impractical, so we use instead consider best, average and
worst-case running times.
= n
Counting the actual number of operations is tedious and pointless—it doesn’t
directly translate into running time anyway. We therefore only care about the
"order" of the complexity.
The "Big-Oh of f" class of functions, for some specific function f, are those that f
can dominate after a while if we get to multiply it with a constant.
If g is in O(f) it doesn’t mean that g(n) is smaller than f(n). It is possible that g(n) is
always larger than f(n). But it does mean that we can multiply f with a constant c
such that cf(n) >= g(n) (eventually). The "eventually" means that after some n it is
always the case. It doesn’t mean that cf(n) is always larger than g(n). For some
finite number of points at the beginning of the n axis it can be larger.
You get big-Omega by changing which function should dominate which.
If g is in O(f) then f is in Ω(g). (If both, then g is in Θ(f) and f is in Θ(g)).
If you do the arithmetic (for function addition, i.e. (f₁ + f₂)(x) = f₁(x) + f₂(x) and (f · g)
(x) = f(x) × g(x)) it is not hard to show these properties.
The second and third are just special cases of the first, but we use these two more
often than the others.
The second rule tells us that if we have different phases in an algorithm, then we
can add the complexity of those to get the complexity of the algorithm.
The third rule tells us that we really only care about the slowest step of an algorithm
— it dominates all the other steps.
The multiplication rules are useful for reasoning about loops. If we do something
that takes constant time at most f(n) times, we have an O(f) running time. Similarly,
if we, f(n) times, do something that takes g(n) times, then we have O(fg). It doesn’t
even have to be exactly f(n) and g(n) times, it suffices that it is O(f) and O(g).
Some complexity classes pop up surprisingly often:
1.Constant time — O(1)
2.logarithmic time — O(log n) — e.g. binary search
3.linear time — O(n) — e.g. linear search
4.log-linear — O(n log n) — e.g. several divide-and-conquer sorting algorithms
5.quadratic time — O(n²) — e.g. simple sorting algorithms
6.cubic time — O(n³) — e.g. straightforward matrix multiplication
7.exponential time — O(2ⁿ) (although it doesn’t have to be base two) — e.g. a lot of
optimisation algorithms. For anything but tiny n this is not usable in practise.
Thats it!
Now it is time to do the
exercises to test that
you now understand
algorithmic complexity