4. Sorting
Introduction
Sorting is the rearranging of a given set of objects in a
specific order
The purpose is often to simplify a search on the set later
Sorting is done for example in telephone books, data
warehouses, libraries, databases, etc.
The structure of the data dramatically influences the
sorting algorithm and its performance
23/10/2018 Sorting Lecture 3 4
5. Sorting
There exists a great diversity of sorting algorithm
To choose a proper algorithm it is necessary to understand
the significant of performance
Sorting algorithm are classified in two categories:
sorting of arrays
sorting of files
They are also called internal and external sorting because
arrays are stored in the internal store of a computer
(internal sorting)
files are stored on external devices like disks or folder
(external sorting)
23/10/2018 Sorting Lecture 3 5
6. Sorting
Definition
If we have a number n of items a0, a1, …, an-1 a sorting algorithm
gains in permuting these items into an array ak0,ak1…ak(n-1) so that
for a given order function f:
f(ako) ≤ f(ak1) ≤ … ≤ f(ak(n-1))
The value of the ordering function is called the key of the item.
A sorting method is called stable (see Stable Sorting) if the
relative order of items with equal keys remains unchanged by the
sorting process.
23/10/2018 Sorting Lecture 3 6
7. Sorting Algorithm
A sorting problem must not be numerical but it must be
distinguishable (e.g. by colors, by size)
The compiler automatically knows how to treat numerical values
For others however it must be possible for a sorting algorithm to
decide if an element of the set is smaller than another element
For example you can make a numbered list for colors
For characters you can use the ASCII table to decide how the
sorting is done
All the algorithms now discussed work with numerical
representations where a smaller-, bigger-, same-relation is
defined
23/10/2018 Sorting Lecture 3 7
8. Sorting Algorithm
The steps in every sorting algorithm can be simplify by
Selecting and inserting
Interchanging
Spreading and collection
Distributing
There are three different characteristics for sorting
algorithm
Indirect Sorting
Distribution Sorting
Stable Sorting
23/10/2018 Sorting Lecture 3 8
9. Indirect Sorting
If you have an array with large elements sorting might be
expensive
Therefore you use for sorting a list with references to the
original array instead of sorting the array itself
The original array data are compared but the element in
the reference list is swapped
Afterwards the original array is still untouched
But the array with references helps to rearrange the data in
the record
This is called Indirect Sorting
23/10/2018 Sorting Lecture 3 9
10. Indirect Sorting
Definition
A an array
n the number of elements in the array
P another array/list is defined by P[i]=i for all i = 1, 2 … n
Task is to modify P so that
A[P[1]] ≤ A[P[2]] ≤ … ≤ A[P[n]]
Therefore instead of changing the array A we change the list
P
23/10/2018 Sorting Lecture 3 10
11. Indirect Sorting
Example
Sorted after the Reg.-No.
Use the Id column to indicate every line
The original table is not changed
23/10/2018 Sorting Lecture 3 11
Id Reg. No. Last Name First Name
1 102 Smith John
2 99 Black Rose
3 376 Miller Fred
4 22 Baker Joseph
5 86 White Agnes
Id Reg. No.
1 102
2 99
3 376
4 22
5 86
Id Reg. No.
4 22
5 86
2 99
1 102
3 376
sort
12. Distribution Sorting
A sorting algorithm is called a Distribution Sorting
Algorithm if the data is distributed from its input to a
multiple temporary structure
This structure is used to collect the elements and place
them on the output
Examples
Merge Sort
Radix Sort
23/10/2018 Sorting Lecture 3 12
Input data
Output data
Temporary
Structure
13. Stable Sorting
For some data you wish to sort them by more than one
criterion
For example in a list of addresses you sort first by the
surnames and than by the first names
A sorting algorithm is called stable when one sort does not
destroy the result of the previous sort
That means if the elements of the original array with the
same value (key) appear in the output array in the same
order as they did in the original array
Therefore a stable sort preserves the order of equal keys
23/10/2018 Sorting Lecture 3 13
14. Stable Sorting
Example
List of pairs to order: {(3, A),(1, C),(2, B),(3, D),(1, B),(2, A),(3, C)}
Two possibilities to sort them after the first key:
{(1, B), (1, C), (2, A), (2, B), (3, A), (3, C), (3, D)}
order changed/not stable
{(1, C), (1, B), (2, B), (2, A), (3, A), (3, D), (3, C)}
order maintained/stable
Sorting the names Victoria, Brenda, Angela by length than:
Brenda, Angela, Victoria would be stable
Angela, Brenda, Victoria would be not stable
23/10/2018 Sorting Lecture 3 14
15. Sorting Algorithm
We going to look at some important sorting algorithm
There are far more!
The most sorting algorithm are comparison sorts meaning
that they compare the keys
The other basic operation is to swap
23/10/2018 Sorting Lecture 3 15
16. Selection Sort
A Selection Sort compares the data to decide if they have
to be swapped or not
It starts with the first element
It compares this element with the next elements and so on
If the value of a next element is smaller than it swap the
two elements
If no smaller value can be found the algorithm continues
with the next element
Therefore almost every value is compared with the other
values (time costly)
If there are only two elements in the record this algorithm
is faster than all others
23/10/2018 Sorting Lecture 3 16
20. Selection Sort
Example
53 2 12 8 64 16 15
2 53 12 8 64 16 15
2 8 12 53 64 16 15
2 8 12 53 64 16 15
2 8 12 15 64 16 53
2 8 12 15 16 64 53
2 8 12 15 16 53 64
23/10/2018 Sorting Lecture 3 20
A variant of the Selection
Sort tries to find the
minimum of the remaining
data and interchanges the
two data elements
21. Insertion Sort
Two groups of data - sorted and unsorted
Every repetition of the Insertion Sort removes an element
of the original input data (unsorted)
This element is put in the correct position of the already
sorted part of the original data (sorted)
The repetition takes place until no element remains
Recommended for data that is nearly sorted otherwise very
time costly
23/10/2018 Sorting Lecture 3 21
22. Insertion Sort
Be s the element to be sorted:
sorted part unsorted data
23/10/2018 Sorting Lecture 3 22
≤ s > s s
≤ s > ss
26. Quick Sort
Quick Sort algorithm uses a so called pivot element
The pivot element is selected in such a way that around
half of the values are smaller and half of the values are
bigger in the input data
The data are separated accordingly into a sub part and high
part
The method is repeated recursively with each part
Equal elements can be put in one of both parts
If a part has no element or one element it is defined as
sorted
23/10/2018 Sorting Lecture 3 26
27. Quick Sort
23/10/2018 Sorting Lecture 3 27
pivot< pivot > pivot
pivot<pivot’ > pivot’’pivot’ >pivot’ <pivot’’ pivot’’
• The pivot element is chosen randomly for example the
middle index of the data or the median of the first, middle
and last element
• Fasted sort algorithm in practice
• But a simple Quick Sort algorithm performs very badly on
already sorted array of data
31. Bucket Sort or Bin Sort
Bucket Sort partitioning an array into a number of buckets
Each of these buckets is sorted individually
This can be done by using another sorting algorithm or again the
bucket sort
A bucket sort is a distribution sort
These are the steps to be performed:
Set up an array of empty buckets
Put every item of the original array in its bucket
Sort each of the buckets that is not empty
Put all the elements now sorted back to the original array
23/10/2018 Sorting Lecture 3 31
35. Bucket Sort or Bin Sort
Variants of Buckets Sort:
Generic bucket sort: operates on a list of n numeric inputs
between 0 and a max value; divided the value range into n
buckets with size Max/n
Postman’s sort: operates on hierarchical structure elements;
used by letter-sorting machines; Mail is sorted first between
nation/international, then state, province, district, city,
streets/routes, etc. keys not sorted against each other.
Shuffle sort: operates by removing the first 1/8 of the elements
n, sorts them recursively and puts them in an array. It creates n/8
buckets to which the remaining 7/8 elements are distributed.
Each bucket is sorted and concatenated into a sorted array
23/10/2018 Sorting Lecture 3 35
36. Radix Sort
A very old sorting algorithm (invented 1887 by Herman
Hollerith) is the Radix Sort
The Radix sort was used to sort cards
In general it sorts integers but it is not limited to it
The algorithm distributes items to a bucket according to
the item’s value beginning with the least significant digit
After each round the items are recollected from the buckets
The process is repeated with the next most significant digit
This is called Least Significant Digit Radix Sort (LSD)
A variant is the Most Significant Digit (MSD) starting
with the most significant digit
23/10/2018 Sorting Lecture 3 36
37. Radix Sort
Example
Input keys: 34, 12, 42, 32, 44, 41, 34, 11, 32, 23
4 buckets, because there are 4 different digits 1, 2, 3, 4
Sorting by the least significant digit:
1. Bucket: 41 11
2. Bucket: 12 42 32 32
3. Bucket: 23
4. Bucket: 34, 44, 34
Recollecting: 41 11 12 42 32 32 23 34 44 34
23/10/2018 Sorting Lecture 3 37
38. Radix Sort
The recollected data are now sorted by the next most
significant digit (here the highest digit):
41 11 12 42 32 32 23 34 44 34
1. Bucket: 11 12
2. Bucket: 23
3. Bucket: 32 32 34 34
4. Bucket: 41 42 44
Recollecting: 11 12 23 32 32 34 34 41 42 44
23/10/2018 Sorting Lecture 3 38
42. Merge Sort
Merge Sort is a comparison-based sorting algorithm
Most of the used implementation produces a stable sort
The algorithm works as follows:
The unsorted input list is divided in two sub-lists of about
half the size of the original data
Each sub-list is sorted recursively by using again a merge sort
Afterwards the two sub-list are merged into one sorted list
The basic ideas behind the Merge Sort:
a smaller list takes less runtime than a bigger
fewer steps are necessary to construct a sorted list from two
sorted lists than from an unsorted list
23/10/2018 Sorting Lecture 3 42
43. Merge Sort
Example
(37 26 42 1 7 70 12 56)
Dividing into 2 parts:
(37 26 42 1)( 7 70 12 56)
Dividing each part again into 2 parts:
(37 26)(42 1)( 7 70)(12 56)
Sorting of each part:
(26 37)( 1 42)( 7 70)(12 56)
Sorting & merging back to previous size (two parts):
( 1 26 37 42)( 7 12 56 70)
Sorting & merging back to the original size(now
sorted):
( 1 7 12 26 37 42 56 70)
23/10/2018 Sorting Lecture 3 43
46. Bubble Sort
23/10/2018
The Bubble Sort is a simple sorting algorithm
It works by repeatedly stepping through a list of data
It compares each pair of elements and swaps them if they
are in wrong order
This is done until no swap is needed any more
The name comes from the way smaller elements bubble to
the top of the list
Sorting Lecture 3 46
47. Bubble Sort
The performance depends strongly on the position of the
elements
If the smaller elements are stored at the end of the list the
sort is extremely slow
If the larger elements are at the beginning this cause no
problem
They are therefore called turtles (small elements at the
end of the list) and rabbits (larger elements at the
beginning of the list)
23/10/2018 Sorting Lecture 3 47
53. Bubble Sort
Variants
23/10/2018 Sorting Lecture 3 53
Variant of Bubble Sort Description
Od-even sort Parallel version of bubble sort for
message passing systems
Cocktail sort Parallel version
Right-to-left Instead of starting from the left side
you start from the right side
54. Comparing the algorithms
To evaluate the different sorting algorithm you can
consider the following factors
Run-Time: the cost/period of time for each performed
operation (comparing, swapping, distributing, etc.)
Memory: the memory you need for the sorting
Can either be
None or constant
Linear
Exponential (worst case)
The goal is of course to use less memory and run-time
In the most cases you look at the run-time behaviour
23/10/2018 Sorting Lecture 3 54
55. Comparing the algorithms
The following table compares all comparison sorts (Insertion,
Binary tree sort (later), Selection, Bubble, Merge, Quick)
n is the number of records
Best describes the best possible performance if the data are
favourable distributed
Average describes the average run time performance
Worst describes the behaviour of the algorithm if the data are
badly distributed
Method is the used operation of the algorithm
For Average and Worst is assumed that all comparisons, swaps
and other necessary operations can proceed in constant time
(O(1))
23/10/2018 Sorting Lecture 3 55
57. Comparing the algorithms
This table compares all sorting algorithm which are not
comparison sorts
n the number of items, k the size of the key/value, d the
digit implementation size
23/10/2018 Sorting Lecture 3 57
Name Average Worst
Bucket Sort O(n+k) O(n2*k)
LSD Radix Sort O(n*(k/d)) O(n*(k/d))
58. Developing an algorithm for the
selection sort
Proceeding
Comparing the start value with all other values to find a
minimum
If a minimum exists swap the two elements
Given an array of elements: list[]
n number of elements in the list
23/10/2018 Sorting Lecture 3 58
0 1 2 3 4 5 … n-3 n-2 n-1
list[0] list[1] list[2] list[3] list[4] list[5] … list[n-3] list[n-2] list[n-1]
59. Developing an algorithm for the
selection sort
We compare every element starting with the first element
for loop
for (int i = 0; i < n-1; i++)
Index i goes from 0 to n-2 because last index n-1 is already
the maximum
Now we compare list[i] with all other elements
Meaning we have to compare it with all elements where the
index is greater than i and smaller than n
second for loop
for (int j = i + 1; j < n; j++)
23/10/2018 Sorting Lecture 3 59
60. Developing an algorithm for the
selection sort
To find the maximum we compare list[i] with list[j]
If list[i] >= list[j] a new minimum is found
Set min: int min = i; before second for loop
In second for loop;
íf (a[min] >= a[j])
min = j;
Finally swap the elements:
int tmp = list[i];
list[i] = list[min];
list[min] = tmp;
23/10/2018 Sorting Lecture 3 60
61. Developing an algorithm for the
selection sort
Complete code
public static void sort(int[] list, int n) {
for (int i = 0; i < n-1; i++) {
int min = i;
for (int j = i+1; j < n; j++) {
if (list[min] >= list[j])
min = j;
}
int tmp = list[i];
list[i] = list[min];
list[min] = tmp;
}
}
23/10/2018 Sorting Lecture 3 61