Searching and Sorting Techniques in Data Structure
1. DATA STRUCTURES
Lectures: 4 hrs/week Theory: 50 Marks
Class - SECSE Online: 50 Marks
By
Mr. B. J Gorad,
BE(CSE), M.Tech (CST), GATE2011,2016, PhD(CSE)*
Assistant Professor, Computer Science and Engineering,
Sharad Institute of Technology College of Engineering,
Ichalkaranji, Maharashtra
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
2. Course Outcomes
At the end of Course students will
CO1 - Familiar with C programming and basic data structures.
CO2 - Understand various searching and sorting algorithms and they will able to
Choose the appropriate data structure and algorithm design method for a
specified application.
CO3 - Understand the abstract properties of various data structures such as stacks,
queues. Also they will able to choose appropriate data structure for
specified application.
CO4 - Understand the abstract properties of various data structures such as Lists.
Also they will able to choose appropriate data structure for specified application.
CO5 - Understand and apply fundamental algorithmic problems including Tree, B+
Tree, and Tree traversals.
CO6 - Understand and apply fundamental algorithmic problems including Graph,
Graph traversals and shortest paths
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
4. What is algorithm and Algorithm
design?
• An Algorithm is a Step by Step solution of a specific
mathematical or computer related problem.
• Algorithm design is a specific method to create a
mathematical process in solving problems.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
5. Sorted Array
• Sorted array is an array where each
element is sorted in numerical,
alphabetical, or some other order, and
placed at equally spaced addresses in
computer memory.
1 2 3 4
0.2 0.3 1 1.5
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
6. Unsorted Array
• Unsorted array is an array where each
element is not sorted in numerical,
alphabetical, or some other order, and
placed at equally spaced addresses in
computer memory.
1 2 3 4
0.2 0.3 1.5 1
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
7. What is searching?
• In computer science, searching is the process of
finding an item with specified properties from a
collection of items.
• The items may be stored as records in a database,
simple data elements in arrays, text in files, nodes in
trees, vertices and edges in graphs, or maybe be
elements in other search place.
• The definition of a search is the process of looking for
something or someone
• Example : An example of a search is a quest to find a
missing person Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
8. Why do we need searching?
✓Searching is one of the core computer science
algorithms.
✓We know that today’s computers store a lot of
information.
✓To retrieve this information proficiently we need
very efficient searching algorithms.
Types of Searching
• Linear search
• Binary search
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
9. Linear Search
• The linear search is a sequential search, which uses a loop
to step through an array, starting with the first element.
• It compares each element with the value being searched
for, and stops when either the value is found or the end of
the array is encountered.
• If the value being searched is not in the array, the
algorithm will unsuccessfully search to the end of the
array.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
10. Linear Search
• Since the array elements are stored in linear order
searching the element in the linear order make it easy and
efficient.
• The search may be successful or unsuccessfully. That is, if
the required element is found them the search is
successful other wise it is unsuccessfully.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
11. Unordered linear/ Sequential
search
int unorderedlinearsearch (int A[], int n, int data)
{
for (int i=0; i<n; i++)
{
if(A[i] == data)
return i;
}
return -1;
}
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
12. Advantages of Linear
search
• If the first number in the directory is the number
you were searching for ,then lucky you!!.
• Since you have found it on the very first page, now
its not important for you that how many pages are
there in the directory.
• The linear search is simple - It is very easy to
understand and implement
• It does not require the data in the array to be stored
in any particular order
• So it does not depends on no. on elements in the
directory. Hence constant time .
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
13. Disadvantages of Linear
search
• It may happen that the number you are searching for is the last
number of directory or if it is not in the directory at all.
• In that case you have to search the whole directory.
• Now number of elements will matter to you. if there are 500
pages ,you have to search 500;if it has 1000 you have to search
1000.
• Your search time is proportional to number of elements in the
directory.
• It has very poor efficiency because it takes lots of comparisons to
find a particular record in big files
• The performance of the algorithm scales linearly with the size of
the input
• Linear search is slower then other searching algorithms
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
14. Analysis of Linear Search
How long will our search take?
In the best case, the target value is in the first
element of the array.
So the search takes some tiny, and constant,
amount of time.
In the worst case, the target value is in the last
element of the array.
So the search takes an amount of time
proportional to the length of the array.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
15. Analysis of Linear Search
In the average case, the target value is somewhere in the array.
In fact, since the target value can be anywhere in the array, any
element of the array is equally likely.
So on average, the target value will be in the middle of the
array.
So the search takes an amount of time proportional to half the
length of the array
The worst case complexity is O(n), sometimes
known an O(n) search
Time taken to search elements keep increasing
as the number of elements are increased.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
16. Binary Search
The general term for a smart search through sorted data is a binary
search.
1. The initial search region is the whole array.
2. Look at the data value in the middle of the search region.
3. If you’ve found your target, stop.
4. If your target is less than the middle data value, the new search
region is the lower half of the data.
5. If your target is greater than the middle data value, the new
search region is the higher half of the data.
6. Continue from Step 2.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
17. 17
Binary Search
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
21. Binary Search Routine
public int binarySearch (int[] number, int
searchValue)
{
int low = 0, high = number.length - 1, mid = (low +
high) / 2;
while (low <= high && number[mid] != searchValue) {
if (number[mid] < searchValue) {
low = mid + 1;
}
else
{ //number[mid] > searchValue
high = mid - 1;
}
mid = (low + high) / 2; //integer
division will truncate
}
if (low > high) {
mid = NOT_FOUND;
}
return mid;
}
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
22. Binary Search
Performance
• Successful Search
– Best Case – 1 comparison
– Worst Case – log2N comparisons
• Unsuccessful Search
– Best Case =
Worst Case – log2N comparisons
• Since the portion of an array to search is cut into
half after every comparison, we compute how many
times the array can be divided into halves.
• After K comparisons, there will be N/2K elements in
the list. We solve for K when N/2K = 1, deriving K =
log2N.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
23. Comparing N and log2N
Performance
Array Size Linear – N Binary –
log2N10 10 4
50 50 6
100 100 7
500 500 9
1000 1000 10
2000 2000 11
3000 3000 12
4000 4000 12
5000 5000 13
6000 6000 13
7000 7000 13
8000 8000 13
9000 9000 14
10000 10000 14
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
24. Important Differences:
Input data needs to be sorted in Binary Search and
not in Linear Search
Linear search does the sequential access whereas
Binary search access data randomly.
Time complexity of linear search -O(n) , Binary
search has time complexity O(log n).
Linear search performs equality comparisons and
Binary search performs ordering comparisons
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
25. Bubble Sort Algorithm
➢ Bubble sort is a simple sorting algorithm.
➢ This sorting algorithm is comparison-based algorithm in which
each pair of adjacent elements is compared and the elements
are swapped if they are not in order.
➢ This algorithm is not suitable for large data sets as its average
and worst case complexity are of Ο(n2) where n is the number
of items.
How Bubble Sort Works?
➢ We take an unsorted array for our example. Bubble sort takes
Ο(n2) time so we're keeping it short and precise.
➢ Bubble sort starts with very first two elements, comparing
them to check which one is greater.Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
26. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
27. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
28. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
29. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
30. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
31. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
32. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
33. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
34. Insertion Sort
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
35. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
36. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
37. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
38. Selection Sort
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
39. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
40. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
41. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
42. Merge Sort Algorithm
• Merge sort is a sorting technique based on
divide and conquer technique. With Average
case and worst-case time complexity being
Ο(n log n), it is one of the most respected
algorithms.
• Merge sort first divides the array into equal
halves and then combines them in a sorted
manner.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
43. How merge sort works
• To understand merge sort, we take an
unsorted array as depicted below −
• We know that merge sort first divides the
whole array iteratively into equal halves
unless the atomic values are achieved. We see
here that an array of 8 items is divided into
two arrays of size 4.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
44. • This does not change the sequence of
appearance of items in the original. Now we
divide these two arrays into halves.
• We further divide these arrays and we achieve
atomic value which can no more be divided.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
45. • Now, we combine them in exactly same
manner they were broken down.
• We first compare the element for each list and
then combine them into another list in sorted
manner. We see that 14 and 33 are in sorted
positions. We compare 27 and 10 and in the
target list of 2 values we put 10 first, followed
by 27. We change the order 19 and 35. 42 and
44 are placed sequentially.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
46. • In next iteration of combining phase, we
compare lists of two data values, and merge
them into a list of four data values placing all
in sorted order.
• After final merging, the list should look like
this −
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
47. Algorithm
• Merge sort keeps on dividing the list into equal
halves until it can no more be divided. By
definition, if it is only one element in the list, it
is sorted. Then merge sort combines smaller
sorted lists keeping the new list sorted too.
– Step 1 − divide the list recursively into two halves
until it can no more be divided.
– Step 2 − if it is only one element in the list it is
already sorted, return.
– Step 3 − merge the smaller lists into new list in
sorted order.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
48. Data Structure - Shell Sort
• Shell sort is a highly efficient sorting algorithm
and is based on insertion sort algorithm. This
algorithm avoids large shifts as in case of
insertion sort if smaller value is very far right and
have to move to far left.
• This algorithm uses insertion sort on widely
spread elements first to sort them and then sorts
the less widely spaced elements. This spacing is
termed as interval. This interval is calculated
based on Knuth's formula as −
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
49. • h = h * 3 + 1
where − h is interval with initial value 1
This algorithm is quite efficient for medium
sized data sets as its average and worst case
complexity are of O(n^2) where n are no. of
items.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
50. How shell sort works
• We take the below example to have an idea, how
shell sort works?
• We take the same array we have used in our
previous examples. {35,33,42,10,14,19,27,44}
• For our example and ease of understanding we
take the interval of 4.
• And make a virtual sublist of all values located at
the interval of 4 positions. Here these values are
{35, 14}, {33, 19}, {42, 27} and {10, 14}
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
51. We compare values in each sub-list and swap them (if necessary) in the
original array. After this step, new array should look like this −
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
52. Then we take interval of 2 and this gap generates two sublists - {14, 27, 35,
42}, {19, 10, 33, 44}
We compare and swap the values, if required, in the original array. After this
step, this array should look like this −
And finally, we sort the rest of the array using interval of value 1. Shell
sort uses insertion sort to sort the array. The step by step depiction is
shown below −
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
53. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
54. Algorithm
• We shall now see the algorithm for shell sort.
• Step 1 − Initialize the value of h
• Step 2 − Divide the list into smaller sub-list of
equal interval h
• Step 3 − Sort these sub-lists using insertion
sort
• Step 4 − Repeat until complete list is sorted
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
55. Radix Sort
• Radix Sort is generalization of Bucket Sort
• To sort Decimal Numbers radix/base will be
used as 10. so we need 10 buckets.
• Buckets are numbered as 0,1,2,3,…,9
• Sorting is Done in the passes
• Number of Passes required for sorting is
number of digits in the largest number in the
list.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
56. Ex.
Range Passes
0 to 99 2 Passes
0 to 999 3 Passes
0 to 9999 4 Passes
• In First Pass number sorted based on Least
Significant Digit and number will be kept in same
bucket.
• In 2nd Pass, Numbers are sorted on second least
significant bit and process continues.
• At the end of every pass, numbers in buckets are
merged to produce common list.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
57. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
58. Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra
59. • Radix Sort is very simple, and a computer can do it fast. When it is
programmed properly, Radix Sort is in fact one of the fastest
sorting algorithms for numbers or strings of letters.
• Average case and Worst case Complexity - O(n)
Disadvantages
• Still, there are some tradeoffs for Radix Sort that can make it less
preferable than other sorts.
• The speed of Radix Sort largely depends on the inner basic
operations, and if the operations are not efficient enough, Radix
Sort can be slower than some other algorithms such as Quick Sort
and Merge Sort.
• In the example above, the numbers were all of equal length, but
many times, this is not the case. If the numbers are not of the same
length, then a test is needed to check for additional digits that need
sorting. This can be one of the slowest parts of Radix Sort, and it is
one of the hardest to make efficient.
• Radix Sort can also take up more space than other sorting
algorithms, since in addition to the array that will be sorted, you
need to have a sublist for each of the possible digits or letters.
Mr. B J Gorad, CSE, Sharad Institute of
Technology COE, Ichalkaranji, Maharshtra