Example 2: Analysis of Binary Search
The analysis of a binary search is not the same as that of linear search because the loop of a binary search does not follow the pattern of going from the start of the array all the way to the end. It starts in the middle of an array and jump around. Not every element will be considered during the search process so this will be a bit different.
A binary search can only be done to lists if two criteria are true:
- the list must be sorted
- access to any element given its position must have a constant run time.
- Python
- C++
def binary_search(my_list, key):
rc = -1
low = 0
high = len(my_list) - 1
while rc == -1 and low <= high:
mid = (low + high) // 2
if key < my_list[mid]:
high = mid - 1
elif key > my_list[mid]:
low = mid + 1
else:
rc = mid
return rc
In the above code, there are two variables low and high. These represent the first and last index of the elements where key may be located. In other words, the only possible places key may exist is somewhere between my_list[low] and my_list[high] inclusive. In each iteration of the while loop we check the middle element between them. If key is smaller than the middle element, we move the high marker so that it is to one element before the middle one. If key is bigger, we move the low marker one element after the middle one. In other words we eliminate half of the remaining possibilities of key's location after each iteration of the while loop. We stop when we either find it or we have eliminated all possible locations for key.
Analysis of an unsuccessful search
Similar to a linear search we should analyze the run time for what happens in its worst case. In this case, it is also performing an unsuccessful search
Let n represent the size of the array
Let T(n) represent the number of operations needed to find key within the array
In a binary search the following operators are done exactly one time:
rc = -1 # 1
low = 0 # 1
high = len(my_list) - 1 # 1 operator here + 1 function call + 1 operation for assignment
return rc # 1
The following are done for each iteration within the loop:
rc == -1 and low <= high # 3 ops per iteration of loop
mid = (low + high) // 2 # 3 ops per iteration of loop
Within the loop there is also an if/else if/else statement. there are 3 paths through this statement. The path that would would lead the the highest operation count would be to follow the else-if. The reason is that we would end up doing both checks. and there are more operations in the else-if than the else statement. Thus, we should use that pathway
if key < my_list[mid]: # <---check this (1 op)
high = mid - 1 # <---skip, check fails
elif key > my_list[mid]: # <---check (1 op)
low = mid + 1 # <---do this (2 ops)
else: # <---skip
rc = mid # <---skip
Thus what we can say is that each time the loop runs, it will perform 10 operations at worst
template <class TYPE>
int binarySearch(const vector<TYPE>& arr, const TYPE& key){
int rc=-1;
int low=0;
int high=arr.size()-1;
int mid;
while(low<=high && rc==-1){
mid = low + (high - low) / 2; // instead of saying: mid = (low + high) / 2, to avoid ovreflow (remember CPR101?)
if(key < arr[mid])
high=mid-1;
else if(key > arr[mid] )
low=mid+1;
else
rc=mid;
}/*while*/
return rc;
}
In the above code, there are two variables low and high. These represent the first and last index of the elements where key may be located. In other words, the only possible places key may exist is somewhere between arr[low] and arr[high] inclusive. In each iteration of the while loop we check the middle element between them. If key is smaller than the middle element, we move the high marker so that it is to one element before the middle one. If key is bigger, we move the low marker one element after the middle one. In other words we eliminate half of the remaining possibilities of key's location after each iteration of the while loop. We stop when we either find it or we have eliminated all possible locations for key.
Analysis of an unsuccessful search
Similar to a linear search we should analyze the run time for what happens in its worst case. In this case, it is also performing an unsuccessful search
Let n represent the size of the array
Let T(n) represent the number of operations needed to find key within the array
In a binary search the following operators are done exactly one time:
int rc=-1;
int low=0;
int high=arr.size()-1;
int mid;
return rc;
The following are done for each iteration within the loop:
low<=high && rc==-1 //<-- 3 ops
mid = low + (high - low) / 2; //<-- 4 ops
Within the loop there is also an if/else if/else statement. there are 3 paths through this statement. The path that would would lead the the highest operation count would be to follow the else-if. The reason is that we would end up doing both checks. and there are more operations in the else-if than the else statement. Thus, we should use that pathway
if(arr[mid] > key) //<---check this (1 op)
high=mid-1; //<---skip, check fails
else if(arr[mid] < key) //<---check (1 op)
low=mid+1; //<---do this (2 ops)
else //<---skip
rc=mid; //<---skip
Thus what we can say is that each time the loop runs, it will perform 10 operations at worst
Number of Loop Iterations
The next question then is to figure out how many times this loop will run. We are of course assuming that the search will fail. Thus, the only way for this loop to stop is for low<=high to become untrue.
When we started off we have n elements to search. After one iteration through the loop we have approximately half of the original number of elements. After one more iteration we eliminate another half of the remaining elements.
the last iteration would involve only 1 element because low == high == mid, after that we would end up with low > high
Lets look at the pattern of the relationship between number elements within the search set and the iteration number.
Loop iteration | Number of elements between low and high inclusive (approx) | Denominator |
---|---|---|
1 | ||
2 | ||
3 | ||
4 | ||
5 |