Merge Sort

The merge sort works on the idea of merging two already sorted lists.

If there existed two already sorted list, merging the two together into a single sorted list can be accomplished in O(n) time where n is the combined length of the two lists.

Thus, if we are able to take our original list, split it into two pieces and then somehow get them into a sorted state, we can then use the merge algorithm to put these two pieces back together.

The algorithm of exactly how we can do this is described below.

Algorithm:

Merge algorithm

Understanding the merge algorithm is an important step of understanding how merge sort works.

The algorithm to merge already sorted list works like this

create an empty merged list
have a way to "point" at the first element of each of the two list
compare the values being pointed at and pick the smaller of the two
copy the smaller to the end of the merged list, and advance the "pointer" of the list that the value came from
continue until one of the list is completely copied then copy over remainder of the rest of the list

Merge Sort Algorithm

The merge algorithm is an O(n) algorithm as we pick n items between two lists. However, it depends on having two lists that are already sorted...So we must first get a list into that state.

Start with the entire list
Split the list into two halfs
mergesort each half - this will result in two sorted lists
merge them together

Now... the above is clearly a recursive algorithm. The base case for this are lists of size 0 or size 1. If you have an list that is 0 elements it doesn't exist so we don't have to sort it. If a list is 1 element then that list is sorted.

Animation

http://cathyatseneca.github.io/DSAnim/web/merge.html

Implementation

The implementation of merge sort is split into 3 pieces, each piece is detailed below.

mergeSort()

This is the mergeSort() function that the program calls. The recursive merge sort has a complicated prototype, we hide the complexity by writing a function that will make the initial call. We also need a second list to store the merged list. The sorting algorithm creates a temporary list once and passes that into a recursive function to do all the work. The idea here is that we don't want to allocate memory on each merge... instead we can just use one list that stays around for the entire process.

Python
C++

def merge_sort(mylist):
    empty_list = [0] * len(mylist)  # create an empty list for merging.  doing it once
                                    # is more efficient than repeatedly creating it when merging

    recursive_merge_sort(mylist, 0, len(mylist) - 1, empty_list)  # call recursive mergesort

void mergeSort(int arr[],int size){
  int* tmp=new int[size];
  mergeSort(arr,tmp,0,size-1);
  delete [] tmp;
}

recursive merge sort

The recursive mergesort splits the list into two pieces, mergesorts each piece then merge them together. While we speak about splitting up the list, the reality is that the split is more conceptual than actual. We will use indexes to indicate where one list begins and one list ends. Thus this function is not only given the list but also the indexes to indicate where it begins and ends. The function is also passed an empty list that we pass to merge function. This is done for efficiency.

Python
C++

def recursive_merge_sort(mylist, first_index, last_index, empty_list):
    # recursive merge sort.  the base case occurs when first_index >= last_index
    # that situation will only occur if the portion of the list is size 0 or 1.
    # in either case, do nothing and exit function.

    if first_index < last_index:
        mid_index = (first_index + last_index) // 2
        recursive_merge_sort(mylist, first_index, mid_index, empty_list)
        recursive_merge_sort(mylist, mid_index + 1, last_index, empty_list)
        merge(mylist, first_index, mid_index + 1, last_index, empty_list)

void mergeSort(int arr[],int tmp[],int start,int end){
  //if the array is more than one element big
  if(start<end){
    int mid=(start+end)/2;
    mergeSort(arr,tmp,start,mid);
    mergeSort(arr,tmp,mid+1,end);
    merge(arr,tmp,start,mid+1,end);
  }
}

The merge function

The merge function takes two sorted list and merge them together. However, because we are merging together two sorted pieces of a list, rather than being given individual lists, the function prototype must take in indexes to indicating starting/ending points of these lists within the original list

Python
C++

# this function will merge two sorted pieces of an array into a single sorted segment.
# We will refer to the two smaller sorted pieces of lists as a and b.
# These are not true lists but rather pieces of mylist
# list a starts at a_first_index and ends at b_first_index - 1 (inclusive)
# list b starts at b_first_index and ends at b_last_index (inclusive)
# this function will merge them together using empty_list as temporary storage
# once it is merged together, it is copied back into mylist, creating a single
# sorted segment that starts at a_first_index to b_last_index (inclusive)

def merge(mylist, a_first_index, b_first_index, b_last_index, empty_list):
    a_ptr = a_first_index  # used to track value from a
    b_ptr = b_first_index
    empty_list_index = a_ptr

    while (a_ptr < b_first_index) and (b_ptr <= b_last_index):
        if mylist[a_ptr] <= mylist[b_ptr]:
            empty_list[empty_list_index] = mylist[a_ptr]
            empty_list_index += 1
            a_ptr += 1
        else:
            empty_list[empty_list_index] = mylist[b_ptr]
            empty_list_index += 1
            b_ptr += 1

    while a_ptr < b_first_index:
        empty_list[empty_list_index] = mylist[a_ptr]
        empty_list_index += 1
        a_ptr += 1

    while b_ptr <= b_last_index:
        empty_list[empty_list_index] = mylist[b_ptr]
        empty_list_index += 1
        b_ptr += 1

    for i in range(a_first_index, b_last_index + 1):
        mylist[i] = empty_list[i]

/*function merges the two halfs of a sorted array together.
  The arrays are defined from arr[startA]to arr[startB-1] and arr[startB]
to arr[endB]*/

void merge(int arr[],int tmp[],int startA,int startB,int endB){
  int aptr=startA;
  int bptr=startB;
  int idx=startA;
  while(aptr < startB && bptr < endB+1){
    if(arr[aptr] < arr[bptr]){
       tmp[idx]=arr[aptr];
       idx++;
       aptr++;
    }
    else{
       tmp[idx++]=arr[bptr];
       bptr++;
    }
  }
  while(aptr<startB){
    tmp[idx]=arr[aptr];
    idx++;
    aptr++;
  }

  while(bptr < endB+1){
    tmp[idx++]=arr[bptr];
    bptr++;
  }

  //copy back into arr
  for(int i=startA;i<=endB;i++){
    arr[i]=tmp[i];
  }
}

Algorithm:​

Merge algorithm​

Merge Sort Algorithm​

Animation​

Implementation​

mergeSort()​

recursive merge sort​

The merge function​