Pre-prefix
Using English to write post is only for personal practice, I hope this will not bother you too much. Since you are one of us - the smartest and most creative programmers.
Prefix
In this post, we are going to dicuss five different versions of MergeSort which is also a quite classic sorting method. As we all know, the merging sort is stable, consuming O(n) space and nlogn time while accessing the elements in order which further enables us to use it in chains easily.
Before we truly get started to discuss the MergeSort, we have to first take a careful look at these two merging assistant functions which will frequently used in later parts.
//using a fixed array to assist merging; void merge(int *a, int l, int m, int r) { int i, j, k; for(i = m + 1; i > l; i--) aux[i - 1] = a[i - 1]; //reverse half of the array to eliminate checking overhead; for(j = m; j < r; j++) aux[r + m - j] = a[j + 1]; //make sure the last loop will render i = l & j = r; for(k = l; k <= r; k++) a[k] = aux[i] > aux[j] ? aux[j--] : aux[i++]; } //explicitly usig another array to assist merging; void mergeAB(int *a, int *aux, int l, int m, int r) { int i = l; int j = m + 1; int k; for(k = l; k <= r; k++) { //make sure the last will also be considered; if(i > m) { a[k] = aux[j++]; continue; } if(j > r) { a[k] = aux[i++]; continue; } //this statement ensures the stability of the sort; a[k] = aux[i] > aux[j] ? aux[j++] : aux[i++]; } }
Version 1.0 is the very basic MergeSort, using recursive function to achieve sorting while reversing one half of the array to eliminate loop-checking overhead to further improve sorting performance.
void sort(int *a, int l, int r) { if(l >= r ) return; int m = (l + r) / 2; sort(a, l, m); sort(a, m + 1, r); merge(a, l, m, r); }
Version 1.1 is the version where we consider to handle the small blocks in advance to improve performance as what we previously did in SwiftSort. It's known to us that the merging process will consider no original sequence at all, so small blocks handled by another basic sort like insertion sort might be more efficient than MergeSort and we can take a look at the timing results in the summary part to check it.
void sort(int *a, int l, int r) { if(r - l < 10) { insertionSort(a, l, r); return; } int m = (l + r) / 2; sort(a, l, m); sort(a, m + 1, r); merge(a, l, m, r); }
Version 1.2 is the version where I spent comparably longer time to completely finish where I try to eliminate the repeated time-consuming array-copy process in each function-merge-calling process; of course in this case I also make use of small-blocks elimination trick to further improve speed.
//checking the reasoning and initial state can ensure the recursive correctness; void mergeSort(int *a, int *aux, int l, int r) { if(r - l < 20) { insertionSort(a, l, r); return; } int m = (l + r) / 2; mergeSort(aux, a, l, m); mergeSort(aux, a, m + 1, r); merge(a, aux, l, m, r); } void sort(int *a, int l, int r) { for(int i=l; i <= r; i++) aux[i] = a[i]; mergeSort(a, aux, l, r); }
Version 1.3 is the first version where I try to handle the merging process from bottom to the top - a so-called bottom-up method, thought its speed is not that ideal, it's just kind of another way around.
void sort(int *a, int l, int r) { for(int m = 1; m < r - l; m += m) for(int i = l; i <= r - m; i += 2 * m)//the range is rather critical! merge(a, i, m + i - 1, min(i + 2 * m - 1, r)); }
Version 1.4 is the last version and a updated version 1.3 where I adopt small-blocks elimination trick.
void sort(int *a, int l, int r) { int unit = 20; if(r - l < unit) { insertionSort(a, l, r); return; } for(int i = l; i < r; i += unit) insertionSort(a, i, min(i + unit - 1, r)); for(int m = unit; m < r - l; m += m) for(int i = l; i <= r - m; i += 2 * m)//the range is rather critical! merge(a, i, m + i - 1, min(i + 2 * m - 1, r)); }
Summary:
According to the timing scripts I simply get a timing result for each version mentioned above and here is my PC Configuration Outline:
description: Desktop Computer product: ThinkCentre M8300T (To be filled by O.E.M.) vendor: LENOVO version: Lenovo Product serial: NA17134507 width: 64 bits Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz total used free shared buff/cache available Mem: 7.5G 1.2G 3.4G 511M 2.9G 5.5G Swap: 3.8G 0B 3.8G L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 8192K
And the results of all versions - testing set is a 2,000,000 random integer array with integers ranging from 0 to 10,000,000 exclusive:
./time.sh > tmp Sorting failed! : Success Sorted successfully! : Success real 0m0.803s user 0m0.792s sys 0m0.011s time ./merge0 *************** Sorting failed! : Success Sorted successfully! : Success real 0m0.708s user 0m0.678s sys 0m0.030s time ./merge1 *************** Sorting failed! : Success Sorted successfully! : Success real 0m0.676s user 0m0.662s sys 0m0.014s time ./merge2 *************** Sorting failed! : Success Sorted successfully! : Success real 0m0.725s user 0m0.706s sys 0m0.019s time ./merge3 *************** Sorting failed! : Success Sorted successfully! : Success real 0m0.707s user 0m0.693s sys 0m0.014s time ./merge4 **************
There are several other functions assisting in the whole testing process and the followings are the files where they are and how they are arranged.
utils.h
#ifndef UTILS_H #define UTILS_H #include<stdio.h> #include<time.h> #include<stdlib.h> #define SIZE 2000000 #define MAX 10000000 #define min(a, b) (a) < (b) ? (a) : (b) #define max(a, b) (a) > (b) ? (a) : (b) int aux[SIZE]; void randomIntArray(int* array, int size, int low, int high); void printArray(int *nums, int size); void checkAscending(int *nums, int size); void insertionSort(int* nums, int l, int r); void merge(int *a, int l, int m, int r); void mergeAB(int *a, int *aux, int l, int m, int r); #endif
Corresponding realization file utils.c
#include"utils.h" void randomIntArray(int* array, int size, int low, int high) { srand(time(NULL)); for(int i = 0; i < size; i++) { array[i] = rand()%(high-low) + low; } } void printArray(int *nums, int size) { for(int i = 0; i < size; i++) printf("%d, ", nums[i]); } void checkAscending(int *nums, int size) { for(int i = 0; i < size - 1; i++) if(nums[i] > nums[i + 1]) { perror("Sorting failed!\n"); printf("Sorting failed!\n"); return; } perror("Sorted successfully!\n"); printf("Sorted successfully!\n"); } void insertionSort(int* nums, int l, int r) { //Via bubbling, you can also make the minimal in the first position; //All the rest elements will be less than the first; int min=l; for(int i = l; i <= r; i++) { if(nums[i] < nums[min]) min = i; } swap(nums + l, nums + min); int j = 0; for(int i = l + 1; i <= r; i++) { int j = i - 1; int tmp = nums[i]; while(nums[j] > tmp) { nums[j+1] = nums[j]; j--; } nums[j + 1] = tmp; } } //using a fixed array to assist merging; void merge(int *a, int l, int m, int r) { int i, j, k; for(i = m + 1; i > l; i--) aux[i - 1] = a[i - 1]; //reverse half of the array to eliminate checking overhead; for(j = m; j < r; j++) aux[r + m - j] = a[j + 1]; //make sure the last loop will render i = l & j = r; for(k = l; k <= r; k++) a[k] = aux[i] > aux[j] ? aux[j--] : aux[i++]; } z //explicitly usig another array to assist merging; void mergeAB(int *a, int *aux, int l, int m, int r) { int i = l; int j = m + 1; int k; for(k = l; k <= r; k++) { //make sure the last will also be considered; if(i > m) { a[k] = aux[j++]; continue; } if(j > r) { a[k] = aux[i++]; continue; } //this statement ensures the stability of the sort; a[k] = aux[i] > aux[j] ? aux[j++] : aux[i++]; } }
The complete file of version 1.0 merge0.c
#include"utils.h" void sort(int *a, int l, int r) { if(l >= r ) return; int m = (l + r) / 2; sort(a, l, m); sort(a, m + 1, r); merge(a, l, m, r); } void main() { int numbers[SIZE]; randomIntArray(numbers, SIZE, 0, MAX); printArray(numbers, SIZE); checkAscending(numbers, SIZE); printf("After sorting:\n***********************\n"); sort(numbers, 0, SIZE - 1); printArray(numbers, SIZE); checkAscending(numbers, SIZE); }
Shell script to timing them time.sh
#!/bin/bash time ./merge0 echo 'time ./merge0' >&2 echo '***************' >&2 time ./merge1 echo 'time ./merge1' >&2 echo '***************' >&2 time ./merge2 echo 'time ./merge2' >&2 echo '***************' >&2 time ./merge3 echo 'time ./merge3' >&2 echo '***************' >&2 time ./merge4 echo 'time ./merge4' >&2 echo '**************' >&2
P.S. if you want to repeat this whole testing process, you can just replace the sorting method in merge0.c and rename the file accordingly and then run command './time.sh > tmp' in your own bash. Have a nice day!
P.S. P.S. If there is anything wrong - the contents, the sentences and even the styles, please do not hesitate to inform me in the reply, so many thanks in advance!