-
I have programmed a Python code to compare the performance of different data sorting algorithms.
In each sorting method, I count the number of times a comparison is made between data and the number of times a copy or movement of data is made.
Sometimes the data will be easy to compare (in the case for example of comparing integers as in this program), but there will be other times when you want to minimize the number of these operations because the data is complex to compare.
The same thing happens with data movement or copying. If the movement or swap between two data is performed with the pointers pointing to those data, the operation will be fast. If, on the other hand, it is necessary to physically move the data and the data is large, we will want to minimize the number of data moved.
For this reason, these data appear as the main performance indicators of each of the sorting algorithms.
The execution time in seconds also appears at the end of the report, but this time must be interpreted relatively, because the algorithms are not optimized. They are running in Python and perform other operations such as counting the number of comparisons and data movements.
This is the program:
"""
Comparison between sorting algorithms
Copyright (c) 2022 Picuino
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""
import copy
import math
import random
import time
num_compare = 0
num_copy = 0
def main():
algorithms = [
{"name":"Bubble Sort", "function":bubble_sort},
{"name":"Direct Selection", "function":direct_selection},
{"name":"Direct Insertion", "function":direct_insertion},
{"name":"Binary Insertion", "function":binary_insertion},
{"name":"ShellSort", "function":shellsort},
{"name":"MergeSort", "function":mergesort},
{"name":"HeapSort", "function":heapsort},
{"name":"QuickSort", "function":quicksort},
]
algorithms_not_used = [
]
# Initialize report file
report_head = "Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time"
with open("Report.csv", "wt") as report_file:
report_file.write(report_head + '\n')
print(report_head)
# Sort arrays of different sizes
array_sizes = [10, 100, 1000, 10000, 100000, 1000000]
for array_size in array_sizes:
array = generate_random_array(array_size)
for algorithm in algorithms:
if array_size > 10000 and algorithm["function"] in (bubble_sort, direct_insertion, binary_insertion, direct_selection):
continue
report = sort(algorithm, array)
print(report)
report_file.write(report + '\n')
def sort(algorithm, array):
global num_compare, num_copy
array_copy = copy.copy(array)
num_compare = num_copy = 0
run_time = time.time()
algorithm["function"](array_copy)
run_time = time.time() - run_time
if is_sorted(array_copy):
report = generate_report(algorithm, len(array), num_compare, num_copy, run_time)
else:
report = "Array not sorted."
return report
def generate_random_array(array_size):
array = []
for i in range(array_size):
array.append(random.getrandbits(32))
return array
def is_sorted(array):
for i in range(1, len(array)):
if array[i-1] > array[i]:
return False
return True
def generate_report(algorithm, array_size, num_compare, num_copy, run_time):
report = []
report.append("%-16s" % algorithm["name"])
report.append("%d" % array_size)
# Number of comparisons
if algorithm['function'] in (shellsort, heapsort, quicksort, mergesort):
report.append("%1.4f,N*Ln(N)" % (num_compare/(array_size * math.log(array_size))))
else:
report.append("%1.4f,N*N" % (num_compare/float(array_size * array_size)))
# Number of copies
if algorithm['function'] in (shellsort, heapsort, quicksort, mergesort):
report.append("%1.4f,N*Ln(N)" % (num_copy/(array_size * math.log(array_size))))
else:
report.append("%1.4f,N*N" % (num_copy/float(array_size * array_size)))
# Execution time
report.append("%1.4f" % run_time)
return ','.join(report)
def bubble_sort(array):
global num_compare, num_copy
for i in range(len(array), 1, -1):
moves = 0
for j in range(1, i):
num_compare += 1
if array[j - 1] > array[j]:
num_copy += 3
temp = array[j]
array[j] = array[j - 1]
array[j -1] = temp
moves = 1
if moves == 0:
break
def binary_insertion(array):
global num_compare, num_copy
for pos in range(1, len(array)):
num_copy += 1
temp = array[pos]
inf = 0
sup = pos
while inf < sup:
med = int((inf + sup) / 2)
num_compare += 1
if temp > array[med]:
inf = med + 1
else:
sup = med
while pos > inf:
num_copy += 1
array[pos] = array[pos - 1]
pos -= 1
num_copy += 1
array[pos] = temp
return num_compare, num_copy
def direct_selection(array):
global num_compare, num_copy
t_inf = 0
t_sup = len(array) - 1
while t_inf < t_sup:
p_min = t_inf
j = t_inf + 1
while j <= t_sup:
num_compare += 1
if array[j] < array[p_min]:
p_min = j
j += 1
num_copy += 3
temp = array[t_inf]
array[t_inf] = array[p_min]
array[p_min] = temp
t_inf += 1
def direct_insertion(array):
global num_compare, num_copy
i = 1
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= 1:
num_compare += 1
num_copy += 1
if array[j - 1] > temp:
array[j] = array[j - 1]
j -= 1
else:
break
array[j] = temp
i += 1
def shellsort(array):
global num_compare, num_copy
i = len(array)
step = 1
while i >= 4:
i //= 2
step *= 2
while step >= 1:
i = step
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= step:
num_compare += 1
num_copy += 1
if array[j-step] > temp:
array[j] = array[j-step]
j -= step
else:
break
array[j] = temp
i += 1
step //= 2
def heapsort(array):
global num_compare, num_copy
# Make heap structure
pointer = int((len(array) - 1) / 2)
while pointer >= 0:
heapsort_screen(array, pointer, len(array) - 1)
pointer -= 1
# Sort array
pointer = len(array) - 1
while pointer > 0:
num_copy += 3
temp = array[pointer]
array[pointer] = array[0]
array[0] = temp
pointer -= 1
heapsort_screen(array, 0, pointer)
def heapsort_screen(array, array_min, array_max):
global num_compare, num_copy
down = array_min * 2 + 1
if (down > array_max):
return
num_copy += 1
temp = array[array_min]
while down <= array_max:
if down < array_max:
num_compare += 1
if array[down + 1] > array[down]:
down += 1
num_compare += 1
if temp > array[down]:
break
num_copy += 1
array[array_min] = array[down]
array_min = down
down = down * 2 + 1
num_copy += 1
array[array_min] = temp
def mergesort(array):
mergesort_2(array, 0, len(array)-1)
def mergesort_2(array, inicio, fin):
global num_compare, num_copy
if fin <= inicio+1:
num_compare += 1
if array[fin]<array[inicio]:
num_copy += 3
temp = array[fin]
array[fin] = array[inicio]
array[inicio] = temp
return
mid = int((inicio + fin)/2)
mergesort_2(array, inicio, mid)
mergesort_2(array, mid+1, fin)
merge_temp(array, inicio, mid+1, fin)
def merge_temp(array, init1, init2, end2):
global num_compare, num_copy
memo_temp = 10
temp = list(range(memo_temp))
end1 = init2-1
i = 0
while init1 <= end1 and init2 <= end2:
num_compare += 1
num_copy += 1
if array[init2] < array[init1]:
temp[i] = array[init2]
init2 += 1
else:
temp[i] = array[init1]
init1 += 1
i += 1
if i >= memo_temp:
merge_insert(array, temp, i, init1, end1, init2, end2)
init1 = init1 + (init2 - 1 - end1)
end1 = init2 - 1
i = 0
merge_insert(array, temp, i, init1, end1, init2, end2)
def merge_insert(array, temp, leng, init1, end1, init2, end2):
global num_compare, num_copy
init_copy = init1 + (init2-end1-1) - leng
j = init2-1
while end1 >= init1:
num_copy += 1
array[j] = array[end1]
j -= 1
end1 -= 1
init1 = init_copy + leng
j = 0
while init_copy < init1:
num_copy += 1
array[init_copy] = temp[j]
init_copy += 1
j += 1
def quicksort(array):
global num_compare, num_copy
quicksort_2(array, 0, len(array)-1)
return num_compare, num_copy
def quicksort_2(array, min_pos, max_pos):
global num_compare, num_copy
if min_pos >= max_pos:
return
imin_pos = min_pos
imax_pos = max_pos
mid = array[int((min_pos + max_pos)/2)]
while imin_pos <= imax_pos:
while array[imin_pos] < mid:
num_compare += 1
imin_pos += 1
while array[imax_pos] > mid:
imax_pos -= 1
num_compare += 1
num_compare += 3
if imin_pos == imax_pos:
imin_pos += 1
imax_pos -= 1
break
if imin_pos < imax_pos:
num_copy +=3
temp = array[imin_pos]
array[imin_pos] = array[imax_pos]
array[imax_pos] = temp
imin_pos += 1
imax_pos -= 1
if imax_pos - min_pos < max_pos - imin_pos:
quicksort_2(array, min_pos, imax_pos)
quicksort_2(array, imin_pos, max_pos)
else:
quicksort_2(array, imin_pos, max_pos)
quicksort_2(array, min_pos, imax_pos)
# Run program
if __name__ == "__main__":
main()
And this is the report (in csv) with my computer:
Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time
Bubble Sort ,10,0.4400,N*N,0.5700,N*N,0.0000
Direct Insertion,10,0.2700,N*N,0.3600,N*N,0.0000
Binary Insertion,10,0.2200,N*N,0.3700,N*N,0.0000
Direct Selection,10,0.4500,N*N,0.2700,N*N,0.0000
ShellSort ,10,1.3029,N*Ln(N),2.3018,N*Ln(N),0.0000
HeapSort ,10,1.6069,N*Ln(N),3.0835,N*Ln(N),0.0000
QuickSort ,10,2.6926,N*Ln(N),1.1726,N*Ln(N),0.0000
MergeSort ,10,1.1726,N*Ln(N),2.0846,N*Ln(N),0.0000
Bubble Sort ,100,0.4905,N*N,0.6393,N*N,0.0000
Direct Insertion,100,0.2228,N*N,0.2327,N*N,0.0000
Binary Insertion,100,0.0528,N*N,0.2329,N*N,0.0000
Direct Selection,100,0.4950,N*N,0.0297,N*N,0.0000
ShellSort ,100,2.0564,N*Ln(N),3.2225,N*Ln(N),0.0000
HeapSort ,100,2.2583,N*Ln(N),2.3430,N*Ln(N),0.0000
QuickSort ,100,2.3018,N*Ln(N),0.9316,N*Ln(N),0.0000
MergeSort ,100,1.2269,N*Ln(N),3.1139,N*Ln(N),0.0000
Bubble Sort ,1000,0.4993,N*N,0.7307,N*N,0.1003
Direct Insertion,1000,0.2446,N*N,0.2456,N*N,0.0529
Binary Insertion,1000,0.0086,N*N,0.2456,N*N,0.0625
Direct Selection,1000,0.4995,N*N,0.0030,N*N,0.0845
ShellSort ,1000,3.4492,N*Ln(N),4.6781,N*Ln(N),0.0219
HeapSort ,1000,2.4396,N*Ln(N),2.0350,N*Ln(N),0.0157
QuickSort ,1000,2.3769,N*Ln(N),0.9715,N*Ln(N),0.0000
MergeSort ,1000,1.2628,N*Ln(N),9.3996,N*Ln(N),0.0000
Bubble Sort ,10000,0.4998,N*N,0.7534,N*N,10.6998
Direct Insertion,10000,0.2512,N*N,0.2513,N*N,6.6190
Binary Insertion,10000,0.0012,N*N,0.2513,N*N,3.4726
Direct Selection,10000,0.5000,N*N,0.0003,N*N,6.1176
ShellSort ,10000,6.6957,N*Ln(N),8.0182,N*Ln(N),0.1693
HeapSort ,10000,2.5553,N*Ln(N),1.8904,N*Ln(N),0.0469
QuickSort ,10000,2.3768,N*Ln(N),1.0155,N*Ln(N),0.0313
MergeSort ,10000,1.3280,N*Ln(N),56.7526,N*Ln(N),0.7800
ShellSort ,100000,15.2216,N*Ln(N),16.5545,N*Ln(N),4.6824
HeapSort ,100000,2.6231,N*Ln(N),1.8023,N*Ln(N),0.6550
QuickSort ,100000,2.3659,N*Ln(N),1.0095,N*Ln(N),0.3543
MergeSort ,100000,1.3615,N*Ln(N),437.2988,N*Ln(N),72.9004
ShellSort ,1000000,36.4804,N*Ln(N),37.8177,N*Ln(N),143.0511
HeapSort ,1000000,2.6633,N*Ln(N),1.7407,N*Ln(N),8.9103
QuickSort ,1000000,2.3073,N*Ln(N),1.0095,N*Ln(N),4.2588
Edit:
In the results, Units are given in N*N or N*log(N) where N is the number of array elements (array size).
For example if the comparisons are 0.456 N*N for an array of 100 elements, it means that there have been 4560 comparisons in total.
The calculation is done by dividing 4560 by 100*100.
-
I'm going to add this 'new' algoritm:
https://www.eevblog.com/forum/programming/interesting-new-sort-algorithm/msg4314406/#msg4314406 (https://www.eevblog.com/forum/programming/interesting-new-sort-algorithm/msg4314406/#msg4314406)
"""
Comparison between sorting algorithms
Copyright (c) 2022 Picuino
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""
import copy
import math
import random
import time
num_compare = 0
num_copy = 0
def main():
algorithms = [
{"name":"Bubble Sort", "function":bubble_sort},
{"name":"Direct Selection", "function":direct_selection},
{"name":"Direct Insertion", "function":direct_insertion},
{"name":"Binary Insertion", "function":binary_insertion},
{"name":"New sort", "function":new_sort},
]
algorithms_not_used = [
{"name":"ShellSort", "function":shellsort},
{"name":"MergeSort", "function":mergesort},
{"name":"HeapSort", "function":heapsort},
{"name":"QuickSort", "function":quicksort},
]
# Initialize report file
report_head = "Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time"
with open("Report.csv", "wt") as report_file:
report_file.write(report_head + '\n')
print(report_head)
# Sort arrays of different sizes
array_sizes = [10, 100, 1000, 10000, 100000, 1000000]
for array_size in array_sizes:
array = generate_random_array(array_size)
for algorithm in algorithms:
if array_size > 10000 and algorithm["function"] in (bubble_sort, direct_insertion, binary_insertion, direct_selection, new_sort):
continue
report = sort(algorithm, array)
print(report)
report_file.write(report + '\n')
def sort(algorithm, array):
global num_compare, num_copy
array_copy = copy.copy(array)
num_compare = num_copy = 0
run_time = time.time()
algorithm["function"](array_copy)
run_time = time.time() - run_time
if is_sorted(array_copy):
report = generate_report(algorithm, len(array), num_compare, num_copy, run_time)
else:
report = "Array not sorted."
return report
def generate_random_array(array_size):
array = []
for i in range(array_size):
array.append(random.getrandbits(32))
return array
def is_sorted(array):
for i in range(1, len(array)):
if array[i-1] > array[i]:
return False
return True
def generate_report(algorithm, array_size, num_compare, num_copy, run_time):
report = []
report.append("%-16s" % algorithm["name"])
report.append("%d" % array_size)
# Number of comparisons
if algorithm['function'] in (shellsort, heapsort, quicksort, mergesort):
report.append("%1.4f,N*Ln(N)" % (num_compare/(array_size * math.log(array_size))))
else:
report.append("%1.4f,N*N" % (num_compare/float(array_size * array_size)))
# Number of copies
if algorithm['function'] in (shellsort, heapsort, quicksort, mergesort):
report.append("%1.4f,N*Ln(N)" % (num_copy/(array_size * math.log(array_size))))
else:
report.append("%1.4f,N*N" % (num_copy/float(array_size * array_size)))
# Execution time
report.append("%1.4f" % run_time)
return ','.join(report)
def bubble_sort(array):
global num_compare, num_copy
for i in range(len(array), 1, -1):
moves = 0
for j in range(1, i):
num_compare += 1
if array[j - 1] > array[j]:
num_copy += 3
temp = array[j]
array[j] = array[j - 1]
array[j -1] = temp
moves = 1
if moves == 0:
break
def binary_insertion(array):
global num_compare, num_copy
for pos in range(1, len(array)):
num_copy += 1
temp = array[pos]
inf = 0
sup = pos
while inf < sup:
med = int((inf + sup) / 2)
num_compare += 1
if temp > array[med]:
inf = med + 1
else:
sup = med
while pos > inf:
num_copy += 1
array[pos] = array[pos - 1]
pos -= 1
num_copy += 1
array[pos] = temp
return num_compare, num_copy
def direct_selection(array):
global num_compare, num_copy
t_inf = 0
t_sup = len(array) - 1
while t_inf < t_sup:
p_min = t_inf
j = t_inf + 1
while j <= t_sup:
num_compare += 1
if array[j] < array[p_min]:
p_min = j
j += 1
num_copy += 3
temp = array[t_inf]
array[t_inf] = array[p_min]
array[p_min] = temp
t_inf += 1
def direct_insertion(array):
global num_compare, num_copy
i = 1
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= 1:
num_compare += 1
num_copy += 1
if array[j - 1] > temp:
array[j] = array[j - 1]
j -= 1
else:
break
array[j] = temp
i += 1
def new_sort(array):
global num_compare, num_copy
for i in range(len(array)):
for j in range(len(array)):
num_compare += 1
if array[i] < array[j]:
num_copy += 3
temp = array[i]
array[i] = array[j]
array[j] = temp
def shellsort(array):
global num_compare, num_copy
i = len(array)
step = 1
while i >= 4:
i //= 2
step *= 2
while step >= 1:
i = step
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= step:
num_compare += 1
num_copy += 1
if array[j-step] > temp:
array[j] = array[j-step]
j -= step
else:
break
array[j] = temp
i += 1
step //= 2
def heapsort(array):
global num_compare, num_copy
# Make heap structure
pointer = int((len(array) - 1) / 2)
while pointer >= 0:
heapsort_screen(array, pointer, len(array) - 1)
pointer -= 1
# Sort array
pointer = len(array) - 1
while pointer > 0:
num_copy += 3
temp = array[pointer]
array[pointer] = array[0]
array[0] = temp
pointer -= 1
heapsort_screen(array, 0, pointer)
def heapsort_screen(array, array_min, array_max):
global num_compare, num_copy
down = array_min * 2 + 1
if (down > array_max):
return
num_copy += 1
temp = array[array_min]
while down <= array_max:
if down < array_max:
num_compare += 1
if array[down + 1] > array[down]:
down += 1
num_compare += 1
if temp > array[down]:
break
num_copy += 1
array[array_min] = array[down]
array_min = down
down = down * 2 + 1
num_copy += 1
array[array_min] = temp
def mergesort(array):
mergesort_2(array, 0, len(array)-1)
def mergesort_2(array, inicio, fin):
global num_compare, num_copy
if fin <= inicio+1:
num_compare += 1
if array[fin]<array[inicio]:
num_copy += 3
temp = array[fin]
array[fin] = array[inicio]
array[inicio] = temp
return
mid = int((inicio + fin)/2)
mergesort_2(array, inicio, mid)
mergesort_2(array, mid+1, fin)
merge_temp(array, inicio, mid+1, fin)
def merge_temp(array, init1, init2, end2):
global num_compare, num_copy
memo_temp = 10
temp = list(range(memo_temp))
end1 = init2-1
i = 0
while init1 <= end1 and init2 <= end2:
num_compare += 1
num_copy += 1
if array[init2] < array[init1]:
temp[i] = array[init2]
init2 += 1
else:
temp[i] = array[init1]
init1 += 1
i += 1
if i >= memo_temp:
merge_insert(array, temp, i, init1, end1, init2, end2)
init1 = init1 + (init2 - 1 - end1)
end1 = init2 - 1
i = 0
merge_insert(array, temp, i, init1, end1, init2, end2)
def merge_insert(array, temp, leng, init1, end1, init2, end2):
global num_compare, num_copy
init_copy = init1 + (init2-end1-1) - leng
j = init2-1
while end1 >= init1:
num_copy += 1
array[j] = array[end1]
j -= 1
end1 -= 1
init1 = init_copy + leng
j = 0
while init_copy < init1:
num_copy += 1
array[init_copy] = temp[j]
init_copy += 1
j += 1
def quicksort(array):
global num_compare, num_copy
quicksort_2(array, 0, len(array)-1)
return num_compare, num_copy
def quicksort_2(array, min_pos, max_pos):
global num_compare, num_copy
if min_pos >= max_pos:
return
imin_pos = min_pos
imax_pos = max_pos
mid = array[int((min_pos + max_pos)/2)]
while imin_pos <= imax_pos:
while array[imin_pos] < mid:
num_compare += 1
imin_pos += 1
while array[imax_pos] > mid:
imax_pos -= 1
num_compare += 1
num_compare += 3
if imin_pos == imax_pos:
imin_pos += 1
imax_pos -= 1
break
if imin_pos < imax_pos:
num_copy +=3
temp = array[imin_pos]
array[imin_pos] = array[imax_pos]
array[imax_pos] = temp
imin_pos += 1
imax_pos -= 1
if imax_pos - min_pos < max_pos - imin_pos:
quicksort_2(array, min_pos, imax_pos)
quicksort_2(array, imin_pos, max_pos)
else:
quicksort_2(array, imin_pos, max_pos)
quicksort_2(array, min_pos, imax_pos)
# Run program
if __name__ == "__main__":
main()
Results:
Bubble Sort ,10,0.4400,N*N,0.8400,N*N,0.0000
Direct Selection,10,0.4500,N*N,0.2700,N*N,0.0000
Direct Insertion,10,0.3500,N*N,0.4400,N*N,0.0000
Binary Insertion,10,0.2300,N*N,0.4600,N*N,0.0000
New sort ,10,1.0000,N*N,0.9000,N*N,0.0000
Bubble Sort ,100,0.4947,N*N,0.7074,N*N,0.0000
Direct Selection,100,0.4950,N*N,0.0297,N*N,0.0000
Direct Insertion,100,0.2453,N*N,0.2552,N*N,0.0000
Binary Insertion,100,0.0533,N*N,0.2556,N*N,0.0000
New sort ,100,1.0000,N*N,0.7080,N*N,0.0000
Bubble Sort ,1000,0.4989,N*N,0.7663,N*N,0.1003
Direct Selection,1000,0.4995,N*N,0.0030,N*N,0.0534
Direct Insertion,1000,0.2564,N*N,0.2574,N*N,0.0625
Binary Insertion,1000,0.0086,N*N,0.2574,N*N,0.0690
New sort ,1000,1.0000,N*N,0.7602,N*N,0.1537
Bubble Sort ,10000,0.4998,N*N,0.7441,N*N,11.3787
Direct Selection,10000,0.5000,N*N,0.0003,N*N,6.5603
Direct Insertion,10000,0.2481,N*N,0.2482,N*N,7.2902
Binary Insertion,10000,0.0012,N*N,0.2482,N*N,3.6161
New sort ,10000,1.0000,N*N,0.7073,N*N,13.3711
Edit: Error corrected in new_sort()
-
This 'new sort' algorithm is worst than bubble sort and not more 'intuitive'
-
Can somebody help me improve the merge algorithm?
It seems to run slower than it should.
-
They are running in Python
Sorting has always been a very sophisticated problem.
Interpreted languages might heavily optimize what you are trying to achieve there (or the opposite), as they might not be optimizing for sort speed but for memory utilization or something else.
Have you considered comparing it to pythons built in sort function?
-
I do not intend to make practical sorting algorithms. I would have to do that in C by sorting, for example, strings.
The sort method built in Python is written in C and optimized, it cannot be compared to the others.
As a curiosity, Python uses the TimSort algorithm which is a mix derived from merge sort and insertion sort.
https://en.wikipedia.org/wiki/Timsort
What I intend with this program are two things:
1 - Show how the different algorithms are programmed.
2 - Compare with each other the different algorithms with the number of comparisons, the number of data movements and the relative execution time of each of them.
-
Program updated with the TimSort algorithm. Lists are used in this implementation. This needs to be changed so that TimSort can work with an array in place.
"""
Comparison between sorting algorithms
Copyright (c) 2022 Picuino
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""
import copy
import math
import random
import time
num_compare = 0
num_copy = 0
def main():
algorithms = [
{"name":"Bubble Sort", "function":bubble_sort},
{"name":"Direct Selection", "function":direct_selection},
{"name":"Direct Insertion", "function":direct_insertion},
{"name":"Binary Insertion", "function":binary_insertion},
{"name":"ShellSort", "function":shellsort},
{"name":"MergeSort", "function":mergesort},
{"name":"TimSort", "function":timsort},
{"name":"HeapSort", "function":heapsort},
{"name":"QuickSort", "function":quicksort},
]
# Initialize report file
report_head = "Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time"
with open("Report.csv", "wt") as report_file:
report_file.write(report_head + '\n')
print(report_head)
# Sort arrays of different sizes
array_sizes = [10, 100, 1000, 10000, 100000, 1000000]
fast_algorithms = (timsort, heapsort, quicksort)
for array_size in array_sizes:
array = generate_random_array(array_size)
for algorithm in algorithms:
if array_size > 10000 and not algorithm["function"] in fast_algorithms:
continue
report = sort(algorithm, array)
print(report)
report_file.write(report + '\n')
def sort(algorithm, array):
global num_compare, num_copy
array_copy = copy.copy(array)
num_compare = num_copy = 0
run_time = time.time()
algorithm["function"](array_copy)
run_time = time.time() - run_time
if is_sorted(array_copy):
report = generate_report(algorithm, len(array), num_compare, num_copy, run_time)
else:
report = "Array not sorted."
return report
def generate_random_array(array_size):
array = []
for i in range(array_size):
array.append(random.getrandbits(32))
return array
def is_sorted(array):
for i in range(1, len(array)):
if array[i-1] > array[i]:
return False
return True
def generate_report(algorithm, array_size, num_compare, num_copy, run_time):
report = []
report.append("%-16s" % algorithm["name"])
report.append("%d" % array_size)
# Number of comparisons
if algorithm['function'] in (shellsort, heapsort, quicksort, timsort, mergesort):
report.append("%1.4f,N*Ln(N)" % (num_compare/(array_size * math.log(array_size))))
else:
report.append("%1.4f,N*N" % (num_compare/float(array_size * array_size)))
# Number of copies
if algorithm['function'] in (shellsort, heapsort, quicksort, timsort, mergesort):
report.append("%1.4f,N*Ln(N)" % (num_copy/(array_size * math.log(array_size))))
else:
report.append("%1.4f,N*N" % (num_copy/float(array_size * array_size)))
# Execution time
report.append("%1.4f" % run_time)
return ','.join(report)
def bubble_sort(array):
global num_compare, num_copy
for i in range(len(array), 1, -1):
moves = 0
for j in range(1, i):
num_compare += 1
if array[j - 1] > array[j]:
num_copy += 3
temp = array[j]
array[j] = array[j - 1]
array[j -1] = temp
moves = 1
if moves == 0:
break
def binary_insertion(array):
global num_compare, num_copy
for pos in range(1, len(array)):
num_copy += 1
temp = array[pos]
inf = 0
sup = pos
while inf < sup:
med = int((inf + sup) / 2)
num_compare += 1
if temp > array[med]:
inf = med + 1
else:
sup = med
while pos > inf:
num_copy += 1
array[pos] = array[pos - 1]
pos -= 1
num_copy += 1
array[pos] = temp
def direct_selection(array):
global num_compare, num_copy
t_inf = 0
t_sup = len(array) - 1
while t_inf < t_sup:
p_min = t_inf
j = t_inf + 1
while j <= t_sup:
num_compare += 1
if array[j] < array[p_min]:
p_min = j
j += 1
num_copy += 3
temp = array[t_inf]
array[t_inf] = array[p_min]
array[p_min] = temp
t_inf += 1
def direct_insertion(array):
global num_compare, num_copy
i = 1
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= 1:
num_compare += 1
num_copy += 1
if array[j - 1] > temp:
array[j] = array[j - 1]
j -= 1
else:
break
array[j] = temp
i += 1
def shellsort(array):
global num_compare, num_copy
i = len(array)
step = 1
while i >= 4:
i //= 2
step *= 2
while step >= 1:
i = step
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= step:
num_compare += 1
num_copy += 1
if array[j-step] > temp:
array[j] = array[j-step]
j -= step
else:
break
array[j] = temp
i += 1
step //= 2
def mergesort(array):
mergesort_2(array, 0, len(array)-1)
def mergesort_2(array, inicio, fin):
global num_compare, num_copy
if fin <= inicio+1:
num_compare += 1
if array[fin]<array[inicio]:
num_copy += 3
temp = array[fin]
array[fin] = array[inicio]
array[inicio] = temp
return
mid = int((inicio + fin)/2)
mergesort_2(array, inicio, mid)
mergesort_2(array, mid+1, fin)
merge_temp(array, inicio, mid+1, fin)
def merge_temp(array, init1, init2, end2):
global num_compare, num_copy
memo_temp = 10
temp = list(range(memo_temp))
end1 = init2-1
i = 0
while init1 <= end1 and init2 <= end2:
num_compare += 1
num_copy += 1
if array[init2] < array[init1]:
temp[i] = array[init2]
init2 += 1
else:
temp[i] = array[init1]
init1 += 1
i += 1
if i >= memo_temp:
merge_insert(array, temp, i, init1, end1, init2, end2)
init1 = init1 + (init2 - 1 - end1)
end1 = init2 - 1
i = 0
merge_insert(array, temp, i, init1, end1, init2, end2)
def merge_insert(array, temp, leng, init1, end1, init2, end2):
global num_compare, num_copy
init_copy = init1 + (init2-end1-1) - leng
j = init2-1
while end1 >= init1:
num_copy += 1
array[j] = array[end1]
j -= 1
end1 -= 1
init1 = init_copy + leng
j = 0
while init_copy < init1:
num_copy += 1
array[init_copy] = temp[j]
init_copy += 1
j += 1
def heapsort(array):
global num_compare, num_copy
# Make heap structure
pointer = int((len(array) - 1) / 2)
while pointer >= 0:
heapsort_screen(array, pointer, len(array) - 1)
pointer -= 1
# Sort array
pointer = len(array) - 1
while pointer > 0:
num_copy += 3
temp = array[pointer]
array[pointer] = array[0]
array[0] = temp
pointer -= 1
heapsort_screen(array, 0, pointer)
def heapsort_screen(array, array_min, array_max):
global num_compare, num_copy
down = array_min * 2 + 1
if (down > array_max):
return
num_copy += 1
temp = array[array_min]
while down <= array_max:
if down < array_max:
num_compare += 1
if array[down + 1] > array[down]:
down += 1
num_compare += 1
if temp > array[down]:
break
num_copy += 1
array[array_min] = array[down]
array_min = down
down = down * 2 + 1
num_copy += 1
array[array_min] = temp
def timsort(array):
n = len(array)
minrun = timsort_find_minrun(n)
for start in range(0, n, minrun):
end = min(start + minrun - 1, n - 1)
timsort_insertion(array, start, end)
size = minrun
while size < n:
for left in range(0, n, 2 * size):
mid = min(n - 1, left + size - 1)
right = min((left + 2 * size - 1), (n - 1))
timsort_merge(array, left, mid, right)
size = 2 * size
def timsort_find_minrun(n):
TIMSORT_MINIMUM = 32
r = 0
while n >= TIMSORT_MINIMUM:
r |= n & 1
n >>= 1
return n + r
def timsort_insertion(array, left, right):
global num_compare, num_copy
for i in range(left+1, right+1):
num_copy += 1
element = array[i]
j = i - 1
num_compare += 1
while element < array[j] and j >= left :
num_copy += 1
array[j+1] = array[j]
j -= 1
num_compare += 1
num_copy += 1
array[j+1] = element
return array
def timsort_merge(array, l, m, r):
global num_compare, num_copy
array_length1 = m - l + 1
array_length2 = r - m
left = []
right = []
for i in range(0, array_length1):
num_copy += 1
left.append(array[l + i])
for i in range(0, array_length2):
num_copy += 1
right.append(array[m + 1 + i])
i=0
j=0
k=l
while j < array_length2 and i < array_length1:
num_compare += 1
if left[i] <= right[j]:
num_copy += 1
array[k] = left[i]
i += 1
else:
num_copy += 1
array[k] = right[j]
j += 1
k += 1
while i < array_length1:
num_copy += 1
array[k] = left[i]
k += 1
i += 1
while j < array_length2:
num_copy += 1
array[k] = right[j]
k += 1
j += 1
def quicksort(array):
global num_compare, num_copy
quicksort_2(array, 0, len(array)-1)
def quicksort_2(array, min_pos, max_pos):
global num_compare, num_copy
if min_pos >= max_pos:
return
imin_pos = min_pos
imax_pos = max_pos
mid = array[int((min_pos + max_pos)/2)]
while imin_pos <= imax_pos:
while array[imin_pos] < mid:
num_compare += 1
imin_pos += 1
while array[imax_pos] > mid:
imax_pos -= 1
num_compare += 1
num_compare += 3
if imin_pos == imax_pos:
imin_pos += 1
imax_pos -= 1
break
if imin_pos < imax_pos:
num_copy +=3
temp = array[imin_pos]
array[imin_pos] = array[imax_pos]
array[imax_pos] = temp
imin_pos += 1
imax_pos -= 1
if imax_pos - min_pos < max_pos - imin_pos:
quicksort_2(array, min_pos, imax_pos)
quicksort_2(array, imin_pos, max_pos)
else:
quicksort_2(array, imin_pos, max_pos)
quicksort_2(array, min_pos, imax_pos)
# Run program
if __name__ == "__main__":
main()
Results:
Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time
Bubble Sort ,10,0.4400,N*N,0.8100,N*N,0.0000
Direct Selection,10,0.4500,N*N,0.2700,N*N,0.0000
Direct Insertion,10,0.3600,N*N,0.4500,N*N,0.0000
Binary Insertion,10,0.2300,N*N,0.4500,N*N,0.0000
ShellSort ,10,1.3463,N*Ln(N),2.3452,N*Ln(N),0.0000
MergeSort ,10,1.1292,N*Ln(N),1.9978,N*Ln(N),0.0000
TimSort ,10,1.5635,N*Ln(N),1.9543,N*Ln(N),0.0000
HeapSort ,10,1.6503,N*Ln(N),3.0835,N*Ln(N),0.0000
QuickSort ,10,2.3018,N*Ln(N),0.9120,N*Ln(N),0.0000
Bubble Sort ,100,0.4947,N*N,0.8259,N*N,0.0000
Direct Selection,100,0.4950,N*N,0.0297,N*N,0.0156
Direct Insertion,100,0.2849,N*N,0.2948,N*N,0.0000
Binary Insertion,100,0.0533,N*N,0.2951,N*N,0.0000
ShellSort ,100,2.2909,N*Ln(N),3.4570,N*Ln(N),0.0000
MergeSort ,100,1.2551,N*Ln(N),3.3658,N*Ln(N),0.0000
TimSort ,100,1.9587,N*Ln(N),2.6123,N*Ln(N),0.0000
HeapSort ,100,2.2301,N*Ln(N),2.3083,N*Ln(N),0.0000
QuickSort ,100,2.1954,N*Ln(N),1.0228,N*Ln(N),0.0000
Bubble Sort ,1000,0.4994,N*N,0.7659,N*N,0.1380
Direct Selection,1000,0.4995,N*N,0.0030,N*N,0.0781
Direct Insertion,1000,0.2563,N*N,0.2573,N*N,0.0690
Binary Insertion,1000,0.0086,N*N,0.2573,N*N,0.0312
ShellSort ,1000,3.4677,N*Ln(N),4.6966,N*Ln(N),0.0157
MergeSort ,1000,1.2632,N*Ln(N),9.5585,N*Ln(N),0.0000
TimSort ,1000,1.9841,N*Ln(N),2.8565,N*Ln(N),0.0000
HeapSort ,1000,2.4354,N*Ln(N),2.0328,N*Ln(N),0.0000
QuickSort ,1000,2.3320,N*Ln(N),1.0145,N*Ln(N),0.0156
Bubble Sort ,10000,0.4999,N*N,0.7448,N*N,11.0654
Direct Selection,10000,0.5000,N*N,0.0003,N*N,6.4206
Direct Insertion,10000,0.2484,N*N,0.2485,N*N,6.8213
Binary Insertion,10000,0.0012,N*N,0.2485,N*N,3.7271
ShellSort ,10000,7.4491,N*Ln(N),8.7716,N*Ln(N),0.2006
MergeSort ,10000,1.3276,N*Ln(N),56.4443,N*Ln(N),0.8181
TimSort ,10000,1.5810,N*Ln(N),2.6737,N*Ln(N),0.0467
HeapSort ,10000,2.5559,N*Ln(N),1.8915,N*Ln(N),0.0535
QuickSort ,10000,2.2253,N*Ln(N),1.0183,N*Ln(N),0.0156
TimSort ,100000,1.6376,N*Ln(N),2.7715,N*Ln(N),0.7021
HeapSort ,100000,2.6224,N*Ln(N),1.8021,N*Ln(N),0.7022
QuickSort ,100000,2.3369,N*Ln(N),1.0141,N*Ln(N),0.3701
TimSort ,1000000,1.6947,N*Ln(N),2.8551,N*Ln(N),9.6082
HeapSort ,1000000,2.6631,N*Ln(N),1.7406,N*Ln(N),9.9156
QuickSort ,1000000,2.2399,N*Ln(N),1.0164,N*Ln(N),4.4138
-
I'm not a fan of Python (as you probably already figured) and not aware in details about how it performs using varying data structures, but at first sight, I would say that the merge_insert() function is probably a bottleneck here.
-
can you translate and add this (originally VB) into your setup? change data type if necessary. its a variant of shell sort, verify it separately first as it can be unstable on some array indexing type. i suggest to test on random (unsorted), some/half part already sorted and nearly sorted data... to simulate near all real life situations. ymmv. (ps: i want to see how it fair up with python's timsort :P)
Public Sub GB_ShellSort(A() As Integer)
Dim temp As Integer
Dim increm As Long
Dim numOfEntries As Long
Dim i As Long
Dim j As Long
numOfEntries = UBound(A) + 1 '(base 0 array)
increm = numOfEntries * (5 / 11)
Do Until increm < 1
For i = increm To numOfEntries - 1 '(base 0 array)
temp = A(i)
For j = i - increm To 0 Step -increm
If temp >= A(j) Then Exit For
A(j + increm) = A(j)
Next j
A(j + increm) = temp
Next i
increm = increm * (5 / 11)
Loop
End Sub
-
This algorithm is exactly to my algorithm sellsort with one change:
the factor (5/11) in my case is (1 / 2)
this ratio (5/11) does not work for some reason on some array sizes.
Comparing both:
Algorithm,Array size,Randomness,Comparisons,Units,Copies,Units,Execution time
ShellSort , 100, 1.0, 1.9826, N*ln(N), 3.1921, N*ln(N), 0.0000
ShellSort 2 , 100, 1.0, 1.6395, N*ln(N), 2.6883, N*ln(N), 0.0000
ShellSort , 100, 0.2, 1.5461, N*ln(N), 2.6622, N*ln(N), 0.0000
ShellSort 2 , 100, 0.2, 1.1965, N*ln(N), 2.1237, N*ln(N), 0.0000
ShellSort , 1000, 1.0, 2.2608, N*ln(N), 3.4893, N*ln(N), 0.0000
ShellSort 2 , 1000, 1.0, 1.8845, N*ln(N), 2.9947, N*ln(N), 0.0157
ShellSort , 1000, 0.2, 1.9155, N*ln(N), 3.0896, N*ln(N), 0.0000
ShellSort 2 , 1000, 0.2, 1.6555, N*ln(N), 2.7094, N*ln(N), 0.0000
ShellSort , 10000, 1.0, 2.8872, N*ln(N), 4.2443, N*ln(N), 0.0781
ShellSort 2 , 10000, 1.0, 2.1232, N*ln(N), 3.2822, N*ln(N), 0.0623
ShellSort , 10000, 0.2, 2.4435, N*ln(N), 3.7594, N*ln(N), 0.1381
ShellSort 2 , 10000, 0.2, 1.9361, N*ln(N), 3.0524, N*ln(N), 0.0625
ShellSort , 100000, 1.0, 3.7974, N*ln(N), 5.1441, N*ln(N), 1.3666
ShellSort 2 , 100000, 1.0, 2.2413, N*ln(N), 3.4288, N*ln(N), 0.8182
ShellSort , 100000, 0.2, 3.3230, N*ln(N), 4.6364, N*ln(N), 1.2349
ShellSort 2 , 100000, 0.2, 2.1061, N*ln(N), 3.2598, N*ln(N), 0.7868
ShellSort , 1000000, 1.0, 4.8703, N*ln(N), 6.2096, N*ln(N), 21.1200
ShellSort 2 , Array not sorted.
ShellSort , 1000000, 0.2, 4.0135, N*ln(N), 5.3252, N*ln(N), 20.1193
ShellSort 2 , Array not sorted.
Translated algorithm:
def shellsort_2(array):
global num_compare, num_copy
array_size = len(array)
increment = array_size * 5 // 11
while increment >= 1:
i = increment
while i < array_size:
num_copy += 1
temp = array[i]
j = i
while j >= increment:
num_compare += 1
if array[j - increment] > temp:
num_copy += 1
array[j] = array[j - increment]
j -= increment
else:
break
num_copy += 1
array[j] = temp
i += 1
increment = increment * 5 // 11
Your algorithm works correctly if I change the ratio back to 1/2
-
Your algorithm works correctly if I change the ratio back to 1/2
yes for ratio 1/2 is standard shellsort, but the 5/11 i derived from value 2.2 (gonnet baeza) as described in https://en.wikipedia.org/wiki/Shellsort maybe you can tweak the ratio to get correct rounding off error. i can see some improvement compared to classic shellsort when both can sort correctly.
-
I have added a parameter (randomness) of variable randomness for the arrays used, so that the algorithms can be tested with arrays with different sorting/randomness level.
Program:
"""
Comparison between sorting algorithms
Copyright (c) 2022 Picuino
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""
import copy
import math
import random
import time
num_compare = 0
num_copy = 0
def main():
algorithms = [
{ "name":"Bubble Sort",
"function":bubble_sort,
"units": "N*N",
"maxsize": 10000,
},
{ "name":"Direct Selection",
"function":direct_selection,
"units": "N*N",
"maxsize": 10000,
},
{ "name":"Direct Insertion",
"function":direct_insertion,
"units": "N*N",
"maxsize": 10000,
},
{ "name":"Binary Insertion",
"function":binary_insertion,
"units": "N*N",
"maxsize": 10000,
},
{ "name":"MergeSort",
"function":mergesort,
"units": "N*ln(N)",
"maxsize": 100000,
},
{ "name":"ShellSort",
"function":shellsort,
"units": "N*ln(N)",
"maxsize": 100000,
},
{ "name":"TimSort",
"function":timsort,
"units": "N*ln(N)",
"maxsize": 1000000,
},
{ "name":"HeapSort",
"function":heapsort,
"units": "N*ln(N)",
"maxsize": 1000000,
},
{ "name":"QuickSort",
"function":quicksort,
"units": "N*ln(N)",
"maxsize": 1000000,
},
]
algorithms_not_used = [
]
# Initialize report file
report_head = "Algorithm,Array size,Randomness,Comparisons,Units,Copies,Units,Execution time"
with open("Report.csv", "wt") as report_file:
report_file.write(report_head + '\n')
print(report_head)
# Sort arrays of different sizes and randomness
array_types = [
( 100, 1), ( 100, 0.1),
( 1000, 1), ( 1000, 0.1),
( 10000, 1), ( 10000, 0.1),
( 100000, 1), ( 100000, 0.1),
(1000000, 1), (1000000, 0.1),
]
for array_type in array_types:
array = generate_random_array(*array_type)
for algorithm in algorithms:
if array_type[0] > algorithm['maxsize']:
continue
report = sort(algorithm, array, array_type)
print(report)
report_file.write(report + '\n')
def sort(algorithm, array, array_type):
global num_compare, num_copy
array_copy = copy.copy(array)
num_compare = num_copy = 0
run_time = time.time()
algorithm["function"](array_copy)
run_time = time.time() - run_time
if is_sorted(array_copy):
report = generate_report(algorithm, array_type, num_compare, num_copy, run_time)
else:
report = "%-16s, Array not sorted." % algorithm["name"]
return report
def generate_random_array(array_size, randomness):
array = []
for i in range(array_size):
array.append(random.getrandbits(32))
if randomness >= 1:
return array
array.sort()
if randomness <= 0:
return array
for i in range(int(array_size * randomness * 0.5)):
a = random.randint(0, array_size - 1)
b = random.randint(0, array_size - 1)
array[a], array[b] = array[b], array[a]
return array
def is_sorted(array):
for i in range(1, len(array)):
if array[i-1] > array[i]:
return False
return True
def generate_report(algorithm, array_type, num_compare, num_copy, run_time):
report = []
array_size, randomness = array_type[0], array_type[1]
# Algorithm name
report.append("%-16s" % algorithm["name"])
# Array size and randomness
report.append("%8d, %1.1f" % array_type)
# Number of comparisons
if algorithm['units'] == "N*ln(N)":
report.append("%8.4f, N*ln(N)" % (num_compare/(array_size * math.log(array_size))))
else:
report.append("%8.4f, N*N " % (num_compare/float(array_size * array_size)))
# Number of copies
if algorithm['units'] == "N*ln(N)":
report.append("%8.4f, N*ln(N)" % (num_copy/(array_size * math.log(array_size))))
else:
report.append("%8.4f, N*N " % (num_copy/float(array_size * array_size)))
# Execution time
report.append("%8.4f" % run_time)
return ', '.join(report)
def bubble_sort(array):
global num_compare, num_copy
for i in range(len(array), 1, -1):
moves = 0
for j in range(1, i):
num_compare += 1
if array[j - 1] > array[j]:
num_copy += 3
temp = array[j]
array[j] = array[j - 1]
array[j -1] = temp
moves = 1
if moves == 0:
break
def binary_insertion(array):
global num_compare, num_copy
for pos in range(1, len(array)):
num_copy += 1
temp = array[pos]
inf = 0
sup = pos
while inf < sup:
med = int((inf + sup) / 2)
num_compare += 1
if temp > array[med]:
inf = med + 1
else:
sup = med
while pos > inf:
num_copy += 1
array[pos] = array[pos - 1]
pos -= 1
num_copy += 1
array[pos] = temp
def direct_selection(array):
global num_compare, num_copy
t_inf = 0
t_sup = len(array) - 1
while t_inf < t_sup:
p_min = t_inf
j = t_inf + 1
while j <= t_sup:
num_compare += 1
if array[j] < array[p_min]:
p_min = j
j += 1
num_copy += 3
temp = array[t_inf]
array[t_inf] = array[p_min]
array[p_min] = temp
t_inf += 1
def direct_insertion(array):
global num_compare, num_copy
i = 1
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= 1:
num_compare += 1
num_copy += 1
if array[j - 1] > temp:
array[j] = array[j - 1]
j -= 1
else:
break
array[j] = temp
i += 1
def shellsort(array):
global num_compare, num_copy
array_size = len(array)
step = array_size // 2
while step >= 1:
i = step
while i < len(array):
num_copy += 1
temp = array[i]
j = i
while j >= step:
num_compare += 1
if array[j-step] > temp:
num_copy += 1
array[j] = array[j-step]
j -= step
else:
break
num_copy += 1
array[j] = temp
i += 1
step //= 2
def mergesort(array):
mergesort_2(array, 0, len(array)-1)
def mergesort_2(array, inicio, fin):
global num_compare, num_copy
if fin <= inicio+1:
num_compare += 1
if array[fin]<array[inicio]:
num_copy += 3
temp = array[fin]
array[fin] = array[inicio]
array[inicio] = temp
return
mid = int((inicio + fin)/2)
mergesort_2(array, inicio, mid)
mergesort_2(array, mid+1, fin)
merge_temp(array, inicio, mid+1, fin)
def merge_temp(array, init1, init2, end2):
global num_compare, num_copy
memo_temp = 10
temp = list(range(memo_temp))
end1 = init2-1
i = 0
while init1 <= end1 and init2 <= end2:
num_compare += 1
num_copy += 1
if array[init2] < array[init1]:
temp[i] = array[init2]
init2 += 1
else:
temp[i] = array[init1]
init1 += 1
i += 1
if i >= memo_temp:
merge_insert(array, temp, i, init1, end1, init2, end2)
init1 = init1 + (init2 - 1 - end1)
end1 = init2 - 1
i = 0
merge_insert(array, temp, i, init1, end1, init2, end2)
def merge_insert(array, temp, leng, init1, end1, init2, end2):
global num_compare, num_copy
init_copy = init1 + (init2-end1-1) - leng
j = init2-1
while end1 >= init1:
num_copy += 1
array[j] = array[end1]
j -= 1
end1 -= 1
init1 = init_copy + leng
j = 0
while init_copy < init1:
num_copy += 1
array[init_copy] = temp[j]
init_copy += 1
j += 1
def heapsort(array):
global num_compare, num_copy
# Make heap structure
pointer = int((len(array) - 1) / 2)
while pointer >= 0:
heapsort_screen(array, pointer, len(array) - 1)
pointer -= 1
# Sort array
pointer = len(array) - 1
while pointer > 0:
num_copy += 3
temp = array[pointer]
array[pointer] = array[0]
array[0] = temp
pointer -= 1
heapsort_screen(array, 0, pointer)
def heapsort_screen(array, array_min, array_max):
global num_compare, num_copy
down = array_min * 2 + 1
if (down > array_max):
return
num_copy += 1
temp = array[array_min]
while down <= array_max:
if down < array_max:
num_compare += 1
if array[down + 1] > array[down]:
down += 1
num_compare += 1
if temp > array[down]:
break
num_copy += 1
array[array_min] = array[down]
array_min = down
down = down * 2 + 1
num_copy += 1
array[array_min] = temp
def timsort(array):
n = len(array)
minrun = timsort_find_minrun(n)
for start in range(0, n, minrun):
end = min(start + minrun - 1, n - 1)
timsort_insertion(array, start, end)
size = minrun
while size < n:
for left in range(0, n, 2 * size):
mid = min(n - 1, left + size - 1)
right = min((left + 2 * size - 1), (n - 1))
timsort_merge(array, left, mid, right)
size = 2 * size
def timsort_find_minrun(n):
TIMSORT_MINIMUM = 32
r = 0
while n >= TIMSORT_MINIMUM:
r |= n & 1
n >>= 1
return n + r
def timsort_insertion(array, left, right):
global num_compare, num_copy
for i in range(left+1, right+1):
num_copy += 1
element = array[i]
j = i - 1
num_compare += 1
while element < array[j] and j >= left :
num_copy += 1
array[j+1] = array[j]
j -= 1
num_compare += 1
num_copy += 1
array[j+1] = element
return array
def timsort_merge(array, l, m, r):
global num_compare, num_copy
array_length1 = m - l + 1
array_length2 = r - m
left = []
right = []
for i in range(0, array_length1):
num_copy += 1
left.append(array[l + i])
for i in range(0, array_length2):
num_copy += 1
right.append(array[m + 1 + i])
i=0
j=0
k=l
while j < array_length2 and i < array_length1:
num_compare += 1
if left[i] <= right[j]:
num_copy += 1
array[k] = left[i]
i += 1
else:
num_copy += 1
array[k] = right[j]
j += 1
k += 1
while i < array_length1:
num_copy += 1
array[k] = left[i]
k += 1
i += 1
while j < array_length2:
num_copy += 1
array[k] = right[j]
k += 1
j += 1
def quicksort(array):
global num_compare, num_copy
quicksort_2(array, 0, len(array)-1)
def quicksort_2(array, min_pos, max_pos):
global num_compare, num_copy
if min_pos >= max_pos:
return
imin_pos = min_pos
imax_pos = max_pos
mid = array[int((min_pos + max_pos)/2)]
while imin_pos <= imax_pos:
while array[imin_pos] < mid:
num_compare += 1
imin_pos += 1
while array[imax_pos] > mid:
imax_pos -= 1
num_compare += 1
num_compare += 3
if imin_pos == imax_pos:
imin_pos += 1
imax_pos -= 1
break
if imin_pos < imax_pos:
num_copy +=3
temp = array[imin_pos]
array[imin_pos] = array[imax_pos]
array[imax_pos] = temp
imin_pos += 1
imax_pos -= 1
if imax_pos - min_pos < max_pos - imin_pos:
quicksort_2(array, min_pos, imax_pos)
quicksort_2(array, imin_pos, max_pos)
else:
quicksort_2(array, imin_pos, max_pos)
quicksort_2(array, min_pos, imax_pos)
# Run program
if __name__ == "__main__":
main()
Results:
Algorithm,Array size,Randomness,Comparisons,Units,Copies,Units,Execution time
Bubble Sort , 100, 1.0, 0.4935, N*N , 0.8349, N*N , 0.0000
Direct Selection, 100, 1.0, 0.4950, N*N , 0.0297, N*N , 0.0000
Direct Insertion, 100, 1.0, 0.2880, N*N , 0.2979, N*N , 0.0000
Binary Insertion, 100, 1.0, 0.0532, N*N , 0.2981, N*N , 0.0000
MergeSort , 100, 1.0, 1.2529, N*ln(N), 3.3397, N*ln(N), 0.0000
ShellSort , 100, 1.0, 1.8696, N*ln(N), 3.1117, N*ln(N), 0.0000
TimSort , 100, 1.0, 2.0933, N*ln(N), 2.7491, N*ln(N), 0.0000
HeapSort , 100, 1.0, 2.2171, N*ln(N), 2.3213, N*ln(N), 0.0000
QuickSort , 100, 1.0, 2.3039, N*ln(N), 0.9967, N*ln(N), 0.0000
Bubble Sort , 100, 0.1, 0.4422, N*N , 0.0558, N*N , 0.0000
Direct Selection, 100, 0.1, 0.4950, N*N , 0.0297, N*N , 0.0000
Direct Insertion, 100, 0.1, 0.0285, N*N , 0.0384, N*N , 0.0000
Binary Insertion, 100, 0.1, 0.0514, N*N , 0.0384, N*N , 0.0000
MergeSort , 100, 0.1, 1.0293, N*ln(N), 2.1671, N*ln(N), 0.0000
ShellSort , 100, 0.1, 1.4071, N*ln(N), 2.5059, N*ln(N), 0.0000
TimSort , 100, 0.1, 0.6818, N*ln(N), 1.3811, N*ln(N), 0.0000
HeapSort , 100, 0.1, 2.3452, N*ln(N), 2.4538, N*ln(N), 0.0000
QuickSort , 100, 0.1, 1.4614, N*ln(N), 0.0261, N*ln(N), 0.0000
Bubble Sort , 1000, 1.0, 0.4990, N*N , 0.7374, N*N , 0.1003
Direct Selection, 1000, 1.0, 0.4995, N*N , 0.0030, N*N , 0.0532
Direct Insertion, 1000, 1.0, 0.2468, N*N , 0.2478, N*N , 0.0625
Binary Insertion, 1000, 1.0, 0.0086, N*N , 0.2478, N*N , 0.0468
MergeSort , 1000, 1.0, 1.2652, N*ln(N), 9.4297, N*ln(N), 0.0222
ShellSort , 1000, 1.0, 2.2320, N*ln(N), 3.4645, N*ln(N), 0.0000
TimSort , 1000, 1.0, 1.9630, N*ln(N), 2.8368, N*ln(N), 0.0156
HeapSort , 1000, 1.0, 2.4373, N*ln(N), 2.0380, N*ln(N), 0.0000
QuickSort , 1000, 1.0, 2.3049, N*ln(N), 1.0102, N*ln(N), 0.0000
Bubble Sort , 1000, 0.1, 0.4927, N*N , 0.0909, N*N , 0.0691
Direct Selection, 1000, 0.1, 0.4995, N*N , 0.0030, N*N , 0.0690
Direct Insertion, 1000, 0.1, 0.0313, N*N , 0.0323, N*N , 0.0000
Binary Insertion, 1000, 0.1, 0.0085, N*N , 0.0323, N*N , 0.0065
MergeSort , 1000, 0.1, 1.0881, N*ln(N), 5.8510, N*ln(N), 0.0000
ShellSort , 1000, 0.1, 1.8065, N*ln(N), 2.9764, N*ln(N), 0.0000
TimSort , 1000, 0.1, 1.0241, N*ln(N), 1.9425, N*ln(N), 0.0000
HeapSort , 1000, 0.1, 2.5312, N*ln(N), 2.1214, N*ln(N), 0.0000
QuickSort , 1000, 0.1, 1.4520, N*ln(N), 0.0634, N*ln(N), 0.0000
Bubble Sort , 10000, 1.0, 0.4999, N*N , 0.7520, N*N , 10.6330
Direct Selection, 10000, 1.0, 0.5000, N*N , 0.0003, N*N , 6.2194
Direct Insertion, 10000, 1.0, 0.2508, N*N , 0.2509, N*N , 6.7740
Binary Insertion, 10000, 1.0, 0.0012, N*N , 0.2509, N*N , 3.5269
MergeSort , 10000, 1.0, 1.3278, N*ln(N), 56.7051, N*ln(N), 0.7867
ShellSort , 10000, 1.0, 2.8618, N*ln(N), 4.2204, N*ln(N), 0.0938
TimSort , 10000, 1.0, 1.5854, N*ln(N), 2.6782, N*ln(N), 0.0469
HeapSort , 10000, 1.0, 2.5569, N*ln(N), 1.8920, N*ln(N), 0.0535
QuickSort , 10000, 1.0, 2.4283, N*ln(N), 1.0001, N*ln(N), 0.0220
Bubble Sort , 10000, 0.1, 0.4977, N*N , 0.0883, N*N , 6.3979
Direct Selection, 10000, 0.1, 0.5000, N*N , 0.0003, N*N , 6.1972
Direct Insertion, 10000, 0.1, 0.0295, N*N , 0.0296, N*N , 0.8181
Binary Insertion, 10000, 0.1, 0.0012, N*N , 0.0296, N*N , 0.4389
MergeSort , 10000, 0.1, 1.2287, N*ln(N), 32.4017, N*ln(N), 0.4701
ShellSort , 10000, 0.1, 2.2982, N*ln(N), 3.6079, N*ln(N), 0.0688
TimSort , 10000, 0.1, 1.1231, N*ln(N), 2.2584, N*ln(N), 0.0467
HeapSort , 10000, 0.1, 2.6448, N*ln(N), 1.9635, N*ln(N), 0.0469
QuickSort , 10000, 0.1, 1.4713, N*ln(N), 0.0759, N*ln(N), 0.0311
MergeSort , 100000, 1.0, 1.3615, N*ln(N), 436.6518, N*ln(N), 75.9208
ShellSort , 100000, 1.0, 3.7946, N*ln(N), 5.1413, N*ln(N), 1.3263
TimSort , 100000, 1.0, 1.6377, N*ln(N), 2.7716, N*ln(N), 0.6866
HeapSort , 100000, 1.0, 2.6228, N*ln(N), 1.8023, N*ln(N), 0.7023
QuickSort , 100000, 1.0, 2.2740, N*ln(N), 1.0101, N*ln(N), 0.3321
MergeSort , 100000, 0.1, 1.2854, N*ln(N), 245.8826, N*ln(N), 45.8585
ShellSort , 100000, 0.1, 2.8985, N*ln(N), 4.2070, N*ln(N), 1.0720
TimSort , 100000, 0.1, 1.1866, N*ln(N), 2.3467, N*ln(N), 0.6020
HeapSort , 100000, 0.1, 2.6961, N*ln(N), 1.8612, N*ln(N), 0.7178
QuickSort , 100000, 0.1, 1.4983, N*ln(N), 0.0805, N*ln(N), 0.2319
TimSort , 1000000, 1.0, 1.6938, N*ln(N), 2.8542, N*ln(N), 9.4139
HeapSort , 1000000, 1.0, 2.6632, N*ln(N), 1.7407, N*ln(N), 9.2069
QuickSort , 1000000, 1.0, 2.3135, N*ln(N), 1.0077, N*ln(N), 4.5677
TimSort , 1000000, 0.1, 1.2283, N*ln(N), 2.4098, N*ln(N), 9.0971
HeapSort , 1000000, 0.1, 2.7237, N*ln(N), 1.7892, N*ln(N), 8.5798
QuickSort , 1000000, 0.1, 1.5192, N*ln(N), 0.0844, N*ln(N), 3.1254
-
Your algorithm works correctly if I change the ratio back to 1/2
yes for ratio 1/2 is standard shellsort, but the 5/11 i derived from value 2.2 (gonnet baeza) as described in https://en.wikipedia.org/wiki/Shellsort maybe you can tweak the ratio to get correct rounding off error. i can see some improvement compared to classic shellsort when both can sort correctly.
Yes, there is a improvement in shellsort 2 in the number of comparisons, number of copies and execution time. But i don't know how to eliminate the error. :-//
Algorithm,Array size,Randomness,Comparisons,Units,Copies,Units,Execution time
ShellSort , 100, 1.0, 1.9826, N*ln(N), 3.1921, N*ln(N), 0.0000
ShellSort 2 , 100, 1.0, 1.6395, N*ln(N), 2.6883, N*ln(N), 0.0000
ShellSort , 100, 0.2, 1.5461, N*ln(N), 2.6622, N*ln(N), 0.0000
ShellSort 2 , 100, 0.2, 1.1965, N*ln(N), 2.1237, N*ln(N), 0.0000
ShellSort , 1000, 1.0, 2.2608, N*ln(N), 3.4893, N*ln(N), 0.0000
ShellSort 2 , 1000, 1.0, 1.8845, N*ln(N), 2.9947, N*ln(N), 0.0157
ShellSort , 1000, 0.2, 1.9155, N*ln(N), 3.0896, N*ln(N), 0.0000
ShellSort 2 , 1000, 0.2, 1.6555, N*ln(N), 2.7094, N*ln(N), 0.0000
ShellSort , 10000, 1.0, 2.8872, N*ln(N), 4.2443, N*ln(N), 0.0781
ShellSort 2 , 10000, 1.0, 2.1232, N*ln(N), 3.2822, N*ln(N), 0.0623
ShellSort , 10000, 0.2, 2.4435, N*ln(N), 3.7594, N*ln(N), 0.1381
ShellSort 2 , 10000, 0.2, 1.9361, N*ln(N), 3.0524, N*ln(N), 0.0625
ShellSort , 100000, 1.0, 3.7974, N*ln(N), 5.1441, N*ln(N), 1.3666
ShellSort 2 , 100000, 1.0, 2.2413, N*ln(N), 3.4288, N*ln(N), 0.8182
ShellSort , 100000, 0.2, 3.3230, N*ln(N), 4.6364, N*ln(N), 1.2349
ShellSort 2 , 100000, 0.2, 2.1061, N*ln(N), 3.2598, N*ln(N), 0.7868
ShellSort , 1000000, 1.0, 4.8703, N*ln(N), 6.2096, N*ln(N), 21.1200
ShellSort 2 , Array not sorted.
ShellSort , 1000000, 0.2, 4.0135, N*ln(N), 5.3252, N*ln(N), 20.1193
ShellSort 2 , Array not sorted.
-
maybe you can change from increment = array_size * 5 // 11 to increment = array_size * (5 // 11) or increment = (array_size * 5) // 11... tokuda's ratio is (4 / 9) could be even worse ;D ymmv. (ps: or just maybe there is a flaw in the implementation of the "gap" in original code :-//) or maybe from the website (attached) is
increment = (array_size * 5 - 1) // 11
...
increment = ((increment * 5) // 11) - 1
-
This 'new sort' algorithm is worst than bubble sort and not more 'intuitive'
It's not all that much worse. Always twice as many comparisons, true, but fewer swaps.
The number of swaps is intriguingly close to sqrt(0.5) aka sqrt(2)/2.
-
I reckon if Tim is allowed his own algorithm then so am I.
{ "name":"BruceSort",
"function":brucesort,
"units": "N*ln(N)",
"maxsize": 1000000,
},
:
:
:
def brucesort(array):
global num_compare, num_copy
if len(array) > 8: brucesort_2(array, 0, len(array)-1)
direct_insertion(array)
def brucesort_2(array, min_pos, max_pos):
global num_compare, num_copy
a, b, c = array[min_pos], array[int((min_pos + max_pos)/2)], array[max_pos]
if (b<a) != (b<c):
pivot = b
elif (a<b) != (a<c):
pivot = a
else:
pivot = c
i, j = min_pos, max_pos
while True:
orig_i, orig_j = i, j
while pivot > array[i]: i += 1
while pivot < array[j]: j -= 1
num_compare += (i-orig_i) + (orig_j-j) + 2
if i >= j: break
num_copy += 3
array[i], array[j] = array[j], array[i]
i, j = i+1, j-1
if i == j: i, j = i+1, j-1
if j - min_pos > 8:
brucesort_2(array, min_pos, j)
if max_pos - i > 8:
brucesort_2(array, i, max_pos)
Results
On M1 Mac:
QuickSort , 1000000, 1.0, 2.3490, N*ln(N), 1.0075, N*ln(N), 4.1489
BruceSort , 1000000, 1.0, 1.6734, N*ln(N), 1.1433, N*ln(N), 3.3282
QuickSort , 1000000, 0.1, 1.4773, N*ln(N), 0.0848, N*ln(N), 2.9112
BruceSort , 1000000, 0.1, 1.3723, N*ln(N), 0.2303, N*ln(N), 2.0927
On ThreadRipper 2990wx:
QuickSort , 1000000, 1.0, 2.3291, N*ln(N), 1.0047, N*ln(N), 4.4930
BruceSort , 1000000, 1.0, 1.7684, N*ln(N), 1.1295, N*ln(N), 3.6325
QuickSort , 1000000, 0.1, 1.4799, N*ln(N), 0.0845, N*ln(N), 2.9236
BruceSort , 1000000, 0.1, 1.3792, N*ln(N), 0.2300, N*ln(N), 1.9506
-
maybe you can change from increment = array_size * 5 // 11 to increment = array_size * (5 // 11) or increment = (array_size * 5) // 11... tokuda's ratio is (4 / 9) could be even worse ;D ymmv. (ps: or just maybe there is a flaw in the implementation of the "gap" in original code :-//) or maybe from the website (attached) is
increment = (array_size * 5 - 1) // 11
...
increment = ((increment * 5) // 11) - 1
Finally!! this is what works:
increment = int(increment * (5/11))
With better results than:
increment = int(increment * (1/2))
-
I reckon if Tim is allowed his own algorithm then so am I.
{ "name":"BruceSort",
"function":brucesort,
"units": "N*ln(N)",
"maxsize": 1000000,
},
:
:
:
def brucesort(array):
global num_compare, num_copy
if len(array) > 8: brucesort_2(array, 0, len(array)-1)
direct_insertion(array)
def brucesort_2(array, min_pos, max_pos):
global num_compare, num_copy
a, b, c = array[min_pos], array[int((min_pos + max_pos)/2)], array[max_pos]
if (b<a) != (b<c):
pivot = b
elif (a<b) != (a<c):
pivot = a
else:
pivot = c
i, j = min_pos, max_pos
while True:
orig_i, orig_j = i, j
while pivot > array[i]: i += 1
while pivot < array[j]: j -= 1
num_compare += (i-orig_i) + (orig_j-j) + 2
if i >= j: break
num_copy += 3
array[i], array[j] = array[j], array[i]
i, j = i+1, j-1
if i == j: i, j = i+1, j-1
if j - min_pos > 8:
brucesort_2(array, min_pos, j)
if max_pos - i > 8:
brucesort_2(array, i, max_pos)
Very impressive!! Better than QuickSort!!
It appears to be an enhancement to QuickSort.
-
Finally!! this is what works:
increment = int(increment * (5/11))
With better results than:
increment = int(increment * (1/2))
Only for 100 000 arrays. In 1000 000 arrays the algorithm fails again. :(
-
def brucesort(array):
global num_compare, num_copy
if len(array) > 8: brucesort_2(array, 0, len(array)-1)
direct_insertion(array)
Very impressive!! Better than QuickSort!!
It appears to be an enhancement to QuickSort.
Yesss. Improve quicksort by doing a slow insertion sort after it!
-
You should check out
fluxsort (stable adaptive branchless , hybrid mergesort/quicksort) : https://github.com/scandum/fluxsort
quadsort (stable bottom-up adaptive branchless merge sort - name comes from sorting 4 variables at once) : https://github.com/scandum/quadsort
crumsort (UNstable adaptive hybrid mergesort/quicksort, claims better than fluxsort at 1M+ records) : https://github.com/scandum/crumsort
May want to look into sorting using SIMD / AVX and other tricks
ex. quicksort using AVX : https://github.com/WojciechMula/simd-sort
also see the PDF referenced in the link above.
-
Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time
Bubble Sort ,10,0.4400,N*N,0.5700,N*N,0.0000
Direct Insertion,10,0.2700,N*N,0.3600,N*N,0.0000
Binary Insertion,10,0.2200,N*N,0.3700,N*N,0.0000
Direct Selection,10,0.4500,N*N,0.2700,N*N,0.0000
ShellSort ,10,1.3029,N*Ln(N),2.3018,N*Ln(N),0.0000
HeapSort ,10,1.6069,N*Ln(N),3.0835,N*Ln(N),0.0000
Why are so many of the tests showing 0.0 execution time?
-
it says size=10 so i assume 10 values can be sorted in nanoseconds on 4+ ghz processors.
-
Algorithm,Array size,Comparisons,Units,Copies,Units,Execution time
Bubble Sort ,10,0.4400,N*N,0.5700,N*N,0.0000
Direct Insertion,10,0.2700,N*N,0.3600,N*N,0.0000
Binary Insertion,10,0.2200,N*N,0.3700,N*N,0.0000
Direct Selection,10,0.4500,N*N,0.2700,N*N,0.0000
ShellSort ,10,1.3029,N*Ln(N),2.3018,N*Ln(N),0.0000
HeapSort ,10,1.6069,N*Ln(N),3.0835,N*Ln(N),0.0000
Why are so many of the tests showing 0.0 execution time?
Because sorting 10 items takes less than 0.00005 seconds even with the worst algorithm, even written in Python?
-
Duh! I entirely missed that there were examples that varied the number of items!
-
For small set of numbers to sort, pc will happily sort with the the new but worst algorithm you put in at 0s.. real life is millions or billions of unsorted numbers. Thats why we came up with some hacks or tricks to deal with since even the best sort will take a while.. such as precomputation and save it somewhere for later use. when inserting value to already sorted values, it only take O(log(n)) instead of dreadful O(n.log (n) ).it looks cute, but not when we are talking billions of numbers.
-
It's worth pointing out one more thing ... how much extra memory each algorithm uses.
I could very easily "sort" 1m numbers by making multiple double linked lists (ex element.previous - element.value - element.next ) and basically inserting each value as I loop through the million entries into one such list or by other tricks that use a lot of memory.
For example, let's say the sort key is a 32 bit number ... I could have as many 256 element "buckets" as needed, by using the top 24 bits of the sort key as an id for the bucket. In each bucket then, you can have up to keys and the positions in the array where these keys show up
When you're done parsing the whole set of numbers, you can just sort the buckets by their id and merge everything in one output array.
-
If you have 2-16GB ram at disposal, you can sort almost any 4bytes length integer with O (n) speed. but thats not the popular way..