|
|
Line 1: |
Line 1: |
| {{Infobox Algorithm
| |
| |class=[[Sorting algorithm]]
| |
| |image=
| |
| |caption=A visual representation of timsort.
| |
| |data=[[Array data structure|Array]]
| |
| |time=<math>O(n\log n)</math><ref>{{cite web
| |
| | title = <nowiki>[Python-Dev]</nowiki> Sorting
| |
| | url = http://mail.python.org/pipermail/python-dev/2002-July/026837.html
| |
| | last = Peters
| |
| | first = Tim
| |
| | work = Python Developers Mailinglist
| |
| | accessdate = 24 Feb 2011
| |
| | quote = [Timsort] also has good aspects: It's stable (items that compare equal retain their relative order, so, e.g., if you sort first on zip code, and a second time on name, people with the same name still appear in order of increasing zip code; this is important in apps that, e.g., refine the results of queries based on user input). ... It has no bad cases (O(N log N) is worst case; N-1 compares is best).
| |
| }}</ref>
| |
| |best-time=<math>O(n)</math>
| |
| |average-time=<math>O(n\log n)</math>
| |
| |space=<math>O(n)</math>
| |
| |optimal=Yes
| |
| }}
| |
| '''Timsort''' is a [[hybrid algorithm|hybrid]] [[sorting algorithm]], derived from [[merge sort]] and [[insertion sort]], designed to perform well on many kinds of real-world data. It was invented by Tim Peters in 2002 for use in the [[Python (programming language)|Python programming language]]. The [[algorithm]] finds subsets of the data that are already ordered, and uses that knowledge to sort the remainder more efficiently. This is done by merging an identified subset, called a run, with existing runs until certain criteria are fulfilled.
| |
| Timsort has been Python's standard sorting algorithm since version 2.3. It is used to sort arrays of non-primitive type in [[Java 7|Java SE 7]],<ref>{{cite web
| |
| | title = Commit 6804124: Replace "modified mergesort" in java.util.Arrays.sort with timsort
| |
| | url = http://hg.openjdk.java.net/jdk7/tl/jdk/rev/bfd7abda8f79
| |
| | last = jjb
| |
| | work = Java Development Kit 7 Hg repo
| |
| | accessdate = 24 Feb 2011
| |
| }}</ref> on the [[Android (operating system)|Android platform]],<ref>{{cite web
| |
| | title = Class: java.util.TimSort<T>
| |
| | url = http://www.kiwidoc.com/java/l/x/android/android/5/p/java.util/c/TimSort
| |
| | work = Android JDK 1.5 Documentation
| |
| | accessdate = 24 Feb 2011
| |
| }}{{Dead link|date=June 2013}}</ref> and in [[GNU Octave]].<ref>{{cite web
| |
| | title = liboctave/util/oct-sort.cc
| |
| | url = http://hg.savannah.gnu.org/hgweb/octave/file/0486a29d780f/liboctave/util/oct-sort.cc
| |
| | work = Mercurial repository of Octave source code
| |
| | accessdate = 18 Feb 2013
| |
| | quote = Code stolen in large part from Python's, listobject.c, which itself had no license header. However, thanks to Tim Peters for the parts of the code I ripped-off.
| |
| | at = Lines 23-25 of the initial comment block.
| |
| }}</ref>
| |
|
| |
|
| == Operation ==
| |
| Timsort was designed to take advantage of partial orderings that already exist in most real-world data. Timsort operates by finding runs, subsets of at least two elements, in the data. Runs are either non-descending (each element is equal to or greater than its predecessor) or strictly descending (each element is lower than its predecessor). If it is descending, it must be strictly descending, since descending runs are later reversed by a simple swap of elements from both ends converging in the middle. This method is [[Sorting_algorithm#Stability|stable]] if the elements are present in strictly descending order. After obtaining such a run in the given array, Timsort processes it, and then searches for the next run.
| |
|
| |
|
| === Minrun ===
| | They're always ready to help, and they're always making changes to the site to make sure you won't have troubles in the first place. Medical word press themes give you the latest medical designs. I thought about what would happen by placing a text widget in the sidebar beneath my banner ad, and so it went. Transforming your designs to Word - Press blogs is not that easy because of the simplified way in creating your very own themes. After activating, you will find their website link and get the activation code from their website. <br><br>Thus, it is imperative that you must Hire Word - Press Developers who have the expertise and proficiency in delivering theme integration and customization services. If a newbie missed a certain part of the video then they could always rewind. A Wordpress plugin is a software that you can install into your Wordpress site. You can up your site's rank with the search engines by simply taking a bit of time with your site. Now a days it has since evolved into a fully capable CMS platform which make it, the best platform in the world for performing online business. <br><br>Photography is an entire activity in itself, and a thorough discovery of it is beyond the opportunity of this content. The only problem with most is that they only offer a monthly plan, you never own the software and you can’t even install the software on your site, you must go to another website to manage your list and edit your autoresponder. You can now search through the thousands of available plugins to add all kinds of functionality to your Word - Press site. Enough automated blog posts plus a system keeps you and your clients happy. Socrates: (link to ) Originally developed for affiliate marketers, I've used this theme again and again to develop full-fledged web sites that include static pages, squeeze pages, and a blog. <br><br>The primary differences are in the plugins that I install, as all sites don't need all the normal plugins. Russell HR Consulting provides expert knowledge in the practical application of employment law as well as providing employment law training and HR support services. Some examples of its additional features include; code inserter (for use with adding Google Analytics, Adsense section targeting etc) Webmaster verification assistant, Link Mask Generator, Robots. Fast Content Update - It's easy to edit or add posts with free Wordpress websites. Make sure you have the latest versions of all your plugins are updated. <br><br>There is no denying that Magento is an ideal platform for building ecommerce websites, as it comes with an astounding number of options that can help your online business do extremely well. Mahatma Gandhi is known as one of the most prominent personalities and symbols of peace, non-violence and freedom. When you have almost any questions about where by along with tips on how to work with [http://aorta.in/WordpressDropboxBackup968545 wordpress dropbox backup], you'll be able to contact us with our internet site. By the time you get the Gallery Word - Press Themes, the first thing that you should know is on how to install it. Thus, Word - Press is a good alternative if you are looking for free blogging software. Get started today so that people searching for your type of business will be directed to you. |
| [[File:Selection of minrun by timsort.png|280px|thumb|Timsort algorithm searches for such ordered sequences, minruns, to perform its sort]]
| |
| A natural run is a sub-array that is already ordered. Natural runs in real-world data may be of varied lengths. Timsort chooses a sorting technique depending on the length of the run. For example, if the run length is smaller than a certain value, insertion sort is used. Thus Timsort is an adaptive sort.<ref name=python_timsort>{{cite web|last=timsort|first=python|title=python_timsort|url=http://hg.python.org/cpython/file/tip/Objects/listsort.txt}}</ref>
| |
| | |
| The size of the run is checked against the minimum run size. The minimum run size (minrun) depends on the size of the [[array data type|array]]. For an array of fewer than 64 elements, minrun is the size of the array, reducing Timsort to an insertion sort. For larger arrays, minrun is chosen from the range 32 to 64 inclusive, such that the size of the array, divided by minrun, is equal to, or slightly smaller than, a power of two. The final algorithm takes the six most significant bits of the size of the array, adds one if any of the remaining bits are set, and uses that result as the minrun. This algorithm works for all arrays, including those smaller than 64.<ref name=python_timsort />
| |
| | |
| === Insertion sort ===
| |
| When an array is random, natural runs most likely contain fewer than minrun elements. In this case, an appropriate number of succeeding elements is selected, and an insertion sort increases the size of the run to minrun size. Thus, most runs in a random array are, or become, minrun in length. This results in efficient, balanced merges. It also results in a reasonable number of function calls in the implementation of the sort.<ref name=drmaciver>{{cite web|last=timsort|first=understanding|title=understanding timsort|url=http://www.drmaciver.com/2010/01/understanding-timsort-1adaptive-mergesort/}}</ref>
| |
| | |
| === Merge memory ===
| |
| [[File:Representation of stack for merge memory in Timsort.svg|280px|thumb|The minruns are inserted in a [[Stack (data structure)|stack]]. If X < Y + Z then X and Y are merged and then inserted into a stack. In this way, merging is continued until all arrays satisfy a) X > Y + Z and b) Y > Z]]Once run lengths are optimized, the runs are merged. When a run is found, the algorithm pushes its base address and length on a stack. A function determines whether the run should be merged with previous runs. Timsort does not merge non-consecutive runs, because doing this would cause the element common to all three runs to become out of order with respect to the middle run.
| |
| | |
| Thus, merging is always done on consecutive runs. For this, the three top-most runs in the stack which are unsorted are considered. If, say, X, Y, Z represent the lengths of the three uppermost runs in the stack, the algorithm merges the runs so that ultimately the following two rules are satisfied:
| |
| | |
| <ol type=i><li>X > Y + Z
| |
| <li>Y > Z<ref name=python_timsort /></ol>
| |
| | |
| For example, if the first of the two rules is not satisfied by the current run status, that is, if X < Y + Z, then, Y is merged with the smaller of X and Z. The merging continues until both rules are satisfied. Then the algorithm determines the next run.<ref name=drmaciver />
| |
| | |
| The rules above aim at maintaining run lengths as close to each other as possible to balance the merges. Only a small number of runs are remembered, as the stack is of a specific size. The algorithm exploits the fresh occurrence of the runs to be merged, in [[CPU cache|cache memory]]. Thus a compromise is attained between delaying merging, and exploiting fresh occurrence in cache.
| |
| | |
| === Merging procedure ===
| |
| [[File:Merging procedure for timsort.svg|280px|thumb|Algorithm creates a temporary memory equal to size of smaller array. Then, it shifts elements in (say if X is smaller) X to the temporary memory and then sorts and fills elements in final order into combined space of X and Y]]
| |
| | |
| Merging adjacent runs is done with the help of temporary memory. The temporary memory is of the size of the lesser of the two runs. The algorithm copies the smaller of the two runs into this temporary memory and then uses the original memory (of the smaller run) and the memory of the other run to store sorted output.
| |
| | |
| A simple merge algorithm runs left to right or right to left depending on which run is smaller, on the temporary memory and original memory of the larger run. The final sorted run is stored in the original memory of the two initial runs. Timsort searches for appropriate positions for the starting element of one array in the other using an adaptation of [[binary search]].
| |
| | |
| Say, for example, two runs A and B are to be merged, with A as the smaller run. In this case a binary search examines A to find the first position larger than the first element of B (a'). Note that A and B are already sorted individually. When a' is found, the algorithm can ignore elements before that position while inserting B. Similarly, the algorithm also looks for the smallest element in B (b') greater than the largest element in A (a''). The elements after b' can also be ignored for the merging. This preliminary searching is not efficient for highly random data, but is efficient in other situations and is hence included.
| |
| | |
| === Galloping mode ===
| |
| [[File:One-one merging timsort.svg|280px|thumb|Elements (pointed to by blue arrow) are compared and the smaller element is moved to its final position (pointed to by red arrow).]]
| |
| | |
| Most of the merge occurs in what is called ‘one pair at a time’ mode, where respective elements of both runs are compared. When the algorithm merges left-to-right, the smaller of the two is brought to a merge area. A count of the number of times the final element<!--what is this?--> appears in a given run is recorded. When this value reaches a certain threshold, MIN_GALLOP, the merge switches to 'galloping mode’. In this mode we use the previously mentioned adaptation of binary search to identify where the first element of the smaller array must be placed in the larger array (and vice-versa). All elements in the larger array that occur before this location can be moved to the merge area as a group (and vice-versa). The functions ''merge-lo'' and ''merge-hi'' increment the value of min-gallop (initialized to MIN_GALLOP), if galloping is not efficient, and decrement it if it is. If too many consecutive elements come from different runs, galloping mode is exited.<ref name=python_timsort />
| |
| | |
| In galloping mode, the algorithm searches for the first element of one array in the other. This is done by comparing that first element (initial element) with the zeroth element of the other array, then the first, the third and so on, that is (2<sup>k</sup> - 1)th element, so as to get a range of elements between which the initial element will lie. This shortens the range for binary searching, thus increasing efficiency. Galloping proves to be more efficient except in cases with especially long runs, but random data usually has shorter runs. Also, in cases where galloping is found to be less efficient as compared to [[Binary search algorithm|binary search]], galloping mode is exited.
| |
| | |
| [[File:Copy galloping mode timsort(2).svg|280px|thumb|All red elements are smaller than blue (here, 21). Thus they can be moved in a chunk to the final array.]]
| |
| | |
| Galloping is not always efficient. One reason is due to excessive function calls. Function calls are expensive and thus when frequent, they affect program efficiency. In some cases galloping mode requires more comparisons than a simple [[linear search]] (one at a time search). While for the first few cases both modes may require the same number of comparisons, over time galloping mode requires 33% more comparisons than linear search to arrive at the same results. Moreover all comparisons in galloping mode are done by [[function call]]s.
| |
| | |
| Galloping is beneficial only when the initial element of one run is not one of the first seven elements of the other run. This implies a MIN_GALLOP of 7. To avoid the drawbacks of galloping mode, the merging functions adjust the value of min-gallop. If the element is from the array currently that has been returning elements, min-gallop is reduced by one. Otherwise, the value is incremented by one, thus discouraging a return to galloping mode. When this is done, in the case of random data, the value of min-gallop becomes so large that galloping mode never recurs.
| |
| | |
| In the case where merge-hi is used (that is, merging is done right-to-left), galloping starts from the right end of the data, that is, the last element. Galloping from the beginning also gives the required results, but makes more comparisons. Thus, the galloping algorithm uses a variable that gives the index at which galloping should begin. Timsort can enter galloping mode at any index and continue checking at the next index which is offset by 1, 3, 7,...., (2<sup>k</sup> - 1).. and so on from the current index. In the case of merge-hi, the offsets to the index will be -1, -3, -7,....<ref name=python_timsort />
| |
| | |
| == Performance ==
| |
| | |
| According to [[information theory]], no [[comparison sort]] can perform better than <math>\Theta(n \log n)</math> comparisons in the worst case. On real-world data, Timsort often requires far fewer than <math>\Theta(n \log n)</math> comparisons, because it takes advantage of the fact that sublists of the data may already be ordered.<ref>{{Cite book
| |
| | last = Martelli
| |
| | first = Alex
| |
| | title = Python in a Nutshell (In a Nutshell (O'Reilly))
| |
| | publisher = O'Reilly Media, Inc.
| |
| | year = 2006
| |
| | page = 57
| |
| | isbn = 0-596-10046-9}}</ref>
| |
| | |
| The following table compares the time complexity of timsort with other comparison sorts.
| |
| | |
| {| class="wikitable" style="text-align: center;"
| |
| !
| |
| ! Timsort
| |
| ! [[Merge sort]]
| |
| ! [[Quicksort]]
| |
| ! [[Insertion sort]]
| |
| ! [[Selection sort]]
| |
| ! [[Smoothsort]]
| |
| |-
| |
| ! Best case
| |
| | <math>\Theta(n)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n)</math>
| |
| | <math>\Theta(n^2)</math>
| |
| | <math>\Theta(n)</math>
| |
| |-
| |
| ! Average case
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n^2)</math>
| |
| | <math>\Theta(n^2)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| |-
| |
| ! Worst case
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| | <math>\Theta(n^2)</math>
| |
| | <math>\Theta(n^2)</math>
| |
| | <math>\Theta(n^2)</math>
| |
| | <math>\Theta(n \log n)</math>
| |
| |}
| |
| | |
| The following table provides a comparison of the space complexities of the various sorting techniques. Note that for merge sort, the ''worst case'' space complexity is usually <math>O(n)</math>.
| |
| | |
| {| class="wikitable" style="text-align: center;"
| |
| !
| |
| ! Timsort
| |
| ! [[Merge sort]]
| |
| ! [[Quicksort]]
| |
| ! [[Insertion sort]]
| |
| ! [[Selection sort]]
| |
| ! [[Smoothsort]]
| |
| |-
| |
| ! Space complexity
| |
| | <math>O(n)</math>
| |
| | <math>O(n)</math>
| |
| | <math>O(\log n)</math>
| |
| | <math>O(1)</math>
| |
| | <math>O(1)</math>
| |
| | <math>O(\log n)</math>
| |
| |}
| |
| Note, however, that the space complexity of both Timsort and merge sort can be reduced to <math>\log n</math> at the cost of speed (see [[in-place merge sort]]).
| |
| | |
| == References ==
| |
| {{Reflist|2}}
| |
| | |
| ==External links==
| |
| * [http://bugs.python.org/file4451/timsort.txt timsort.txt] - original explanation by Tim Peters.
| |
| * [http://corte.si/posts/code/timsort/index.html Visualising Timsort] - the source for the image on this page.
| |
| * [http://hg.python.org/cpython/file/default/Objects/listobject.c Python's listobject.c] - the [[C (programming language)|C]] implementation of timsort for [[CPython]].
| |
| * [http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java OpenJDK's TimSort.java] - the Java implementation of timsort.
| |
| * [http://hg.savannah.gnu.org/hgweb/octave/file/0486a29d780f/liboctave/util/oct-sort.cc GNU Octave's oct-sort.cc] - the [[C++]] implementation of timsort for [[GNU Octave]].
| |
| * [http://stromberg.dnsalias.org/~strombrg/sort-comparison/ Sort Comparison] - a Pure Python and Cython implementation of Timsort, among other sorts.
| |
| | |
| {{Sorting}}
| |
| | |
| [[Category:Sorting algorithms]]
| |
| [[Category:Comparison sorts]]
| |
They're always ready to help, and they're always making changes to the site to make sure you won't have troubles in the first place. Medical word press themes give you the latest medical designs. I thought about what would happen by placing a text widget in the sidebar beneath my banner ad, and so it went. Transforming your designs to Word - Press blogs is not that easy because of the simplified way in creating your very own themes. After activating, you will find their website link and get the activation code from their website.
Thus, it is imperative that you must Hire Word - Press Developers who have the expertise and proficiency in delivering theme integration and customization services. If a newbie missed a certain part of the video then they could always rewind. A Wordpress plugin is a software that you can install into your Wordpress site. You can up your site's rank with the search engines by simply taking a bit of time with your site. Now a days it has since evolved into a fully capable CMS platform which make it, the best platform in the world for performing online business.
Photography is an entire activity in itself, and a thorough discovery of it is beyond the opportunity of this content. The only problem with most is that they only offer a monthly plan, you never own the software and you can’t even install the software on your site, you must go to another website to manage your list and edit your autoresponder. You can now search through the thousands of available plugins to add all kinds of functionality to your Word - Press site. Enough automated blog posts plus a system keeps you and your clients happy. Socrates: (link to ) Originally developed for affiliate marketers, I've used this theme again and again to develop full-fledged web sites that include static pages, squeeze pages, and a blog.
The primary differences are in the plugins that I install, as all sites don't need all the normal plugins. Russell HR Consulting provides expert knowledge in the practical application of employment law as well as providing employment law training and HR support services. Some examples of its additional features include; code inserter (for use with adding Google Analytics, Adsense section targeting etc) Webmaster verification assistant, Link Mask Generator, Robots. Fast Content Update - It's easy to edit or add posts with free Wordpress websites. Make sure you have the latest versions of all your plugins are updated.
There is no denying that Magento is an ideal platform for building ecommerce websites, as it comes with an astounding number of options that can help your online business do extremely well. Mahatma Gandhi is known as one of the most prominent personalities and symbols of peace, non-violence and freedom. When you have almost any questions about where by along with tips on how to work with wordpress dropbox backup, you'll be able to contact us with our internet site. By the time you get the Gallery Word - Press Themes, the first thing that you should know is on how to install it. Thus, Word - Press is a good alternative if you are looking for free blogging software. Get started today so that people searching for your type of business will be directed to you.