|
|
(One intermediate revision by one other user not shown) |
Line 1: |
Line 1: |
| [[Image:Suffix tree ABAB BABA.svg|thumb|300px|right|Suffix tree for the strings <code>ABAB</code> and <code>BABA</code>. [[Suffix_tree#Description|Suffix links]] not shown.]] | | The name of the writer is Jayson. Since I was eighteen I've been operating as a bookkeeper but soon my spouse and I will begin our own business. My spouse doesn't like [http://test.jeka-nn.ru/node/129 telephone psychic] it the way I do but what I truly like performing is caving but I don't have the time lately. Some time ago she selected to live in Alaska and her psychic solutions by lynne ([http://netwk.hannam.ac.kr/xe/data_2/85669 agree with this]) parents live nearby.<br><br>Take a look at my page - telephone psychic, [http://ustanford.com/index.php?do=/profile-38218/info/ visit the up coming article], |
| In [[computer science]], a '''generalized suffix tree''' is a [[suffix tree]] for a set of [[String (computer science)|strings]]. Given the set of strings <math>D=S_1,S_2,\dots,S_d</math> of total length <math>n</math>, it is a [[Patricia tree]] containing all <math>n</math> [[suffix (computer science)|suffixes]] of the strings. It is mostly used in [[bioinformatics]].{{ref|BRCR}}
| |
| | |
| == Functionality ==
| |
| It can be built in <math>\Theta(n)</math> time and space, and can be used to find all <math>z</math> occurrences of a string <math>P</math> of length <math>m</math> in <math>O(m + z)</math> time, which is [[asymptotically optimal]] (assuming the size of the alphabet is constant, see {{ref|Gus97}} page 119).
| |
| | |
| When constructing such a tree, each string should be padded with a unique out-of-alphabet marker symbol (or string) to ensure no suffix is a substring of another, guaranteeing each suffix is represented by a unique leaf node.
| |
| | |
| Algorithms for constructing a GST include [[Ukkonen's algorithm]] (1995) and [[McCreight's algorithm]] (1976).
| |
| | |
| == Example ==
| |
| A suffix tree for the strings <code>ABAB</code> and <code>BABA</code> is shown in a figure above. They are padded with the unique terminator strings <code>$0</code> and <code>$1</code>. The numbers in the leaf nodes are string number and starting position. Notice how a left to right traversal of the leaf nodes corresponds to the sorted order of the suffixes. The terminators might be strings or unique single symbols. Edges on <code>$</code> from the root are left out in this example.
| |
| | |
| == Alternatives ==
| |
| An alternative to building a generalised suffix tree is to concatenate the strings, and build a regular suffix tree or [[suffix array]] for the resulting string. When hits are evaluated after a search, global positions are mapped into documents and local positions with some algorithm and/or data structure, such as a binary search in the starting/ending positions of the documents.
| |
| | |
| ==References==
| |
| | |
| * {{note|Hui92}} {{cite conference
| |
| | author=Lucas Chi Kwong Hui
| |
| | title=Color Set Size Problem with Applications to String Matching
| |
| | booktitle=Combinatorial Pattern Matching, Lecture Notes in Computer Science, 644.
| |
| | year=1992
| |
| | pages=230–243
| |
| | url=http://www.springerlink.com/content/y565487707522555/}}
| |
| * {{note|BRCR}} {{cite conference
| |
| | author=Paul Bieganski, John Riedl, John Carlis, and Ernest F. Retzel
| |
| | title=Generalized Suffix Trees for Biological Sequence Data
| |
| | booktitle=Biotechnology Computing, Proceedings of the Twenty-Seventh Hawaii International Conference on.
| |
| | year=1994
| |
| | pages=35–44
| |
| | url=http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=323593}}
| |
| * {{note|Gus97}} {{cite book
| |
| | last = Gusfield
| |
| | first = Dan
| |
| | origyear = 1997
| |
| | year = 1999
| |
| | title = Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology
| |
| | publisher = Cambridge University Press
| |
| | location = USA
| |
| | isbn = 0-521-58519-8
| |
| }}
| |
| | |
| [[Category:Trees (data structures)]]
| |
| [[Category:Substring indices]]
| |
| [[Category:String data structures]]
| |
The name of the writer is Jayson. Since I was eighteen I've been operating as a bookkeeper but soon my spouse and I will begin our own business. My spouse doesn't like telephone psychic it the way I do but what I truly like performing is caving but I don't have the time lately. Some time ago she selected to live in Alaska and her psychic solutions by lynne (agree with this) parents live nearby.
Take a look at my page - telephone psychic, visit the up coming article,