Verifiable random function: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Pomme.de.lait
m Added link for Vadhan.
en>Monkbot
 
Line 1: Line 1:
{{Selfref|For word wrap handling on Wikipedia, see [[Wikipedia:Line-break handling]].}}
The author's name is Christy Brookins. I am really fond of to go to karaoke but I've been using on new things recently. North Carolina is exactly where we've been residing for many years and will never transfer. Since I was 18 I've been operating as a bookkeeper but soon my wife and I will begin our own business.<br><br>My homepage ... [http://myoceancounty.net/groups/apply-these-guidelines-when-gardening-and-grow/ best psychic]
 
In text display, '''line wrap''' is the feature of continuing on a new line when a line is full, such that each line fits in the viewable window, allowing text to be read from top to bottom without any horizontal [[scrolling]]. Word wrap [[wikt:obviate|obviates]] the hard-coding of [[newline]] delimiters inside [[paragraph]]s and allows the dynamic reflowing of text with new automatic line-breaking decisions on the fly (for example, whenever a [[window (computing)|window]] is resized). '''Word wrap''' is the additional feature  of most [[text editor]]s, [[word processors]], and [[web browser]]s, of breaking lines between and not within words, except when a single word is longer than a line.
 
==Soft and hard returns==
 
A '''soft return''' or '''soft wrap''' is the break resulting from line wrap or word wrap (whether automatic or manual), whereas a '''hard return''' or '''hard wrap''' is an intentional break, creating a new [[paragraph]].  With a hard enter, paragraph-break formatting may be applied (either [[indenting]] or vertical whitespace). Soft wrapping allows line lengths to adjust automatically with adjustments to the width of the user's window or margin settings, and is a standard feature of all modern [[text editor]]s, [[word processor]]s, and [[email client]]s.  Manual soft breaks are unnecessary when word wrap is done automatically, so hitting the "Enter" key usually produces a hard return.
 
Alternatively, "soft enter" can mean an intentional, stored line break that is not a paragraph break. For example, it is common to print postal addresses in a multiple-line format, but the several lines are understood to be a single paragraph.  Line breaks are needed to divide the words of the address into lines of the appropriate length.
 
In the contemporary [[graphical user interface|graphical]] word processors [[Microsoft Word]] and [[OpenOffice.org]], users are expected to type a carriage return ([[enter key]]) between each paragraph. Formatting settings, such as first-line indentation or spacing between paragraphs, take effect where the carriage return marks the break. A non-[[paragraph]] line break, which is a soft return, is inserted using [[shift key|shift]]-enter or via the menus, and is provided for cases when the text should start on a new line but none of the other side effects of starting a new paragraph are desired.
 
In text-oriented markup languages, a soft return is typically offered as a markup tag. For example, in [[HTML]] there is a &lt;br&gt; tag that has the same purpose as the soft return in word processors described above; the &lt;p&gt; tag is used to contain paragraphs.
 
===Unicode===
The [[Unicode]] character set provides a line separator character as well as a paragraph separator to represent the semantics of the soft return and hard return.
 
:0x2028  LINE SEPARATOR
:        * may be used to represent this semantic unambiguously
:0x2029  PARAGRAPH SEPARATOR
:        * may be used to represent this semantic unambiguously
 
==Word boundaries, hyphenation, and hard spaces==
<!-- this section could probably use some example illustrations of text before and after wrapping.  Maybe pull from /RTF Pocket Guide/ -->
The soft returns are usually placed after the ends of complete words, or after the punctuation that follows complete words. However, word wrap may also occur following a [[hyphen]] inside of a word.  This is sometimes not desired, and can be blocked by using a '''[[non-breaking hyphen]]''', or '''[[hard hyphen]]''', instead of a regular hyphen.
 
A word without hyphens can be made wrappable by having '''[[soft hyphen]]s''' in it.  When the word isn't wrapped (i.e., isn't broken across lines), the soft hyphen isn't visible.  But if the word is wrapped across lines, this is done at the soft hyphen, at which point it is shown as a visible hyphen on the top line where the word is broken.  (In the rare case of a word that is meant to be wrappable by breaking it across lines but ''without'' making a hyphen ever appear, a '''[[zero-width space]]''' is put at the permitted breaking point(s) in the word.)<!-- example? a URL maybe? those are so long that they often need breaking, but must never have a hyphen introduced into them. -->
 
Sometimes word wrap is undesirable between adjacent words.  In such cases, word wrap can usually be blocked by using a '''hard space''' or '''[[non-breaking space]]''' between the words, instead of regular spaces.
 
==Word wrapping in text containing Chinese, Japanese, and Korean==
In [[Chinese language|Chinese]], [[Japanese language|Japanese]], and [[Korean language|Korean]], each [[Han character]] is normally considered a word,{{Citation needed|date=May 2011}} and therefore word wrapping can usually occur before and after any Han character. Japanese [[Kana|kana]], letters of the Japanese alphabet, are treated the same way as Han Characters ([[Kanji]]) by extension, meaning words can, and tend to be broken without any hyphen or other indication that this has happened.
 
Under certain circumstances, however, word wrapping is not desired. For instance,
* word wrapping might not be desired within personal names, and
* word wrapping might not be desired within any compound words (when the text is flush left but only in some styles).
 
Most existing word processors and [[typesetting]] software cannot handle either of the above scenarios.
 
[[CJK]] punctuation may or may not follow rules similar to the above-mentioned special circumstances. It is up to [[Line breaking rules in East Asian language|line breaking rules in CJK]].
 
A special case of line breaking rules in CJK, however, always applies: line wrap must never occur inside the CJK dash and ellipsis. Even though each of these punctuation marks must be represented by two characters due to a limitation of all existing [[character encoding]]s, each of these are intrinsically a single punctuation mark that is two [[em (typography)|em]]s wide, not two one-em-wide punctuation marks.
 
==Algorithm==
Word wrapping is an optimization problem. Depending on what needs to be optimized for, different algorithms are used.
 
=== Minimum number of lines ===
A simple way to do word wrapping is to use a [[greedy algorithm]] that puts as many words on a line as possible, then moving on to the next line to do the same until there are no more words left to place. This method is used by many modern word processors, such as [[OpenOffice.org Writer]] and [[Microsoft Word]]. This algorithm always uses the minimum possible number of lines but may lead to lines of widely varying lengths. The following pseudocode implements this algorithm:
 
SpaceLeft := LineWidth
for each Word in Text
    if (Width(Word) + SpaceWidth) > SpaceLeft
        insert line break before Word in Text
        SpaceLeft := LineWidth - Width(Word)
    else
        SpaceLeft := SpaceLeft - (Width(Word) + SpaceWidth)
 
Where <code>LineWidth</code> is the width of a line, <code>SpaceLeft</code> is the remaining width of space on the line to fill, <code>SpaceWidth</code> is the width of a single space character, <code>Text</code> is the input text to iterate over and <code>Word</code> is a word in this text.
 
=== Minimum raggedness ===
 
A different algorithm, used in [[TeX]], minimizes the sum of the squares of the lengths of the spaces at the end of lines to produce a more aesthetically pleasing result. The following example compares this method with the greedy algorithm, which does not always minimize squared space.
 
For the input text
 
aaa bb cc ddddd
 
with line width 6, the greedy algorithm would produce:
 
------    Line width: 6
aaa bb    Remaining space: 0
cc        Remaining space: 4
ddddd    Remaining space: 1
 
The sum of squared space left over by this method is <math>0^2 + 4^2 + 1^2 = 17</math>. However, the optimal solution achieves the smaller sum <math>3^2 + 1^2 + 1^2 = 11</math>:
 
------    Line width: 6
aaa      Remaining space: 3
bb cc    Remaining space: 1
ddddd    Remaining space: 1
 
The difference here is that the first line is broken before <code>bb</code> instead of after it, yielding a better right margin and a lower cost 11.
 
By using a [[dynamic programming]] algorithm to choose the positions at which to break the line, instead of choosing breaks greedily, the solution with minimum raggedness may be found in time <math>\mathcal{O}(n^2)</math>, where <math>n</math> is the number of words in the input text. Typically, the cost function for this technique should be modified so that it does not count the space left on the final line of a paragraph; this modification allows a paragraph to end in the middle of a line without penalty. It is also possible to apply the same dynamic programming technique to minimize more complex cost functions that combine other factors such as the number of lines or costs for hyphenating long words.<ref name="knuth-plass">{{citation
| last1 = Knuth | first1 = Donald E. | author1-link = Donald Knuth
| last2 = Plass | first2 = Michael F.
| doi = 10.1002/spe.4380111102
| issue = 11
| journal = Software: Practice and Experience
| pages = 1119–1184
| title = Breaking paragraphs into lines
| volume = 11
| year = 1981}}.</ref> Faster but more complicated [[linear time]] algorithms based on the [[SMAWK algorithm]] are also known for the minimum raggedness problem, and for some other cost functions that have similar properties.<ref>{{citation
| last = Wilber | first = Robert
| doi = 10.1016/0196-6774(88)90032-6
| mr = 955150
| issue = 3
| journal = Journal of Algorithms
| pages = 418–425
| title = The concave least-weight subsequence problem revisited
| volume = 9
| year = 1988}}.</ref><ref>{{citation
| last1 = Galil | first1 = Zvi | author1-link = Zvi Galil
| last2 = Park | first2 = Kunsoo
| doi = 10.1016/0020-0190(90)90215-J
| mr = 1045521
| issue = 6
| journal = Information Processing Letters
| pages = 309–311
| title = A linear-time algorithm for concave one-dimensional dynamic programming
| volume = 33
| year = 1990}}.</ref>
 
===History===
A primitive line-breaking feature was used in 1955 in a "page printer control unit" developed by [[Western Union]]. This system used relays rather than programmable digital computers, and therefore needed a simple algorithm that could be implemented without [[data buffer]]s. In the Western Union system, each line was broken at the first space character to appear after the 58th character, or at the 70th character if no space character was found.<ref>{{citation|url=http://massis.lcs.mit.edu/archives/technical/western-union-tech-review/10-1/p040.htm|journal=Western Union Technical Review|volume=10|issue=1|date=January 1956|first=Robert W.|last=Harris|title=Keyboard standardization|pages=37–42}}.</ref>
 
The greedy algorithm for line-breaking predates the dynamic programming method outlined by [[Donald Knuth]] in an unpublished 1977 memo describing his [[TeX]] typesetting system<ref>{{citation|first=Donald|last=Knuth|authorlink=Donald Knuth|url=http://www.saildart.org/TEXDR.AFT%5B1,DEK%5D|title=TEXDR.AFT|year=1977|accessdate=2013-04-07}}. Reprinted in {{citation|first=Donald|last=Knuth|authorlink=Donald Knuth|title=Digital Typography|location=Stanford, California|publisher=Center for the Study of Language and Information|year=1999|series=CSLI Lecture Notes|volume=78|isbn=1-57586-010-4}}.</ref> and later published in more detail by {{harvtxt|Knuth|Plass|1981}}.
 
== See also ==
 
* [[Word divider]]
* [[Non-breaking space]]
* [[Zero-width space]]
 
==References==
{{reflist}}
 
== External links ==
=== Knuth's algorithm ===
* [http://defoe.sourceforge.net/folio/knuth-plass.html "Knuth & Plass line-breaking Revisited"]
* [http://oedipus.sourceforge.net/texlib/ "tex_wrap": "Implements TeX's algorithm for breaking paragraphs into lines."] Reference: "Breaking Paragraphs into Lines", D.E. Knuth and M.F. Plass, chapter 3 of _Digital Typography_, CSLI Lecture Notes #78.
* [https://metacpan.org/module/Text::Reflow Text::Reflow - Perl module for reflowing text files using Knuth's paragraphing algorithm.] "The reflow algorithm tries to keep the lines the same length but also tries to break at punctuation, and avoid breaking within a proper name or after certain connectives ("a", "the", etc.). The result is a file with a more "ragged" right margin than is produced by fmt or Text::Wrap but it is easier to read since fewer phrases are broken across line breaks."
* [http://www.nabble.com/Initial-soft-hyphen-support-t2970713.html adjusting the Knuth algorithm] to recognize the [[Hyphen#Hyphens_in_computing|"soft hyphen"]].
* [http://wiki.apache.org/xmlgraphics-fop/KnuthsModel Knuth's breaking algorithm.] "The detailed description of the model and the algorithm can be found on the paper "Breaking Paragraphs into Lines" by Donald E. Knuth, published in the book "Digital Typography" (Stanford, California: Center for the Study of Language and Information, 1999), (CSLI Lecture Notes, no. 78.)" ; part of [http://wiki.apache.org/xmlgraphics-fop/GoogleSummerOfCode2006/FloatsImplementationProgress Google Summer Of Code 2006]
* [http://citeseer.ist.psu.edu/23630.html "Bridging the Algorithm Gap: A Linear-time Functional Program for Paragraph Formatting"] by Oege de Moor, Jeremy Gibbons, 1999
 
=== Other word-wrap links ===
* [http://www.codecomments.com/message230162.html the reverse problem -- picking columns just wide enough to fit (wrapped) text]  ([http://archive.is/Swmvx Archived version])
* [http://api.kde.org/4.x-api/kdelibs-apidocs/kdeui/html/classKWordWrap.html KWordWrap Class Reference] used in the KDE GUI
* [http://www.leverkruid.eu/GKPLinebreaking/elements.html "Knuth linebreaking elements for Formatting Objects"] by Simon Pepping 2006. Extends the Knuth model to handle a few enhancements.
* [http://wiki.apache.org/xmlgraphics-fop/PageLayout/ "Page breaking strategies"] Extends the Knuth model to handle a few enhancements.
* [http://www.techwr-l.com/archives/0504/techwhirl-0504-00203.html "a Knuth-Plass-like linebreaking algorithm] ... The *really* interesting thing is how Adobe's algorithm differs from the Knuth-Plass algorithm. It must differ, since Adobe has managed to patent its algorithm (6,510,441)."[http://www.techwr-l.com/archives/0504/techwhirl-0504-00206.html ]
* [http://blogs.msdn.com/murrays/archive/2006/11/15/lineservices.aspx "Murray Sargent: Math in Office"]
 
[[Category:Text editor features]]
[[Category:Typography]]
[[Category:Dynamic programming]]

Latest revision as of 07:38, 25 July 2014

The author's name is Christy Brookins. I am really fond of to go to karaoke but I've been using on new things recently. North Carolina is exactly where we've been residing for many years and will never transfer. Since I was 18 I've been operating as a bookkeeper but soon my wife and I will begin our own business.

My homepage ... best psychic