<html><head><title>Text::KnuthPlass</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" >
</head>
<body class='pod'>
<!--
  generated by Pod::Simple::HTML v3.43,
  using Pod::Simple::PullParser v3.43,
  under Perl v5.026001 at Mon Oct  3 16:10:18 2022 GMT.

 If you want to change this HTML document, you probably shouldn't do that
   by changing it directly.  Instead, see about changing the calling options
   to Pod::Simple::HTML, and/or subclassing Pod::Simple::HTML,
   then reconverting this document from the Pod source.
   When in doubt, email the author of Pod::Simple::HTML for advice.
   See 'perldoc Pod::Simple::HTML' for more info.

-->

<!-- start doc -->
<a name='___top' class='dummyTopAnchor' ></a>

<div class='indexgroup'>
<ul   class='indexList indexList1'>
  <li class='indexItem indexItem1'><a href='#NAME'>NAME</a>
  <li class='indexItem indexItem1'><a href='#SYNOPSIS'>SYNOPSIS</a>
  <li class='indexItem indexItem1'><a href='#METHODS'>METHODS</a>
  <ul   class='indexList indexList2'>
    <li class='indexItem indexItem2'><a href='#%24t_%3D_Text%3A%3AKnuthPlass-%3Enew(%25opts)'>$t = Text::KnuthPlass-&#62;new(%opts)</a>
    <li class='indexItem indexItem2'><a href='#%24t-%3Etypeset(%24paragraph_string%2C_%25opts)'>$t-&#62;typeset($paragraph_string, %opts)</a>
    <li class='indexItem indexItem2'><a href='#%24t-%3Eline_lengths()'>$t-&#62;line_lengths()</a>
    <li class='indexItem indexItem2'><a href='#%24t-%3Ebreak_text_into_nodes(%24paragraph_string%2C_%25opts)'>$t-&#62;break_text_into_nodes($paragraph_string, %opts)</a>
    <ul   class='indexList indexList3'>
      <li class='indexItem indexItem3'><a href='#%27style%27_%3D%3E_%22string_name%22'>&#39;style&#39; =&#62; &#34;string_name&#34;</a>
    </ul>
    <li class='indexItem indexItem2'><a href='#break'>break</a>
    <li class='indexItem indexItem2'><a href='#%40lines_%3D_%24t-%3Ebreakpoints_to_lines(%5C%40breakpoints%2C_%5C%40nodes)'>@lines = $t-&#62;breakpoints_to_lines(\@breakpoints, \@nodes)</a>
    <li class='indexItem indexItem2'><a href='#boxclass()'>boxclass()</a>
    <li class='indexItem indexItem2'><a href='#glueclass()'>glueclass()</a>
    <li class='indexItem indexItem2'><a href='#penaltyclass()'>penaltyclass()</a>
  </ul>
  <li class='indexItem indexItem1'><a href='#AUTHOR'>AUTHOR</a>
  <li class='indexItem indexItem1'><a href='#ACKNOWLEDGEMENTS'>ACKNOWLEDGEMENTS</a>
  <li class='indexItem indexItem1'><a href='#BUGS'>BUGS</a>
  <li class='indexItem indexItem1'><a href='#COPYRIGHT_%26_LICENSE'>COPYRIGHT &#38; LICENSE</a>
</ul>
</div>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="NAME"
>NAME</a></h1>

<p>Text::KnuthPlass - Breaks paragraphs into lines using the TeX (Knuth-Plass) algorithm</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="SYNOPSIS"
>SYNOPSIS</a></h1>

<p>To use with plain text,
indentation of 2.
NOTE that you should also set the shrinkability of spaces to 0 in the new() call:</p>

<pre>    use Text::KnuthPlass;
    my $typesetter = Text::KnuthPlass-&#62;new(
        &#39;indent&#39; =&#62; 2, # two characters,
        # set space shrinkability to 0
        &#39;space&#39; =&#62; { &#39;width&#39; =&#62; 3, &#39;stretch&#39; =&#62; 6, &#39;shrink&#39; -&#62; 0 },
        # can let &#39;measure&#39; default to character count
        # default line lengths to 78 characters
    );
    my @lines = $typesetter-&#62;typeset($paragraph);
    ...

    for my $line (@lines) {
        for my $node (@{$line-&#62;{&#39;nodes&#39;}}) {
            if ($node-&#62;isa(&#34;Text::KnuthPlass::Box&#34;)) { 
                # a Box is a word or word fragment (no hyphen on fragment)
                print $node-&#62;value();
            } elsif ($node-&#62;isa(&#34;Text::KnuthPlass::Glue&#34;)) {
                # a Glue is (at least) a single space, but you can look at 
                # the line&#39;s &#39;ratio&#39; to insert additional spaces to 
                # justify the line. we also are glossing over the skipping
                # of any final glue at the end of the line
                print &#34; &#34;;
            }
            # ignoring Penalty (word split point) within line
        }
        if ($line-&#62;{&#39;nodes&#39;}[-1]-&#62;is_penalty()) { print &#34;-&#34;; }
        print &#34;\n&#34;;
    }</pre>

<p>To use with PDF::Builder: (also PDF::API2)</p>

<pre>    my $text = $page-&#62;text();
    $text-&#62;font($font, 12);
    $text-&#62;leading(13.5);

    my $t = Text::KnuthPlass-&#62;new(
        &#39;indent&#39; =&#62; 2*$text-&#62;text_width(&#39;M&#39;), # 2 ems
        &#39;measure&#39; =&#62; sub { $text-&#62;text_width(shift) }, 
        &#39;linelengths&#39; =&#62; [235]  # points
    );
    my @lines = $t-&#62;typeset($paragraph);

    my $y = 500;  # PDF decreases y down the page
    for my $line (@lines) {
        $x = 50;  # left margin
        for my $node (@{$line-&#62;{&#39;nodes&#39;}}) {
            $text-&#62;translate($x,$y);
            if ($node-&#62;isa(&#34;Text::KnuthPlass::Box&#34;)) {
                # a Box is a word or word fragment (no hyphen on fragment)
                $text-&#62;text($node-&#62;value());
                $x += $node-&#62;width();
            } elsif ($node-&#62;isa(&#34;Text::KnuthPlass::Glue&#34;)) {
                # a Glue is a variable-width space
                $x += $node-&#62;width() + $line-&#62;{&#39;ratio&#39;} *
                    ($line-&#62;{&#39;ratio&#39;} &#60; 0 ? $node-&#62;shrink(): $node-&#62;stretch());
                # we also are glossing over the skipping
                # of any final glue at the end of the line
            }
            # ignoring Penalty (word split point) within line
        }
        # explicitly add a hyphen at a line-ending split word
        if ($line-&#62;{&#39;nodes&#39;}[-1]-&#62;is_penalty()) { $text-&#62;text(&#34;-&#34;); }
        $y -= $text-&#62;leading(); # go to next line down
    }</pre>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="METHODS"
>METHODS</a></h1>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="$t_=_Text::KnuthPlass-&#62;new(%opts)"
>$t = Text::KnuthPlass-&#62;new(%opts)</a></h2>

<p>The constructor takes a number of options. The most important ones are:</p>

<dl>
<dt><a name="measure"
>measure</a></dt>

<dd>
<p>A subroutine reference to determine the width of a piece of text. This defaults to <code>length(shift)</code>, which is what you want if you&#39;re typesetting plain monospaced text. You will need to change this to plug into your font metrics if you&#39;re doing something graphical. For PDF::Builder (also PDF::API2), this would be the <code>advancewidth()</code> method (alias <code>text_width()</code>), which returns the width of a string (in the present font and size) in points.</p>

<pre>    &#39;measure&#39; =&#62; sub { length(shift) },  # default, for character output
    &#39;measure&#39; =&#62; sub { $text-&#62;advancewidth(shift) }, # PDF::Builder/API2</pre>

<dt><a name="linelengths"
>linelengths</a></dt>

<dd>
<p>This is an array of line lengths. For instance, <code> [30,40,50] </code> will typeset a triangle-shaped piece of text with three lines. What if the text spills over to more than three lines? In that case, the final value in the array is used for all further lines. So to typeset an ordinary block-shaped column of text, you only need specify an array with one value: the default is <code> [78] </code>. Note that this default would be the character count, rather than points (as needed by PDF::Builder or PDF::API2).</p>

<pre>    &#39;linelengths&#39; =&#62; [$lw, $lw, $lw-6, $lw-6, $lw],</pre>

<p>This would set the first two lines in the paragraph to <code>$lw</code> length, the next two to 6 less (such as for a float inset), and finally back to full length. At each line, the first element is consumed, but the last element is never removed. Any paragraph indentation set will result in a shorter-appearing first line, which actually has blank space at its beginning. Start output of the first line at the same <code>x</code> value as you do the other lines.</p>

<p>Setting <code>linelengths</code> in the <code>new()</code> (constructor) call resets the internal line length list to the new elements, overwriting anything that was already there (such as any remaining line lengths left over from a previous <code>typeset()</code> call). Subsequent <code>typeset()</code> calls will continue to consume the existing line length list, until the last element is reached. You can either reset the list for the next paragraph with the <code>typeset()</code> call, or call the <code>linelengths()</code> method to get or set the list.</p>

<dt><a name="indent"
>indent</a></dt>

<dd>
<p>This sets the global (default) paragraph indentation, unless overridden on a per-paragraph basis by an <code>indent</code> entry in a <code>typeset()</code> call. The units are the same as for <code>meaure</code> and <code>linelengths</code>. A &#34;Box&#34; of value <code>&#39;&#39;</code> and width of <code>indent</code> is inserted before the first node of the paragraph. Your rendering code should know how to handle this by starting at the same <code>x</code> coordinate as other lines, and then moving right (or left) by the indicated amount.</p>

<pre>    &#39;indent&#39; =&#62; 2,  # 2 character indentation
    &#39;indent&#39; =&#62; 2*$text-&#62;text_width(&#39;M&#39;),  # 2 ems indentation
    &#39;indent&#39; =&#62; -3,  # 3 character OUTdent</pre>

<p>If the value is negative, a negative-width space Box is added. The overall line will be longer than other lines, by that amount. Again, your rendering code should handle this in a similar manner as with a positive indentation, but move <i>left</i> by the indicated amount. Be careful to have your starting <code>x</code> value far enough to the right that text will not end up being written off-page.</p>

<dt><a name="tolerance"
>tolerance</a></dt>

<dd>
<p>How much leeway we have in leaving wider spaces than the algorithm would prefer. The <code>tolerance</code> is the maximum <code>ratio</code> glue expansion value to <i>tolerate</i> in a possible solution, before discarding this solution as so infeasible as to be a waste of time to pursue further. Most of the time, the <code>tolerance</code> is going to have a value in the 1 to 3 range. One approach is to try with <code>tolerance =&#62; 1</code>, and if no successful layout is found, try again with 2, and then 3 and perhaps even 4.</p>

<dt><a name="hyphenator"
>hyphenator</a></dt>

<dd>
<p>An object which hyphenates words. If you have the <code>Text::Hyphen</code> product installed (which is highly recommended), then a <code>Text::Hyphen</code> object is instantiated by default; if not, an object of the class <code>Text::KnuthPlass::DummyHyphenator</code> is instantiated - this simply finds no hyphenation points at all. So to turn hyphenation off, set</p>

<pre>    &#39;hyphenator&#39; =&#62; Text::KnuthPlass::DummyHyphenator-&#62;new()</pre>

<p>To typeset non-English text, pass in a <code>Text::Hyphen</code>-like object which responds to the <code>hyphenate</code> method, returning a list of hyphen positions for that particular language (native <code>Text::Hyphen</code> defaults to American English hyphenation rules). (See <code>Text::Hyphen</code> for the interface.)</p>

<dt><a name="space"
>space</a></dt>

<dd>
<p>Fine tune space (glue) width, stretchability, and shrinkability.</p>

<pre>    &#39;space&#39; =&#62; { &#39;width&#39; =&#62; 3, &#39;stretch&#39; =&#62; 6, &#39;shrink&#39; =&#62; 9 },</pre>

<p>For typesetting constant width text or output to a text file (characters), we suggest setting the <code>shrink</code> value to 0. This prevents the glue spaces from being shrunk to less than one character wide, which could result in either no spaces between words, or overflow into the right margin.</p>

<pre>    &#39;space&#39; =&#62; { &#39;width&#39; =&#62; 3, &#39;stretch&#39; =&#62; 6, &#39;shrink&#39; =&#62; 0 },</pre>

<dt><a name="infinity"
>infinity</a></dt>

<dd>
<p>The default value for <i>infinity</i> is, as is customary in TeX, 10000. While this is a far cry from the real infinity, so long as it is substantially larger than any other demerit or penalty, it should take precedence in calculations. Both positive and negative <code>inifinity</code> are used in the code for various purposes, including a <code>+inf</code> penalty for something absolutely forbidden, and <code>-inf</code> for something absolutely required (such as a line break at the end of a paragraph).</p>

<pre>    &#39;infinity&#39; =&#62; 10000,</pre>

<dt><a name="hyphenpenalty"
>hyphenpenalty</a></dt>

<dd>
<p>Set the penalty for an end-of-line hyphen at 50. You may want to try a somewhat higher value, such as 100+, if you see too much hyphenation on output. Remember that excessively short lines are prone to splitting words and being hyphenated, no matter what the penalty is.</p>

<pre>    &#39;hyphenpenalty&#39; =&#62; 50,</pre>

<p>There does not appear to be anything in the code to find and prevent multiple contiguous (adjacent) hyphenated lines, nor to prevent the penultimate (next-to-last) line from being hyphenated, nor to prevent the hyphenation of a line where you anticipate the paragraph to be split between columns. Something may be done in the future about these three special cases, which are considered to not be good typesetting.</p>

<dt><a name="demerits"
>demerits</a></dt>

<dd>
<p>Various demerits used in calculating penalties, including <i>fitness</i>, which is used when line tightness (<code>ratio</code>) changes by more than one class between two lines.</p>

<pre>    &#39;demerits&#39; =&#62; { &#39;line&#39; =&#62; 10, &#39;flagged&#39; =&#62; 100, &#39;fitness&#39; =&#62; 3000 },</pre>
</dd>
</dl>

<p>There may be other options for fine-tuning the output. If you know your way around TeX, dig into the source to find out what they are. At some point, this package will support additional tuning by allowing the setting of more parameters which are currently hard-coded. Please let us know if you found any more parameters that would be useful to allow additional tuning!</p>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="$t-&#62;typeset($paragraph_string,_%opts)"
>$t-&#62;typeset($paragraph_string, %opts)</a></h2>

<p>This is the main interface to the algorithm, made up of the constituent parts below. It takes a paragraph of text and returns a list of lines (array of hashes) if suitable breakpoints could be found.</p>

<p>The typesetter currently allows several options:</p>

<dl>
<dt><a name="indent"
>indent</a></dt>

<dd>
<p>Override the global paragraph indentation value <b>just for this paragraph.</b> This can be useful for instances such as <i>not</i> indenting the first paragraph in a section.</p>

<pre>    &#39;indent&#39; =&#62; 0,  # default set in new() is 2ems</pre>

<dt><a name="linelengths"
>linelengths</a></dt>

<dd>
<p>The array of line lengths may be set here, in <code>typeset</code>. As with <code>new()</code>, it will override whatever existing line lengths array is left over from earlier operations.</p>
</dd>
</dl>

<p>Possibly (in the future) many other global settings set in <code>new()</code> may be overridden on a per-paragraph basis in <code>typeset()</code>.</p>

<p>The returned list has the following structure:</p>

<pre>    (
        { &#39;nodes&#39; =&#62; \@nodes, &#39;ratio&#39; =&#62; $ratio },
        { &#39;nodes&#39; =&#62; \@nodes, &#39;ratio&#39; =&#62; $ratio },
        ...
    )</pre>

<p>The node list in each element will be a list of objects. Each object will be either <code>Text::KnuthPlass::Box</code>, <code>Text::KnuthPlass::Glue</code> or <code>Text::KnuthPlass::Penalty</code>. See below for more on these.</p>

<p>The <code>ratio</code> is the amount of stretch or shrink which should be applied to each glue element in this line. The corrected width of each glue node should be:</p>

<pre>    $node-&#62;width() + $line-&#62;{&#39;ratio&#39;} *
        ($line-&#62;{&#39;ratio&#39;} &#60; 0 ? $node-&#62;shrink() : $node-&#62;stretch());</pre>

<p>Each box, glue or penalty node has a <code>width</code> attribute. Boxes have <code>value</code>s, which are the text which went into them (including a wide null blank for paragraph indentation, a special case); glue has <code>stretch</code> and <code>shrink</code> to determine how much it should vary in width. That should be all you need for basic typesetting; for more, see the source, and see the original Knuth-Plass paper in &#34;Digital Typography&#34;.</p>

<p>Why <i>typeset</i> rather than something like <i>linesplit</i>? Per <a href="#ACKNOWLEDGEMENTS" class="podlinkpod"
>&#34;ACKNOWLEDGEMENTS&#34;</a>, this code is ported from the Javascript product <b>typeset</b>.</p>

<p>This method is a thin wrapper around the three methods below.</p>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="$t-&#62;line_lengths()"
>$t-&#62;line_lengths()</a></h2>

<dl>
<dt><a name="@list_=_$t-&#62;line_lengths()_#_Get"
>@list = $t-&#62;line_lengths() # Get</a></dt>

<dd>
<dt><a name="$t-&#62;line_lengths(@list)_#_Set"
>$t-&#62;line_lengths(@list) # Set</a></dt>

<dd>
<p>Get or set the <code>linelengths</code> list of allowed line lengths. This permits you to do more elaborate operations on this array than simply replacing (resetting) it, as done in the <code>new()</code> and <code>typeset()</code> methods. For example, at the bottom of a page, you might cancel any further inset for a float, by deleting all but the last element of the list.</p>

<pre>    my @temp_LL = $t-&#62;line_lengths();
    # cancel remaining line shortening
    splice(@temp_LL, 0, scalar(@temp_LL)-1);
    $t-&#62;line_lengths(@temp_LL);</pre>

<p>On a &#34;Set&#34; request, you must have at least one length element in the list. If the list is empty, it is assumed to be a &#34;Get&#34; request.</p>
</dd>
</dl>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="$t-&#62;break_text_into_nodes($paragraph_string,_%opts)"
>$t-&#62;break_text_into_nodes($paragraph_string, %opts)</a></h2>

<p>This turns a paragraph into a list of box/glue/penalty nodes. It&#39;s fairly basic, and designed to be overloaded. It should also support multiple justification styles (centering, ragged right, etc.) but this will come in a future release; right now, it just does full justification.</p>

<h3><a class='u' href='#___top' title='click to go to top of document'
name="&#39;style&#39;_=&#62;_&#34;string_name&#34;"
>&#39;style&#39; =&#62; &#34;string_name&#34;</a></h3>

<dl>
<dt><a name="&#34;justify&#34;"
>&#34;justify&#34;</a></dt>

<dd>
<p>Fully justify the text (flush left <i>and</i> right). This is the <b>default</b>, and currently <i>the only choice implemented.</i></p>

<dt><a name="&#34;left&#34;"
>&#34;left&#34;</a></dt>

<dd>
<p>Not yet implemented. This will be flush left, ragged right (reversed for RTL scripts).</p>

<dt><a name="&#34;right&#34;"
>&#34;right&#34;</a></dt>

<dd>
<p>Not yet implemented. This will be flush right, ragged left (reversed for RTL scripts).</p>

<dt><a name="&#34;center&#34;"
>&#34;center&#34;</a></dt>

<dd>
<p>Implemented, but not yet fully tested. This is centered text within the indicated line width.</p>
</dd>
</dl>

<p>If you are doing clever typography or using non-Western languages you may find that you will want to break text into nodes yourself, and pass the list of nodes to the methods below, instead of using this method.</p>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="break"
>break</a></h2>

<p>This implements the main body of the algorithm; it turns a list of nodes (produced from the above method) into a list of breakpoint objects.</p>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="@lines_=_$t-&#62;breakpoints_to_lines(\@breakpoints,_\@nodes)"
>@lines = $t-&#62;breakpoints_to_lines(\@breakpoints, \@nodes)</a></h2>

<p>And this takes the breakpoints and the nodes, and assembles them into lines.</p>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="boxclass()"
>boxclass()</a></h2>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="glueclass()"
>glueclass()</a></h2>

<h2><a class='u' href='#___top' title='click to go to top of document'
name="penaltyclass()"
>penaltyclass()</a></h2>

<p>For subclassers.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="AUTHOR"
>AUTHOR</a></h1>

<p>originally written by Simon Cozens, <code>&#60;simon at cpan.org&#62;</code></p>

<p>since 2020, maintained by Phil Perry</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="ACKNOWLEDGEMENTS"
>ACKNOWLEDGEMENTS</a></h1>

<p>This module is a Perl translation (originally by Simon Cozens) of Bram Stein&#39;s &#34;Typeset&#34; Javascript Knuth-Plass implementation.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="BUGS"
>BUGS</a></h1>

<p>Please report any bugs or feature requests to the <i>issues</i> section of <code>https://github.com/PhilterPaper/Text-KnuthPlass</code>.</p>

<p>Do NOT under ANY circumstances open a PR (Pull Request) to report a bug. It is a waste of both your and our time and effort. Open a regular ticket (issue), and attach a Perl (.pl) program illustrating the problem, if possible. If you believe that you have a program patch, and offer to share it as a PR, we may give the go-ahead. Unsolicited PRs may be closed without further action.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="COPYRIGHT_&#38;_LICENSE"
>COPYRIGHT &#38; LICENSE</a></h1>

<p>Copyright (c) 2011 Simon Cozens.</p>

<p>Copyright (c) 2020-2022 Phil M Perry.</p>

<p>This program is released under the following license: Perl, GPL</p>

<!-- end doc -->

</body></html>