NAME
Pod::POM::View::HTML::Filter - Use filters on sections of your pod documents
SYNOPSIS
In your POD:
Some coloured Perl code:
=begin filter perl
# now in full colour!
$A++;
=end filter
=for filter=perl $A++; # this works too
This should read C<bar bar bar>:
=begin filter foo
bar foo bar
=end filter
In your code:
my $view = Pod::POM::View::HTML::Filter->new;
$view->add(
foo => {
code => sub { my $s = shift; $s =~ s/foo/bar/gm; $s },
# other options are available
}
);
my $pom = Pod::POM->parse_file( '/my/pod/file' );
$pom->present($view);
The resulting HTML will look like this (modulo the stylesheet):
# now in full colour! $A++;
$A++; # this works too
This should read bar bar bar:
bar bar bar
DESCRIPTION
This module is a subclass of Pod::POM::View::HTML that support the filter extension. This can be used in =begin / =end and =for pod blocks.
Please note that since the view maintains an internal state, only an instance of the view can be used to present the POM object. Either use:
my $view = Pod::POM::View::HTML::Filter->new;
$pom->present( $view );
or
$Pod::POM::DEFAULT_VIEW = Pod::POM::View::HTML::Filter->new;
$pom->present;
Even though the module was specifically designed for use with Perl::Tidy, you can write your own filters quite easily (see "Writing your own filters").
FILTERING POD?
The whole idea of this module is to take advantage of all the syntax colouring modules that exist (actually, Perl::Tidy was my first target) to produce colourful code examples in a POD document (after conversion to HTML).
Filters can be used in two different POD constructs:
=begin filter filter-
The data in the
=begin filter...=end filterregion is passed to the filter and the result is output in place in the document.The general form of a
=begin filterblock is as follow:=begin filter lang optionstring # some text to process with filter "lang" =end filterThe optionstring is trimed for whitespace and passed as a single string to the filter routine which must perform its own parsing.
=for filter=filter-
=forfilters work just like=begin/C=<end> filters, except that a single paragraph is the target.The general form of a
=for filterblock is as follow:=for filter=lang:option1:option2 # some code in language langThe option string sent to the filter
langwould beoption1 option2(colons are replaced with spaces).
Options
Some filters may accept options that alter their behaviour. Options are separated by whitespace, and appear after the name of the filter. For example, the following code will be rendered in colour and with line numbers:
=begin filter perl -nnn
$a = 123;
$b = 3;
print $a * $b; # prints 369
print $a x $b; # prints 123123123
=end filter
=for filters can also accept options, but the syntax is less clear. (This is because =for expects the formatname to match \S+.)
The syntax is the following:
=for filter=html:nnn=1
<center><img src="camel.png" />
A camel</center>
In summary, options are separated by space for =begin blocks and by colons for =for paragraphs.
The options and their paramater depend on the filter, but they cannot contain the pipe (|) or colon (:) character, for obvious reasons.
Pipes
Having filter to modify a block of text is usefule, but what's more useful (and fun) than a filter? Answer: a stack of filters piped together!
Take the imaginary filters foo (which does a simple s/foo/bar/g) and bang (which does an even simpler tr/r/!/). The following block
=begin filter foo|bar
foo bar baz
=end
will become ba! ba! baz.
And naturally,
=for filter=bar|foo
foo bar baz
will return bar ba! baz.
A note on verbatim and text blocks
Note: The fact that I mention verbatim and paragraph in this section is due to an old bug in Pod::POM, which parses the content of begin/end sections as the usual POD paragraph and verbatim blocks. This is a bug in Pod::POM, around which Pod::POM::View::HTML::Filter tries to work around.
As from version 0.06, Pod::POM::View::HTML::Filter gets to the original text contained in the =begin / =end block (it was easier than I thought, actually) and put that string throught all the filters.
If any filter in the stack is defined as verbatim, or if Pod::POM detect any block in the =begin / =end block as verbatim, then the output will be produced between <pre> and </pre> tags. Otherwise, no special tags will be added (his is left to the formatter).
Examples
An example of the power of pipes can be seen in the following example. Take a bit of Perl code to colour:
=begin filter perl
"hot cross buns" =~ /cross/;
print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns>
print "Left: <$`>\n"; # Left: <hot >
print "Match: <$&>\n"; # Match: <cross>
print "Right: <$'>\n"; # Right: < buns>
=end
This will produce the following HTML code:
<pre> <span class="q">"hot cross buns"</span> =~ <span class="q">/cross/</span><span class="sc">;</span>
<span class="k">print</span> <span class="q">"Matched: <$`> $& <$'>\n"</span><span class="sc">;</span> <span class="c"># Matched: <hot > cross < buns></span>
<span class="k">print</span> <span class="q">"Left: <$`>\n"</span><span class="sc">;</span> <span class="c"># Left: <hot ></span>
<span class="k">print</span> <span class="q">"Match: <$&>\n"</span><span class="sc">;</span> <span class="c"># Match: <cross></span>
<span class="k">print</span> <span class="q">"Right: <$'>\n"</span><span class="sc">;</span> <span class="c"># Right: < buns></span></pre>
Which your browser will render as:
"hot cross buns" =~ /cross/; print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns> print "Left: <$`>\n"; # Left: <hot > print "Match: <$&>\n"; # Match: <cross> print "Right: <$'>\n"; # Right: < buns>
Now if you want to colour and number the HTML code produced, it's as simple as tackling the html on top of the perl filter:
=begin filter perl | html nnn=1
"hot cross buns" =~ /cross/;
print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns>
print "Left: <$`>\n"; # Left: <hot >
print "Match: <$&>\n"; # Match: <cross>
print "Right: <$'>\n"; # Right: < buns>
=end
Which produces the rather unreadable piece of HTML:
<pre><span class="h-lno"> 1</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>hot cross buns<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> =~ <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span>/cross/<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 2</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Matched: <span class="h-ent">&lt;</span>$`<span class="h-ent">&gt;</span> $<span class="h-ent">&amp;</span> <span class="h-ent">&lt;</span>$'<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Matched: <span class="h-ent">&lt;</span>hot <span class="h-ent">&gt;</span> cross <span class="h-ent">&lt;</span> buns<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 3</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Left: <span class="h-ent">&lt;</span>$`<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Left: <span class="h-ent">&lt;</span>hot <span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 4</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Match: <span class="h-ent">&lt;</span>$<span class="h-ent">&amp;</span><span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Match: <span class="h-ent">&lt;</span>cross<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 5</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Right: <span class="h-ent">&lt;</span>$'<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Right: <span class="h-ent">&lt;</span> buns<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span></pre>
But your your browser will render it as:
1 <span class="q">"hot cross buns"</span> =~ <span class="q">/cross/</span><span class="sc">;</span> 2 <span class="k">print</span> <span class="q">"Matched: <$`> $& <$'>\n"</span><span class="sc">;</span> <span class="c"># Matched: <hot > cross < buns></span> 3 <span class="k">print</span> <span class="q">"Left: <$`>\n"</span><span class="sc">;</span> <span class="c"># Left: <hot ></span> 4 <span class="k">print</span> <span class="q">"Match: <$&>\n"</span><span class="sc">;</span> <span class="c"># Match: <cross></span> 5 <span class="k">print</span> <span class="q">"Right: <$'>\n"</span><span class="sc">;</span> <span class="c"># Right: < buns></span>
Caveats
There were a few things to keep in mind when mixing verbatim and text paragraphs in a =begin block. These problems do not exist any more as from version 0.06.
- Text paragraphs are not processed for POD escapes any more
-
Because the
=begin/=endblock is now processed as a single string of text, the following block:=begin filter html B<foo> =endwill not be transformed into
<bfoo</b> > before being passed to the filters, but will produce the expected:<pre>B<span class="h-ab"><</span><span class="h-tag">foo</span><span class="h-ab">></span></pre>This will be rendered by your web browser as:
B<foo>
And the same text in a verbatim block
=begin filter html B<foo> =endwill produce the same results.
<pre> B<span class="h-ab"><</span><span class="h-tag">foo</span><span class="h-ab">></span></pre>Which a web browser will render as:
B<foo>
Which looks quite the same, doesn't it?
- Separate paragraphs aren't filtered separately any more
-
As seen in "A note on verbatim and text blocks", the filter now processes the begin block as a single string of text. So, if you have a filter that replace each
*character with an auto-incremented number in square brackets, like this:$view->add( notes => { code => sub { my ( $text, $opt ) = @_; my $n = $opt =~ /(\d+)/ ? $1 : 1; $text =~ s/\*/'[' . $n++ . ']'/ge; $text; } } );And you try to process the following block:
=begin filter notes 2 TIMTOWDI*, but your library should DWIM* when possible. You can't always claims that PICNIC*, can you? =end filterYou'll get the expected result (contrary to previous versions):
<p>TIMTOWDI[2], but your library should DWIM[3] when possible. You can't always claims that PICNIC[4], can you?</p>The filter was really called only once, starting at
2, just like requested.Future versions of
Pod::POM::View::HTML::Filtermay supportinit,beginandendcallbacks to run filter initialisation and clean up code.
METHODS
Public methods
The following methods are available:
add( lang => { options }, ... )-
Add support for one or more languages. Options are passed in a hash reference.
The required
codeoption is a reference to the filter routine. The filter must take a string as its only argument and return the formatted HTML string (coloured accordingly to the language grammar, hopefully).Available options are:
Name Type Content ---- ---- ------- code CODEREF filter implementation verbatim BOOLEAN if true, force the full content of the =begin/=end block to be passed verbatim to the filter requires ARRAYREF list of required modules for this filterNote that
add()is both a class and an instance method.When used as a class method, the new language is immediately available for all future and existing instances.
When used as an instance method, the new language is only available for the instance itself.
delete( $lang )-
Remove the given language from the list of class or instance filters. The deleted filter is returned by this method.
delete()is both a class and an instance method, just likeadd(). filters()-
Return the list of languages supported.
know( $lang )-
Return true if the view knows how to handle language
$lang.
Overloaded methods
The following Pod::POM::View::HTML methods are overridden in Pod::POM::View::HTML::Filter:
new()-
The overloaded constructor initialises some internal structures. This means that you'll have to use a instance of the class as a view for your
Pod::POMobject. Therefore you must usenew.$Pod::POM::DEFAULT_VIEW = 'Pod::POM::View::HTML::Filter'; # WRONG $pom->present( 'Pod::POM::View::HTML::Filter' ); # WRONG # this is CORRECT $Pod::POM::DEFAULT_VIEW = Pod::POM::View::HTML::Filter->new; # this is also CORRECT my $view = Pod::POM::View::HTML::Filter->new; $pom->present( $view );The only option at this time is
auto_unindent, which is enabled by default. This option remove leading indentation from all verbatim blocks within the begin blocks, and put it back after highlighting. view_begin()view_for()-
These are the methods that support the
filterformat.
FILTERS
Built-in filters
Pod::POM::View::HTML::Filter is shipped with a few built-in filters.
The name for the filter is obtained by removing _filter from the names listed below (except for default):
- default
-
This filter is called when the required filter is not known by
Pod::POM::View::HTML::Filter. It does nothing more than normal POD processing (POD escapes for text paragraphs and<pre>for verbatim paragraphs.You can use the
delete()method to remove a filter and therefore make it behave likedefault. - perl_tidy_filter
-
This filter does Perl syntax highlighting with a lot of help from
Perl::Tidy.It accepts options to
Perl::Tidy, such as-nnnto number lines of code. CheckPerl::Tidy's documentation for more information about those options. - perl_ppi_filter
-
This filter does Perl syntax highlighting using
PPI::HTML, which is itself based on the incrediblePPI.It accepts the same options as
PPI::HTML, which at this time solely consist ofline_numbersto, as one may guess, add line numbers to the output. - html_filter
-
This filter does HTML syntax highlighting with the help of
Syntax::Highlight::HTML.The filter supports
Syntax::Highlight::HTMLoptions:=begin filter html nnn=1 <p>The lines of the HTML code will be numbered.</p> <p>This is line 2.</p> =end filterSee
Syntax::Highlight::HTMLfor the list of supported options. - shell_filter
-
This filter does shell script syntax highlighting with the help of
Syntax::Highlight::Shell.The filter supports
Syntax::Highlight::Shelloptions:=begin filter shell nnn=1 #!/bin/sh echo "This is a foo test" | sed -e 's/foo/shell/' =end filterSee
Syntax::Highlight::Shellfor the list of supported options. - kate_filter
-
This filter support syntax highlighting for numerous languages with the help of
Syntax::Highlight::Engine::Kate.The filter supports
Syntax::Highlight::Engine::Katelanguages as options:=begin filter kate Diff Index: lib/Pod/POM/View/HTML/Filter.pm =================================================================== --- lib/Pod/POM/View/HTML/Filter.pm (revision 99) +++ lib/Pod/POM/View/HTML/Filter.pm (working copy) @@ -27,6 +27,11 @@ requires => [qw( Syntax::Highlight::Shell )], verbatim => 1, }, + kate => { + code => \&kate_filter, + requires => [qw( Syntax::Highlight::Engine::Kate )], + verbatim => 1, + }, ); my $HTML_PROTECT = 0; =end filterCheck the
Syntax::Highlight::Engine::Katedocumentation for the full list of supported languages. Please note that some of them aren't well supported yet (bySyntax::Highlight::Engine::Kate), so the output may not be what you expect.Here is a list of languages we have successfully tested with
Syntax::Highlight::Engine::Kateversion 0.02:C,Diff,Fortran,JavaScript,LDIF,SQL. - wiki_filter
-
This filter converts the wiki format parsed by
Text::WikiFormatin HTML.The supported options are:
prefix,extended,implicit_links,absolute_links. The option and value are separated by a=character, as in the example below:=begin filter wiki extended=1 [link|title] =end - wikimedia_filter
-
This filter converts the wiki format parsed by
Text::MediawikiFormatin HTML.The supported options are:
prefix,extended,implicit_links,absolute_linksandprocess_html. The option and value are separated by a=character.
Writing your own filters
Write a filter is quite easy: a filter is a subroutine that takes two arguments (text to parse and option string) and returns the filtered string.
The filter is added to Pod::POM::View::HTML::Filter's internal filter list with the add() method:
$view->add(
foo => {
code => \&foo_filter,
requires => [],
}
);
When presenting the following piece of pod,
=begin filter foo bar baz
Some text to filter.
=end filter
the foo_filter() routine will be called with two arguments, like this:
foo_filter( "Some text to filter.", "bar baz" );
If you have a complex set of options, your routine will have to parse the option string by itself.
Please note that in a =for construct, whitespace in the option string must be replaced with colons:
=for filter=foo:bar:baz Some text to filter.
The foo_filter() routine will be called with the same two arguments as before.
BUILT-IN FILTERS CSS STYLES
Each filter uses its own CSS classes, so that one can define their favourite colours in a custom CSS file.
perl filter
Perl::Tidy's HTML code looks like:
<span class="i">$A</span>++<span class="sc">;</span>
Here are the classes used by Perl::Tidy:
n numeric
p paren
q quote
s structure
c comment
v v-string
cm comma
w bareword
co colon
pu punctuation
i identifier
j label
h here-doc-target
hh here-doc-text
k keyword
sc semicolon
m subroutine
pd pod-text
ppi filter
PPI::HTML uses the following CSS classes:
comment
double
heredoc_content
interpolate
keyword for language keywords (my, use
line_number
number
operator for language operators
pragma for pragmatas (strict, warnings)
single
structure for syntaxic symbols
substitute
symbol
word for module, function and method names
words
match
html filter
Syntax::Highlight::HTML makes use of the following classes:
h-decl declaration # declaration <!DOCTYPE ...>
h-pi process # process instruction <?xml ...?>
h-com comment # comment <!-- ... -->
h-ab angle_bracket # the characters '<' and '>' as tag delimiters
h-tag tag_name # the tag name of an element
h-attr attr_name # the attribute name
h-attv attr_value # the attribute value
h-ent entity # any entities: é «
shell filter
Syntax::Highlight::Shell makes use of the following classes:
s-key # shell keywords (like if, for, while, do...)
s-blt # the builtins commands
s-cmd # the external commands
s-arg # the command arguments
s-mta # shell metacharacters (|, >, \, &)
s-quo # the single (') and double (") quotes
s-var # expanded variables: $VARIABLE
s-avr # assigned variables: VARIABLE=value
s-val # shell values (inside quotes)
s-cmt # shell comments
kate filter
Output formatted with Syntax::Highlight::Engine::Kate makes use of the following classes:
k-alert # Alert
k-basen # BaseN
k-bstring # BString
k-char # Char
k-comment # Comment
k-datatype # DataType
k-decval # DecVal
k-error # Error
k-float # Float
k-function # Function
k-istring # IString
k-keyword # Keyword
k-normal # Normal
k-operator # Operator
k-others # Others
k-regionmarker # RegionMarker
k-reserved # Reserved
k-string # String
k-variable # Variable
k-warning # Warning
HISTORY
The goal behind this module was to produce nice looking HTML pages from the articles the French Perl Mongers are writing for the French magazine GNU/Linux Magazine France (http://www.linuxmag-france.org/).
The resulting web pages can be seen at http://articles.mongueurs.net/magazines/.
AUTHOR
Philippe "BooK" Bruhat, <book@cpan.org>
THANKS
Many thanks to Sébastien Aperghis-Tramoni (Maddingue), who helped debugging the module and wrote Syntax::Highlight::HTML and Syntax::Highlight::Shell so that I could ship PPVHF with more than one filter. He also pointed me to Syntax::Highlight::Engine::Kate, which led me to clean up PPVHF before adding support for SHEK.
Perl code examples where borrowed in Amelia, aka Programming Perl, 3rd edition.
TODO
There are a few other syntax highlighting modules on CPAN, which I should try to add support for in Pod::POM::View::HTML::Filter:
Syntax::Highlight::UniversalSyntax::Highlight::MasonSyntax::Highlight::Perl(seems old)Syntax::Highlight::Perl::Improved
BUGS
Please report any bugs or feature requests to bug-pod-pom-view-html-filter@rt.cpan.org, or through the web interface at http://rt.cpan.org. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
COPYRIGHT & LICENSE
Copyright 2004 Philippe "BooK" Bruhat, All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1164:
Non-ASCII character seen before =encoding in 'Sébastien'. Assuming CP1252