NAME
OODoc::Parser::Markov - Parser for the MARKOV syntax
INHERITANCE
OODoc::Parser::Markov
is a OODoc::Parser
is a OODoc::Object
SYNOPSIS
DESCRIPTION
The Markov parser is named after the author, because the author likes to invite other people to write their own parser as well: every one has not only their own coding style, but also their own documentation wishes.
The task for the parser is to strip Perl package files into a code part and a documentation tree. The code is written to a directory where the module distribution is built, the documenation tree is later formatted into manual pages.
OVERLOADED
METHODS
Constructors
OODoc::Parser::Markov->new(OPTIONS)
Option --Default
additional_rules []
. additional_rules => ARRAY
Reference to an array which contains references to match-action pairs, as accepted by rule(). These rules get preference over the existing rules.
Inheritance knowledge
$obj->extends([OBJECT])
Parsing a file
$obj->currentManual([MANUAL])
Returns the manual object which is currently being filled with data. With a new MANUAL, a new one is set.
$obj->findMatchingRule(LINE)
Check the list of rules whether this LINE matches one of them. This is an ordered evaluation. Returned is the matched string and the required action. If the line fails to match anything, an empty list is returned.
example:
if(my($match, $action) = $parser->findMatchingRule($line))
{ # do something with it
$action->($parser, $match, $line);
}
$obj->inDoc([BOOLEAN])
When a BOOLEAN is specified, the status changes. It returns the current status of the document reader.
$obj->parse(OPTIONS)
Option --Default
distribution <required>
input <required>
notice ''
output devnull
version <required>
. distribution => STRING
. input => FILENAME
. notice => STRING
Block of text added in from of the output file.
. output => FILENAME
. version => STRING
$obj->rule((STRING|REGEX), (METHOD|CODE))
Register a rule which will be applied to a line in the input file. When a STRING is specified, it must start at the beginning of the line to be selected. You may also specify a regular expression which will match on the line.
The second argument is the action which will be taken when the line is selected. Either the named METHOD or the CODE reference will be called. Their arguments are:
$parser->METHOD($match, $line, $file, $linenumber);
CODE->($parser, $match, $line, $file, $linenumber);
$obj->setBlock(REF-SCALAR)
Set the scalar where the next documentation lines should be collected in.
Formatting text pieces
$obj->cleanup(FORMATTER, MANUAL, STRING)
Producing manuals
$obj->cleanupPod(FORMATTER, MANUAL, STRING)
$obj->cleanupPodL(FORMATTER, MANUAL, LINK)
The L
markups for OODoc::Parser::Markov
have the same syntax as standard POD has, however most standard pod-laters do no accept links in verbatim blocks. Therefore, the links have to be translated in their text in such a case. The translation itself is done in by this method.
$obj->cleanupPodM(FORMATTER, MANUAL, LINK)
$obj->decomposeL(MANUAL, LINK)
Decompose the L-tags. These tags are described in perlpod, but they will not refer to items: only headers.
$obj->decomposeM(MANUAL, LINK)
Commonly used functions
$obj->cleanupHtml(FORMATTER, MANUAL, STRING, [IS_HTML])
Some changes will not be made when IS_HTML is true
, for instance, a "<" will stay that way, not being translated in a "<".
$obj->cleanupHtmlL(FORMATTER, MANUAL, LINK)
$obj->cleanupHtmlM(FORMATTER, MANUAL, LINK)
$obj->filenameToPackage(FILENAME)
OODoc::Parser::Markov->filenameToPackage(FILENAME)
$obj->mkdirhier(DIRECTORY)
OODoc::Parser::Markov->mkdirhier(DIRECTORY)
Manual Repository
$obj->addManual(MANUAL)
$obj->mainManual(NAME)
$obj->manual(NAME)
$obj->manuals
$obj->manualsForPackage(NAME)
$obj->packageNames
DETAILS
General Description
The Markov parser has some commonalities with the common POD syntax. You can use the same tags as are defined by POD, however these tags are "visual style", which means that OODoc can not treat it smart. The Markov parser adds many logical markups which will produce nicer pages.
Furthermore, the parser will remove the documentation from the source code, because otherwise the package installation would fail: Perl's default installation behavior will extract POD from packages, but the markup is not really POD, which will cause many complaints.
The version of the module is defined by the OODoc object which creates the manual page. Therefore, $VERSION
will be added to each package automatically.
Disadvantages
The Markov parser removes all raw documentation from the package files, which means that people sending you patches will base them on the processed source: the line numbers will be wrong. Usually, it is not much of a problem to manually process the patch: you have to check the correctness anyway.
A second disadvantage is that you have to backup your sources separately: the sources differ from what is published on CPAN, so CPAN is not your backup anymore. The example scripts, contained in the distribution, show how to produce these "raw" packages.
Finally, a difference with the standard POD process: the manual-page must be preceeded with a package
keyword.
Structural tags
Heading
=chapter STRING
=section STRING
=subsection STRING
=subsubsection STRING
These text structures are used to group descriptive text and subroutines. You can use any name for a chapter, but the formatter expects certain names to be used: if you use a name which is not expected by the formatter, that documentation will be ignored.
Subroutines
Perl has many kinds of subroutines, which are distinguished in the logical markup. The output may be different per kind.
=i_method NAME PARAMETERS (instance method)
=c_method NAME PARAMETERS (class method)
=ci_method NAME PARAMETERS (class and instance method)
=method NAME PARAMETERS (short for i_method)
=function NAME PARAMETERS
=tie NAME PARAMETERS
=overload STRING
The NAME is the name of the subroutine, and the PARAMETERS an argument indicator.
Then the subroutine description follows. These tags have to follow the general description of the subroutines. You can use
=option NAME PARAMETERS
=default NAME VALUE
=requires NAME PARAMETERS
If you have defined an =option, you have to provide a =default for this option anywhere. Use of =default for an option on a higher level will overrule the one in a subclass.
Include examples
Examples can be added to chapters, sections, subsections, subsubsections, and subroutines. They run until the next markup line, so can only come at the end of the documentation pieces.
=example
=examples
Include diagnostics
A subroutine description can also contain error or warning descriptions. These diagnostics are usually collected into a special chapter of the manual page.
=error this is very wrong
Of course this is not really wrong, but only as an example
how it works.
=warning wrong, but not sincerely
Warning message, which means that the program can create correct output
even though it found sometning wrong.
Compatibility
For comfort, all POD markups are supported as well
=head1 Heading Text (same as =chapter)
=head2 Heading Text (same as =section)
=head3 Heading Text (same as =subsection)
=head4 Heading Text (same as =subsubsection)
=over indentlevel
=item stuff
=back
=cut
=pod
=begin format
=end format
=for format text...
Text markup
Next to the structural markup, there is textual markup. This markup is the same as POD defines in the perlpod manual page. For instance, C<some code> can be used to create visual markup as a code fragment.
One kind is added to the standard list: the M
.
The M-link
The M
-link can not be nested inside other text markup items. It is used to refer to manuals, subroutines, and options. You can use an L
-link to manuals as well, however then the POD output filter will modify the manual page while converting it to other manual formats.
Syntax of the M
-link: M < OODoc::Object > M < OODoc::Object::new() > M < OODoc::Object::new(verbose) > M < new() > M < new(verbose) >
These links refer to a manual page, a subroutine within a manual page, and an option of a subroutine respectively. And then two abbreviations are shown: they refer to subroutines of the same manual page, in which case you may refer to inherited documentation as well.
The L-link
The standard POD defines a L
markup tag. This can also be used with this Markov parser.
The following syntaxes are supported: L < manual > L < manual/section > L < manual/"section" > L < manual/subsection > L < manual/"subsection" > L < /section > L < /"section" > L < /subsection > L < /"subsection" > L < "section" > L < "subsection" > L < "subsubsection" > L < unix-manual > L < url >
In the above, manual is the name of a manual, section the name of any section (in that manual, by default the current manual), and subsection a subsection (in that manual, by default the current manual).
The unix-manual MUST be formatted with its chapter number, for instance cat(1)
, otherwise a link will be created. See the following examples in the html version of these manual pages: M < perldoc > illegal: not in distribution L < perldoc > manual perldoc L < perldoc(1perl) > manual perldoc(1perl) M < OODoc::Object > OODoc::Object L < OODoc::Object > OODoc::Object L < OODoc::Object(3pm) > manual OODoc::Object(3pm)
Grouping subroutines
Subroutine descriptions can be grouped in a chapter, section, subsection, or subsubsection. It is very common to have a large number of subroutines, so some structure has to be imposed here.
If you document the same routine in more than one manual page with an inheritance relationship, the documentation location shall not conflict. You do not need to give the same level of detail about the exact location of a subroutine, as long as it is not conflicting. This relative freedom is created to be able to regroup existing documentation without too much effort.
For instance, in the code of OODoc itself (which is of course documented with OODoc), the following happens:
package OODoc::Object;
...
=chapter METHODS
=section Initiation
=c_method new OPTIONS
package OODoc;
use base 'OODoc::Object';
=chapter METHODS
=c_method new OPTIONS
As you can see in the example, in the higher level of inheritance, the new
method is not put in the Initiation
section explicitly. However, it is located in the METHODS chapter, which is required to correspond to the base class. The generated documentation will show new
in the Initiation
section in both manual pages.
Caveats
The markov parser does not require blank lines before or after tags, like POD does. This means that the change to get into parsing problems have increased: lines within here documents which start with a =
will cause confusion. However, I these case, you can usually simply add a backslash in front of the printed =
, which will disappear once printed.
Examples
You may also take a look at the raw code archive for OODoc (the text as is before it was processed for distribution).
example: how subroutines are documented
=chapter FUNCTIONS
=function countCharacters FILE|STRING, OPTIONS
Returns the number of bytes in the FILE or STRING,
or undef if the string is undef or the character
set unknown.
=option charset CHARSET
=default charset 'us-ascii'
Characters in, for instance, utf-8 or unicode encoding
require variable number of bytes per character. The
correct CHARSET is needed for the correct result.
=examples
my $count = countCharacters("monkey");
my $count = countCharacters("monkey",
charset => 'utf-8');
=error unknown character set $charset
The character set you can use is limited by the sets
defined by manual Encode. The characters of the input can
not be seperated from each other without this definition.
=cut
# now the coding starts
sub countCharacters($@) {
my ($self, $input, %options) = @_;
...
}
DIAGNOSTICS
Warning: =cut does not terminate any doc in $file line $number
There is no document to end here.
Warning: Debugging remains in $filename line $number
The author's way of debugging is by putting warn/die/carp etc on the first position of a line. Other lines in a method are always indented, which means that these debugging lines are clearly visible. You may simply ingnore this warning.
Warning: Manual $manual links to unknown entry "$item" in $manual
Error: The formatter type $class is not known for cleanup
Text blocks have to get the finishing touch in the final formatting phase. The parser has to fix the text block segments to create a formatter dependent output. Only a few formatters are predefined.
Warning: You may have accidentally captured code in doc file $fn line $number
Some keywords on the first position of a line are very common for code. However, code within doc should start with a blank to indicate pre-formatted lines. This warning may be false.
Error: cannot read document from $input: $!
The document file can not be processed because it can not be read. Reading is required to be able to build a documentation tree.
Error: chapter `$name' before package statement in $file line $number
A package file can contain more than one package: more than one name space. The docs are sorted after the name space. Therefore, each chapter must be preceeded by a package statement in the file to be sure that the correct name space is used.
Error: default for option $name outside subroutine in $file line $number
A default is set, however there is not subroutine in scope (yet). It is plausible that the option does not exist either, but that will be checked later.
Warning: default line incorrect in $file line $number: $line
The shown $line is not in the right format: it should contain at least two words being the option name and the default value.
Error: diagnostic $type outside subroutine in $file line $number
It is unclear to which subroutine this diagnostic message belongs.
Warning: doc did not end in $input
When the whole $input was parsed, the documentation part was still open. Probably you forgot to terminate it with a =cut
.
Warning: empty L link in $manual
Error: example outside chapter in $file line $number
An example can belong to a subroutine, chapter, section, and subsection. Apparently, this example was found before the first chapter started in the file.
Error: manual definition requires manual object
A call to addManual() expects a new manual object (a OODoc::Manual), however an incompatible thing was passed. Usually, intended was a call to manualsForPackage() or mainManual().
Warning: module $name is not on your system, but linked to in $manual
The module can not be found. This may be an error at your part (usually a typo) or you didn't install the module on purpose. This message will also be produced if some defined package is stored in one file together with an other module or when compilation errors are encountered.
Warning: no diagnostic message supplied in $file line $number
The start of a diagnostics message was indicated, however not provided on the same line.
Error: no input file to parse specified
The parser needs the name of a file to be read, otherwise it can not work.
Warning: no manual for $package (correct casing?)
The manual for $package cannot be found. If you have a module named this way, this may indicate that the NAME chapter of the manual page in that module differs from the package name. Often, this is a typo in the NAME... probably a difference in used cases.
Warning: option "$name" unknow for $name() in $package, found in $manual
Error: option $name outside subroutine in $file line $number
An option is set, however there is not subroutine in scope (yet).
Warning: option line incorrect in $file line $number: $line
The shown $line is not in the right format: it should contain at least two words being the option name and an abstract description of possible values.
Error: section `$name' outside chapter in $file line $number
Sections must be contained in chapters.
Warning: subroutine $name is not defined by $package, found in $manual
Error: subroutine $name outside chapter in $file line $number
Subroutine descriptions (method, function, tie, ...) can only be used within a restricted set of chapters. You have not started any chapter yet.
Error: subsection `$name' outside section in $file line $number
Subsections are only allowed in a chapter when it is nested within a section.
Error: subsubsection `$name' outside subsection in $file line $number
Subsubsections are only allowed in a chapter when it is nested within a subsection.
Warning: unknown markup in $file line $number: $line
The standard pod and the extensions made by this parser define a long list of markup keys, but yours is not one of these predefined names.
Warning: use problem for module $link in $module; $@
In an attempt to check the correctness of your naming of a module, OODoc will try to compile ("require") the named module. Apparently, the module was found, but something else went wrong. The exact cause is not always easy to find.
SEE ALSO
This module is part of OODoc distribution version 1.03, built on March 14, 2008. Website: http://perl.overmeer.net/oodoc/
LICENSE
Copyrights 2003-2008 by Mark Overmeer. For other contributors see ChangeLog.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html