NAME
Syntax::Kamelon - A versatile and fully programmable textual content parser that is extremely well suited for syntax highlighting and code folding
SYNOPSIS
use Syntax::Kamelon;
my @attributes = Syntax::Kamelon->AvailableAttributes;
my %formtab = ();
for (@attributes) {
$formtab{$_} = "<font class=\"$_\">"
}
my $textfilter = "[%~ text FILTER html FILTER replace('\\040', ' ') FILTER replace('\\t', ' ') ~%]";
my $hl = new Syntax::Kamelon(
xmlfolder => $xmldir,
noindex => 1,
formatter => ['Base',
textfilter => \$textfilter,
format_table => \%formtab,
newline => "</br>\n",
tagend => '</font>',
],
);
while (my $in = <IFILE>) {
$hl->Parse($in);
}
print $hl->Format;
DESCRIPTION
Kamelon is based on the syntax highlighting and code folding algorithms used in the Kate texteditor of the KDE desktop. It replaces and supercedes Syntax::Highlight::Engine::Kate.
This is a rewrite and not backwards compatible with Syntax::Highlight::Engine::Kate.
Instead of using plugin modules it loads Kate's syntax highlight definition xml files directly. That makes development and testing a lot easier. It also opens up a new field of applications like creating your own highlight definitions to neatly format your reports. Tons of bugs have been removed. Testing has been redesigned. It runs about four times faster than version 0.10 and is up to spec with the latest Kate highlight definitions.
OPTIONS
Kamelons' constructor is called with a paired list of options as parameters. You can use the following options.
- commands => ref to hash
-
Specify commands to execute upon a specific match. Example:
mycommand => sub { my $match = shift; return '' }
You can specify the command in the rules of your own syntax xml file. Kamelon will give the matched text to your sub as parameter and will parse whatever your sub returns. Make it always return at least an empty string.
- formatter => ['Name', @options],
-
A formatter can be any object that inherits Syntax::Kamelon::Format::Base and lives in Syntax::Kamelon::Format::Name. By default 'Base' without options is loaded. This is convenient if you only use ParseRaw.
See also Syntax::Kamelon::Format::Base, Syntax::Kamelon::Format::ANSI, Syntax::Kamelon::Format::HTML4.
- indexfile => filename
-
Specifies the filename where Kamelon stores information about available syntax definitions.
By default it points to 'indexrc' in the xmlfolder. If the file does not exist Kamelon will load all xml files in the xmlfolder and attempt to create the indexfile.
Once the indexfile has been created it becomes static. If you add a syntax definition XML file to the xmlfolder it will no longer be recognized. Delete the indexfile and reload Kamelon to fix that.
See also Syntax::Kamelon::Indexer
- logcall => ref to sub
-
By default Kamelon writes all errors to STERR.
- noindex => boolean
-
By default 0. If you set this option Kamelon will ignore the existence of an indexfile and manually build the index, without saving it. But it gives you the liberty of adding and removing syntax highlight definition files.
This option comes with a considerable startup penalty.
See also Syntax::Kamelon::Indexer
- syntax => string
-
Specify the syntax definition you want to use. If you do not specify this option Kamelon will start in blank mode. It the Highlight and HighlightAndFormat methods will allow text to pass without any highlighting being done.
- verbose => boolean
-
By default 0. If you set it Kamelon will happily complain about all integrity errors it finds in syntax xml files. Otherwise it will only complain and crunch about the ones it cannot overcome.
- xmlfolder => folder
-
This is the place where Kamelon looks for syntax highlight definition XML files. By default it searches @INC for 'Syntax/Kamelon/XML'. Here you find the XML files used in the Kate text editor. They are specially crafted for this module.
See also Syntax::Kamelon::Indexer
PUBLIC METHODS
- AvailableAttributes
-
Returns a list of all available attribute tags. Can also be called before initializing Kamelon.
- AvailableSyntaxes
-
Returns a list of all available syntax definitions.
- ClearLexers
-
Empties the pool of loaded lexers. Every called lexer will be loaded from scratch.
- Column
-
Returns the column position in the line that is currently highlighted.
- FirstNonSpace($string);
-
Returns true if the current line did not contain a non-spatial character so far and the first character in $string is a non spatial character.
- Format
-
Calls the Format method of the currently loaded formatter and returns the result.
- Formatter
-
Returns a reference to the formatter object.
- GetIndexer
-
Returns the Indexer object.
- GetLexer($syntax);
-
Returns the lexer data structure from the pool of loaded lexers. If it is not found it will create and return it.
- IsDeliminator($char);
-
Returns true if $char is a deliminator.
- LastcharBoundary
-
Returns true if the last character that was parsed was a word boundary.
- LastcharDeliminator
-
Returns true if the last character that was parsed was a deliminator.
- LastChar
-
Returns the last character that was parsed.
- LineNumber($number);
-
Sets and returns the line number of the next line that is to be parsed.
- LineStart
-
Returns true if the parser is at the beginning of a line.
- LogCallGet;
-
Returns a reference to the anonymous sub that handles Warnings. See also the locall option.
- LogCallSet($anonsub);
-
Sets the anonymous sub that handles Warnings. See also the locall option.
- LogWarning($message);
-
Send a message to the warning mechanism of Kamelon.
- Parse($text);
-
Parses $text and returns a formatted text.
- ParseRaw($text);
-
Parses $text and returns a paired list of text fragments and the format information from the formatters FormatTable.
- Reset
-
Clears all buffers and resets Kamelon to beginning state.
- StateCompare($state);
-
Returns true if the current stack is equal to a previously saved $state. $state contains a reference to a list.
- StateGet
-
Returns a copy of the stack in an array.
- StateSet(@state);
-
Set the state to a previously saved state.
- SuggestSyntax($filename);
-
Tries to come up with a suitable lexer for $filename. It matches the extension of the file against the extension database held by the Indexer. Returns undef if nothing is found.
- Syntax($syntax);
-
Switches to the lexer in $syntax and performs a reset.
ACKNOWLEDGEMENTS
All the people who wrote Kate and the syntax highlight xml files.
AUTHOR AND COPYRIGHT
This module is written and maintained by:
Hans Jeuken < hanje at cpan dot org>
Copyright (c) 2017 - 2023 by Hans Jeuken, all rights reserved.
Published under the same lincense as Perl.
BUGS, ERRORS and DISCLAIMER
We know for a fact that the supplied xml files from Kate do not all produce accurate results. Most of them do though.
There are a few instances (Lilypond we know of) where Perl treats regular expressions slightly different from Kate. This became obvious as of Perl 5.26.
We have also chosen to not integrate some of Kate's tag features. We think they are a poor design choice. Kate allows you to specify tags like 'bold', 'italic' or a colour. We have foregone those. So for some xml definition files you will get different output when comparing it with a Kate editor window.
If you bump into one of these, unfortunately you are on your own.
What you can do is use your own set of xml definitions in a folder of your choice. Then edit the xml to your liking.
SEE ALSO
Syntax::Kamelon::Builder, Syntax::Kamelon::Debugger, Syntax::Kamelon::Diagnostics, Syntax::Kamelon::Indexer, Syntax::Kamelon::XMLData, Syntax::Kamelon::Format::Base, Syntax::Kamelon::Format::ANSI, Syntax::Kamelon::Format::HTML4, Syntax::Kamelon::Syntaxes.