NAME
Org::Parser - Parse Org documents
VERSION
version 0.02
SYNOPSIS
use 5.010;
use Org::Parser;
my $orgp = Org::Parser->new();
# parse into a document object
my $doc = $orgp->parse_file("$ENV{HOME}/todo.org");
# print out elements while parsing
$orgp->handler(sub {
my ($orgp, $event, @args) = @_;
next unless $event eq 'element';
my $el = shift @args;
next unless $el->isa('Org::Element::Headline') &&
$el->is_todo && !$el->is_done;
say "found todo item: ", $el->title->as_string;
});
$orgp->parse(<<EOF);
* heading1a
** TODO heading2a
** DONE heading2b
* TODO heading1b
* heading1c
EOF
will print something like:
found todo item: heading2a
found todo item: heading1b
DESCRIPTION
This module parses Org documents. See http://orgmode.org/ for more details on Org documents.
This module uses Log::Any logging framework.
This module uses Moo object system.
NOTE: This module is in alpha stage. See "BUGS/TODO/LIMITATIONS" for the list of stuffs not yet implemented.
Already implemented/parsed:
in-buffer settings
blocks
headlines & TODO items
Including custom TODO keywords, custom priorities
schedule timestamps (subset of)
drawers & properties
$orgp->parse_inline($str, $doc[, $parent])
Inline elements are elements that can be put under a heading, table cell, heading title, etc. these include normal text (and text with markups), timestamps, links, etc.
Found elements will be added into $parent's children. If $parent is not specified, it will be set to $orgp->_last_headline (or, if undef, $doc).
ATTRIBUTES
handler => CODEREF (default undef)
If set, the handler which will be called repeatedly by the parser during parsing. This can be used to quickly filter/extract wanted elements (e.g. headlines, timestamps, etc) from an Org document.
Handler will be passed these arguments:
$orgp, $event, $args
$orgp is the parser instance, $event is the type of event (currently only 'element', triggered after the parser parses an element) and $args is a hashref containing extra information depending on $event and type of elements. For $event == 'element', $args->{element} will be set to the element object.
METHODS
new()
Create a new parser instance.
$orgp->parse($str | $arrayref | $coderef | $filehandle) => $doc
Parse document (which can be contained in a scalar $str, an array of lines $arrayref, a subroutine which will be called for chunks until it returns undef, or a filehandle).
Returns Org::Document object.
If 'handler' attribute is specified, will call handler repeatedly during parsing. See the 'handler' attribute for more details.
Will die if there are syntax errors in documents.
$orgp->parse_file($filename) => $doc
Just like parse(), but will load document from file instead.
BUGS/TODO/LIMITATIONS
Single-pass parser
Parser is currently a single-pass parser, so you need to preset stuffs before using them. For example, when declaring custom TODO keywords:
#+TODO: TODO | DONE #+TODO: BUG WISHLIST | FIXED CANTREPRO * FIXED blah
and not:
* FIXED blah (at this point, custom TODO keywords not yet recognized) #+TODO: TODO | DONE #+TODO: BUG WISHLIST | FIXED CANTREPRO
What's the syntax for multiple in-buffer settings on a single line?
Currently the parser assumes a single in-buffer settings per line
Difference between TYP_TODO and TODO/SEQ_TODO?
Currently we assume it to be the same as the other two.
Parse link & link abbreviations (#+LINK)
Parse timestamps & timestamp pairs
Parse repeats in schedule timestamps
Set table's caption, etc from settings
#+CAPTION: A long table #+LABEL: tbl:long |...|...| |...|...|
Question: is this still valid caption?
#+CAPTION: A long table some text #+LABEL: tbl:long some more text |...|...| |...|...|
Parse text markups
Parse headline percentages
Parse {unordered,ordered,description,check) lists
Process includes (#+INCLUDE)
Parse buffer-wide header arguments (#+BABEL, 14.8.1)
SEE ALSO
AUTHOR
Steven Haryanto <stevenharyanto@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2011 by Steven Haryanto.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.