NAME

Org::Parser - Parse Org documents

VERSION

version 0.03

SYNOPSIS

use 5.010;
use Org::Parser;
my $orgp = Org::Parser->new();

# parse into a document object
my $doc  = $orgp->parse_file("$ENV{HOME}/todo.org");

# print out elements while parsing
$orgp->handler(sub {
    my ($orgp, $event, @args) = @_;
    next unless $event eq 'element';
    my $el = shift @args;
    next unless $el->isa('Org::Element::Headline') &&
        $el->is_todo && !$el->is_done;
    say "found todo item: ", $el->title->as_string;
});
$orgp->parse(<<EOF);
* heading1a
** TODO heading2a
** DONE heading2b
* TODO heading1b
* heading1c
EOF

will print something like:

found todo item: heading2a
found todo item: heading1b

DESCRIPTION

This module parses Org documents. See http://orgmode.org/ for more details on Org documents.

This module uses Log::Any logging framework.

This module uses Moo object system.

NOTE: This module is in alpha stage. See "BUGS/TODO/LIMITATIONS" for the list of stuffs not yet implemented.

Already implemented/parsed:

  • text & markups (bold, italic, etc)

  • in-buffer settings

  • blocks

  • headlines & TODO items

    Including custom TODO keywords, custom priorities

  • schedule timestamps (subset of)

  • drawers & properties

  • tables

$orgp->parse_inline($str, $doc[, $parent])

Inline elements are elements that can be put under a heading, table cell, heading title, etc. these include normal text (and text with markups), timestamps, links, etc.

Found elements will be added into $parent's children. If $parent is not specified, it will be set to $orgp->_last_headline (or, if undef, $doc).

ATTRIBUTES

handler => CODEREF (default undef)

If set, the handler which will be called repeatedly by the parser during parsing. This can be used to quickly filter/extract wanted elements (e.g. headlines, timestamps, etc) from an Org document.

Handler will be passed these arguments:

$orgp, $event, $args

$orgp is the parser instance, $event is the type of event (currently only 'element', triggered after the parser parses an element) and $args is a hashref containing extra information depending on $event and type of elements. For $event == 'element', $args->{element} will be set to the element object.

METHODS

new()

Create a new parser instance.

$orgp->parse($str | $arrayref | $coderef | $filehandle) => $doc

Parse document (which can be contained in a scalar $str, an array of lines $arrayref, a subroutine which will be called for chunks until it returns undef, or a filehandle).

Returns Org::Document object.

If 'handler' attribute is specified, will call handler repeatedly during parsing. See the 'handler' attribute for more details.

Will die if there are syntax errors in documents.

$orgp->parse_file($filename) => $doc

Just like parse(), but will load document from file instead.

BUGS/TODO/LIMITATIONS

  • Single-pass parser

    Parser is currently a single-pass parser, so you need to preset stuffs before using them. For example, when declaring custom TODO keywords:

    #+TODO: TODO | DONE
    #+TODO: BUG WISHLIST | FIXED CANTREPRO
    
    * FIXED blah

    and not:

    * FIXED blah (at this point, custom TODO keywords not yet recognized)
    
    #+TODO: TODO | DONE
    #+TODO: BUG WISHLIST | FIXED CANTREPRO
  • What's the syntax for multiple in-buffer settings on a single line?

    Currently the parser assumes a single in-buffer settings per line

  • Difference between TYP_TODO and TODO/SEQ_TODO?

    Currently we assume it to be the same as the other two.

  • Parse link & link abbreviations (#+LINK)

  • Parse timestamps & timestamp pairs

  • Parse repeats in schedule timestamps

  • Set table's caption, etc from settings

    #+CAPTION: A long table
    #+LABEL: tbl:long
    |...|...|
    |...|...|

    Question: is this still valid caption?

    #+CAPTION: A long table
    some text
    #+LABEL: tbl:long
    some more text
    |...|...|
    |...|...|
  • Parse headline percentages

  • Parse {unordered,ordered,description,check) lists

  • Process includes (#+INCLUDE)

  • Parse buffer-wide header arguments (#+BABEL, 14.8.1)

SEE ALSO

AUTHOR

Steven Haryanto <stevenharyanto@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2011 by Steven Haryanto.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.