NAME

Plucene::SearchEngine::Index::PPT - a Plucene backend for indexing Microsoft Powerpoint presentations

VERSION

version 0.001

DESCRIPTION

This backend analysis a PPT file. The module use the tool called ppthtml, provided by xlhtml packges available from http://chicago.sourceforge.net/xlhtml/, or your operating system's package manager.

This code is not currently actively maintained.

text

The text part of the PPT

A list of links in the HTML

Additionally, any META tags are turned into Plucene fields.

METHODS

gather_data_from_file

Overrides the method from Plucene::SearchEngine::Index::HTML to provide PPT parsing.

AVAILABILITY

The latest version of this module is available from the Comprehensive Perl Archive Network (CPAN). Visit http://www.perl.com/CPAN/ to find a CPAN site near you, or see https://metacpan.org/module/Plucene::SearchEngine::Index::MSOffice/.

SOURCE

The development version is on github at http://github.com/doherty/Plucene-SearchEngine-Index-MSOffice and may be cloned from git://github.com/doherty/Plucene-SearchEngine-Index-MSOffice.git

BUGS AND LIMITATIONS

You can make new bug reports, and view existing ones, through the web interface at https://github.com/doherty/Plucene-SearchEngine-Index-MSOffice/issues.

AUTHORS

  • Sopan Shewale <sopan.shewale@gmail.com>

  • Mike Doherty <doherty@pythian.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by Sopan Shewale <sopan.shewale@gmail.com>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.