NAME

PDF::OCR2 - extract all text and all image ocr from pdf

SYNOPSIS

use PDF::OCR2;

my $p = PDF::OCR2->new('./path/to/file.pdf');

my $text_all   = $p->text;
my @text_pages = $p->text;

DESCRIPTION

This is meant to replace PDF::OCR. The backend complexity of this process has been isolated in modules:

PDF::GetImages
PDF::Burst
Image::OCR::Tesseract
PDF::OCR2::Pages - in this distro.

Why not just modify PDF::OCR?? This is such a massive breakdown of code hierachy and interdependency, and such a different interface, that this made more sense. PDF::OCR was ok. But it was messy and really, this is a lot better.

METHODS

new()

Argument is path to pdf file.

text()

Takes no argument. In scalar context, returns text of all pages, joined with a pagebreak \f character. In list context, returns text of pages one per element.

CAVEATS

This only works on posix.

ERRORS

If you have errors with PDF::API2 saying the pdf is corrupt, likely via PDF::Burst.. Then try this:

use PDF::OCR2;

PDF::Burst::BURST_METHOD = 'CAM_PDF';

# and then...
my $pdf = PDF::OCR2->new('./pathtofile.pdf');
print $pdf->text;

CRIT AND SUGGESTIONS

The AUTHOR is open to any suggestions and requests.

REPLACES

PDF::OCR - deprecated by this module.

AUTHOR

Leo Charre leocharre at cpan dot org

THANKS

Long Nguyen

COPYRIGHT

LICENSE

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, i.e., under the terms of the "Artistic License" or the "GNU General Public License".

DISCLAIMER

This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the "GNU General Public License" for more details.

To install PDF::OCR2, copy and paste the appropriate command in to your terminal.

cpanm

cpanm PDF::OCR2

CPAN shell

perl -MCPAN -e shell
install PDF::OCR2

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)