NAME

PDF::OCR2::Page

DESCRIPTION

Extract a pdf page document's text, from inside the document and if there are images, from the images via tesseract ocr.

Arg is hashref. Must have abs_pdf to pdf file. If no abs_pdf is provided or it does not exist on disk, throws exception.

Argument is path to pdf representing one page. Must be on disk. Perl setget method.

Returns aref of images, returns list in list context. Uses PDF::GetImages, slow.

Defaults shown.

Eval pdf with PDF::API2 for correctness/etc.

$PDF::OCR2::Page::CHECK_PDF = 0;

Do not clean up trash when DESTROY

$PDF::OCR2::Page::NO_TRASH_CLEANUP = 0;

Debug on

$PDF::OCR2::Page::DEBUG = 0;

Leo Charre leocharre at cpan dot org

To install PDF::OCR2, copy and paste the appropriate command in to your terminal.

cpanm PDF::OCR2

perl -MCPAN -e shell
install PDF::OCR2

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)