NAME
File::Extract - Extract Text From Arbitrary File Types
SYNOPSIS
use File::Extract;
my $e = File::Extract->new();
my $r = $e->extract($filename);
my $e = File::Extract->new(encodings => [...]);
my $class = "MyExtractor";
File::Extract->register($class);
DESCRIPTION
File::Extract is a framework to extract text data out of arbitrary file types, useful to collect data for indexing.
CLASS METHODS
register($class)
Registers a new text-extractor. The specified class needs to implement two functions:
- mime_type(void)
-
Returns the MIME type that $class can extract files from.
- extract($file)
-
Extracts the text from $file. Returns a File::Extract::Result object.
METHODS
- encodings
-
List of encodings that you expect your files to be in. This is used to re-encode and normalize the contents of the file via Encode::Guess.
- output_encoding
-
The final encoding that you the extracted test to be in. The default encoding is UTF8.
new(%args)
extract($file)
SEE ALSO
AUTHOR
Copyright 2005 Daisuke Maki <dmaki@cpan.org>. All rights reserved. Development funded by Brazil, Ltd. <http://b.razil.jp>
2 POD Errors
The following errors were encountered while parsing the POD:
- Around line 125:
'=item' outside of any '=over'
- Around line 135:
You forgot a '=back' before '=head2'