NAME
@<Biblio::Document::Parser::Utils> - utility module for handling International characters and document conversion
DESCRIPTION
Biblio::Document::Parser::Utils provides some utility functions for handling international characters and for conversion of documents to plaintext.
SYNOPSIS
use Biblio::Document::Parser::Utils qw( normalise_multichars );
print normalise_multichars( $str );
METHODS
- $str = normalise_multichar( $str )
-
Convert multi-char international characters into single UTF-8 chars, e.g.: ¨o => ö These appear in pdftotext output from PDFs generated by pdflatex.
- $content = ParaTools::Utils::get_content($location)
-
This function takes either a filename or a URL as a parameter, and aims to return a string containing the lines in the file. A hash of converters is provided in ParaTools/Utils.pm, which should be customised for your system.
For URLs, the file is first downloaded to a temporary directory, then converted, whereas local files are copied straight into the temporary directory. For this reason, some care should be taken when handling very large files.
- $escaped_url = ParaTools::Utils::url_escape($string)
-
Simple function to convert a string into an encoded URL (i.e. spaces to %20, etc). Takes the unencoded URL as a parameter, and returns the encoded version.
AUTHOR
Tim Brody <tdb01r@ecs.soton.ac.uk> Mike Jewell <moj@ecs.soton.ac.uk> (packaging)
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 67:
Non-ASCII character seen before =encoding in '¨o'. Assuming UTF-8