NAME
Alvis::NLPPlatform::Convert - Perl extension for converting files in any format into the ALVIS XML.
SYNOPSIS
use Alvis::NLPPlatform::Convert;
my %config = &Alvis::NLPPlatform::load_config($rcfile);
my $mm = Alvis::NLPPlatform::Convert::load_MagicNumber(\%config);
my $AlvisConverter = Alvis::NLPPlatform::Convert::html2alvis_init();
Alvis::NLPPlatform::Convert::conversion_file_to_alvis_xml($ARGV[0], $AlvisConverter, \%config, $mm);
DESCRIPTION
This module provides methods to convert input files into the ALVIS XML format. It determines the type of the input files according to its magic number and applies converters. Output files are stored in a temporary spool.
METHODS
load_MagicNumber
load_MagicNumber(\%config);
This method loads additional information for magic numbers. The file is defined in the variable SupplMagicFile
in the section CONVERTER
.
It returns the object containing the list of magic numbers.
html2alvis_init
html2alvis_init(\%config);
The method Initializes the HTML2XML Alvis converter. It also determines the directory where will store the output files. It is either the directory by the variable ALVISTMP
) or either, by default, the current directory. The start number of the files is also determined.
The method returns the Alvis converter (i.e. from HTML file to Alvis DTD XML).
conversion_file_to_alvis_xml
conversion_file_to_alvis_xml($file, $AlvisConv, $config, $mm);
The method converts the input file $file
into the Alvis XML. Other arguments are the Alvis converter $AlvisConv
, the NLP platform configuration ($config
), providing command lines for convertion, and additional magic numbers ($mm
).
html2alvis
html2alvis($file, $Alvis_converter);
The method converts the HTML file $file
into the ALVIS XML format (thanks to the ALVIS converter Alvis_converter
) and store the output file in the temporary spool directory.
It returns a value different of 0
if it fails.
make_meta
make_meta($filename)
The method generates the meta information associated to filename
with default values, i.e. title, date and url, and then returns it.
outputting_empty_xmlns_file
outputting_default_xmlns_file($outdata, $outfile, $AlvisConverter, $config, $mm);
The method print the output data $output
(defined in a empty XML namespace) into the temporary file outfile
, and carries out the convertion to the ALVIS XML format, with $AlvisConverter
.
Additional parameters are the configuration $config
and the additional magic filter $mm
.
applying_stylesheet
applying_stylesheet($file, $xmlns, $config);
This method applies the XML style sheet, defined for the namespace $xmlns
given the configuration $config
, to the file $file
.
The method returns an two element array containing the XML namespace and the XML data.
get_type_file
get_type_file($file, $mm);
The method determines and returns the type of the file $file
according to its magic number (regarding the list $mm
) and in the case of "msword", according to the extension of the file (PowerPoint and Excel.
outputting_alvis_from_file
outputting_alvis_from_file($alvisfile, $Alvis_converter, $config);
The method formats and outputs the file $alvisfile
. It loads the file, and applies the ALVIS converter. The language of the document(s) is identified at this point thanks to the method defined in the ALVIS::NLPPlatform::Document module.
The $config
parameter is the hashtable containing the configuration variables.
outputting_alvis
outputting_alvis($alvisXML, $Alvis_converter, $config);
The method ouputs the data contained in $alvisXML
thaks to the ALVIS converter. The $config
parameter is the hashtable containing the configuration variables.
making_spool
making_spool($config, $outputRootDir);
The method generates the spool directory defined in $config
from the $outputRootDir
.
# =head1 ENVIRONMENT
SEE ALSO
Alvis web site: http://www.alvis.info
AUTHOR
Thierry Hamon <thierry.hamon@lipn.univ-paris13.fr>
LICENSE
Copyright (C) 2007 by Thierry Hamon
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.