news_xml2alvis.pl - news XML to Alvis XML converter
SYNOPSIS
news_xml2alvis.pl [options] [source directory ...]
Options:
--xml-ext XML file identifying filename extension
--meta-ext meta file identifying filename extension
--out-dir output directory
--N-per-out-dir # of records per output directory
--meta-encoding the encoding of the meta files
--help brief help message
--man full documentation
--[no]warnings warnings output flag
OPTIONS
--xml-ext
Sets the XML file identifying filename extension.
Default value: 'xml'.
--meta-ext
Sets the meta file identifying filename extension.
Default value: 'meta'.
--out-dir
Sets the output directory. Default value: '.'.
--N-per-out-dir
Sets the # of records per output directory. Default value: 1000.
--meta-encoding
Specifies the encoding of the meta files. Default value 'iso-8859-1'.
Goes recursively through the files under the source directory
and converts them to Alvis XML files. Meta information (such
as the URL or the detected character set, title of the document
etc.) can be given in a separate meta file, one per each document,
recognized by the shared basename. E.g. the XML document is
called foo.news and the meta information is in foo.meta.
In this case news_xml2alvis.pl should be called like this:
news_xml2.alvis.pl --xml-ext news --meta-ext meta
The news XML files are expected to be of the format
<DOCUMENT>
<article>
<date></date>
<iso-date></iso-date>
<title></title>
<content></content>
<links>
<link type="a">
<location></location>
</link>
</links>
</article>
and meta files of the format
<feature name>\t<feature value>\n
Special features are url,title,date,detectedCharSet.
Module Install Instructions
To install Alvis::Convert, copy and paste the appropriate command in to your terminal.