Name

Data::Edit::Xml::Lint - Lint xml files in parallel using xmllint and report the failure rate

Synopsis

Create some sample xml files, some with errors, lint them in parallel and retrieve the number of errors and failing files:

 for my $n(1..$N)                                                              # Some projects
  {my $x = Data::Edit::Xml::Lint::new();                                       # New xml file linter

   my $catalog = $x->catalog = catalogName;                                    # Use catalog if possible
   my $project = $x->project = projectName($n);                                # Project name
   my $file    = $x->file    =    fileName($n);                                # Target file

   $x->source = <<END;                                                         # Sample source
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE concept PUBLIC "-//HPE//DTD HPE DITA Concept//EN" "concept.dtd" []>
<concept id="$project">
<title>Project $project</title>
<conbody>
  <p>Body of $project</p>
</conbody>
</concept>
END

   $x->source =~ s/id="\w+?"//gs if addError($n);                              # Introduce an error into some projects

   $x->lint(foo=>1);                                                           # Write the source to the target file, lint using xmllint, include some attributes to be included as comments at the end of the target file
  }

 Data::Edit::Xml::Lint::wait;                                                  # Wait for lints to complete

 for my $n(1..$N)                                                              # Check each linted file
  {my $x = Data::Edit::Xml::Lint::read(fileName($n));                          # Reload the linted file
   ok $x->{foo}   == 1;                                                        # Check the reloaded attributes
   ok $x->project eq projectName($n);                                          # Check project name for file
   ok $x->errors  == addError($n);                                             # Check errors in file
  }

 my $report = Data::Edit::Xml::Lint::report($outDir, "xml");                   # Report total pass fail rate
 ok $report->passRatePercent  == 50;
 ok $report->numberOfProjects ==  3;
 ok $report->numberOfFiles    == $N;
 say STDERR $report->print;                                                    # Print report
}

Produces:

50 % success converting 3 projects containing 10 xml files on 2017-07-13 at 17:43:24

ProjectStatistics
   #  Percent   Pass  Fail  Total  Project
   1  33.3333      1     2      3  aaa
   2  50.0000      2     2      4  bbb
   3  66.6667      2     1      3  ccc

FailingFiles
   #  Errors  Project       File
   1       1  ccc           out/ccc5.xml
   2       1  aaa           out/aaa9.xml
   3       1  bbb           out/bbb1.xml
   4       1  bbb           out/bbb7.xml
   5       1  aaa           out/aaa3.xml

Description

Constructor

Construct a new linter

new

Create a new xml linter - call this method statically as in Data::Edit::Xml::Lint::new()

Attributes

Attributes describing a lint

file :lvalue

File that the xml will be written to and read from

catalog :lvalue

Optional catalog file containing the locations of the DTDs used to validate the xml

dtds :lvalue

Optional directory containing the DTDs used to validate the xml

errors :lvalue

Number of lint errors detected by xmllint

linted :lvalue

Date the lint was performed

project :lvalue

Optional project name to allow error counts to be aggregated by project

processes :lvalue

Maximum number of lint processes to run in parallel - 8 by default

sha256 :lvalue

String containing the xml to be written or the xml read

source :lvalue

String containing the xml to be written or the xml read

Lint

Lint xml files in parallel

lint

Store some xml in a file and apply xmllint in parallel

   Parameter    Description
1  $lint        Linter
2  %attributes  Attributes to be recorded as xml comments

read

Reload a linted xml file and extract attributes

   Parameter  Description
1  $file      File containing xml

wait()

Wait for all lints to finish

clear

Clear the results of a prior run

   Parameter         Description
1  $outputDirectory  Directory to clear
2  @fileExtensions   Extensions of files to remove

Report

Methods for reporting the results of linting several files

report

Analyse the results of prior lints and return a hash reporting various statistics and a printable report

   Parameter         Description
1  $outputDirectory  Directory to clear
2  @fileExtensions   Types of files to analyze

Attributes

passRatePercent :lvalue

Total number of passes as a percentage of all input files

timestamp :lvalue

Timestamp of report

numberOfProjects :lvalue

Number of projects defined - each project can contain zero or more files

numberOfFiles :lvalue

Number of files encountered

failingFiles :lvalue

Array of [number of errors, project, file] ordered from least to most errors

A printable report of the above

Index

catalog

clear

dtds

errors

failingFiles

file

lint

linted

new

numberOfFiles

numberOfProjects

passRatePercent

print

processes

project

read

report

sha256

source

timestamp

wait()

Installation

This module is written in 100% Pure Perl and is thus easy to read, use, modify and install.

Standard Module::Build process for building and installing modules:

perl Build.PL
./Build
./Build test
./Build install

Author

philiprbrenan@gmail.com

http://www.appaapps.com

Copyright

Copyright (c) 2016-2017 Philip R Brenan.

This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.