Name

Data::Edit::Xml::Xref - Cross reference Dita XML.

Synopsis

Check the references in a set of Dita XML documents held in folder inputFolder:

 use Data::Edit::Xml::Xref;

 my $x = xref(inputFolder=>q(in));
 ok nws($x->statusLine) eq nws(<<END);
Xref:
10 bad first lines,
10 bad second lines,
 9 bad conrefs,
 9 bad xrefs,
 8 duplicate ids,
 8 missing image files,
 8 missing image references,
 3 bad topicrefs,
 2 duplicate topic ids,
 1 bad book map,
 1 file failed to parse,
 1 file not referenced
END

The counts listed in the statusLine are the counts of the files that have the described problems not a count of all the instances of the problem in all the files which would be larger.

More detailed reports are produced in the reports folder:

$x->reports

Description

Cross reference Dita XML.

Version 20190121.

The following sections describe the methods in each functional area of this module. For an alphabetic listing of all methods by name see Index.

Cross reference

Check the cross references in a set of Dita files and report the results.

xref(%)

Check the cross references in a set of Dita files held in inputFolder and report the results in the reports folder. The possible attributes are defined in Data::Edit::Xml::Xref

   Parameter    Description
1  %attributes  Attributes

Example:

  my $N = 8;                                                                    

Hash Definitions

Data::Edit::Xml::Xref Definition

Attributes used by the Xref cross referencer.

attributeCount - {file}{attribute} == count of the different xml attributes found in the xml files.

author - {file} = author of file

badBookMaps - Bad book maps

badConRefs - {sourceFile} = [file, href] indicating the file has at least one bad conref

badConRefsList - Bad conrefs - by file

badGuidHrefs - Bad conrefs - all

badImageRefs - Consolidated images missing.

badTables - Array of tables that need fixing

badTopicRefs - [file, href] Invalid href attributes found on topicref tags.

badXRefs - Bad Xrefs - by file

badXRefsList - Bad Xrefs - all

badXml1 - [Files] with a bad xml encoding header on the first line.

badXml2 - [Files] with a bad xml doc type on the second line.

conRefs - {file}{href} Count of conref definitions in each file.

docType - {file} == docType: the docType for each xml file.

duplicateIds - [file, id] Duplicate id definitions within each file.

duplicateTopicIds - [topicId, [files]] Files with duplicate topic ids - the id on the outermost tag.

fileExtensions - Default file extensions to load

fixBadRefs - Try to fix bad references in these files where possible by either changing a guid to a file name assuming the right file is present or failing that by moving the failing reference to the "xtrf" attribute.

fixRefs - {file}{ref} where the href or conref target is not present.

fixedRefs - [] hrefs and conrefs from fixRefs

goodBookMaps - Good book maps

goodConRefs - Good con refs - by file

goodConRefsList - Good con refs - all

goodGuidHrefs - {file}{href}{location}++ where a href that starts with GUID- has been correctly resolved

goodImageRefs - Consolidated images found.

goodTopicRefs - Good topic refs

goodXRefs - Good xrefs - by file

goodXRefsList - Good xrefs - all

guidHrefs - {file}{href} = location where href starts with GUID- and is thus probably a guid

guidToFile - {topic id which is a guid} = file defining topic id

ids - {file}{id} Id definitions across all files.

images - {file}{href} Count of image references in each file.

improvements - Improvements needed

inputFiles - Input files from inputFolder.

inputFolder - A folder containing the dita and ditamap files to be cross referenced.

inputFolderImages - {filename} = full file name which works well for images because the md5 sum in their name is probably unique

maximumNumberOfProcesses - Maximum number of processes to run in parallel at any one time.

md5Sum - MD5 sum for each input file

missingImageFiles - [file, href] == Missing images in each file.

missingTopicIds - Missing topic ids

noHref - Tags that should have an href but do not have one

notReferenced - Files in input area that are not referenced by a conref, image, topicref or xref tag and are not a bookmap.

parseFailed - [file] files that failed to parse

reports - Reports folder: the cross referencer will write reports to files in this folder.

results - Summary of results table

sourceFile - The source file from which this structure was generated

statusLine - Status line summarizing the cross reference.

statusTable - Status table summarizing the cross reference.

summary - Print the summary line.

tagCount - {file}{tags} == count of the different tag names found in the xml files.

title - {file} = title of file

topicIds - {file} = topic id - the id on the outermost tag.

topicRefs - {file}{href}++ References from bookmaps to topics via appendix, chapter, topicref.

unixPath - Path to be used to name files on unix in reports

validationErrors - True means that Lint detected errors in the xml contained in the file

windowsPath - Path to be used to name files on windows in reports

xRefs - {file}{href}++ Xrefs references.

xrefBadFormat - External xrefs with no format=html

xrefBadScope - External xrefs with no scope=external

Attributes

The following is a list of all the attributes in this package. A method coded with the same name in your package will over ride the method of the same name in this package and thus provide your value for the attribute in place of the default value supplied for this attribute by this package.

Replaceable Attribute List

improvementLength

improvementLength

Improvement length

Private Methods

lll(@)

Write a message

   Parameter  Description
1  @m         Message text

countLevels($$)

Count has elements to the specified number of levels

   Parameter  Description
1  $l         Levels
2  $h         Hash

windowsFile($$)

Format file name for easy use on windows

   Parameter  Description
1  $xref      Xref
2  $file      File

unixFile($$)

Format file name for easy use on unix

   Parameter  Description
1  $xref      Xref
2  $file      File

formatFileNames($$$)

Format file names for easy use on unix and windows

   Parameter  Description
1  $xref      Xref
2  $array     Array of arrays containing file names in unix format
3  $column    Column containing file names

loadInputFiles($)

Load the names of the files to be processed

   Parameter  Description
1  $xref      Cross referencer

analyzeOneFile($)

Analyze one input file

   Parameter  Description
1  $iFile     File to analyze

reportGuidsToFiles($)

Map and report guids to files

   Parameter  Description
1  $xref      Xref results

fixOneFile($$)

Fix one file by moving unresolved references to the xtrf attribute

   Parameter  Description
1  $xref      Xref results
2  $file      File to fix

fixFiles($)

Fix files by moving unresolved references to the xtrf attribute

   Parameter  Description
1  $xref      Xref results

analyze($)

Analyze the input files

   Parameter  Description
1  $xref      Cross referencer

reportDuplicateIds($)

Report duplicate ids

   Parameter  Description
1  $xref      Cross referencer

reportDuplicateTopicIds($)

Report duplicate topic ids

   Parameter  Description
1  $xref      Cross referencer

reportNoHrefs($)

Report locations where an href was expected but not found

   Parameter  Description
1  $xref      Cross referencer

reportRefs($$)

Report bad references found in xrefs or conrefs as they have the same structure

   Parameter  Description
1  $xref      Cross referencer
2  $type      Type of reference to be processed

reportGuidHrefs($)

Report on guid hrefs

   Parameter  Description
1  $xref      Cross referencer

reportXrefs($)

Report bad xrefs

   Parameter  Description
1  $xref      Cross referencer

reportTopicRefs($)

Report bad topic refs

   Parameter  Description
1  $xref      Cross referencer

reportConrefs($)

Report bad conrefs refs

   Parameter  Description
1  $xref      Cross referencer

reportImages($)

Reports on images and references to images

   Parameter  Description
1  $xref      Cross referencer

reportParseFailed($)

Report failed parses

   Parameter  Description
1  $xref      Cross referencer

reportXml1($)

Report bad xml on line 1

   Parameter  Description
1  $xref      Cross referencer

reportXml2($)

Report bad xml on line 2

   Parameter  Description
1  $xref      Cross referencer

reportDocTypeCount($)

Report doc type count

   Parameter  Description
1  $xref      Cross referencer

reportTagCount($)

Report tag counts

   Parameter  Description
1  $xref      Cross referencer

reportAttributeCount($)

Report attribute counts

   Parameter  Description
1  $xref      Cross referencer

reportValidationErrors($)

Report the files known to have validation errors

   Parameter  Description
1  $xref      Cross referencer

checkBookMap($$)

Check whether a bookmap is valid or not

   Parameter  Description
1  $xref      Cross referencer
2  $bookMap   Bookmap

reportBookMaps($)

Report on whether each bookmap is good or bad

   Parameter  Description
1  $xref      Cross referencer

reportTables($)

Report on tables that have problems

   Parameter  Description
1  $xref      Cross referencer

reportFileExtensionCount($)

Report file extension counts

   Parameter  Description
1  $xref      Cross referencer

reportFileTypes($)

Report file type counts - takes too long in series

   Parameter  Description
1  $xref      Cross referencer

reportNotReferenced($)

Report files not referenced by any of conref, image, topicref, xref and are not bookmaps.

   Parameter  Description
1  $xref      Cross referencer

reportExternalXrefs($)

Report external xrefs missing other attributes

   Parameter  Description
1  $xref      Cross referencer

reportPossibleImprovements($)

Report improvements possible

   Parameter  Description
1  $xref      Cross referencer

reportTopicDetails($)

Things that occur once in each file

   Parameter  Description
1  $xref      Cross referencer

reportMd5Sum($)

Good files have short names which uniquely represent their content and thus can be used instead of their md5sum to generate unique names

   Parameter  Description
1  $xref      Cross referencer

createSampleInputFiles($)

Create sample input files for testing. The attribute inputFolder supplies the name of the folder in which to create the sample files.

   Parameter  Description
1  $N         Number of sample files

Index

1 analyze - Analyze the input files

2 analyzeOneFile - Analyze one input file

3 checkBookMap - Check whether a bookmap is valid or not

4 countLevels - Count has elements to the specified number of levels

5 createSampleInputFiles - Create sample input files for testing.

6 fixFiles - Fix files by moving unresolved references to the xtrf attribute

7 fixOneFile - Fix one file by moving unresolved references to the xtrf attribute

8 formatFileNames - Format file names for easy use on unix and windows

9 lll - Write a message

10 loadInputFiles - Load the names of the files to be processed

11 reportAttributeCount - Report attribute counts

12 reportBookMaps - Report on whether each bookmap is good or bad

13 reportConrefs - Report bad conrefs refs

14 reportDocTypeCount - Report doc type count

15 reportDuplicateIds - Report duplicate ids

16 reportDuplicateTopicIds - Report duplicate topic ids

17 reportExternalXrefs - Report external xrefs missing other attributes

18 reportFileExtensionCount - Report file extension counts

19 reportFileTypes - Report file type counts - takes too long in series

20 reportGuidHrefs - Report on guid hrefs

21 reportGuidsToFiles - Map and report guids to files

22 reportImages - Reports on images and references to images

23 reportMd5Sum - Good files have short names which uniquely represent their content and thus can be used instead of their md5sum to generate unique names

24 reportNoHrefs - Report locations where an href was expected but not found

25 reportNotReferenced - Report files not referenced by any of conref, image, topicref, xref and are not bookmaps.

26 reportParseFailed - Report failed parses

27 reportPossibleImprovements - Report improvements possible

28 reportRefs - Report bad references found in xrefs or conrefs as they have the same structure

29 reportTables - Report on tables that have problems

30 reportTagCount - Report tag counts

31 reportTopicDetails - Things that occur once in each file

32 reportTopicRefs - Report bad topic refs

33 reportValidationErrors - Report the files known to have validation errors

34 reportXml1 - Report bad xml on line 1

35 reportXml2 - Report bad xml on line 2

36 reportXrefs - Report bad xrefs

37 unixFile - Format file name for easy use on unix

38 windowsFile - Format file name for easy use on windows

39 xref - Check the cross references in a set of Dita files held in inputFolder and report the results in the reports folder.

Installation

This module is written in 100% Pure Perl and, thus, it is easy to read, comprehend, use, modify and install via cpan:

sudo cpan install Data::Edit::Xml::Xref

Author

philiprbrenan@gmail.com

http://www.appaapps.com

Copyright

Copyright (c) 2016-2018 Philip R Brenan.

This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 1768:

Unterminated L<...> sequence