Name
Data::Edit::Conversion - Perform a restartable series of steps in parallel.
Synopsis
Launch the conversion of several files, each represented by a project, in parallel processes, saving the project state after each step of the conversion so that subsequent conversions can be restarted at later steps to speed up development by bypassing initial processing steps unless they are really needed. The data and stepTimes are transferred back from each project's sub process to the main calling process so that the main process can further process their results.
use warnings FATAL=>qw(all);
use strict;
use Test::More tests=>90;
use File::Touch;
use Data::Edit::Conversion;
my $N = 8; # Number of test files == projects per launch
makePath(my $inDir = q(in)); clearFolder($inDir, 20); # Create and clear folders
my $tAge = File::Touch->new(mtime=>int time - 100); # Age file
$tAge->touch(writeFile(fpe($inDir, $_, q(xml)), <<END)) for 1..$N; # Create and age $N test files
$_
END
my $convert = sub {my ($p) = @_; $p->data = $p->data =~ s(\s) ()gsr x 2}; # Convert one project
my $l = Data::Edit::Conversion::new # Convert $N projects in parallel
(projects => Data::Edit::Conversion::loadProjectsFromFolder($inDir,qw(xml)),
convert =>
[[load => sub {my ($p) = @_; $p->data = readFile($p->source)}], # Load a project
[c1 => $convert],
[c2 => $convert],
[c3 => $convert],
],
maximumNumberOfProcesses => $N,
);
my $verify = sub # Verify launch results
{my (@stepsExecuted) = @_; # Steps that should have been executed
ok $l->projectData($_) eq $_ x 8 for 1..$N; # Check result of each conversion
is_deeply [sort keys %{$l->projectSteps($_)}], [@stepsExecuted] for 1..$N; # Check expected steps have been executed
};
$l->launch; &$verify(qw(c1 c2 c3 load)); # Full run
$l->restart(q(load)); &$verify(qw(c1 c2 c3 load)); # Restart the launch at various points
$l->restart(q(c1)); &$verify(qw(c1 c2 c3));
$l->restart(q(c2)); &$verify(qw(c2 c3));
$l->restart(q(c3)); &$verify(qw(c3));
File::Touch->new(mtime=>int time + 100)->touch(qq($inDir/1.xml)); # Renew source file to force all the steps to be redone despite requesting a restart
$l->restart(q(c2), "After touch");
ok $l->projectData($_) eq $_ x 8 for 1..$N;
is_deeply [sort keys %{$l->projectSteps(1)}], [qw(c1 c2 c3 load)];
is_deeply [sort keys %{$l->projectSteps(2)}], [qw(c2 c3)];
Description
The following sections describe the methods in each functional area of this module. For an alphabetic listing of all methods by name see Index.
Methods
Specify and run the restartable conversion of zero or more files in parallel
new(@)
Create a conversion specification for zero or more files represented by projects.
Parameter Description
1 @attributes L</Launch attributes> describing the launch
This is a static method and so should be invoked as:
Data::Edit::Conversion::new
launch($$$)
Launch the conversion of several files represented by projects in parallel
Parameter Description
1 $launch Launch specification
2 $title Optional title
3 $restart Optional name of latest step to restart at.
restart($$$)
Launch the conversion of several files represented by projects in parallel, starting at the specified step: the data from the previous step will be restored unless it does not exist in which case the conversion will be run from the latest step available prior to this step or right from the start.
Parameter Description
1 $launch Launch specification
2 $restart Step to restart at
3 $title Optional title
Launch Attributes
Use these attributes to configure a launch.
convert :lvalue
I [[step name => sub]...] A list of steps and their associated subs to process that step. At the end of each step the data stored on data is saved to allow for a later restart at the next step.
maximumNumberOfProcesses :lvalue
I Maximum number of processes to run in parallel
out :lvalue
I Optional file output area. This area will be cleared at the start of each launch.
outFileLimit :lvalue
I Limit on the number of files to be cleared from the out folder at the start of each launch.
projects :lvalue
I A reference to a hash of Data::Edit::Conversion::Project definitions. This can be most easily created by using loadProjectsFromFolder.
save :lvalue
I Temporary files will be stored in this folder
stepNumberByName :lvalue
O Get the number of a step from its name
stepsByNumber :lvalue
O Array of steps to be performed. The subs in this array call the user supplied subs after approriate set up and then do the required set down after the execution of each step.
loadProjectsFromFolder($@)
Create a project for file in and below the specified folder and return the projects created
Parameter Description
1 $dir Folder to search
2 @extensions List of file extensions to search for
This is a static method and so should be invoked as:
Data::Edit::Conversion::loadProjectsFromFolder
projectData($$)
Get data for a project after a launch has completed
Parameter Description
1 $launch Launch specification
2 $projectName Project
projectSteps($$)
Get the steps times showing the executed time in seconds for each step in a project after a launch has completed. If a step name is not present in this hash then the step was not run.
Parameter Description
1 $launch Launch specification
2 $projectName Project
Project
A project is one input file to be converted in one more restartable steps.
new()
Create a project to describe the conversion of a source file containing xml representing documentation into one or more Dita topics.
This is a static method and so should be invoked as:
Data::Edit::Conversion::new
name :lvalue
I Name of project.
number :lvalue
I Number of the project.
source :lvalue
I Input file containing the source xml.
data :lvalue
O Per project data being converted
stepTimes :lvalue
O Hash of steps processed during a launch
title :lvalue
I Title of the project.
Private Methods
defaultMaximumNumberOfProcesses()
Default maximum number of processes to use during the conversion
defaultOutFileLimit()
Default maximum number of files to clear art a time.
stepSaveFile($$$)
Save file for a project and a step
Parameter Description
1 $launch Launch specification
2 $projectName Project
3 $step Step name
deleteProject($$$)
Delete results before executing a particular step
Parameter Description
1 $launch Launch specification
2 $projectName Project
3 $step Step
saveProject($$$)
Save project at a particular step
Parameter Description
1 $launch Launch specification
2 $projectName Project
3 $step Step
loadProject($$$)
Load a project at a particular step
Parameter Description
1 $launch Launch specification
2 $projectName Project
3 $stepNumber Step to reload
launchProject($$$)
Convert a single project in a seperate process
Parameter Description
1 $launch Launch specification
2 $projectName Project to be processed
3 $restart Optional latest step to restart at
Index
1 convert
2 data
3 defaultMaximumNumberOfProcesses
6 launch
11 name
12 new
13 number
14 out
15 outFileLimit
16 projectData
17 projects
18 projectSteps
19 restart
20 save
21 saveProject
22 source
24 stepSaveFile
26 stepTimes
27 title
Installation
This module is written in 100% Pure Perl and, thus, it is easy to read, comprehend, use, modify and install via cpan:
sudo cpan install Data::Edit::Conversion
Author
Copyright
Copyright (c) 2016-2018 Philip R Brenan.
This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.