NAME
Tree::PseudoIncLib - Perl class encapsulating a description of pseudo-INC array.
ABSTRACT
This module encapsulates a perl-type library description data and provides methods for manipulating that data. It is in no way associated with any real @INC array on the system. Instead, it works with so-called pseudo_INC incoming array that might be, or might be not directly associated with @INC defined for a particular user or a process on the system.
SYNOPSIS
# make sure to configure the log system properly.
#
use Tree::PseudoIncLib;
#
# class default object:
#
my $tree_obj = Tree::PseudoIncLib->new();
#
# another instance:
#
my $sp_obj = $tree_obj->new (
max_nodes => $my_max_nodes, # limit number of nodes
p_INC => $my_INC_copy_ref,
);
unless ( $sp_obj->from_scratch ) {
# something went wrong:
print ($sp_obj->status_as_string);
die;
}
# we'we got a description inside the object.
# we can export it to an appropriate form now...
#
my $src_html = $sp_obj->export_to_DHTML (
title => 'Test-Debug',
image_dir => 'data/images/',
icon_shaded => 'file_x.gif',
icon_folder_opened => 'folder_opened.gif',
icon_symlink => 'hand.right.gif',
tree_intend => 18,
row_class => 'r0',
css => '', # use 'inline' css
jslib => '', # no jslib
overlib => 'js/overlib.js',
);
# ... and deploy the document from $src_html then...
DESCRIPTION
Detailed description of Perl library on the system is extremely helpful for every perl developer. It could be benefitial for the system administrator too in order to ensure a proper structure of system libraries.
This module encapsulates the description data and provides methods for manipulating that data. It was initially developed as an Apache incorporated tool for the mod_perl development. The idea beside was pretty simple -- to provide developers with the tree of all available perl modules installed on the system and make all sources and documents viewable on network.
As a side effect of the first developed prototype, it appeared to be usefull additionally from the standpoint of proper configuration of @INC array on the system, particularly regarding the fact that some perl modules could be shaded by other ones carrying the same CPAN class name. It appears to be pretty easy to mark all shaded modules on the tree, providing helpful information for the system administrator.
It was noticed additionally that the process of creation of the tree is extremely time consuming, especially on busy web servers equiped with rich Perl libraries. On the other hand, the content of the libraries remains unchanged usualy pretty long time that is measured in days and weeks. So far, the separation of the process of creation of the tree from the process of deployment of the view to the client browser seems beneficial from the prospective of improvement of performance on busy systems. That was the main reason of creation of this module, making it possible to use the same API from the command line script or one running under the cron control.
Despite the initial purpose, this version of the module is in no way associated with any real @INC array on the system. Instead, this module works with so-called pseudo_INC incoming array that might be, or might be not directly associated with current @INC for a particular user/process on the system.
Object Identification
It is sometimes required to keep several @INC descriptions on one system. This mainly depends on the fact that @INC is often user-cpecific. Apache::App::PerlLibTree is capable of managing several descriptions simultaniously when every description has a unique name.
A part of the problem is addressed through the creation of unique file-names for the results of export_to_DHTML
for every tree within the Apache::App::PerlLibTree
(in associated cron scripts) and has nothing to do with the class itself except the fact that we need to have a human-readable identification of the tree inside the screen-view in order to make it clear for the user, which tree is (s)he viewing.
This version of the module is using the following internal key for the object identification:
- tree_id
-
a string of identification that is dispayed on the screen for the end-user
It is assumed that tree_id
will be containing sufficient information for the object recognition by human user. It might contain blank spaces if necessary. This data could be provided as an incoming parameter for a new method usually. A special public method exists to check/update tree_id
.
Note: This item is important for export_to_DHTML
only since the removal of archive functionality. The presence of this item in the area of the main object data is a subject of possible future changes...
Internal Object Data
Internal data of the object is basically a hash. However, some keys are referencing another structured data like arrays, arrays of hashes, etc.
The full list of primary internal keys contains:
- tree_id
-
a string of identification that is dispayed on the screen for the human end-user
- application_directory
-
a URL mask of the application from the
Document Root
of web server - max_nodes
-
a watch-dog or down-counter of nodes represented in final document. Terminates all further recursions when reaches the zerro value.
- skip_empty_dir
-
boolen variable, means whether an empty directories should be skipped in final tree representation, or not.
Note: Directory is considered empty when it does not contain any files of known types. See
allowed_files
for details. - skip_mode
-
boolen variable, means whether the information about permissions should be skipped in final tree representation, or not.
- skip_owner
-
boolen variable, means whether the information about the
owner
should be skipped in final tree representation, or not. - skip_group
-
boolen variable, means whether the information about the
group
should be skipped in final tree representation, or not. - descript
-
a reference to the array of hashes finally...
- descript_internal_start_time
-
in internal date-time format
- descript_internal_finish_time
-
in internal date-time format
- descript_start_time_text
-
in text format:
"%B %d, %Y at %H:%M"
- descript_finish_time_text
-
in text format:
"%B %d, %Y at %H:%M"
- rpm_type
-
the type of packager used on the system
This version of the module recognizes only:
- RPM
- dpkg
Only RPM is supported currently.
- rpm_active
-
boolen variable, means whether an RPM information should be presented in final document, or not.
Note:
TRUE
might be for known RPM types only,FALSE
has no limits... - lib_index_prefix
-
for internal names
- p_INC
-
a reference to array that contains the pathes representing pseudo-INC library
- allow_files
-
a reference to the array of hashes, those describe allowed file types
- plog
-
a reference to
Log::Log4perl
logger
Watch-Dog
In order to prevent the program from the infinite loop during the creation of descriptions I use one watch-dog inside the code:
- max_nodes
-
a global number of nodes those might be stored in a tree.
All recursions are terminated upon the exhaust of max_nodes
counter. The final return value of the method from_scratch depends on exhaust of this counter.
Class Defaults
This version of the module provides the following defaults:
APPLICATION_DIRECTORY => '/app/pltree/';# URL mask from the Apache Document_Root
TREE_ID_DEFAULT => 'Default_Tree';
LIB_INDEX_PREFIX => 'lib';# default prefix for root library name
MIN_LIMIT_NODES => 15; # min value for max_nodes setting validation
LIMIT_NODES => 15000;# default for max_nodes
RPM_TYPE => 'RPM';# default type of packaging system
NO_RPM_OWNER => undef;# '-' is not that convenient...
SKIP_EMPTY_DIR_DEFAULT => 1; # true
SKIP_MODE_DEFAULT => 0; # false
SKIP_OWNER_DEFAULT => 0; # false
SKIP_GROUP_DEFAULT => 0; # false
Logging Policy
I use an open source Log::Log4perl
from CPAN in order to log information from my module. It makes the logging system extremely flexible for the user. This module logs error
, warn
, info
, and debug
messages.
I would recommend to use a global info
level of logging in routine jobs. On this level module is logging two messages only:
- start_message
-
contains the identification of the object and the start time of the method from_scratch
- end_message
-
contains the end time of the method from_scratch and a real-time duration of the process
These info
messages could be helpful in order to identify description problems originated from possible library update during the creation of description.
With Log::Log4perl
one can choose required log level on his own for each source method when necessary. All log configurations are code-independent. My log configuration file (for code testing) is available at data/log.config
inside the distribution. It could be used as an example in order to create a quick start config. Please see the documentation of Log::Log4perl
in order to configure your logging system in accorance with your real needs.
Polymorphism
The class is polymorphism-ready. One can inherit everything with the simple declaration:
package MyOwnTree;
use Tree::PseudoIncLib;
@ISA = ("Tree::PseudoIncLib");
# do what you need...
1;
PUBLIC METHODS
new
Creates the instance of the class. A new instance can be created directly from the class like:
my $tree_obj = Tree::PseudoIncLib->new();
or from another (existent) object of this class like:
my $another_tree_object = $tree_obj->new();
It does not copy the content of the existent base instance in the last case. Instead, it always creates the default class object when it is called with no incoming parameters. Otherwise, icoming parameters have precedance over the class default values. See method clone for copy-constructor when necessary.
Method new accepts a set of optional incoming parameters those should be represented as a hash:
- tree_id
-
a string of the tree identification that will be printed in view
- application_directory
-
a string of the URL mask of the application
- max_nodes
-
an integer number, big enough to count all nodes in the tree
- p_INC
-
a reference to the array. See method for details.
- skip_empty_dir
-
an integer, might be 0, or 1
- skip_mode
-
an integer, might be 0, or 1
- skip_owner
-
an integer, might be 0, or 1
- skip_group
-
an integer, might be 0, or 1
- allow_files
-
a reference to the array of hashes. See method for details.
- rpm_type
-
a string
- rpm_active
-
an integer, might be 0, or 1
This method returns blessed object reference upon success.
Example of the use:
my @pseudo_inc = ( $dir.'/data/testlibs/lib2',);
# ...
my $obj = Tree::PseudoIncLib->new(
max_nodes => 100,
tree_id => 'My Lovely Tree',
p_INC => \@pseudo_inc,
);
allow_files
This method provides access to the array of hashes that defines the set of files those we wish to keep (and display) within the tree of the library. Technically, this method works with internal variable $self->{allow_files} that keeps the reference to the mentioned array of hashes. As an example of the structure, we can write:
$self->{allow_files} = [
{ mask => '.pm$', icon => 'file.gif',
name_on_click_action => 'source',
icon_on_click_action => 'ps2html',
name_mouse_over_prompt => 'view source',
icon_mouse_over_prompt => 'view document',},
{ mask => '.pod$', icon => 'file_note.gif',
name_on_click_action => 'source',
icon_on_click_action => 'ps2html',
name_mouse_over_prompt => 'view source',
icon_mouse_over_prompt => 'view document',},
];
This method takes one optional parameter -- a new reference to the array of a similar structure. It updates $self->{allow_files} when it is called with valid incoming value. In this version an incoming data validation is pretty simple: it just checks whether the incoming parameter is a reference to an ARRAY
.
This method returns the current value of $self->{allow_files} upon success. Otherwise, it returns undef
.
Example of the use:
my $obj = Tree::PseudoIncLib->new();
my $target_files = [
{ mask => '.pm$', icon => 'file.gif',
name_on_click_action => 'source',
icon_on_click_action => 'ps2html',
name_mouse_over_prompt => 'view source',
icon_mouse_over_prompt => 'view document',},
{ mask => '.pod$', icon => 'file_note.gif',
name_on_click_action => 'source',
icon_on_click_action => 'ps2html',
name_mouse_over_prompt => 'view source',
icon_mouse_over_prompt => 'view document',},
];
$obj->allow_files( $target_files );
pseudo_INC
This method provides access to the array of paths, defining current pseudo_INC for the object. Technically, this method works with internal variable $self->{p_INC} that keeps the reference to the mentioned array. The data structure is pretty simple and could be illustrated with the following example:
$self->{p_INC} = [
'/usr/lib/perl5/5.6.1/i386-linux',
'/usr/lib/perl5/5.6.1',
'/usr/lib/perl5/site_perl/5.6.1/i386-linux',
'/usr/lib/perl5/site_perl/5.6.1',
'/usr/lib/perl5/site_perl/5.6.0',
'/usr/lib/perl5/site_perl',
'/usr/lib/perl5/vendor_perl/5.6.1/i386-linux',
'/usr/lib/perl5/vendor_perl/5.6.1',
'/usr/lib/perl5/vendor_perl'
];
This method takes one optional parameter -- a new reference to the array of a similar structure. It updates $self->{p_INC} when it is called with valid incoming value. In this version an incoming data validation is pretty simple: it just checks whether the incoming parameter is a reference to an ARRAY
.
This method always returns the current value of $self->{p_INC}.
Example of the use:
my $obj = Tree::PseudoIncLib->new();
my $target_dirs = [
'/usr/lib/perl5/5.6.1/i386-linux',
'/usr/lib/perl5/site_perl/5.6.1/i386-linux',
'/usr/lib/perl5/site_perl',
];
$obj->pseudo_INC( $target_dirs );
tree_id
Provides access to the internal variable $self->{tree_id}. This method takes one optional parameter -- a new value for the $self->{tree_id}. It updates $self->{tree_id} when it is called with any incoming value that could be evaluated to TRUE
.
It always returns the current value of $self->{tree_id}.
application_directory
Provides access to the internal variable $self->{application_directory} that stores the URL mask of the application. This method takes one optional parameter -- a new value for the $self->{application_directory}. It updates $self->{application_directory} when it is called with any incoming value that could be evaluated to TRUE
.
It always returns the current value of $self->{application_directory}.
rpm_type
Provides access to the internal variable $self->{rpm_type}. This method takes one optional parameter -- a new value for the $self->{rpm_type}. It updates $self->{rpm_type} when it is called with defined
incoming value (even when the value is an empty string).
Note: This method changes the value of $self->{rpm_active} to 0 additionally when it is called with defined
empty string as an incoming parameter, because unknown type of RPM can not be active.
It always returns the current value of $self->{rpm_type}.
rpm_active
Provides access to the internal boolen variable $self->{rpm_active}. This method takes one optional parameter -- a new value for the $self->{rpm_active}. It updates $self->{rpm_active} when it is called with defined
incoming value (even the value is 0, means set RPM inactive).
Note: It is prohibited to set unknown RPM type to active. Error message is logged in this case.
This method always returns the current value of $self->{rpm_active}.
skip_empty_dir
Provides access to the internal boolen variable $self->{skip_empty_dir}. This method takes one optional parameter -- a new value for the $self->{skip_empty_dir}. It updates $self->{skip_empty_dir} when it is called with defined
incoming value (even the value is 0, means allow to store and display empty directories in a tree structure).
Example code of the use:
$obj->skip_empty_dir(0); # keep empty dirs in description
# ...
$obj->skip_empty_dir(1); # skip empty dirs
This method always returns the current value of $self->{skip_empty_dir}.
skip_mode
Provides access to the internal boolen variable $self->{skip_mode}. This method takes one optional parameter -- a new value for the $self->{skip_mode}. It updates $self->{skip_mode} when it is called with defined
incoming value (even the value is 0, means allow to store and display permissions in a tree structure).
skip_owner
Provides access to the internal boolen variable $self->{skip_owner}. This method takes one optional parameter -- a new value for the $self->{skip_owner}. It updates $self->{skip_owner} when it is called with defined
incoming value (even the value is 0, means allow to store and display node owner name in a tree structure).
skip_group
Provides access to the internal boolen variable $self->{skip_group}. This method takes one optional parameter -- a new value for the $self->{skip_group}. It updates $self->{skip_group} when it is called with defined
incoming value (even the value is 0, means allow to store and display node group name in a tree structure).
max_nodes
Provides access to the internal watch-dog variable $self->{max_nodes}. This method takes one optional numeric parameter -- a new value for the $self->{max_nodes}. It updates $self->{max_nodes} when it is called with valid incoming value. It logs an error message when it is called with invalid (low) value and sets $self->{max_nodes} to MIN_LIMIT_NODES
-- a constant defined by the class, that is equal to 15 in this version.
Example of the use:
$obj->max_nodes(50); # should be sufficient for the test...
This method always returns the current value of $self->{max_nodes}.
from_scratch
Creates a primary discription of the perl-library defined by $self->{p_INC} reference. The result is a reference to the array of hashes that is stored internally in $self->{descript}.
This method does not require any incoming parameters. It returnes the number of ctreated records upon success, otherwise - undef.
export_to_DHTML
Creates a DHTML page-code of the tree.
This method returns the multi-string of the created page upon success, otherwise - undef.
Example of the use:
# assume $obj->from_scratch() is done OK, then:
#
my $src_html = $obj->export_to_DHTML (
title => 'Test-Debug',
image_dir => 'data/images/',
icon_shaded => 'file_x.gif',
icon_folder_opened => 'folder_opened.gif',
icon_symlink => 'hand.right.gif',
tree_intend => 18,
row_class => 'r0',
css => '', # use 'inline' css
jslib => '', # no jslib
overlib => 'js/overlib.js',
);
status_as_string
This method provides an internal status of the object. It takes no parameters, and returns the human readable multi-string. It might be helpful to trace/debug the application that uses this object:
$message = "Internal Status:\n".$self->status_as_string;
$self->{plog}->debug($message);
list_descript_keys
This method takes no parameters. It returns a reference to the array that contains sorted alphabetically names of keys used anywhere inside the descriptions. Array is generated dynamically using the recent content of descriptions.
list_simple_keys
This method takes no parameters. It returns a reference to the array that contains a sorted alphabetically set of names of simple keys of the object. The list does not contain any keys representing references to another arrays or hashes.
w3c_doctype
Creates and returnes the string of W3C document type. This method accepts one mandatory incoming hash parameter:
- type
-
defines
XHTML
orHTML
type of the exported document
This version of the module is using an HTML
document type only.
my $self = shift;
my $res = $self->w3c_doctype( type => 'html' );
inline_CSS
Creates and returnes a multi-string of DHTML code representing in-line CSS for the page. This method does not require incoming parameters in this version.
inc_html_table
Creates and returnes a multi-string of DHTML code representing pseudo-INC array. Every line of this table is a link to an appropriate row of the main tree table. This method requires one mandatory incoming hash parameter
- title
-
of the table to display
Example of the use:
$res .= $self->inc_html_table ( title => 'Library' );
PRIVATE METHODS
Methods of this group are not supposed to be used in applications directly, they are subject to change in future versions. All private methods could be inherited automatically, or overwritten within the child class if necessary.
_dir_description
This method is used by from_scratch
method in order to create a very primary description of so-called 'root directory' using recursion into every child directory.
This version of the module distinguish 3 types of nodes:
This method takes a mandatory hash of incoming parameters:
- root_dir
-
absolute address of the directory to explore (the trailing slash / might be omitted);
- pseudo_cpan_root_name
-
estimation of the CPAN name for root_dir;
- parent_index
-
unique object name for the root_dir;
- parent_depth_level
-
depth level of root_dir inside the result tree;
- prior_libs
-
a reference to the array of priorly described libraries those should not be repeated in description again;
- inc_lib
-
a name of the current library as it appears in @INC;
The result of _dir_description is a pretty complicated structure of arrays and hashes. Primarily, it is an array of hashes, where some keys might reference another (child) arrays of hashes, and so on...
Every file/directory/symlink is described with the hash using the following set of keys:
- type
-
can be 'd', 'f', or 'l' in first position (stand for 'directory', 'file', or 'link');
- inode
-
associated with the item;
- permissions_octal_text
-
like '0755';
- size
-
in bytes;
- owner
-
name of the owner;
- group
-
name of the group;
- level
-
depth in the tree (beginning with 1 for the names listed in @INC);
- name
-
local name of the file/link/directory (inside the parent directory);
- full_name
-
absolute address like /full/path/to/the/file
- pseudo_cpan_name
-
makes sense for the .pm file only; indeed is generated for directories too recursively;
- last_mod_time_text
-
date/time of last modification in format "%B %d, %Y at %H:%M"
- parent_index
-
unique name of the parent node/object;
- self_index
-
unique name for the self node/object;
- child_dir_list
-
a reference to the array of children descriptions;
- rpm_package_name
-
optional key is used for real files only;
Note:
One directory can belong to many packages. Appropriate description features might be a matter of further improvement, not actual at the moment. - link_target
-
optional key is used for symlinks only;
All children in every array are sorted by the name alphabetically.
Upon success _dir_description returns a reference to the created array of hashes.
_object_list
Transforms the internal description of the tree to the simple (regular) array of simple hashes. This method takes one mandatory incomming parameter -- a reference to the array of primary tree description.
This method returns a reference to the array of hashes upon success. Otherwise, it returns undef. Every hash contains (some of) the following keys:
- pseudo_cpan_name
- level
- inc_lib
- parent_obj_name
-
= $_->{parent_index};
- self_obj_name
-
= $_->{self_index};
- name
-
= $_->{name};
- type
- size
- last_mod_time_text
- full_name
-
including absolute path
- permissions_octal_text
- owner
- group
- inode
- icon
-
= $_->{icon} if $_->{icon}; # defined for files only
- allow_index
-
= $_->{allow_index} if defined $_->{allow_index}; # for files only
- rpm_package_name
-
= $_->{rpm_package_name} if $_->{rpm_package_name};
- link_target
-
= $_->{link_target} if $_->{link_target};
Note: This is not a full list of incoming keys.
Example of the use:
$self->{descript} = $self->_object_list ($lib_list_ref);
_mark_shaded_names
Creates extended descriptions for shaded .pm files indicating which module will really be loaded (executed) for given CPAN name. Every shaded module is accomplished additionally with the following keys:
- shaded_by_lib
- shaded_by_inode
- shaded_by_last_modified
This method takes no incoming parameters. It returns the the reference to the array that contains the list of shaded names in CPAN representation.
_html_head
Creates and returnes the HTML code of the head
section of DHTML page.
Takes 3 incoming parameters in a hash:
- title
-
of the page
- jslib
-
optional file of external JavaScript for the page (to serve collapsable branches of the tree)
- css
-
Cascaded Style Sheet of the page. Might be an external file, or just
inline
. - overLib
-
file of external JavaScript overLIB.
_descript_html_table_head_row
Creates and returnes the string of the head-row of the main DHTML table of tree description. Takes no incoming parameters.
_link_icon_overLib
Creates and returnes the string of DHTML providing overLib
call on client side.
Takes the hash of the following incoming parameters:
- icon_src
- on_click_href
- on_mouse_over_message
- hspace
-
for the image
- border
-
for the image
- align
-
for the image
_link_text_overLib
Creates and returnes the string of DHTML providing overLib
call on client side.
Takes the hash of the following incoming parameters:
- text
- on_click_href
- on_mouse_over_message
_data_row_HTML
This method creates one regular row of DHTML description table, It takes the hash of mandatory incoming parameters:
- current_row_description
-
a reference to the hash of internal description of the tree item
- image_dir
-
might be an absolute or relative path to the directory containing all icons
- icon_shaded
-
to mark shaded files
- icon_folder_opened
- icon_symlink
- tree_intend
-
intend (in pixels) between levels of the tree
Example of the use:
foreach ( @{$self->{descript}} ) {
$res .= $self->_data_row_HTML(
current_row_description => $_,
image_dir => $image_dir,
icon_shaded => $icon_shaded,
icon_folder_opened => $icon_folder_opened,
icon_symlink => $icon_symlink,
tree_intend => $tree_intend,
row_class => $row_class,
)."\n";
}
AUTHOR
Slava Bizyayev <slava@cpan.org> - Freelance Software Developer & Consultant.
COPYRIGHT AND LICENSE
Copyright (C) 2004 Slava Bizyayev. All rights reserved.
This package is free software. You can use it, redistribute it, and/or modify it under the same terms as Perl itself.
The latest version of this module can be found on CPAN.
SEE ALSO
Apache::App::PerlLibTree
- mod_perl web application.
overLIB 3.51
Copyright Erik Bosrup 1998-2002. All rights reserved. Available at http://www.bosrup.com/web/overlib/