NAME

CPAN::Access::AdHoc - Retrieve stuff from an arbitrary CPAN repository

SYNOPSIS

use CPAN::Access::AdHoc;

my ( $module ) = @ARGV;
my $cad = CPAN::Access::AdHoc->new();
my $index = $cad->fetch_module_index();
if ( $index->{$module} ) {
    print "$module is in $index->{distribution}\n";
} else {
    print "$module is not indexed\n";
}

NOTICE

Effective with version 0.000_03:

* Methods whose names contain 'package' are removed. Use the correspondingly-named 'distribution' methods.

* Method fetch_registered_module_index() now returns a hash.

DESCRIPTION

This class provides a lowish-level interface to an arbitrary CPAN repository. You can fetch anything, but there is particular support for the author and module indices, distributions, and their metadata.

What it does not provide is module installation, dependency resolution, or what-have-you. There are already plenty of tools for that.

The intent is that this should be a zero-configuration system, or at least a configuration-optional system.

Attributes can be specified explicitly either when the object is instantiated or afterwards. The default is from the global section of a Config::Tiny configuration file, CPAN-Access-AdHoc.ini, which is found in directory File::HomeDir->my_dist_config( 'CPAN-Access-AdHoc' ). The named sections are currently unused, though CPAN-Access-AdHoc reserves to itself all section names which contain no uppercase letters.

In addition, it is possible to take the default CPAN repository URL from the user's CPAN::Mini, cpanm, CPAN, or CPANPLUS configuration. They are accessed in this order by default, and the first available is used. But which of these are considered, and the order in which they are considered is under the user's control, via the default_cpan_source attribute/configuration item.

What actually happened here is that I got an RT ticket on one of my CPAN distributions, pointing out that the Free Software Foundation had moved, and I needed to update the copy of the Gnu GPL that I distributed. Well, it's the same text for all my distributions, so I wanted a tool to tell me which ones had already been updated in CPAN.

A little later, I realized that a clobbered version of one of my author tests got shipped in a couple distributions, so I wrote another Perl script to see how far the rot had spread.

Then I found out about an interesting but somewhat heavyweight module, and wanted to know what I really needed to install to get it going. Yes, cpanm will do this, but I have not taken that step yet.

So I found myself writing mostly the same code for the third time, and decided there ought to be a better way. Hence this module.

METHODS

This class supports the following public methods:

Instantiator

new

This static method instantiates the object. You can specify attribute values by passing name/value argument pairs. Defaults are documented with the individual attributes.

If you do not specify an explicit cpan argument, and a default CPAN URL can not be computed, an exception is thrown. See the cpan attribute documentation for a few more details.

Accessors/Mutators

config

When called with no arguments, this method acts as an accessor, and returns the current configuration as a Config::Tiny object.

When called with an argument, this method acts as a mutator. If the argument is a Config::Tiny object it becomes the new configuration. If the argument is undef, file CPAN-Access-AdHoc.ini in File::HomeDir->my_dist_config( 'CPAN-Access-AdHoc' ) is read for the configuration. If this file does not exist, the configuration is set to an empty Config::Tiny object.

cpan

When called with no arguments, this method acts as an accessor, and returns the URL of the CPAN repository accessed by this object.

When called with an argument, this method acts as a mutator. It sets the URL of the CPAN repository accessed by this object, and (for reasons of sanity) calls flush() to purge any data cached from the old repository.

If the argument is undef, the default URL as computed from the sources in default_cpan_source is used. If no URL can be computed from any source, an exception is thrown.

default_cpan_source

When called with no arguments, this method acts as an accessor, and returns the current list of default CPAN sources as a comma-delimited string.

When called with an argument, this method acts as a mutator, and sets the list of default CPAN sources. This list is a comma-delimited string, and consists of the names of zero or more CPAN::Access::AdHoc::Default::CPAN::* classes, with the common prefix removed. See the documentation of these classes for more information.

If any of the elements in the string does not represent an existing CPAN::Access::AdHoc::Default::CPAN:: class, an exception is thrown and the value of the attribute remains unmodified.

If the argument is undef, the default is restored.

The default is 'CPAN::Mini,cpanm,CPAN,CPANPLUS'.

Functionality

These methods are what all the rest is in aid of.

corpus

This convenience method returns a list of the indexed distributions by the author with the given CPAN ID. This information is derived from the output of indexed_distributions(). The argument is converted to upper case before use.

fetch

This method fetches the named file from the CPAN repository. Its argument is the name of the file relative to the root of the repository.

If this method determines that there might be checksums for this file, it attempts to retrieve them, and if successful will compare the SHA256 checksum of the retrieved data to the retrieved value.

If the file is compressed in some way it will be decompressed.

If the fetched file is an archive of some sort, an object representing the archive will be returned. This object will be of one of the CPAN::Access::AdHoc::Archive::* classes, each of which wraps the corresponding Archive::* class and provides CPAN::Access::AdHoc with a consistent interface. These classes will be initialized with

content => the literal content of the archive, as downloaded,
encoding => the MIME encoding used to decode the archive,
path => the path to the archive, relative to the base URL.

If the fetched file is not an archive, it is wrapped in a CPAN::Access::AdHoc::Archive::Null object and returned.

All other fetch functionality is implemented in terms of this method.

fetch_author_index

This method fetches the author index, authors/01mailrc.txt.gz. It is expanded and interpreted, and returned as a hash reference keyed by the authors' CPAN IDs. The data for each author is an anonymous hash with the following keys:

name => the name of the author;
address => the electronic mail address of the author.

The results of the first fetch are cached; subsequent calls are supplied from cache.

fetch_module_index

This method fetches the module index, modules/02packages.details.txt.gz. It is expanded and interpreted, and returned as a hash reference keyed by the module names. The data for each module is an anonymous hash with the following keys:

distribution => the name of the distribution that contains the module, relative to the authors/id/ directory;
version => the version of the module.

If called in list context, the first return is the index, and the second is another hash reference containing the metadata that appears at the top of the expanded index file.

The results of the first fetch are cached; subsequent calls are supplied from cache.

fetch_distribution_archive

This method takes as its argument the name of a distribution file relative to the archive's authors/id/ directory, and returns the distribution as a CPAN::Access::AdHoc::Archive::* object.

Note that since this method is implemented in terms of fetch(), the archive method's path attribute will be set to its path relative to the base URL of the CPAN repository, not its path relative to the authors/id/ directory. So, for example,

$arc = $cad->fetch_distribution_archive(
    'B/BA/BACH/PDQ-0.000_01.zip' );
say $arc->path(); # authors/id/B/BA/BACH/PDQ-0.000_01.zip

For convenience, either the top or the top two directories can be omitted, since they can be reconstructed from the rest. So the above example can also be written as

$arc = $cad->fetch_distribution_archive(
    'BACH/PDQ-0.000_01.zip' );
say $arc->path(); # authors/id/B/BA/BACH/PDQ-0.000_01.zip

fetch_distribution_checksums

use YAML::Any;
print Dump( $cad->fetch_distribution_checksums(
    'B/BA/BACH/' ) );
print Dump( $cad->fetch_distribution_checksums(
    'B/BA/BACH/Johann-0.001.tar.bz2' ) );

This method takes as its argument either a file name or a directory name relative to authors/id/. A directory is indicated by a trailing slash.

If the request if for the CHECKSUMS file, the return is a reference to a hash which contains the interpreted contents of the entire file.

If the argument is a file name other than CHECKSUMS, the return is a reference to the CHECKSUMS entry for that file, provided it exists.

If the argument is a directory name, it is treated like a request for the CHECKSUMS file in that directory.

If the CHECKSUMS file does not exist, an exception is raised. If the argument was a file name and the file has no entry in the CHECKSUMS file, nothing is returned.

For convenience, either the top or the top two directories can be omitted, since they can be reconstructed from the rest.

The result of the first fetch for a given directory is cached, and subsequent calls for the same author are supplied from cache.

fetch_registered_module_index

This method fetches the registered module index, modules/03modlist.data.gz. It is interpreted, and returned as a hash reference keyed by module name.

If called in list context, the first return is the index, and the second is a hash reference containing the metadata that appears at the top of the expanded index file.

The results of the first fetch are cached; subsequent calls are supplied from cache.

flush

This method deletes all cached results, causing them to be re-fetched when needed.

indexed_distributions

This convenience method returns a list of all indexed distributions in ASCIIbetical order. This information is derived from the results of fetch_module_index(), and is cached.

SEE ALSO

CPAN::DistnameInfo, which parses distribution name and version (among other things) from the name of a particular distribution archive. This was very helpful in some of my CPAN ad-hocery.

SUPPORT

Support is by the author. Please file bug reports at http://rt.cpan.org, or in electronic mail to the author.

AUTHOR

Thomas R. Wyant, III wyant at cpan dot org

COPYRIGHT AND LICENSE

Copyright (C) 2012 by Thomas R. Wyant, III

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0. For more details, see the full text of the licenses in the directory LICENSES.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.