NAME

CPANDB::Distribution - CPANDB class for the distribution table

DESCRIPTION

CPANDB::Distribution provides an object representation of a distribution in the CPAN. Because many CPAN websites are oriented around distributions, this class serves as one of the primary integration points for the various different CPAN databases providing information on popularity and testing data.

Distributions are also the primary plane on which graph-aware algorithms are run, and on which metrics are calculated.

METHODS

base

my $namespace = CPANDB::Distribution->base; # Returns 'CPANDB'

Normally you will only need to work directly with a table class, and only with one ORLite package.

However, if for some reason you need to work with multiple ORLite packages at the same time without hardcoding the root namespace all the time, you can determine the root namespace from an object or table class with the base method.

table

print CPANDB::Distribution->table; # Returns 'distribution'

While you should not need the name of table for any simple operations, from time to time you may need it programatically. If you do need it, you can use the table method to get the table name.

load

my $object = CPANDB::Distribution->load( $distribution );

If your table has single column primary key, a load method will be generated in the class. If there is no primary key, the method is not created.

The load method provides a shortcut mechanism for fetching a single object based on the value of the primary key. However it should only be used for cases where your code trusts the record to already exists.

It returns a CPANDB::Distribution object, or throws an exception if the object does not exist.

select

# Get all objects in list context
my @list = CPANDB::Distribution->select;

# Get a subset of objects in scalar context
my $array_ref = CPANDB::Distribution->select(
    'where distribution > ? order by distribution',
    1000,
);

The select method executes a typical SQL SELECT query on the distribution table.

It takes an optional argument of a SQL phrase to be added after the FROM distribution section of the query, followed by variables to be bound to the placeholders in the SQL phrase. Any SQL that is compatible with SQLite can be used in the parameter.

Returns a list of CPANDB::Distribution objects when called in list context, or a reference to an ARRAY of CPANDB::Distribution objects when called in scalar context.

Throws an exception on error, typically directly from the DBI layer.

iterate

CPANDB::Distribution->iterate( sub { print $_->distribution . "\n"; } );

The iterate method enables the processing of large tables one record at a time without loading having to them all into memory in advance.

This plays well to the strength of SQLite, allowing it to do the work of loading arbitrarily large stream of records from disk while retaining the full power of Perl when processing the records.

The last argument to iterate must be a subroutine reference that will be called for each element in the list, with the object provided in the topic variable $_.

This makes the iterate code fragment above functionally equivalent to the following, except with an O(1) memory cost instead of O(n).

foreach ( CPANDB::Distribution->select ) {
    print $_->distribution . "\n";
}

You can filter the list via SQL in the same way you can with select. CPANDB::Distribution->iterate( 'order by ?', 'distribution', sub { print $_->distribution . "\n"; } );

You can also use it in raw form from the root namespace for better control. Using this form also allows for the use of arbitrarily complex queries, including joins. Instead of being objects, rows are provided as ARRAY references when used in this form. CPANDB->iterate( 'select name from distribution order by distribution', sub { print $_->[0] . "\n"; } );

count

# How many objects are in the table
my $rows = CPANDB::Distribution->count;

# How many objects 
my $small = CPANDB::Distribution->count(
    'where distribution > ?',
    1000,
);

The count method executes a SELECT COUNT(*) query on the distribution table.

It takes an optional argument of a SQL phrase to be added after the FROM distribution section of the query, followed by variables to be bound to the placeholders in the SQL phrase. Any SQL that is compatible with SQLite can be used in the parameter.

Returns the number of objects that match the condition.

Throws an exception on error, typically directly from the DBI layer.

uploaded_datetime

For situations in which date math will be done, the uploaded_datetime method can be used to access the date of the most decent release as a DateTime object.

Returns a DateTime object with date attributes, without a time set, and set at the UTC timezone in the 'C' locale.

age

The age method finds the age of a distribution in DateTime terms, and where high age is considered a negative property. It is calculated as the time between the current day and the date of the last release of the distribution.

Returns a DateTime::Duration object, with maximum resolution at the 'day' level.

age_months

When expressing the age of a module in the CPAN, the most normal unit to use is the month. Releasing more than once a month is considered high pace for a CPAN distribution, while releasing every few months or even once per several months is fairly reasonable for mature modules.

For situations when a single numeric value is desired to represent a distribution's age, the age_months thus provides a direct calculation of this value.

Returns the age in months as an integer, with a value of zero if the module has been released within the last month.

quartile

The quartile method determines which statstical age quartile the distribution falls into, between 1 and 4.

Quartile 1 represents "new" or "current" modules, which have seen a release in at least the last year.

Distributions in the first quadrant are usually going to work and be up to date, as the author is likely to be present (for new modules) or maintains the module to a reasonable level of diligence (for older modules).

Quartile 2 represents "mature" or "stale" modules, which have seen a release in the last several years, but not recently.

Smaller distributions with high CPAN Testers PASS rates and low or no bug counts are often simply "mature" as they are essentially "finished" and don't need to be extended. These will often see a new release only every 2 or 3 years to fix trivial issues or match changes in some underlying dependency which has changed.

Larger distributions with non-perfect CPAN Tester PASS rates (or those with high bug counts) can be considered to be "stale". However, their authors are still likely to be around. Contacting the author may result in new releases due to the your attention, or the author might be interested in handing off the module for maintenance.

Quartile 3 represents "old" or "rotten" modules, which have not seen a release in the last several years.

Modules with high number of other modules depending on them in this range may simply be "old" and suffering from benign neglect due to the author moving on to other careers, languages, or projects and not annointing a replacement

They aren't actively broken, but nobody remains to maintain them and they may have crufty and hard to learn codebases. Due to entrenched workarounds for any bugs they have, they can also be risky to change.

Modules without downstream dependencies in this zone are often "rotten", broken due to changes in the modules around them and abandoned by anything that used to depend on them.

It can also be common to find modules in this range labelled as "DEPRECATED" in the abstract.

Quartile 4 represents the garbage dump of the CPAN. These modules tend to largely be abandoned modules, ideas that failed dramatically, Acme:: joke modules that have never needed updated releases, or modules who have been replaced wholesale by updated core techniques or dramatically superior replacements.

The oldest of these modules date from the age of Perl 4 and the earliest Perl 5.

dependency_graph

This method generates a Graph::Directed object linking the distribution to all the other distributions that are needed by it to work, recursively.

dependants_graph

This method generates a Graph::Directed object linking the distribution to all the other distributions that use the distribution, recursively.

dependency_easy

This method generates a Graph::Easy object linking the distribution to all the other distributions that are needed by it to work, recursively.

dependants_easy

This method generates a Graph::Easy object linking the distribution to all the other distributions that use the distribution, recursively.

dependency_graphviz

This method generates a Graphviz object linking the distribution to all the other distributions that are needed by it to work, recursively.

dependants_graphviz

This method generates a Graphviz object linking the distribution to all the other distributions that use the distribution, recursively.

dependency_xgmml

This method generates a Graph::XGMML object linking the distribution to all the other distributions that are needed by it to work, recursively.

dependants_xgmml

This method generates a Graph::XGMML object linking the distribution to all the other distributions that use the distribution, recursively.

ACCESSORS

distribution

if ( $object->distribution ) {
    print "Object has been inserted\\n";
} else {
    print "Object has not been inserted\\n";
}

Returns true, or throws an exception on error.

REMAINING ACCESSORS TO BE COMPLETED

SQL

The distribution table was originally created with the following SQL command.

CREATE TABLE distribution (
    distribution TEXT NOT NULL PRIMARY KEY,
    version TEXT NULL,
    author TEXT NOT NULL,
    meta INTEGER NOT NULL,
    license TEXT NULL,
    release TEXT NOT NULL,
    uploaded TEXT NOT NULL,
    pass INTEGER NOT NULL,
    fail INTEGER NOT NULL,
    unknown INTEGER NOT NULL,
    na INTEGER NOT NULL,
    rating TEXT NULL,
    ratings INTEGER NOT NULL,
    weight INTEGER NOT NULL,
    volatility INTEGER NOT NULL,
    FOREIGN KEY (
        author
    )
    REFERENCES author (
        author
    )
)

SUPPORT

CPANDB::Distribution is part of the CPANDB API.

See the documentation for CPANDB for more information.

COPYRIGHT

Copyright 2009 - 2010 Adam Kennedy.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.