NAME

CGI::Uploader - Manage CGI uploads using an SQL database

Synopsis

# Create an upload object
# -----------------------

my($u) = CGI::Uploader -> new # Mandatory.
(
	dbh      => $dbh,  # Optional. Or specify in call to upload().
	dsn      => [...], # Optional. Or specify in call to upload().
	imager   => $obj,  # Optional. Or specify in call to upload's transform.
	manager  => $obj,  # Optional. Or specify in call to upload().
	query    => $q,    # Optional.
	temp_dir => $t,    # Optional.
);

# Upload N files
# --------------

my($meta_data) = $u -> upload # Mandatory.
(
form_field_1 => # An arrayref of hashrefs. The keys are CGI form field names.
[
{ # First, mandatory, set of options for storing the uploaded file.
column_map    => {...}, # Optional.
dbh           => $dbh,  # Optional. But one of dbh or dsn is
dsn           => [...], # Optional. mandatory if no manager.
file_scheme   => $s,    # Optional.
manager       => $obj,  # Optional. If present, all others params are optional.
sequence_name => $s,    # Optional, but mandatory if Postgres and no manager.
table_name    => $s,    # Optional if manager, but mandatory if no manager.
transform     => {...}  # Optional.
},
{ # Second, etc, optional sets of options for storing copies of the file.
},
],
form_field_2 => [...], # Another arrayref of hashrefs.
);

# Delete N files for each uploaded file
# -------------------------------------

my($report) = $u -> delete # Optional.
(
column_map => {...}, # Mandatory.
dbh        => $dbh,  # Optional. But one of dbh or dsn is
dsn        => [...], # Optional. mandatory.
id         => $id,   # Mandatory.
table_name => $s,    # Mandatory.
);

# Generate N files from each uploaded file
# ----------------------------------------

$u -> generate # Optional.
(
form_field_1 => [...], # Mandatory. An arrayref of hashrefs.
form_field_2 => [...], # Mandatory. Another arrayref of hashrefs.
);

The simplest option, then, is to use

CGI::Uploader -> new() -> upload(file_name => [{dbh => $dbh, table_name => 'uploads'}]);

and let CGI::Uploader do all the work.

For Postgres, make that

CGI::Uploader -> new() -> upload(file_name => [{dbh => $dbh, sequence_name => 'uploads_id_seq', table_name => 'uploads'}]);

Description

CGI::Uploader is a pure Perl module.

Warning: V 2 'v' V 3

The API for CGI::Uploader version 3 is not compatible with the API for version 2.

This is because V 3 is a complete rewrite of the code, taking in to account all the things learned from V 2.

Constructor and initialization

new() returns a CGI::Uploader object.

This is the class's contructor.

You must pass a hash to new().

Options:

dbh => $dbh

This key may be specified globally or in the call to upload().

See Details for an explanation, including how this key interacts with dsn.

This key (dbh) is optional.

dsn => [...]

This key may be specified globally or in the call to upload().

See Details for an explanation, including how this key interacts with dbh.

This key (dsn) is optional.

imager => $obj

This key may be specified globally or in the call to upload's transform.

This object is used to handle the transformation of images.

This key (imager) is optional.

manager => $obj

This key may be specified globally or in the call to upload().

This object is used to handle the transfer of meta-data into the database. See Meta-data.

This key (manager) is optional.

query => $q

Use this to pass in a query object.

This object is expected to belong to one of these classes:

Apache::Request
Apache2::Request
CGI

If not provided, an object of type CGI will be created and used to do the uploading.

If you want to use a different type of object, just ensure it has these CGI-compatible methods:

cgi_error()

This is only called if something goes wrong.

upload()
uploadInfo()

Warning: CGI::Simple cannot be supported. See this ticket, which is not resolved:

http://rt.cpan.org/Ticket/Display.html?id=14838

There is a comment in the source code of CGI::Simple about this issue. Search for 14838.

This key (query) is optional.

temp_dir => $string

Note the spelling of temp_dir.

If not provided, an object of type File::Spec will be created and its tmpdir() method called.

This key (temp_dir) is optional.

Transformation 'v' Generation

Transform is an optional component in the call to upload().

Generate() is a separate method.

This section discusses these 2 processes.

Tranformation:

You must specify a CGI form field

This means transformation takes exactly 1 input file.

The file is uploaded before being transformed
The uploaded file is transformed and saved
The uploaded file is discarded
The transformed file's meta-data goes in the database

This means transformation outputs exactly 1 file.

Generation:

There is no upload associated with generation
The file used as a basis for generation must be in the database

This means generation takes exactly 1 input file.

So this input file was, presumably, uploaded at some time in the past, and may have been transformed at that time.

You specify how to generate a new file based on an old file

That is, you specify a set of options which control the generation of 1 new file.

You specify N >= 1 sets of such options

This means generation outputs N >= 1 new files.

The old file stays in the database
All the generated files' meta-data go in the database.

A typical use of generation would be to produce thumbnails of large images.

Method: delete(%hash)

Note: Methods are listed here in alphabetical order. So delete() comes before upload(). Nevertheless, the most detailed explanations of options are under upload(), with only brief notes here under delete().

You must pass a hash to delete().

delete(%hash) deletes everything associated with a given database table id.

The keys of this hash are reserved words, and the values are your options.

column_map => {...}

See Details for a discussion of column_map.

Note: If your column map does not contain the server_file_name key, delete(%hash) will do nothing because it won't be able to find any file names to delete.

The key (column_map) is optional.

dbh => $dbh

This key may be specified globally or in the call to delete().

See Details for an explanation, including how this key interacts with dsn.

This key (dbh) is optional.

dsn => [...]

This key may be specified globally or in the call to delete().

See Details for an explanation, including how this key interacts with dbh.

This key (dsn) is optional.

id => $id

This is the (primary) key of the database table which will be processed.

To specify a column name other than id, use the column_map option.

This key (id) is mandatory.

table_name => $string

This is the name of the database table.

This key (table_name) is mandatory.

There is no manager key because there is no point in you passing all these options to delete(%hash) just so this method can pass them all back to your manager.

The items deleted are:

All files generated from the uploaded file

They can be identified because their parent_id column matches $id, and their file names come from the server_file_name column.

The records in the table whose parent_id matches $id
The uploaded file

It can be identified becase its id column matches $id, and its file name comes from the server_file_name column.

The record in the table whose id matches $id

delete(%hash) returns an array ref of hashrefs.

Each hashref has 2 keys and 2 values:

id => $id

$id is the value of the (primary) key column of a deleted file.

One of these $id values will be the $id you passed in to delete(%hash).

file_name => $string

$string is the name of a deleted file.

Method: generate(%hash)

You must pass a hash to generate().

The keys to this hash are:

column_map => {...}

The default column_map is documented under Details.

This key (column_map) is optional.

dbh => $dbh

Dbh is documented under Details.

At least one of dbh and dsn must be provided.

dsn => [...]

Dbh is documented under Details.

At least one of dbh and dsn must be provided.

file_scheme => $string

file_scheme is documented under Details.

File_scheme defaults to string.

This key (file_scheme) is optional.

manager => $obj

Manager is documented under Details.

This key (manager) is optional.

path => $string

Path is documented under Details.

This key (path) is mandatory.

records => {...}

Records specifies which (primary) keys in the table are used to find files to process.

These files are input files, and the options in the hashref specify how to use those files to generate output files.

The keys in the hashref are the keys in the table. E.g.:

records => {1 => [...], 99 => [...]}

specifies that only records with ids of 1 and 99 are to be processed.

The name of the (primary) key column defaults to id, but you can use column_map to change that.

The name of the input file comes from the server_file_name column of the table. Use column_map to change that column name.

The arrayrefs are used to specify N >= 1 output files for each input file.

So, each arrayref contains N >= 1 hashrefs, and each hashref specifies how to generate 1 output file. E.g.:

records => {1 => [{...}, {...}], 99 => [{...}]}

This says use id 1 to generate 2 output files, and use id 99 to generate 1 output file.

To make life easier, if you only wish to generate a single output file, you can reduce this:

records => {99 => [{...}]}

to this:

records => {99 => {...} }

The structure of the inner-most hashrefs is exactly the same as the hashrefs pointed to by the <transform> key, documented at "transform". E.g.:

For an imager object of type Image::Magick:

records => {1 => [{imager => $obj, options => {width => $w, height => $h} }, {...}], 99 => [{...}]}

or, for an imager object of type Imager:

records => {1 => [{imager => $obj, options => {xpixels => $x, ypixels => $y}, {...}], 99 => [{...}]}

CGI::Uploader takes care of the meta-data for each generated file. See Meta-data.

This key (records) is mandatory.

sequence_name => $string

Sequence_name is documented under Details.

This key is mandatory if you are using Postgres, and optional if not.

table_name => $string

This key (table_name) is mandatory.

Note: generate() returns an hashref of arrayrefs, where the keys of the hashref are the ids provided in the records hashref, and the arrayrefs list the ids of the files generated.

You can use this data, e.g., to read the meta-data from the database and populate form fields to inform the user of the results of the generation process.

Method: upload(%hash)

You must pass a hash to upload().

The keys of this hash are CGI form field names (where the fields are of type file).

CGI::Uploader cycles thru these keys, using each one in turn to drive a single upload.

Note: upload() returns an arrayref of hashrefs, one hashref for each uploaded file stored.

The hashrefs returned are not the meta-data associated with each uploaded file, but more like status reports.

These status reports are explained here, and the meta-data is explained in the next section.

The structure of these status hashrefs is 2 keys and 2 values:

field => CGI form field name
id => The value of the id column in the database

You can use this data, e.g., to read the meta-data from the database and populate form fields to inform the user of the results of the upload.

Meta-data

Meta-data associated with each uploaded file is accumulated while upload() works.

Meta-data is a hashref, with these keys:

client_file_name

The client_file_name is the name supplied by the web client to CGI::Uploader. It may or may not have path information prepended, depending on the web client.

date_stamp

This value is the string 'now()', until the meta-data is saved in the database.

At that time, the value of the function now() is stored, except for SQLite, which just stores the string 'now()'.

Date_stamp has an underscore in it in case your database regards datastamp as a reserved word.

extension

This is provided by the File::Basename module.

The extension is a string without the leading dot.

If an extension cannot be determined, the value will be '', the empty string.

height

This is provided by the Image::Size module, if it recognizes the type of the file.

For non-image files, the value will be 0.

id

The id is (presumably) the primary key of your table.

This value is 0 until the meta-data is saved in the database.

In the case of Postgres, it will be populated by the sequence named with the sequence_name key.

mime_type

This is provided by the MIME::Types module, if it can determine the type.

If not, it is '', the empty string.

parent_id

This is populated when a file is generated from the uploaded file. It's value will be the id of the upload file's record.

For the uploaded file itself, the value will be 0.

server_file_name

The server_file_name is the name under which the file is finally stored on the file system of the web server. It is not the temporary file name used during the upload process.

size

This is the size in bytes of the uploaded file.

width

This is detrmined by the Image::Size module, if it recognizes the type of the file.

For non-image files, the value will be 0.

Processing Steps

A mini-synopsis:

$u -> upload
(
file_name_1 =>
[
{First set of storage options for this file},
{Second set of storage options for the same file},
{...},
],
);
Upload file

upload() calls do_upload() to do the work of uploading the caller's file to a temporary file.

This is done once, whereas the following steps are done once for each hashref of storage options you specify in the arrayref pointed to by the 'current' CGI form field's name.

do_upload() returns a hashref of meta-data associated with the file.

Transform the file

If requested, call do_transform().

Save the meta-data

upload() calls the do_insert() method on the manager object to insert the meta-data into the database.

The default manager is CGI::Uploader itself.

do_insert() saves the last insert id from that insert in the meta-data hashref.

Create the permanent file

upload() calls copy_temp_file() to save the file permanently.

copy_temp_file() saves the permanent file name in the meta-data hashref.

Determine the height and width of images

upload() calls the get_size() method to get the image size, which delegates the work to Image::Size.

get_size() saves the image's dimensions in the meta-data hashref.

Update the database with the permanent file's name and image size

upload() calls the do_update() method on the manager object to put the permanent file's name into the database record, along with the height and width.

Details

Each key in the hash passed in to upload() points to an arrayref of options which specifies how to process the form field.

Use multiple elements in the arrayref to store multiple sets of meta-data, all based on the same uploaded file.

Each hashref contains 1 .. 5 of the following keys:

column_map => {...}

This hashref maps column_names used by CGI::Uploader to column names used by your database table.

The default column_map is:

{
client_file_name => 'client_file_name',
date_stamp       => 'date_stamp',
extension        => 'extension',
height           => 'height',
id               => 'id',
mime_type        => 'mime_type',
parent_id        => 'parent_id',
server_file_name => 'server_file_name',
size             => 'size',
width            => 'width',
}

If you supply a different column map, the values on the right-hand side are the ones you change.

Points to note:

Omitting keys

If you omit any keys from your map, the corresponding meta-data will not be saved.

This key (column_map) is optional.

dbh => $dbh

This is a database handle for use by the default manager class (which is just CGI::Uploader) discussed below, under manager.

This key is optional if you use the manager key, since in that case you do anything in your own storage manager code.

If you do provide the dbh key, it is passed in to your manager just in case you need it.

Also, if you provide dbh, the dsn key, below, is ignored.

If you do not provide the dbh key, the default manager uses the dsn arrayref to create a dbh via DBI.

dsn => [...]

This key is optional if you use the manager key, since in that case you do anything in your own storage manager code.

If you do provide the dsn key, it is passed in to your manager just in case you need it.

Using the default manager, this key is ignored if you provide a dbh key, but it is mandatory when you do not provide a dbh key.

The elements in the arrayref are:

A connection string

E.g.: 'dbi:Pg:dbname=test'

This element is mandatory.

A username string

This element is mandatory, even if it's just the empty string.

A password string

This element is mandatory, even if it's just the empty string.

A connection attributes hashref

This element is optional.

The default manager class calls DBI -> connect(@$dsn) to connect to the database, i.e. in order to generate a dbh, when you don't provide a dbh key.

file_scheme => $string

File_scheme controls how files are stored on the web server's file system.

All files are stored in the directory specified by the path option.

Each file name has the appropriate extension appended (as determined by MIME::Types.

The possible values of file_scheme are:

md5

The file name is determined like this:

Digest::MD5

Use the (primary key) id (returned by storing the meta-data in the database) to seed the Digest::MD5 module.

Create 3 subdirectories

Use the first 3 digits of the hex digest of the id to generate 3 levels of sub-directories.

Add the name

The file name is the (primary key) id.

simple

The file name is the (primary key) id.

Simple is the default.

This key (file_scheme) is optional.

manager => $object

This is an instance of your class which will manage the transfer of meta-data to a database table.

In the case you provide the manager key, your object is responsible for saving (or discarding!) the meta-data.

If you provide an object here, CGI::Uploader will call $object => do_insert($field_name, $meta_data, $store_option).

Parameters are:

$field_name

$field_name will be the 'current' CGI form field.

Remember, upload() is iterating over all your CGI form field parameters at this point.

$meta_data

$meta_data will be a hashref of options generated by the uploading process

See Meta-data, for the definition of meta-data.

$store_option

$store_option will be the 'current' hashref of storage options, one of the arrayref elements associated with the 'current' form field.

If you do not provide the manager key, CGI::Uploader will do the work itself.

Later, CGI::Uploader will call $object => do_update($field_name, $meta_data, $store_option), as explained above, under Processing Steps.

This key (manager) is optional.

path => $string

This is a path on the web server's file system where a permanent copy of the uploaded file will be saved.

This key (path) is mandatory.

sequence_name => $string

This is the name of the sequence used to generate values for the primary key of the table.

You would normally only need this when using Postgres.

This key is optional if you use the manager key, since in that case you can do anything in your own storage manager code. If you do provide the sequence_name key, it is passed in to your manager just in case you need it.

This key is mandatory if you use Postgres and do not use the manager key, since without the manager key, sequence_name must be passed in to the default manager (CGI::Uploader).

table_name => $string

This is the name of the table into which to store the meta-data.

This key is optional if you use the manager key, since in that case you can do anything in your own storage manager code. If you do provide the table_name key, it is passed in to your manager just in case you need it.

This key is mandatory if you do not use the manager key, since without the manager key, table_name must be passed in to the default manager (CGI::Uploader).

transform => {...}

This key points to a set of options which are used to transform the uploaded file.

As stated above, transformation takes 1 input file, uploads it, transforms it, saves the transformed file, and discards the uploaded file.

See also generate(), for a completely different way of processing files.

Here are the 2 examples I used in testing, but not at the same time!

 transform =>
 {
	 imager  => Image::Magick -> new(), # Optional. Default.
	 options => {height => 400, width => 500},
 }

 transform =>
 {
	 imager  => Imager -> new(),
	 options => {xpixels => 400, ypixels => 500},
 }

Clearly, transform points to a hashref:

imager => $obj

The imager key is optional. If omitted, CGI::Uploader creates an object of type Image::Magick, and uses that.

You can pass in an object whose class is a descendent of Image::Magick or Imager.

They are treated differently, as explained next.

height => 'Int', width => 'Int'

If the $obj isa('Image::Magick') you must pass in at least 1 of height and width.

The missing one is calculated from the size of the input image and the given parameter.

Here's what happens:

if ($$option{'imager'} -> isa('Image::Magick') )
{
	my($result)     = $$option{'imager'} -> Read($old_file_name);
	my($dimensions) = $self -> calculate_dimensions($$option{'imager'}, $option);
	$result         = $$option{'imager'} -> Resize($dimensions);
	$result         = $$option{'imager'} -> Write($temp_file_name);
}

Note: calculate_dimensions() calls Get('width', 'height').

This means if you wish to intercept these calls with a custom object, your Image::Magick-based object must respond to these calls:

Get()
Read()
Resize()
Write()
options => {xpixels => 400, ypixels => 500}

If the $obj isa('Imager') you must pass in suitable parameters for Imager's scale() method.

Any such parameters are acceptable. I just used xpixels and ypixels during testing.

Here's what happens:

if ($$option{'imager'} -> isa('Imager') )
{
	my($result)     = $$option{'imager'} -> read(file => $old_file_name, type => $$meta_data{'extension'});
	my($new_image)  = $$option{'imager'} -> scale(%{$$option{'options'} });
	my($extension)  = $$meta_data{'extension'};
	$extension      = $extension ? ".$extension" : '';
	$temp_file_name = "$temp_file_name$extension";
	$result         = $new_image -> write(file => $temp_file_name, type => $$meta_data{'extension'});
}

So, to intercept these calls, a descendent of Imager must respond to these calls:

read()
scale()
write()

This key (transform) is optional.

Sample Code

Most of the features in CGI::Uploader are demonstrated in samples shipped with the distro:

Config data

Patch lib/CGI/Uploader/.ht.cgi.uploader.conf as desired.

This is used by CGI::Uploader::Config and hence by CGI::Uploader::Test.

CGI forms

Copy the directory htdocs/uploads/ to the doc root of your web server.

CGI scripts

Copy the files in cgi-bin/ to your cgi-bin directory.

As explained above, don't expect use.cgi.simple.pl to work.

Also, use.cgi.uploader.v2.pl will not run if you have installed V 3 over the top of V 2.

Run the CGI scripts

Point your web client at:

/cgi-bin/use.cgi.pl
/cgi-bin/use.cgi.uploader.v3 pl

You can enter 1 or 2 file names in each CGI form.

The code executed is actually in CGI::Uploader::Test.

See the method use_cgi_uploader_v3() in that module for one way of utilizing the data returned by upload().

Command line scripts

The scripts/ directory contains various sample programs.

In particular, see scripts/test.generate.pl.

Note: to run this program you will have already uploaded one or more files, and Apache will have created a directory structure according to your path option, and will own that path.

So, you may need to use sudo to run scripts/test.generate.pl, since it will write temporary files to the same path.

Modules Used and Required

Both Build.PL and Makefile.PL list the modules used by CGI::Uploader.

Further to those, user options can trigger the use of these modules:

Config::IniFiles

If you use CGI::Uploader::Test, it uses CGI::Uploader::Config, which uses Config::IniFiles.

DBD::Pg

I used Postgres when writing and testing V 3, and hence I used DBD::Pg.

Examine lib/CGI/Uploader/.ht.cgi.uploader.conf for details. This file is read in by CGI::Uploader::Config.

DBD::SQLite

A quick test with SQLite worked, too.

The test only requires changing .ht.cgi.uploader.conf and re-running scripts/create.table.pl. E.g.:

dsn=dbi:SQLite:dbname=/tmp/test
password=
table_name=uploads
username=

Also, after running scripts/create.table.pl, use 'chmod a+w /tmp/test' so that the Apache daemon can write to the database.

One last thing. SQLite does not interpret the function now(); it just puts that string in the date_stamp column. Oh, well.

DBI

If you do not specify a manager object, CGI::Uploader uses DBI.

DBIx::Admin::CreateTable

If you use CGI::Uploader::Test to create the table, via scripts/create.table.pl, you'll need DBIx::Admin::CreateTable.

Digest::MD5

If you set the file_scheme option to md5, you'll need Digest::MD5.

HTML::Template

If you want to run any of the test scripts in cgi-bin/, you'll need HTML::Template.

Image::Magick

If you specify the transform option without the imager option, CGI::Uploader uses Imager::Magick.

FAQ

Specifying the file name on the server

This feature is not provided, for various reasons.

One problem is sabotage.

Another problem is users specifying characters which are illegal in file names on the server.

In other words, this feature was considered and rejected.

API changes from V 2 to V 3

API changes between V 2 and V 3 are obviously enormous. A direct comparison doesn't make much sense.

However, here are some things to watch out for:

Various columns have different (default) names
Default file extension

Under V 2, a file called 'x' would be saved by force with a name of 'x.bin'.

V 3 does not change file names, so 'x' will be stored in the database as 'x'.

The dot in the file extension

Under V 2, a file called 'x.png' would have '.png' stored in the extension column of the database.

V 3 only stores 'png'.

The id of the last record inserted

Under V 2, various mechanisms were used to retrieve this value.

V 3 calls $dbh -> last_insert_id(), unless of course you've circumvented this by supplying your own manager object.

The file name on the server

Under V 2, the permanent file name was not stored as part of the meta-data.

V 3 stores this information.

Datestamps

Under V 2, the datestamp of when the file was uploaded was not saved.

V 3 stores this information.

How come there is no update option like there was in V 2?

Errr, it's been renamed to delete() and upload().

Changes

See Changes and Changelog.ini. The latter is machine-readable, using Module::Metadata::Changes.

Public Repository

V 3 is available from github: git:github.com/ronsavage/cgi--uploader.git

Authors

V 2 was written by Mark Stosberg <mark@summersault.com>.

V 3 was written by Ron Savage <ron@savage.net.au>.

Ron's home page: http://savage.net.au/index.html

Licence

Artistic.