NAME
CGI::Uploader - Manage CGI uploads using an SQL database
Synopsis
# Create an upload object
# -----------------------
my($u) = CGI::Uploader -> new # Mandatory.
(
dbh => $dbh, # Optional. Or specify in call to upload().
dsn => [...], # Optional. Or specify in call to upload().
imager => $obj, # Optional. Or specify in call to upload's transform.
manager => $obj, # Optional. Or specify in call to upload().
query => $q, # Optional.
temp_dir => $t, # Optional.
);
# Upload N files
# --------------
my($meta_data) = $u -> upload # Mandatory.
(
form_field_1 => # An arrayref of hashrefs. The keys are CGI form field names.
[
{ # First, mandatory, set of options for storing the uploaded file.
column_map => {...}, # Optional.
dbh => $dbh, # Optional. But one of dbh or dsn is
dsn => [...], # Optional. mandatory if no manager.
file_scheme => $s, # Optional.
manager => $obj, # Optional. If present, all others params are optional.
sequence_name => $s, # Optional, but mandatory if Postgres and no manager.
table_name => $s, # Optional if manager, but mandatory if no manager.
transform => {...} # Optional.
},
{ # Second, etc, optional sets of options for storing copies of the file.
},
],
form_field_2 => [...], # Another arrayref of hashrefs.
);
# Delete N files for each uploaded file
# -------------------------------------
my($report) = $u -> delete # Optional.
(
column_map => {...}, # Mandatory.
dbh => $dbh, # Optional. But one of dbh or dsn is
dsn => [...], # Optional. mandatory.
id => $id, # Mandatory.
table_name => $s, # Mandatory.
);
# Generate N files from each uploaded file
# ----------------------------------------
$u -> generate # Optional.
(
form_field_1 => [...], # Mandatory. An arrayref of hashrefs.
form_field_2 => [...], # Mandatory. Another arrayref of hashrefs.
);
The simplest option, then, is to use
CGI::Uploader -> new() -> upload(file_name => [{dbh => $dbh, table_name => 'uploads'}]);
and let CGI::Uploader
do all the work.
For Postgres, make that
CGI::Uploader -> new() -> upload(file_name => [{dbh => $dbh, sequence_name => 'uploads_id_seq', table_name => 'uploads'}]);
Description
CGI::Uploader
is a pure Perl module.
Warning: V 2 'v' V 3
The API for CGI::Uploader
version 3 is not compatible with the API for version 2.
This is because V 3 is a complete rewrite of the code, taking in to account all the things learned from V 2.
Constructor and initialization
new()
returns a CGI::Uploader
object.
This is the class's contructor.
You must pass a hash to new()
.
Options:
- dbh => $dbh
-
This key may be specified globally or in the call to
upload()
.See Details for an explanation, including how this key interacts with dsn.
This key (dbh) is optional.
- dsn => [...]
-
This key may be specified globally or in the call to
upload()
.See Details for an explanation, including how this key interacts with dbh.
This key (dsn) is optional.
- imager => $obj
-
This key may be specified globally or in the call to
upload
's transform.This object is used to handle the transformation of images.
This key (imager) is optional.
- manager => $obj
-
This key may be specified globally or in the call to
upload()
.This object is used to handle the transfer of meta-data into the database. See Meta-data.
This key (manager) is optional.
- query => $q
-
Use this to pass in a query object.
This object is expected to belong to one of these classes:
If not provided, an object of type
CGI
will be created and used to do the uploading.If you want to use a different type of object, just ensure it has these CGI-compatible methods:
Warning: CGI::Simple cannot be supported. See this ticket, which is not resolved:
http://rt.cpan.org/Ticket/Display.html?id=14838
There is a comment in the source code of CGI::Simple about this issue. Search for 14838.
This key (query) is optional.
- temp_dir => $string
-
Note the spelling of temp_dir.
If not provided, an object of type
File::Spec
will be created and its tmpdir() method called.This key (temp_dir) is optional.
Transformation 'v' Generation
Transform is an optional component in the call to upload()
.
Generate()
is a separate method.
This section discusses these 2 processes.
Tranformation:
- You must specify a CGI form field
-
This means transformation takes exactly 1 input file.
- The file is uploaded before being transformed
- The uploaded file is transformed and saved
- The uploaded file is discarded
- The transformed file's meta-data goes in the database
-
This means transformation outputs exactly 1 file.
Generation:
- There is no upload associated with generation
- The file used as a basis for generation must be in the database
-
This means generation takes exactly 1 input file.
So this input file was, presumably, uploaded at some time in the past, and may have been transformed at that time.
- You specify how to generate a new file based on an old file
-
That is, you specify a set of options which control the generation of 1 new file.
- You specify N >= 1 sets of such options
-
This means generation outputs N >= 1 new files.
- The old file stays in the database
- All the generated files' meta-data go in the database.
A typical use of generation would be to produce thumbnails of large images.
Method: delete(%hash)
Note: Methods are listed here in alphabetical order. So delete()
comes before upload()
. Nevertheless, the most detailed explanations of options are under upload()
, with only brief notes here under delete()
.
You must pass a hash to delete()
.
delete(%hash) deletes everything associated with a given database table id.
The keys of this hash are reserved words, and the values are your options.
- column_map => {...}
-
See Details for a discussion of column_map.
Note: If your column map does not contain the server_file_name key,
delete(%hash)
will do nothing because it won't be able to find any file names to delete.The key (column_map) is optional.
- dbh => $dbh
-
This key may be specified globally or in the call to
delete()
.See Details for an explanation, including how this key interacts with dsn.
This key (dbh) is optional.
- dsn => [...]
-
This key may be specified globally or in the call to
delete()
.See Details for an explanation, including how this key interacts with dbh.
This key (dsn) is optional.
- id => $id
-
This is the (primary) key of the database table which will be processed.
To specify a column name other than id, use the column_map option.
This key (id) is mandatory.
- table_name => $string
-
This is the name of the database table.
This key (table_name) is mandatory.
There is no manager key because there is no point in you passing all these options to delete(%hash)
just so this method can pass them all back to your manager.
The items deleted are:
- All files generated from the uploaded file
-
They can be identified because their parent_id column matches $id, and their file names come from the server_file_name column.
- The records in the table whose parent_id matches $id
- The uploaded file
-
It can be identified becase its id column matches $id, and its file name comes from the server_file_name column.
- The record in the table whose id matches $id
delete(%hash)
returns an array ref of hashrefs.
Each hashref has 2 keys and 2 values:
- id => $id
-
$id is the value of the (primary) key column of a deleted file.
One of these $id values will be the $id you passed in to
delete(%hash)
. - file_name => $string
-
$string is the name of a deleted file.
Method: generate(%hash)
You must pass a hash to generate()
.
The keys to this hash are:
- column_map => {...}
-
The default column_map is documented under Details.
This key (column_map) is optional.
- dbh => $dbh
-
Dbh is documented under Details.
At least one of dbh and dsn must be provided.
- dsn => [...]
-
Dbh is documented under Details.
At least one of dbh and dsn must be provided.
- file_scheme => $string
-
file_scheme is documented under Details.
File_scheme defaults to string.
This key (file_scheme) is optional.
- manager => $obj
-
Manager is documented under Details.
This key (manager) is optional.
- path => $string
-
Path is documented under Details.
This key (path) is mandatory.
- records => {...}
-
Records specifies which (primary) keys in the table are used to find files to process.
These files are input files, and the options in the hashref specify how to use those files to generate output files.
The keys in the hashref are the keys in the table. E.g.:
records => {1 => [...], 99 => [...]}
specifies that only records with ids of 1 and 99 are to be processed.
The name of the (primary) key column defaults to id, but you can use column_map to change that.
The name of the input file comes from the server_file_name column of the table. Use column_map to change that column name.
The arrayrefs are used to specify N >= 1 output files for each input file.
So, each arrayref contains N >= 1 hashrefs, and each hashref specifies how to generate 1 output file. E.g.:
records => {1 => [{...}, {...}], 99 => [{...}]}
This says use id 1 to generate 2 output files, and use id 99 to generate 1 output file.
To make life easier, if you only wish to generate a single output file, you can reduce this:
records => {99 => [{...}]}
to this:
records => {99 => {...} }
The structure of the inner-most hashrefs is exactly the same as the hashrefs pointed to by the <transform> key, documented at "transform". E.g.:
For an imager object of type
Image::Magick
:records => {1 => [{imager => $obj, options => {width => $w, height => $h} }, {...}], 99 => [{...}]}
or, for an imager object of type
Imager
:records => {1 => [{imager => $obj, options => {xpixels => $x, ypixels => $y}, {...}], 99 => [{...}]}
CGI::Uploader
takes care of the meta-data for each generated file. See Meta-data.This key (records) is mandatory.
- sequence_name => $string
-
Sequence_name is documented under Details.
This key is mandatory if you are using Postgres, and optional if not.
- table_name => $string
-
This key (table_name) is mandatory.
Note: generate()
returns an hashref of arrayrefs, where the keys of the hashref are the ids provided in the records hashref, and the arrayrefs list the ids of the files generated.
You can use this data, e.g., to read the meta-data from the database and populate form fields to inform the user of the results of the generation process.
Method: upload(%hash)
You must pass a hash to upload()
.
The keys of this hash are CGI form field names (where the fields are of type file).
CGI::Uploader
cycles thru these keys, using each one in turn to drive a single upload.
Note: upload()
returns an arrayref of hashrefs, one hashref for each uploaded file stored.
The hashrefs returned are not the meta-data associated with each uploaded file, but more like status reports.
These status reports are explained here, and the meta-data is explained in the next section.
The structure of these status hashrefs is 2 keys and 2 values:
You can use this data, e.g., to read the meta-data from the database and populate form fields to inform the user of the results of the upload.
Meta-data
Meta-data associated with each uploaded file is accumulated while upload() works.
Meta-data is a hashref, with these keys:
- client_file_name
-
The client_file_name is the name supplied by the web client to
CGI::Uploader
. It may or may not have path information prepended, depending on the web client. - date_stamp
-
This value is the string 'now()', until the meta-data is saved in the database.
At that time, the value of the function now() is stored, except for SQLite, which just stores the string 'now()'.
Date_stamp has an underscore in it in case your database regards datastamp as a reserved word.
- extension
-
This is provided by the
File::Basename
module.The extension is a string without the leading dot.
If an extension cannot be determined, the value will be '', the empty string.
- height
-
This is provided by the Image::Size module, if it recognizes the type of the file.
For non-image files, the value will be 0.
- id
-
The id is (presumably) the primary key of your table.
This value is 0 until the meta-data is saved in the database.
In the case of Postgres, it will be populated by the sequence named with the sequence_name key.
- mime_type
-
This is provided by the MIME::Types module, if it can determine the type.
If not, it is '', the empty string.
- parent_id
-
This is populated when a file is generated from the uploaded file. It's value will be the id of the upload file's record.
For the uploaded file itself, the value will be 0.
- server_file_name
-
The server_file_name is the name under which the file is finally stored on the file system of the web server. It is not the temporary file name used during the upload process.
- size
-
This is the size in bytes of the uploaded file.
- width
-
This is detrmined by the Image::Size module, if it recognizes the type of the file.
For non-image files, the value will be 0.
Processing Steps
A mini-synopsis:
$u -> upload
(
file_name_1 =>
[
{First set of storage options for this file},
{Second set of storage options for the same file},
{...},
],
);
- Upload file
-
upload()
callsdo_upload()
to do the work of uploading the caller's file to a temporary file.This is done once, whereas the following steps are done once for each hashref of storage options you specify in the arrayref pointed to by the 'current' CGI form field's name.
do_upload()
returns a hashref of meta-data associated with the file. - Transform the file
-
If requested, call
do_transform()
. - Save the meta-data
-
upload()
calls thedo_insert()
method on the manager object to insert the meta-data into the database.The default manager is
CGI::Uploader
itself.do_insert()
saves the last insert id from that insert in the meta-data hashref. - Create the permanent file
-
upload()
callscopy_temp_file()
to save the file permanently.copy_temp_file()
saves the permanent file name in the meta-data hashref. - Determine the height and width of images
-
upload()
calls theget_size()
method to get the image size, which delegates the work toImage::Size
.get_size()
saves the image's dimensions in the meta-data hashref. - Update the database with the permanent file's name and image size
-
upload()
calls thedo_update()
method on the manager object to put the permanent file's name into the database record, along with the height and width.
Details
Each key in the hash passed in to upload()
points to an arrayref of options which specifies how to process the form field.
Use multiple elements in the arrayref to store multiple sets of meta-data, all based on the same uploaded file.
Each hashref contains 1 .. 5 of the following keys:
- column_map => {...}
-
This hashref maps column_names used by
CGI::Uploader
to column names used by your database table.The default column_map is:
{ client_file_name => 'client_file_name', date_stamp => 'date_stamp', extension => 'extension', height => 'height', id => 'id', mime_type => 'mime_type', parent_id => 'parent_id', server_file_name => 'server_file_name', size => 'size', width => 'width', }
If you supply a different column map, the values on the right-hand side are the ones you change.
Points to note:
This key (column_map) is optional.
- dbh => $dbh
-
This is a database handle for use by the default manager class (which is just
CGI::Uploader
) discussed below, under manager.This key is optional if you use the manager key, since in that case you do anything in your own storage manager code.
If you do provide the dbh key, it is passed in to your manager just in case you need it.
Also, if you provide dbh, the dsn key, below, is ignored.
If you do not provide the dbh key, the default manager uses the dsn arrayref to create a dbh via
DBI
. - dsn => [...]
-
This key is optional if you use the manager key, since in that case you do anything in your own storage manager code.
If you do provide the dsn key, it is passed in to your manager just in case you need it.
Using the default manager, this key is ignored if you provide a dbh key, but it is mandatory when you do not provide a dbh key.
The elements in the arrayref are:
- A connection string
-
E.g.: 'dbi:Pg:dbname=test'
This element is mandatory.
- A username string
-
This element is mandatory, even if it's just the empty string.
- A password string
-
This element is mandatory, even if it's just the empty string.
- A connection attributes hashref
-
This element is optional.
The default manager class calls DBI -> connect(@$dsn) to connect to the database, i.e. in order to generate a dbh, when you don't provide a dbh key.
- file_scheme => $string
-
File_scheme controls how files are stored on the web server's file system.
All files are stored in the directory specified by the path option.
Each file name has the appropriate extension appended (as determined by
MIME::Types
.The possible values of file_scheme are:
- md5
-
The file name is determined like this:
- simple
-
The file name is the (primary key) id.
Simple is the default.
This key (file_scheme) is optional.
- manager => $object
-
This is an instance of your class which will manage the transfer of meta-data to a database table.
In the case you provide the manager key, your object is responsible for saving (or discarding!) the meta-data.
If you provide an object here,
CGI::Uploader
will call $object => do_insert($field_name, $meta_data, $store_option).Parameters are:
- $field_name
-
$field_name will be the 'current' CGI form field.
Remember, upload() is iterating over all your CGI form field parameters at this point.
- $meta_data
-
$meta_data will be a hashref of options generated by the uploading process
See Meta-data, for the definition of meta-data.
- $store_option
-
$store_option will be the 'current' hashref of storage options, one of the arrayref elements associated with the 'current' form field.
If you do not provide the manager key,
CGI::Uploader
will do the work itself.Later,
CGI::Uploader
will call $object => do_update($field_name, $meta_data, $store_option), as explained above, under Processing Steps.This key (manager) is optional.
- path => $string
-
This is a path on the web server's file system where a permanent copy of the uploaded file will be saved.
This key (path) is mandatory.
- sequence_name => $string
-
This is the name of the sequence used to generate values for the primary key of the table.
You would normally only need this when using Postgres.
This key is optional if you use the manager key, since in that case you can do anything in your own storage manager code. If you do provide the sequence_name key, it is passed in to your manager just in case you need it.
This key is mandatory if you use Postgres and do not use the manager key, since without the manager key, sequence_name must be passed in to the default manager (
CGI::Uploader
). - table_name => $string
-
This is the name of the table into which to store the meta-data.
This key is optional if you use the manager key, since in that case you can do anything in your own storage manager code. If you do provide the table_name key, it is passed in to your manager just in case you need it.
This key is mandatory if you do not use the manager key, since without the manager key, table_name must be passed in to the default manager (
CGI::Uploader
). - transform => {...}
-
This key points to a set of options which are used to transform the uploaded file.
As stated above, transformation takes 1 input file, uploads it, transforms it, saves the transformed file, and discards the uploaded file.
See also
generate()
, for a completely different way of processing files.Here are the 2 examples I used in testing, but not at the same time!
transform => { imager => Image::Magick -> new(), # Optional. Default. options => {height => 400, width => 500}, } transform => { imager => Imager -> new(), options => {xpixels => 400, ypixels => 500}, }
Clearly, transform points to a hashref:
- imager => $obj
-
The imager key is optional. If omitted,
CGI::Uploader
creates an object of typeImage::Magick
, and uses that.You can pass in an object whose class is a descendent of
Image::Magick
orImager
.They are treated differently, as explained next.
- height => 'Int', width => 'Int'
-
If the $obj isa('Image::Magick') you must pass in at least 1 of height and width.
The missing one is calculated from the size of the input image and the given parameter.
Here's what happens:
if ($$option{'imager'} -> isa('Image::Magick') ) { my($result) = $$option{'imager'} -> Read($old_file_name); my($dimensions) = $self -> calculate_dimensions($$option{'imager'}, $option); $result = $$option{'imager'} -> Resize($dimensions); $result = $$option{'imager'} -> Write($temp_file_name); }
Note:
calculate_dimensions()
calls Get('width', 'height').This means if you wish to intercept these calls with a custom object, your
Image::Magick
-based object must respond to these calls: - options => {xpixels => 400, ypixels => 500}
-
If the $obj isa('Imager') you must pass in suitable parameters for
Imager's
scale()
method.Any such parameters are acceptable. I just used xpixels and ypixels during testing.
Here's what happens:
if ($$option{'imager'} -> isa('Imager') ) { my($result) = $$option{'imager'} -> read(file => $old_file_name, type => $$meta_data{'extension'}); my($new_image) = $$option{'imager'} -> scale(%{$$option{'options'} }); my($extension) = $$meta_data{'extension'}; $extension = $extension ? ".$extension" : ''; $temp_file_name = "$temp_file_name$extension"; $result = $new_image -> write(file => $temp_file_name, type => $$meta_data{'extension'}); }
So, to intercept these calls, a descendent of
Imager
must respond to these calls:
This key (transform) is optional.
Sample Code
Most of the features in CGI::Uploader
are demonstrated in samples shipped with the distro:
- Config data
-
Patch lib/CGI/Uploader/.ht.cgi.uploader.conf as desired.
This is used by
CGI::Uploader::Config
and hence byCGI::Uploader::Test
. - CGI forms
-
Copy the directory htdocs/uploads/ to the doc root of your web server.
- CGI scripts
-
Copy the files in cgi-bin/ to your cgi-bin directory.
As explained above, don't expect use.cgi.simple.pl to work.
Also, use.cgi.uploader.v2.pl will not run if you have installed V 3 over the top of V 2.
- Run the CGI scripts
-
Point your web client at:
You can enter 1 or 2 file names in each CGI form.
The code executed is actually in
CGI::Uploader::Test
.See the method use_cgi_uploader_v3() in that module for one way of utilizing the data returned by
upload()
. - Command line scripts
-
The scripts/ directory contains various sample programs.
In particular, see scripts/test.generate.pl.
Note: to run this program you will have already uploaded one or more files, and Apache will have created a directory structure according to your path option, and will own that path.
So, you may need to use sudo to run scripts/test.generate.pl, since it will write temporary files to the same path.
Modules Used and Required
Both Build.PL and Makefile.PL list the modules used by CGI::Uploader
.
Further to those, user options can trigger the use of these modules:
- Config::IniFiles
-
If you use
CGI::Uploader::Test
, it usesCGI::Uploader::Config
, which usesConfig::IniFiles
. - DBD::Pg
-
I used Postgres when writing and testing V 3, and hence I used
DBD::Pg
.Examine lib/CGI/Uploader/.ht.cgi.uploader.conf for details. This file is read in by
CGI::Uploader::Config
. - DBD::SQLite
-
A quick test with SQLite worked, too.
The test only requires changing .ht.cgi.uploader.conf and re-running scripts/create.table.pl. E.g.:
dsn=dbi:SQLite:dbname=/tmp/test password= table_name=uploads username=
Also, after running scripts/create.table.pl, use 'chmod a+w /tmp/test' so that the Apache daemon can write to the database.
One last thing. SQLite does not interpret the function now(); it just puts that string in the date_stamp column. Oh, well.
- DBI
-
If you do not specify a manager object,
CGI::Uploader
usesDBI
. - DBIx::Admin::CreateTable
-
If you use
CGI::Uploader::Test
to create the table, via scripts/create.table.pl, you'll needDBIx::Admin::CreateTable
. - Digest::MD5
-
If you set the file_scheme option to md5, you'll need
Digest::MD5
. - HTML::Template
-
If you want to run any of the test scripts in cgi-bin/, you'll need
HTML::Template
. - Image::Magick
-
If you specify the transform option without the imager option,
CGI::Uploader
usesImager::Magick
.
FAQ
- Specifying the file name on the server
-
This feature is not provided, for various reasons.
One problem is sabotage.
Another problem is users specifying characters which are illegal in file names on the server.
In other words, this feature was considered and rejected.
- API changes from V 2 to V 3
-
API changes between V 2 and V 3 are obviously enormous. A direct comparison doesn't make much sense.
However, here are some things to watch out for:
- Various columns have different (default) names
- Default file extension
-
Under V 2, a file called 'x' would be saved by force with a name of 'x.bin'.
V 3 does not change file names, so 'x' will be stored in the database as 'x'.
- The dot in the file extension
-
Under V 2, a file called 'x.png' would have '.png' stored in the extension column of the database.
V 3 only stores 'png'.
- The id of the last record inserted
-
Under V 2, various mechanisms were used to retrieve this value.
V 3 calls $dbh -> last_insert_id(), unless of course you've circumvented this by supplying your own manager object.
- The file name on the server
-
Under V 2, the permanent file name was not stored as part of the meta-data.
V 3 stores this information.
- Datestamps
-
Under V 2, the datestamp of when the file was uploaded was not saved.
V 3 stores this information.
- How come there is no update option like there was in V 2?
-
Errr, it's been renamed to
delete()
andupload()
.
Changes
See Changes and Changelog.ini. The latter is machine-readable, using Module::Metadata::Changes.
Public Repository
V 3 is available from github: git:github.com/ronsavage/cgi--uploader.git
Authors
V 2 was written by Mark Stosberg <mark@summersault.com>.
V 3 was written by Ron Savage <ron@savage.net.au>.
Ron's home page: http://savage.net.au/index.html
Licence
Artistic.