NAME
SPOPS::Manual::ImportExport - Moving data (and more) with SPOPS
SYNOPSIS
This part of the SPOPS manual describes how to get data in and out of SPOPS datasources, and also how to create portable database structures.
DESCRIPTION
Once you can retrofit object-oriented access and behaviors on top of existing datasources, the world is your oyster, right?
Except...
Getting data in and out of these datasources can prove an issue. One of the benefits of SPOPS is that databases become more transparent -- you don't really care what the data are stored on, just that you can access it.
Of course, one of the downsides of this transparency is that you might actually take advantage of it! Being able to move from MySQL to PostgreSQL maybe become much more important, particularly if you're consolidating disparate datasources to one or two.
To get around this fact, SPOPS (as of 0.55) comes with importing and exporting capabilities. And, of course, it's fairly simple for you to extend these capabilities and create your own.
This manual section describes the basics of getting data in and out (with examples) and along the way points out ways you can extend it.
IMPORTING
There are two types of importing we'll talk about: data and structures. Currently the only structural support is for DBI database tables, but the framework exists for other datastores.
Importing Data
No better way to introduce a topic than show working code, so here's a simple example of a script to import data:
1: #!/usr/bin/perl
2:
3: use strict;
4: use SPOPS::Import;
5: use SPOPS::Initialize;
6:
7: {
8: SPOPS::Initialize->process({ filename => 'spops.conf' });
9: my $import = SPOPS::Import->new( 'object' )
10: ->data_from_fh( \*DATA );
11: my $status = $import->run;
12: foreach my $item ( @{ $status } ) {
13: print "Status for $item->[1][0]: ";
14: if ( $item->[0] ) { print "OK\n" }
15: else { print "FAILED ($item->[2])\n" }
16: }
17: }
18:
19: __DATA__
20: $item = [
21: { spops_class => 'My::Doodad',
22: field_order => [ qw/ name description unit_cost factory created_by / ] },
23: [q{Amazing Melonhead}, q{Spits seeds}, q{100.75}, q{Saskatoon, Saskatchewan, Canada}, q{2}],
24: [q{Dipsy Doodler}, q{Who knows?}, q{5.00}, q{Chicago, Illinois, USA}, q{2}],
25: [q{Greg Kihn Band}, q{I lost on Jeopardy}, q{11.99}, q{Topeka, Kansas, USA}, q{2}]
26: ];
As seen here, there are three main steps to importing:
Create an importer object. Instantiating the import object must include at least the type of import you will be performing.
In the example, we set the type of import we're doing with the parameter 'object' passed into the
new()
method.Set properties for the importer object. The importer object needs to know some basic information about what you're importing. At a minimum, it needs the SPOPS object class and the data you're importing. Different types of imports may need additional information as well.
In the example, we read the properties for the object from a filehandle using the
data_from_fh()
method.Run the import. Every import subclass includes the
run()
method, which actually performs the import. Until you executerun()
, no data are written to the datasource.
No matter what the import format, you will always need to execute these three steps. The first two can be combined, either explicitly as we did in the example or by passing the properties to the new()
method.
Currently SPOPS data import options are an object import, called using the 'object' key, and a raw data import for DBI tables, called using the 'dbdata' key. See SPOPS::Import::Object and SPOPS::Import::DBI::Data for more information specific to each import format.
Importing DBI Tables
Another goal of SPOPS importing and exporting is to make the structures the datasources use portable as well. SPOPS has support for simplistic generic DBI table importing. It uses keys in the CREATE TABLE SQL statement and replaces them with database-dependent structures.
One of the most common uses of this is to create a table that supports an 'auto-incrementing' field on different databases. Since databases use very different schemes for generating this value, it's abstracted into a key that's replaced on import.
For instance, take this basic table:
1: CREATE TABLE user (
2: user_id %%INCREMENT%%,
3: login_name varchar(50) not null,
4: email varchar(75) not null,
5: primary key( user_id )
6: )
When we run the import the %%INCREMENT%% key gets translated to a database-specific expression. In MySQL it would be 'INT NOT NULL AUTO_INCREMENT' and in PostgreSQL it would be 'SERIAL'.
You can also add your own translation behaviors. For instance, you can create a central datastore for datatypes in your application:
1: address varchar(100)
2: email varchar(75)
3: phone varchar(25)
4: city varchar(40)
5: postal varchar(12)
And then you'd create a custom behavior and add it to the import routine as in this example:
1: #!/usr/bin/perl
2:
3: use strict;
4: use SPOPS::Import;
5:
6: my %DATATYPES = initialize_datatypes();
7:
8: {
9: my $table_import = SPOPS::Import->new( 'table' );
10: $table_import->database_type( 'mysql' );
11: $table_import->print_only( 1 );
12: $table_import->transforms([ \&my_transform ]);
13: $table_import->read_table_from_file( 'ie_import_table_custom_before' );
14: $table_import->run;
15: }
16:
17: sub my_transform {
18: my ( $self, $sql, $importer ) = @_;
19: foreach my $datatype ( keys %DATATYPES ) {
20: $$sql =~ s/%%$datatype%%/$DATATYPES{ $datatype }/g;
21: }
22: }
23:
24:
25: sub initialize_datatypes {
26: my %h = ();
27: open( DT, "ie_import_table_datatypes" )
28: || die "Cannot import datatypes: $!";
29: while ( <DT> ) {
30: chomp;
31: s/^\s+//;
32: s/\s+$//;
33: my ( $datatype, $sql ) = split /\s+/, $_, 2;
34: $h{ uc $datatype } = $sql;
35: }
36: return %h;
37: }
So that a table definition like:
1: CREATE TABLE address (
2: address_id %%INCREMENT%%,
3: company varchar(50) null,
4: address1 %%ADDRESS%% null,
5: address2 %%ADDRESS%% null,
6: city %%CITY%% null,
7: state char(2) null,
8: postal %%POSTAL%% null
9: email %%EMAIL%% null,
10: main_phone %%PHONE%% not null,
11: main_fax %%PHONE%% null,
12: mobile_phone %%PHONE%% null,
13: primary key ( address_id )
14: )
would become:
1: CREATE TABLE address (
2: address_id INT NOT NULL AUTO_INCREMENT,
3: company varchar(50) null,
4: address1 varchar(100) null,
5: address2 varchar(100) null,
6: city varchar(40) null,
7: state char(2) null,
8: postal varchar(12) null
9: email varchar(75) null,
10: main_phone varchar(25) not null,
11: main_fax varchar(25) null,
12: mobile_phone varchar(25) null,
13: primary key ( address_id )
14: )
It's not as powerful as custom datatypes -- with inheritance and dynamic schema updating capabitility -- but it's still very useful to enforce data standards within and across applications. (Besides, how often do you modify the schema once something is created?)
See SPOPS::Import::DBI::Table for the currently supported keys.
EXPORTING
Exporting data and importing data are slightly different. With importing, you do everything required in the run()
method. With exporting, the parent class defines the run()
method but triggers callbacks in the process. Each callback returns content which gets concatendated together to form the set of exported objects.
These callbacks are:
create_header
Only called once before any records have been processed.
create_record
Called for each record.
create_footer
Only called once after all the records have been processed.
The export implementation is free to use these how it sees fit. For instance, the SPOPS::Export::Perl just keeps a copy of every object it sees and then dumps them all with a single call to Data::Dumper in the create_footer callback. (Due to memory issues, you'd probably want to modify this if you were exporting hundreds of thousands of records. But it's just an example.)
COPYRIGHT
Copyright (c) 2001-2002 Chris Winters. All rights reserved.
See SPOPS::Manual for license.
AUTHORS
Chris Winters <chris@cwinters.com>