NAME
FlatFile::DataStore - Perl module that implements a flatfile datastore.
SYNOPSYS
use FlatFile::DataStore;
# new datastore object
my $dir = "/my/datastore/directory";
my $name = "dsname";
my $ds = FlatFile::DataStore->new( { dir => $dir, name => $name } );
# create a record
my $record_data = "This is a test record.";
my $user_data = "Test1";
my $record = $ds->create( {
data => \$record_data,
user => $user_data,
} );
my $record_number = $record->keynum;
# retrieve it
$record = $ds->retrieve( $record_number );
# update it
$record->data( "Updating the test record." );
$record = $ds->update( $record );
# delete it
$record = $ds->delete( $record );
# get its history
my @records = $ds->history( $record_number );
DESCRIPTION
FlatFile::DataStore implements a simple flatfile datastore. When you create (store) a new record, it is appended to the flatfile. When you update an existing record, the existing entry in the flatfile is flagged as updated, and the updated record is appended to the flatfile. When you delete a record, the existing entry is flagged as deleted, and a "delete record" is appended to the flatfile.
The result is that all versions of a record are retained in the datastore, and running a history will return all of them. Another result is that each record in the datastore represents a transaction: create, update, or delete.
Methods support the following actions:
- create
- retrieve
- update
- delete
- history
Additionally, FlatFile::DataStore::Utils provides the methods
- validate
- migrate
and others.
See FlatFile::DataStore::Tiehash for a tied interface.
VERSION
FlatFile::DataStore version 1.03
CLASS METHODS
FlatFile::DataStore->new();
Constructs a new FlatFile::DataStore object.
Accepts hash ref giving values for dir
and name
.
my $ds = FlatFile::DataStore->new(
{ dir => $dir,
name => $name,
} );
To initialize a new datastore, edit the "$dir/$name.uri" file and enter a configuration URI (as the only line in the file), or pass the URI as the value of the uri
parameter, e.g.,
my $ds = FlatFile::DataStore->new(
{ dir => $dir,
name => $name,
uri => join( ";" =>
"http://example.com?name=$name",
"desc=My%20Data%20Store",
"defaults=medium",
"user=8-%20-%7E",
"recsep=%0A",
),
} );
(See URI Configuration below.)
Also accepts a userdata
parameter, which sets the default user data for this instance, e.g.,
my $ds = FlatFile::DataStore->new(
{ dir => $dir,
name => $name,
userdata => ':',
} );
Returns a reference to the FlatFile::DataStore object.
OBJECT METHODS, Record Processing (CRUD)
create( $record )
or create( { data => \$record_data, user => $user_data } )
or create( { record => $record[, data => \$record_data][, user => $user_data] } )
Creates a record. If the parameter is a record object, the record data and user data will be gotten from it. Otherwise, if the parameter is a hash reference, the expected keys are:
- record => FlatFile::DataStore::Record object
- data => string or scalar reference
- user => string
If no record is passed, both 'data' and 'user' are required. Otherwise, if a record is passed, the record data and user data will be gotten from it unless one or both are explicitly provided.
Returns a Flatfile::DataStore::Record object.
Note: the record data (but not the user data) is stored in the FF::DS::Record object as a scalar reference. This is done for efficiency in the cases where the record data may be very large. Likewise, the data parm passed to create() may be a scalar reference.
retrieve( $num[, $pos] )
Retrieves a record. The parm $num
may be one of
- a key number, i.e., record sequence number
- a file number
The parm $pos
is required if $num
is a file number.
Here's why: When $num is a record key sequence number (key number), a preamble is retrieved from the datastore key file. In that preamble is the file number and seek position where the record data may be gotten. Otherwise, when $num is a file number, the application (you) must supply the seek position into that file. Working from an array of record history is the most likely time you would do this.
Returns a Flatfile::DataStore::Record object.
retrieve_preamble( $keynum )
Retrieves a preamble. The parm $keynum
is a key number, i.e., record sequence number
Returns a Flatfile::DataStore::Preamble object.
This method allows getting information about the record, e.g., if it's deleted, what's in the user data, etc., without the overhead of retrieving the full record data.
locate_record_data( $num[, $pos] )
Rather than retrieving a record, this subroutine positions you at the record data in the data file. This might be handy if, for example, the record data is text, and you just want part of it. You can scan the data and get what you want without having to read the entire record. Or the data might be XML and you could parse it using SAX without reading it all into memory.
The parm $num
may be one of
- a key number, i.e., record sequence number
- a file number
The parm $pos
is required if $num
is a file number. See retrieve() above for why.
Returns a list containing the file handle (which is already locked for reading in binmode), the seek position, and the record length.
You will be positioned at the seek position, so you could begin reading data, e.g., via <$fh>
:
my( $fh, $pos, $len ) = $ds->locate_record_data( $keynum );
my $got;
while( <$fh> ) {
last if ($got += length) > $len; # in case we read the recsep
# [do something with $_ ...]
last if $got == $len;
}
close $fh;
The above loop assumes you know each line of the data ends in a newline. Also keep in mind that the file is opened in binmode, so you will be reading bytes (octets), not necessarily characters. Decoding these octets is up to you.
XXX ("opened in binmode"?) does that make the example wrong wrt non-unix OS's
update( $record )
or update( { string => $preamble_string, data => \$record_data, user => $user_data } )
or update( { preamble => $preamble_obj, data => \$record_data, user => $user_data } )
or update( { record => $record_obj
[, preamble => $preamble_obj]
[, string => $preamble_string]
[, data => \$record_data]
[, user => $user_data] } )
Updates a record. If the parameter is a record object, the preamble, record data, and user data will be gotten from it. Otherwise, if the parameter is a hash reference, the expected keys are:
- record => FlatFile::DataStore::Record object
- preamble => FlatFile::DataStore::Preamble object
- string => a preamble string (the string attribute of a preamble object)
- data => string or scalar reference
- user => string
If no record is passed, 'preamble' (or 'string'), 'data', and 'user' are required. Otherwise, if a record is passed, the preamble, record data and user data will be gotten from it unless any of them are explicitly provided.
Returns a Flatfile::DataStore::Record object.
delete( $record )
or delete( { string => $preamble_string, data => \$record_data, user => $user_data } )
or delete( { preamble => $preamble_obj, data => \$record_data, user => $user_data } )
or delete( { record => $record_obj
[, preamble => $preamble_obj]
[, string => $preamble_string]
[, data => \$record_data]
[, user => $user_data] } )
Deletes a record. The parameters are the same as for update().
Returns a Flatfile::DataStore::Record object.
exists()
Tests if a datastore exists. Currently, a datastore "exists" if there is a .uri file -- whether the file is valid or not.
May be called on a datastore object, e.g.,
$ds->exists()
Or may be called as a class method, e.g.,
FlatFile::DataStore->exists({
name => 'example',
dir => '/dbs/example',
})
If called as a class method, you must pass a hashref that provides values for 'name' and 'dir'.
history( $keynum )
Retrieves a record's history. The parm $keynum
is always a key number, i.e., a record sequence number.
Returns an array of FlatFile::DataStore::Record objects.
The first element of this array is the current record. The last element is the original record. That is, the array is in reverse chronological order.
OBJECT METHODS, Accessors
In the specifications below, square braces ([]) denote optional parameters, not anonymous arrays, e.g., [$omap]
indicates that $omap
is optional, instead of implying that you need to pass it inside an array.
$ds->specs( [$omap] )
Sets and returns the specs
attribute value if $omap
is given, otherwise just returns the value.
An 'omap' is an ordered hash as defined in
http://yaml.org/type/omap.html
and implemented here using Data::Omap. That is, it's an array of single-key hashes. This ordered hash contains the specifications for constructing and parsing a record preamble as defined in the name.uri file.
In list context, the value returned is a list of hashrefs. In scalar context, the value returned is an arrayref containing the list of hashrefs.
$ds->dir( [$dir] )
Sets and returns the dir
attribute value if $dir
is given, otherwise just returns the value.
If $dir
is given and is a null string, the dir
object attribute is removed from the object. If $dir
is not null, the directory must already exist. In other words, this module will not create the directory where the database is to be stored.
Preamble accessors (from the uri)
The following methods set and return their respective attribute values if $value
is given. Otherwise, they just return the value.
$ds->indicator( [$value] ); # length-characters
$ds->transind( [$value] ); # length-characters
$ds->date( [$value] ); # length-format
$ds->transnum( [$value] ); # length-base
$ds->keynum( [$value] ); # length-base
$ds->reclen( [$value] ); # length-base
$ds->thisfnum( [$value] ); # length-base
$ds->thisseek( [$value] ); # length-base
$ds->prevfnum( [$value] ); # length-base
$ds->prevseek( [$value] ); # length-base
$ds->nextfnum( [$value] ); # length-base
$ds->nextseek( [$value] ); # length-base
$ds->user( [$value] ); # length-characters
Other accessors
$ds->name( [$value] ); # from uri, name of datastore
$ds->desc( [$value] ); # from uri, description of datastore
$ds->recsep( [$value] ); # from uri, character(s)
$ds->uri( [$value] ); # full uri as is
$ds->preamblelen( [$value] ); # length of preamble string
$ds->toclen( [$value] ); # length of toc entry
$ds->keylen( [$value] ); # length of stored keynum
$ds->keybase( [$value] ); # base of stored keynum
$ds->translen( [$value] ); # length of stored transaction number
$ds->transbase( [$value] ); # base of stored transaction number
$ds->fnumlen( [$value] ); # length of stored file number
$ds->fnumbase( [$value] ); # base of stored file number
$ds->userlen( [$value] ); # format from uri
$ds->dateformat( [$value] ); # format from uri
$ds->regx( [$value] ); # capturing regx for preamble string
$ds->datamax( [$value] ); # maximum bytes in a data file
$ds->crud( [$value] ); # hash ref, e.g.,
{
create => '+',
oldupd => '#',
update => '=',
olddel => '*',
delete => '-',
'+' => 'create',
'#' => 'oldupd',
'=' => 'update',
'*' => 'olddel',
'-' => 'delete',
}
(logical actions <=> symbolic indicators)
Accessors for optional attributes
$ds->dirmax( [$value] ); # maximum files in a directory
$ds->dirlev( [$value] ); # number of directory levels
$ds->tocmax( [$value] ); # maximum toc entries
$ds->keymax( [$value] ); # maximum key entries
$ds->userdata( [$value] ); # default user data
If no dirmax
, directories will keep being added to.
If no dirlev
, toc, key, and data files will reside in top-level directory. If dirmax
is given, dirlev
defaults to 1.
If no tocmax
, there will be only one toc file, which will grow indefinitely.
If no keymax
, there will be only one key file, which will grow indefinitely.
If no userdata
, will default to a null string (padded with spaces) unless supplied another way.
OBJECT METHODS, Other
howmany( [$regx] )
Returns count of records whose indicators match regx, e.g.,
$self->howmany( qr/create|update/ );
$self->howmany( qr/delete/ );
$self->howmany( qr/oldupd|olddel/ );
If no regx, howmany() returns numrecs from the toc file, which should give the same number as qr/create|update/.
lastkeynum()
Returns the last key number used, i.e., the sequence number of the last record added to the datastore, as an integer.
nextkeynum()
Returns lastkeynum()+1 (a convenience method). This could be useful for adding a new record to a hash tied to a datastore, e.g.,
$h{ $ds->nextkeynum } = "New record data.";
(but also note that there is a "null key" convention for this -- see FlatFile::DataStore::Tiehash)
URI Configuration
It may seem odd to use a URI as a configuration file. I needed some configuration approach and wanted to stay as lightweight as possible. The specs for a URI are fairly well-known, and it allows for everything we need, so I chose that approach.
The examples all show a URL, because I thought it would be a nice touch to be able to visit the URL and have the page tell you things about the datastore. This is what the utils/flatfile-datastore.cgi
program is intended to do, but it is in a very young/rough state so far.
Following are the URI configuration parameters. The order of the preamble parameters does matter: that's the order those fields will appear in each record preamble. Otherwise the order of the URI parameters doesn't matter.
Parameter values should be percent-encoded (uri escaped). Use %20 for space (don't be tempted to use '+'). Use URI::Escape::uri_escape , if desired, e.g.,
my $name = 'example';
my $dir = '/example/dir';
use URI::Escape;
my $datastore = FlatFile::DataStore::->new( {
name => $name,
dir => $dir,
uri => join( ';' =>
"http://example.com?name=$name",
"desc=" . uri_escape( 'My DataStore' ),
"defaults=medium",
"user=" . uri_escape( '8- -~' ),
"recsep=%0A",
) }
);
Preamble parameters
All of the preamble parameters are required.
(In fact, four of them are optional, but leaving them out means that you're giving up keeping the linked list of record history, so don't do that unless you have a good reason.)
- indicator
-
The indicator parameter specifies the single-character record indicators that appear in each record preamble. This parameter has the following form:
indicator=length-5CharacterString
, e.g.,indicator=1-+#=*-
The length is always 1. The five characters represent the five states of a record in the datastore (in this order):
create(+): the record has not changed since being added oldupd(#): the record was updated, and this entry is an old version update(=): this entry is the updated version of a record olddel(*): the record was deleted, and this entry is the old version delete(-): the record is deleted, and this entry is the "delete record"
(The reason for a "delete record" is for storing information about the delete process, such has when it was deleted and by whom.)
The five characters shown in the example are the ones used by all examples in the documentation. You're free to use your own characters, but the length must always be 1.
- transind
-
The transind parameter describes the single-character transaction indicators that appear in each record preamble. This parameter has the same format and meanings as the indicator parameter, e.g.,
transind=1-+#=*-
(Note that only three of these are used, but all five must be given and must match the indicator parameter.)
The three characters that are used are create(+), update(=), and delete(-). While the record indicators will change, e.g., from create to oldupd, or from update to olddel, etc., the transaction indicators never change from their original values. So a transaction that created a record will always have the create value, and the same for update and delete.
- date
-
The date parameter specifies how the transaction date is stored in the preamble. It has the form:
date=length-format
, e.g.,date=8-yyyymmdd date=14-yyyymmddtttttt date=4-yymd date=7-yymdttt
The examples show the four choices for length: 4, 7, 8, or 14. When the length is 8, the format must contain 'yyyy', 'mm', and 'dd' in some order. When the length is 14, add 'tttttt' (hhmmss) in there somewhere.
When the length is 4, the format must contain 'yy', 'm', and 'd' in some order. When the length is 7, add 'ttt' (hms) in there somewhere, e.g.
date=8-mmddyyyy, date=8-ddmmyyyy, etc. date=14-mmddyyyytttttt, date=14-ttttttddmmyyyy, etc. date=4-mdyy, date=4-dmyy, etc. date=7-mdyyttt, date=7-tttdmyy, etc.
When the length is 8 (or 14), the year, month, and day (and hours, minutes, seconds) are stored as decimal numbers, e.g., '20100615' for June 15, 2010 (or '20101224114208' for Dec 24, 2010 11:42:08).
When the length is 4 (or 7), they are stored as base62 numbers, e.g. 'WQ6F' (yymd) for June 15, 2010, or 'WQCOBg8' (yymdttt) for Dec 24, 2010 11:42:08.
- transnum
-
The transnum parameter specifies how the transaction number is stored in the preamble. It has the form:
transnum=length-base
, e.g.,transnum=4-62
The example says the number is stored as a four-digit base62 integer. The highest transaction number this allows is 'zzzz' base62 which is 14,776,335 decimal. Therefore, the datastore will accommodate up to that many transactions (creates, updates, deletes).
- keynum
-
The keynum parameter specifies how the record sequence number is stored in the preamble. It has the form:
keynum=length-base
, e.g.,keynum=4-62
As with the transnum example above, the keynum would be stored as a four-digit base62 integer, and the highest record sequence number allowed would be 14,776,335 ('zzzz' base62). Therefore, the datastore could not store more than this many records.
- reclen
-
The reclen parameter specifies how the record length is stored in the preamble. It has the form:
reclen=length-base
, e.g.,reclen=4-62
This example allows records to be up to 14,776,335 bytes long.
- thisfnum
-
The thisfnum parameter specifies how the file numbers are stored in the preamble. There are three file number parameters, thisfnum, prevfnum, and nextfnum. They must match each other in length and base. The parameter has the form:
thisfnum=length-base
, e.g.,thisfnum=2-36
There is an extra constraint imposed on the file number parameters: they may not use a number base higher than 36. The reason is that the file number appears in file names, and base36 numbers match [0-9A-Z]. By limiting to base36, file names will therefore never differ only by case, e.g., there may be a file named example.Z.data, but never one named example.z.data.
The above example states that the file numbers will be stored as two-digit base36 integers. The highest file number is 'ZZ' base36, which is 1,295 decimal. Therefore, the datastore will allow up to that many data files before filling up. (If a datastore "fills up", it must be migrated to a newly configured datastore that has bigger numbers where needed.)
In a preamble, thisfnum is the number of the datafile where the record is stored. This number combined with the thisseek value and the reclen value gives the precise location of the record data.
- thisseek
-
The thisseek parameter specifies how the seek positions are stored in the preamble. There are three seek parameters, thisseek, prevseek, and nextseek. They must match each other in length and base. The parameter has the form:
thisseek=length-base
, e.g.,thisseek=5-62
This example states that the seek positions will be stored as five-digit base62 integers. So the highest seek position is 'zzzzz' base62, which is 916,132,831 decimal. Therefore, each of the datastore's data files may contain up to that many bytes (record data plus preambles).
Incidentally, no record (plus its preamble) may be longer than this, because it just wouldn't fit in a data file.
Also, the size of each data file may be further limited using the datamax parameter (see below). For example, a seek value of
4-62
would allow datafiles up to 14,776,335 bytes long. If you want bigger files, but don't want them bigger than 500 Meg, you can givethisseek=5-62
anddatamax=500M
. - prevfnum (optional)
-
The prevfnum parameter specifies how the "previous" file numbers are stored in the preamble. The value of this parameter must exactly match thisfnum (see thisfnum above for more details). It has the form:
prevfnum=length-base
, e.g.,prevfnum=2-36
In a preamble, the prevfnum is the number of the datafile where the previous version of the record is stored. This number combined with the prevseek value gives the beginning location of the previous record's data.
This is the first of the four "optional" preamble parameters. If you don't provide this one, don't provide the other three either. If you leave these off, you will not be able to get a record's history of changes, and you will not be able to migrate any history to a new datastore.
So why would to not provide these? You might have a datastore that has very transient data, e.g., indexes, and you don't care about change history. By not including these four optional parameters, when the module updates a record, it will not perform the extra bit of IO to update a previous record's nextfnum and nextseek values. And the preambles will be a little bit shorter.
- prevseek (optional)
-
The prevseek parameter specifies how the "previous" seek positions are stored in the preamble. The value of this parameter must exactly match thisseek (see thisseek above for more details). It has the form
prevseek=length-base
, e.g.,prevseek=5-62
- nextfnum (optional)
-
The nextfnum parameter specifies how the "next" file numbers are stored in the preamble. The value of this parameter must exactly match thisfnum (see thisfnum above for more details). It has the form:
nextfnum=length-base
, e.g.,nextfnum=2-36
In a preamble, the nextfnum is the number of the datafile where the next version of the record is stored. This number combined with the nextseek value gives the beginning location of the next version of the record's data.
- nextseek (optional)
-
The nextseek parameter specifies how the "next" seek positions are stored in the preamble. The value of this parameter must exactly match thisseek (see thisseek above for more details). It has the form
nextseek=length-base
, e.g.,nextseek=5-62
You would have a nextfnum and nextseek in a preamble when it's a previous version of a record whose current version appears later in the datastore. While thisfnum and thisseek are critical for all record retrievals, prevfnum, prevseek, nextfnum, and nextseek are only needed for getting a record's history. They are also used during a migration to help validate that all the data (and transactions) were migrated intact.
- user
-
The user parameter specifies the length and character class for extra user data stored in the preamble. It has the form:
user=length-CharacterClass
, e.g.,user=8-%20-~ (must match /[ -~]+ */ and not be longer than 8) user=10-0-9 (must match /[0-9]+ */ and not be longer than 10) user=1-: (must be literally ':')
When a record is created, the application supplies a value to store as "user" data. This might be a userid, an md5 digest, multiple fixed-length fields -- whatever is needed or wanted.
This field is required but may be preassigned using the userdata parameter (see below). If no user data is provided or preassigned, it will default to a null string (which will be padded with spaces).
When this data is stored in the preamble, it is padded on the right with spaces.
Preamble defaults
All of the preamble parameters -- except user -- may be set using one of the defaults provided, e.g.,
http://example.com?name=example;defaults=medium;user=8-%20-~
http://example.com?name=example;defaults=large;user=10-0-9
Note that these are in a default order also. And the user parameter is still part of the preamble, so you can make it appear first if you want, e.g.,
http://example.com?name=example;user=8-%20-~;defaults=medium
http://example.com?name=example;user=10-0-9;defaults=large
The _nohist
versions leave out the optional preamble parameters -- the above caveat about record history still applies.
Finally, if none of these suits, they may still be good starting points for defining your own preambles.
- xsmall, xsmall_nohist
-
When the URI contains
defaults=xsmall
, the following values are set:indicator=1-+#=*- transind=1-+#=*- date=7-yymdttt transnum=2-62 3,843 transactions keynum=2-62 3,843 records reclen=2-62 3,843 bytes/record thisfnum=1-36 35 data files thisseek=4-62 14,776,335 bytes/file prevfnum=1-36 prevseek=4-62 nextfnum=1-36 nextseek=4-62
The last four are not set for
defaults=xsmall_nohist
.Rough estimates: 3800 records (or transactions), no larger than 3800 bytes each; 517 Megs total (35 * 14.7M).
- small, small_nohist
-
For
defaults=small
:indicator=1-+#=*- transind=1-+#=*- date=7-yymdttt transnum=3-62 238,327 transactions keynum=3-62 238,327 records reclen=3-62 238,327 bytes/record thisfnum=1-36 35 data files thisseek=5-62 916,132,831 bytes/file prevfnum=1-36 prevseek=5-62 nextfnum=1-36 nextseek=5-62
The last four are not set for
defaults=small_nohist
.Rough estimates: 238K records (or transactions), no larger than 238K bytes each; 32 Gigs total (35 * 916M).
- medium, medium_nohist
-
For
defaults=medium
:indicator=1-+#=*- transind=1-+#=*- date=7-yymdttt transnum=4-62 14,776,335 transactions keynum=4-62 14,776,335 records reclen=4-62 14,776,335 bytes/record thisfnum=2-36 1,295 data files thisseek=5-62 916,132,831 bytes/file prevfnum=2-36 prevseek=5-62 nextfnum=2-36 nextseek=5-62
The last four are not set for
defaults=medium_nohist
.Rough estimates: 14.7M records (or transactions), no larger than 14.7M bytes each; 1 Terabyte total (1,295 * 916M).
- large, large_nohist
-
For
defaults=large
:datamax=1.9G 1,900,000,000 bytes/file dirmax=300 keymax=100_000 indicator=1-+#=*- transind=1-+#=*- date=7-yymdttt transnum=5-62 916,132,831 transactions keynum=5-62 916,132,831 records reclen=5-62 916,132,831 bytes/record thisfnum=3-36 46,655 data files thisseek=6-62 56G per file (but see datamax) prevfnum=3-36 prevseek=6-62 nextfnum=3-36 nextseek=6-62
The last four are not set for
defaults=large_nohist
.Rough estimates: 916M records/transactions, no larger than 916M bytes each; 88 Terabytes total (46,655 * 1.9G).
- xlarge, xlarge_nohist
-
For
defaults=xlarge
:datamax=1.9G 1,900,000,000 bytes/file dirmax=300 dirlev=2 keymax=100_000 tocmax=100_000 indicator=1-+#=*- transind=1-+#=*- date=7-yymdttt transnum=6-62 56B transactions keynum=6-62 56B records reclen=6-62 56G per record (limited to 1.9G by datamax) thisfnum=4-36 1,679,615 data files thisseek=6-62 56G per file (but see datamax) prevfnum=4-36 prevseek=6-62 nextfnum=4-36 nextseek=6-62
The last four are not set for
defaults=xlarge_nohist
.Rough estimates: 56B records/transactions, no larger than 1.9G bytes each; 3 Petabytes total (1,679,615 * 1.9G).
Other required parameters
- name
-
The name parameter identifies the datastore by name. This name should be short and uncomplicated, because it is used as the root for the datastore's files.
- recsep
-
The recsep parameter gives the ascii character(s) that will make up the record separator. The "flatfile" stategy suggests that these characters ought to match what your OS considers to be a "newline". But in fact, you could use any string of ascii characters.
recsep=%0A (LF) recsep=%0D%0A (CR+LF) recsep=%0D (CR) recsep=%0A---%0A (HR -- sort of)
(But keep in mind that the recsep is also used for the key files and toc files. So a simpler recsep is probably best.)
Also, if you develop your data on unix with recsep=%0A and then copy it to a windows machine, the module will continue to use the configured recsep, i.e., it is not tied the to OS.
Other optional parameters
- desc
-
The desc parameter provides a means to give a short description (or perhaps a longer name) for the datastore.
- datamax
-
The datamax parameter gives the maximum number of bytes a data file may contain. If you don't provide a datamax, it will be computed from the thisseek value (see thisseek above for more details).
The datamax value is simply a number, e.g.,
datamax=1000000000 (1 Gig)
To make things easier to read, you can add underscores, e.g.,
datamax=1_000_000_000 (1 Gig)
You can also shorten the number with an 'M' for megabytes (10**6) or a 'G' for gigabytes (10**9), e.g.,
datamax=1000M (1 Gig) datamax=1G (1 Gig)
Finally, with 'M' or 'G', you can use fractions, e.g.
datamax=.5M (500_000) datamax=1.9G (1_900_000_000)
- keymax
-
The keymax parameter gives the number of record keys that may be stored in a key file. This simply limits the size of the key files, e.g.,
keymax=10_000
The maximum bytes would be:
keymax * (preamble length + recsep length)
The numeric value may use underscores and 'M' or 'G' as described above for datamax.
- tocmax
-
The tocmax parameter gives the number of data file entries that may be stored in a toc (table of contents) file. This simply limits the size of the toc files, e.g.,
tocmax=10_000
Each (fairly short) line in a toc file describes a single data file, so you would need a tocmax only in the extreme case of a datastore with thousands or millions of data files.
The numeric value may use underscores and 'M' or 'G' as described above for datamax.
- dirmax
-
The dirmax parameter gives the number of files (and directories) that may be stored in a datastore directory, e.g.,
dirmax=300
This allows a large number of data files (and key/toc files) to be created without there being too many files in a single directory.
(The numeric value may use underscores and 'M' or 'G' as described above for datamax.)
If you specify dirmax without dirlev (see below), dirlev will default to 1.
Without dirmax and dirlev, a datastore's data files (and key/toc files) will reside in the same directory as the uri file, and the module will not limit how many you may create (though the size of your filesystem might).
With dirmax and dirlev, these files will reside in subdirectories.
Giving a value for dirmax will also limit the number of data files (and key/toc files) a datastore may have, by this formula:
max files = dirmax ** (dirlev + 1)
So dirmax=300 and dirlev=1 would result in a limit of 90,000 data files. If you go to dirlev=2, the limit becomes 27,000,000, which is why you're unlikely to need a dirlev greater than 2.
- dirlev
-
The dirlev parameter gives the number of levels of directories that a datastore may use, e.g.,
dirlev=1
You can give a dirlev without a dirmax, which would store the data files (and key/toc files) in subdirectories, but wouldn't limit how many files may be in each directory.
- userdata
-
The userdata parameter is similar to the userdata parameter in the call to new(). It specifies the default value to use if the application does not provide a value when creating, updating, or deleting a record.
Those provided values will override the value given in the call to new(), which will override the value given here in the uri.
If you don't specify a default value here or in the call to new(), the value defaults to a null string (which would be padded with spaces).
userdata=:
The example is contrived for a hypothetical datastore that doesn't need this field. Since the field is required, the above setting will always store a colon (and the user parameter might be
user=1-:
).
CAVEATS
This module is still in an experimental state. The tests are sparse. When I start using it in production, I'll bump the version to 1.00.
Until then (afterwards, too) please use with care.
AUTHOR
Brad Baxter, <bbaxter@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2011 by Brad Baxter
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.