NAME

Tie::DB_File::SplitHash - Divides a DB_File hash across multiple files

SYNOPSIS

use Tie::DB_File::SplitHash;

[$X =] tie %hash,  'Tie::DB_File::SplitHash', $filename, $flags, $mode, $DB_HASH, $multi_n;

$status = $X->del($key [, $flags]) ;
$status = $X->put($key, $value [, $flags]) ;
$status = $X->get($key, $value [, $flags]) ;
$status = $X->seq($key, $value, $flags) ;
$status = $X->sync([$flags]) ;
$status = $X->fd ;

$hash{'this'} = 'that';
my $entry     = $hash{'this'};

undef $X;  # See the 'untie() Gotcha' in the DB_File documentation
untie %hash;

$multi_n determines the 'spread out' or number of files the hash will be split between. The larger the number, the larger the final hash can be.

The other parameters are as defined in DB_File.

DESCRIPTION

Transparently splits a Berkeley DB_File database into multiple files to allow the exceeding of file system limits on file size. From the outside, it behaves identically with Berkeley DB DB_File hash support in general with the exception of 'seq' in object oriented mode. This has the potential to greatly expand the amount of data that can be stored on a file size limited file system.

It does so by taking a hash of the key to be stored, folding the resulting hash into a value from 0 to X and storing the data to a db file selected by the value 0 to X. The randomizing behavior of the hash and subsequent fold down distribute the records essentially randomly between the X+1 database files, raising the capacity of the database to X+1 times the capacity of a single file database on the average.

In other words: If your filesystem is limited to (for example) 2 gigabyte files, but you need to store more than that much data in a Berkeley hash, you can use this module to efficiently do so.

NOTE: Using an 'in-memory' database is not supported by this. Use DB_File directly if you want to do that.

BTREE and RECNO databases are not supported by this module either.

The module by default exports the following constants and variables from DB_File (see DB_File for full details on what they are for):

$DB_HASH       DB_LOCK         DB_SHMEM        DB_TXN      HASHMAGIC
HASHVERSION    MAX_PAGE_NUMBER MAX_PAGE_OFFSET MAX_REC_NUMBER
RET_ERROR      RET_SPECIAL     RET_SUCCESS     R_CURSOR
R_DUP          R_FIRST         R_FIXEDLEN      R_IAFTER
R_IBEFORE      R_LAST          R_NEXT          R_NOKEY
R_NOOVERWRITE  R_PREV          R_RECNOSYNC     R_SETCURSOR
R_SNAPSHOT __R_UNUSED

We also export the Fcntl O_xxxx constants.

You can suppress those exports by 'use'ing the module with an empty parameter list:

use Tie::DB_File::SplitHash ();

For documentation on the methods and features of a DB_File hash - see the documentation for DB_File. This module is essentially a wrapper around DB_File that layers on the additional functionality of using multiple files to store the data.

WARNING - changing the 'split' factor on an existing database will result in data loss. Don't do it.

CHANGES

1.07 2020.10.04 - Fixes for MSWin directory permissions
                  and file descriptor behavior breaking tests.

1.06 2020.10.03 - Fixes for temp directory issue in tests (thanks goes
                  to jkeenan [...] cpan.org for identifying the problem
                  and submitting a patch for it). Updated maintainer info.
                  Updated build support. Relicensed under MIT license.

1.05 2005.11.18 - Added version requirement for Pod::Coverage to tests

1.04 2005.10.03 - Fixed build test failures under MSWindows.
                  Merged db creation tests with newer tests.

1.03 2005.09.29 - Fixed build test failure caused by root being able
                  to create directories and files even in 'forbidden'
                  directories. No functional changes.

1.02 2005.09.27 - Added Build.PL support, META.yml and Changes. Revised
                  documentation. Extended build test coverage to 100%
                  code coverage. Removed unneeded usage of 'Tie::Hash'.
                  Fixed bug in NEXTKEY causing CLEAR to throw errors.
                  Added LICENSES, GPL_License.txt and
                  Artistic_License.txt.

1.01 2000.03.06 - Removed 'dependencies' on built-ins that caused
                  'make' failures and added install tests.

METHODS

The object methods only apply if you are using the object oriented interface instead of the tied hash interface. There is the significant limitation that 'seq' does not work correctly in a split database.

$status = $X->get($key, $value [, $flags]) ;

See DB_File.

$status = $X->put($key, $value [, $flags]);

See DB_File.

$status = $X->del($key [, $flags]);

See DB_File.

$status = $X->fd;

See DB_File. Note - since multiple databases are actually open, only the file descriptor for the '1st' underlaying database is returned.

$status = $X->seq($key, $value, $flags);

'seq' DOES NOT WORK. DO NOT USE IT. This DB_File method is difficult to make work correctly in a split database.

$status = $X->sync([$flags]);

See DB_File.

$result = $X->exists($key);

Returns true if the specified key exists in the database.

$result = $X->clear;

Clears (removes all keys and values) the entire hash.

COPYRIGHT

Copyright 1999-2020, Jerilyn Franz and FreeRun Technologies, Inc. All Rights Reserved.

LICENSE

MIT License

Copyright (c) 2020 Jerilyn Franz and Freerun Technologies, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

DISCLAIMER

THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

Use of this software in any way or in any form, source or binary, is not allowed in any country which prohibits disclaimers of any implied warranties of merchantability or fitness for a particular purpose or any disclaimers of a similar nature.

IN NO EVENT SHALL I BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION (INCLUDING, BUT NOT LIMITED TO, LOST PROFITS) EVEN IF I HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE

AUTHOR

Jerilyn Franz

TODO

'seq' functionality

VERSION

1.07 - 2020.10.04

SEE ALSO

DB_File