NAME

ObjStore - Perl Extension For ObjectStore OODBMS

SYNOPSIS

The new SQL and the sunset of relational databases.

DESCRIPTION

ObjectStore is the leading object-oriented database. It is engineered by Object Design, Inc. ( http://www.odi.com ) (NASDAQ: ODIS). The database uses the virtual memory mechanism to make persistent data available in the most efficient manner possible.

In case you didn't know, Object Design's Persistent Storage Engine has been licensed by Sun, Microsoft, Netscape, and Symantic for inclusion in their Java development environments. While this (presumably) gains ODI credibility, the real strength of ObjectStore is its integration with perl.

Prior to this joining of forces,

  • ObjectStore was too radical a design decision for many applications.

  • Perl5 did not have a simple way of storing complex data persistently.

Now there is an easy way to build databases, especially if you care about preserving your ideals of data encapsulation. (See below!)

TUTORIAL

The best way to get started is to win the tutorial! See ObjStore::Tutorial!

WHAT IS PERSISTENT PERL?

It's just like normal perl, except that you can create data that doesn't go away when your program exits. This more permenant data lives in files or raw disk partitions that are divided into databases. And databases are comprised of...

Segments

Segments dynamically resize from very small to very big. You should split your data into lots segments when it makes sense. Segments improve locality and can be a unit of locking or caching.

When you create a database object you must specify the segment in which it is to be allocated. All objects use the form 'new $class($near, ...)'. You may pass any persistent object (or database, or segment) in place of $near and the new object will be created appropriately!

Hashes

The following code snippet creates a persistent hash reference with an expected cardinality of ten elements.

my $h7 = new ObjStore::HV($store, 10);

An array representation is used for low cardinalities. Arrays do not scale well, but they do afford a pleasingly compact representation. ObjectStore's os_Dictionary is transparently used for large cardinalities [MAYCHANGE].

Persistent data structures can be built with the normal perl construction:

$h7->{foo} = { 'fwaz'=> { 1=>'blort', 'snorf'=>3 }, b=>'ouph' };

Or the equally effective, but unbearibly tedious:

my $h1 = $dict->{foo} ||= new ObjStore::HV($dict);
my $h2 = $h1->{fwaz} ||= new ObjStore::HV($h1);
$h2->{1}='blort';
$h2->{snorf}=3;
$h1->{b}='ouph';

Perl saves us again! Relief.

Arrays

The following code snippet creates a persistent array reference with an expected cardinality of ten elements.

my $a7 = new ObjStore::AV($store, 10);

None of the usually array operations are supported except fetch and store. (Actually push, pop, shift and unshift might be available but undocumented.) At least the following works:

$a7->[1] = [1,2,3,[4,5],6];

Complete array support will be available as soon as Larry and friends fix the TIEARRAY interface. (See perltie(3) or http://www.perl.com more info.)

References

You can generate a reference to any persistent object with the method new_ref($segment). Since refcnts are not updated remotely, refs are the safest way to refer across databases. They are also designed to be allocated transiently.

$r->open($how);              # attempts to open the focus' database
$yes = $r->deleted;          # is the focus deleted?
$f = $r->focus;              # returns the focus of the ref
$str = $r->dump;             # the ref as a string

Be aware that references can return garbage if they are not open. You will need to open them explicitly (see ObjStore::Ref::POSH_ENTER). Also note that references use significantly more memory than pointers. (Look up os_reference_protected in the ODI FAQ.)

Unsafe, unprotected references are also available:

$r = $o->new_ref('transient', 'unsafe');

Care must to taken that these hard reference do not point to objects that have already been deleted. SEGV or garbled data can result. It is always safe to use hard references given they are used to merely to avoid circular references within a single database.

Cursors

All containers have a method, new_cursor($near), that creates a persistent cursor for the given container. The following methods are available.

$cs->focus();                 # returns the cursor's collection
$cs->moveto($pos);            # seek to the nth element
($k,$v) = $cs->at;            # returns the current element
($k,$v) = $cs->each(1);       # returns the next element
  • First-class cursors for arrays & hashes are incomplete and under construction. If you can avoid them, do so!

  • Array cursors return (index,value) pairs. Hash cursors return (key,value) pairs. All cursors return the empty list () when no more elements are available.

  • For hashes & arrays, you should not assume the order of iteration will follow any particular pattern (but it probably will).

  • If you change membership of a collection while you're iterating through it, something could break, so don't.

Indices

SQL has this great facility called indices. ObjStore does too and they work in almost same!

my $nx = ObjStore::Index->new($near);
$nx->configure(unique => 1, path=>"name");

$nx->add({name=> 'Square'}, {name=> 'Round'}, {name=> 'Triangular'});

my $c = $nx->new_cursor;
$c->seek('T');
$c->step(1);
warn $c->at()->{name}; # Triangular

Index cursors are a lot more powerful than hash or array cursors. Here are the available methods:

$c->focus();
$c->moveto();
$c->step($delta);
$c->seek(@keys);
my $pos = $c->pos();            
my @keys = $c->keys();
my $v = $c->at();
my $v = $c->each($delta);

Where the following invariants hold:

$c->moveto($c->pos() + $delta) is $c->step($delta)
$c->each($delta) is { $c->step($delta); $c->at(); }

To eliminate the possibility of indices becoming out-of-sync with actual data, keys are marked read-only as they are indexed. (Another scheme for keeping indices up-to-date is to use os_backptr. This scheme is not supported because it has considerable memory overhead and provides little benefit beyond our read-only scheme.) For example,

my $category = { name => 'Bath Toy' };
my $row = { name => 'Rubber Ducky', category => $category };

$index->configure(path => 'category/name, name');
$index->add($row);

The first key is the category's name. The second key is the row's name. You cannot index direct keys more than once. There is no such restriction on indirect keys.

And Access Paths (Oh My!)

If you cross your eyes, you will see an array of references or cursors as an access path. Two simple implementations are provided for manipulating access paths:

ObjStore::Path::Ref            # access path composed of refs
ObjStore::Path::Cursor         # cursor based access path

See the source code for details.

DATABASE DESIGN

The best design is to be flexible!

ospeek [-all] [-addr] [-refcnt] [-raw] <database>

While there is no formalized schema for a perl database, the ospeek utility generates a sample of data content and structure. ospeek never outputs more than a short summary, without regard to the size of your database.

You can also get the same thing from within perl:

ObjStore::peek($whatever);

Wait! No Schema?! How Can This Scale?

How can a relational database scale?! When you write down a central schema, you are violating the principle of encapsulation. This is dumb. None of the usual database management operations require a central schema. Why create artificial dependencies between your classes when you can avoid it?

The Theory of Lazy Evolution

A database can be so large, it might be impossible to evolve it all at once! As databases grow, lazy evolution is really the only way to scale up. Fortunately, it's not as complicated as it sounds since all changes can be sorted into three categories. The following examples will involve a hypothetical Spoon class and it's associated instances.

  • INTERFACE

    If you want to change the interface to class Spoon, you have do a bit of planning. All the Spoons in the database probably rely on themselves working as they have in the past. Therefore, you need to copy the class and rename it to Spoon2. The evolution process can be triggered by bless (see the section on customizing bless).

  • IMPLEMENTATION

    Changing the implementation is easier. By using version numbers or careful coding, you should be able to add a new representations while at the same time maintaining backward compatibility. Both soup-spoons and tea-spoons are great for ice cream. is_evolved and evolve methods should be used to upgrade the representation whenever it is deemed necessary.

  • EVERYTHING ELSE

    Most Americans would agree that using chop-sticks is significantly different from using a spoon. Naturally, the fuzzy case will require thought proportional to the triangulation between the old and new sport utility functions. Fortunately, the upside is that your database can afford to be lazy. It can learn to use chop-sticks one stick at a time.

Also see the is_corrupted method for integrity verification.

RDBMS Emulation

Un-structured perl databases are probably under-constrained for most applications. Fortunately, RDBMS style tables have been adapted, adopted, and included with this package. While they are a little different from traditional tables, hopefully relational developers will feel right at home. See ObjStore::Table3.

API REFERENCE

Fortunately, you probably will not need to use most of the API. It is exhibited here mainly to make it seem like this extension has a difficult and steep learning curve. In general, the API mostly mirrors the C++ API. Refer to the ObjectStore documentation for exact symantics. The API for ::UNIVERSAL is probably of most interest to ex-C++ developers. If you need a function that isn't available in perl, send mail to the OS/Perl mailing list (see the README).

Quick Start!

 #!/usr/local/bin/perl -w

 use ObjStore;
 use ObjStore::Config;

 my $db = ObjStore::open(TMP_DBDIR."/myjunk", 'update', 0666);

 begin 'update', sub {

   # $junk might be any arbitrary non-circular structure up to 2GB
   my $junk = { 
                 pork => [1,2,3],
                 chicken => [1,2,3],
                 wheat => [1,2,3],
                 tests => 0,
              };

   $db->root("junk", sub { $junk });
   
   my $pjunk = $db->root("junk");
   $pjunk->{tests} ++;

   ObjStore::peek($db);
 };

ObjStore

  • $db = ObjStore::open($pathname, $read_only, $mode);

    Also see ObjStore::HV::Database & ObjStore::Table3::Database.

  • $name = ObjStore::release_name()

  • $major = ObjStore::release_major()

  • $minor = ObjStore::release_minor()

  • $maintenance = ObjStore::release_maintenance()

  • $yes = ObjStore::network_servers_available();

  • $num = ObjStore::return_all_pages();

  • $size = ObjStore::get_page_size();

  • @Servers = ObjStore::get_all_servers();

  • $in_abort = ObjStore::abort_in_progress();

  • $num = ObjStore::get_n_databases();

::Server

  • $name = $s->get_host_name();

  • $is_broken = $s->connection_is_broken();

  • $s->disconnect();

  • $s->reconnect();

  • @Databases = $s->get_databases();

::Database

See ObjStore::HV::Database, ObjStore::Table3::Database

  • $open_mode = $db->is_open();

  • $s = $db->create_segment();

  • $value = $db->root($root_name => sub{ $new_value });

    This is the recommended API for roots. If the given root is not found, creates a new one. Returns the root's current value.

  • $s = $db->get_segment($segment_number);

    Note that this method (correctly) never returns an error. The only way to know which segments are actually created in a database is to iterate through get_all_segments.

  • @Segments = $db->get_all_segments();

  • $db->close();

  • $db->destroy();

  • $db->get_default_segment_size();

  • $db->get_sector_size();

  • $db->size();

  • $db->size_in_sectors();

  • $ctime = $db->time_created();

  • $can_write = $db->is_writable();

  • $db->set_fetch_policy(policy[, blocksize]);

    Policy can be one of segment, page, or stream.

  • $db->set_lock_whole_segment(policy);

    Policy can be one of as_used, read, or write.

  • @Roots = $db->get_all_roots();

  • $root = $db->create_root($root_name);

  • $root = $db->find_root($root_name);

  • $db->destroy_root($root_name);

    Destroys the root with the given name if it exists.

::Root

  • $root->get_name();

  • $root->get_value();

  • $root->set_value($new_value);

  • $root->destroy();

::Transaction

ObjectStore transactions and exceptions are seemlessly integrated into perl. ObjectStore exceptions cause a die in perl just as perl exceptions cause a transaction abort.

begin 'update', sub {
    $top = $db->root('top');
    $top->{abc} = 3;
    die "Oops!  abc should not change!";       # aborts the transaction
};

There are three types of transactions: read, update, and abort_only. The default is read. Read transaction are blindingly fast.

    begin 'read', sub {
	my $var = $db->root('top');
	$var->{abc} = 7;	# write to $var triggers die(...)
    };

(In a read transaction, you are not allowed to modify persistent data.)

  • $T = ObjStore::Transaction::get_current();

  • $type = $T->get_type();

  • $pop = $T->get_parent();

  • $T->prepare_to_commit();

  • $yes = $T->is_prepare_to_commit_invoked();

  • $yes = $T->is_prepare_to_commit_completed();

  • ObjStore::set_transaction_priority($very_low);

  • ObjStore::set_max_retries($oops);

  • ObjStore::fatal_exceptions($yes);

  • my $oops = ObjStore::get_max_retries();

  • my $yes = ObjStore::is_lock_contention();

  • my $type = ObjStore::get_lock_status($ref);

  • my $tm = ObjStore::get_readlock_timeout();

  • my $tm = ObjStore::get_writelock_timeout();

  • ObjStore::set_readlock_timeout($tm);

  • ObjStore::set_writelock_timeout($tm);

::Segment

  • $s->set_comment($comment);

  • $s->destroy();

  • $size = $s->size();

  • $yes = $s->is_empty();

  • $yes = $s->is_deleted();

  • $num = $s->get_number();

  • $comment = $s->get_comment();

  • $s->lock_into_cache();

  • $s->unlock_from_cache();

  • $s->set_fetch_policy($policy[, $size]);

    Policy can be one of segment, page, or stream.

  • $s->set_lock_whole_segment($policy);

    Policy can be one of as_used, read, or write.

::Notification

  • ObjStore::subscribe(...);

  • ObjStore::unsubscribe(...);

  • set_queue_size($size);

  • ($size, $pending, $overflow) = queue_status();

  • $fd = _get_fd();

  • $n = receive([$timeout]);

  • $db = $n->get_database;

  • $p = $n->focus;

  • $why = $n->why;

::UNIVERSAL

All persistent objects inherit from ObjStore::UNIVERSAL.

  • overload

    Stringify, boolean coersion, and equality tests.

  • bless

    bless and isa work according to the moment of the blessing, rather than searching the current @ISA tree. (UNIVERSAL::can remains un-modified.) The method os_class reports the natural persistent class of an object.

  • $o-notify($why, ['now'])>

    Without the 'now' parameter, notification will take place after commit.

  • $errs = $o->is_corrupted($verbosity_level)

    Application specific integrity checking can be achieved by providing an _is_corrupted method.

  • Lazy Evolution

    Use is_evolved() to know if evolution is needed. $o->evolve() should bring stuff up-to-date.

  • posh

    posh behavior can be customized by adding special methods. See the section on posh.

  • Of

    database_of and segment_of are always available as methods.

To make everything seem apparently consistent, ObjStore::Database is completely special-cased to support most of the features above.

THE ADVANCED CHAPTER

Performance Check List

The word tuning implies too high a brain-level requirement. Getting performance out of ObjectStore is not rocket science.

  • COMPACTNESS

    You get 90% of your performance because you can fit your whole working data set into RAM. If you are doing a good job, your un-indexed database should be less than twice the size of it's un-compressed ASCII dump; i.e., less than 2 times expansion. (See the section on representation.)

  • SEGMENTS

    Is your data partitioned into as many segments as possible?

  • DO AS MUCH AS POSSIBLE PER TRANSACTION

    Transactions, especially update transactions, involve a good deal of setup/cleanup. The more you do per transaction the better.

  • AVOID THE NETWORK

    Run your program on the same machine as the ObjectStore server.

  • DO STUFF IN PARALLEL

    If you have an MP machine, you can do reads/updates in parallel (even without perl threads).

  • WHERE IS THE REAL BOTTLENECK?

    Use Devel::*Prof or a similar tools to analyze your program. Make your client-side cache bigger/smaller.

  • SPEED UP PERL

    Try using the perl compiler. See http://www.perl.com

  • LOCKING AND CACHING

    Object Design claims that caching and locking parameters also impact performance. (See os_segment::set_lock_whole_segment and os_database::set_fetch_policy.)

  • THROW MONEY AT THE PROBLEM

    Get a more memory, more CPUs, and upgrade to your network.

Transactions Redux

  • A BRIEF HISTORY OF TRANSACTIONAL MEMORY USAGE

    Each time you access a persistent object, a small amount of transient memory is reserved until the transaction completes (to cope with perl scoping rules). For this reason, and for speed, you should avoid repeating long access paths.

    ++ $at->{bob}{house}{fridge}{beer};  #go to the minimart
    -- $at->{bob}{house}{fridge}{beer};  #big gulp

    Instead, use a lexical variable to keep the fridge door open:

    my $fridge = $at->{bob}{house}{fridge};
    ++ $fridge->{beer};
    -- $fridge->{beer};
  • NESTING

    Nested transactions are supported, but transaction modes must match. You can nest reads within reads or updates within updates, but not reads within updates nor updates within reads. If you need to do a read but you don't care if the parent transaction is an update or not, you can leave the mode unspecified.

    sub do_extra_push_ups_in_a_transaction {
      begin sub {
        ...
        # Unspecified assumes a read or the same mode as the parent.
        ...
      };
    }
  • RELAXING EXCEPTION SEVERITY

    Transactions are always executed within an implicit eval. If you do not want your program to become suicidal when an ObjectStore exception occurs, you should indicate that you want to have control over your own reflexive behavior:

    ObjStore::fatal_exceptions(0);

    This is global to the whole process. After a transaction, you absolutely must remember to check the value of $@ to see if anything went wrong.

    begin(sub {
       ...
    });
    die if $@;    # Don't forget to remember!  Always check for errors!
  • MIXING WITH EVAL

    It is possible to use eval within transactions, but you absolutely must not use the ObjectStore API or access any persistent memory.

    begin('read', sub {
      ...
      eval { $db->root('new root' => [1,2,3]); };
      ...
    });

    In the above code, the update in a read transaction will cause an exception that jumps through the eval and out of the begin. This is due to the excellent but imperfect integration of ObjectStore exceptions and perl exceptions. In general, it's much safer to replace evals with begins.

  • DEADLOCK

    Top level transactions are automatically retried in the case of a deadlock. You can increase the number of retries with ObjStore::set_max_retries($retries). Or if you need to handle deadlocks yourself, you can set the number of retries to zero. (There is not much point to retrying non-top-level transactions because locks are released only at the top-level [OS 4-5.0].)

Stargate Mechanics

Create hashes and arrays pre-sized to exactly the right number of slots:

new ObjStore::HV($near, { key => 'value' });  # 1 slot
new ObjStore::AV($near, [1..3]);              # 3 slots

(Be aware that any transient structures you pass through the stargate are no longer dismembered as they are copied.)

You can use the stargate directly:

my $persistent_junk = ObjStore::translate($near, [1,2,3,{fat=>'dog'}]);

If you want to design your own stargate, you may inspect the default stargate in ObjStore.pm for inspiration.

How Can I Rescue Persistent Objects From Oblivion?

All data stored in ObjectStore is reference counted. This is a fantastically efficient way to manage memory (for most applications). It has very good locality and low overhead. However, as soon as an object's refcnt reaches zero, it is permenantly deleted from the database. You get only one chance to save the object: the NOREFS method is invoked just prior to deletion. You must create a new persistent reference to it, or kiss the object goodbye.

Note that the DESTROY method is still invoked every time an object becomes unreachable from the current scope. However, contrary to transient objects, this method does not preview object destruction. [Hacking DESTROY such that it can be used as NOREFS is desirable, but would require changes to the core perl code-base. This change is under consideration...]

posh

posh is your interactive window into databases.

It is designed treat your data in an application specific manner. Customize by providing your own implementation for these methods:

  • $o->POSH_PEEK($peeker, $o_name);

  • $o->POSH_CD($path);

  • $o->POSH_ENTER();

There are lots of good examples throughout the standard ObjStore:: libraries.

Arrays-as-Hashes

use base 'ObjStore::AVHV';
use Class::Fields qw(f1 f2 f3);

$ObjStore::COMPILE_TIME XXX

See ObjStore::AVHV XXX

Autoloading

When you use a database, ObjStore tries to require each class to which it finds reference that doesn't seem to be loaded. This means that you can write generic data processing programs that load the appropriate libraries to manipulate data in application specific ways.

To disable class autoloading behavior call this function before you open any databases:

ObjStore::disable_class_auto_loading();

This mechanism is orthogonal to the AUTOLOAD mechanism for autoloading functions.

Cross Database POINTERS

This feature is highly depreciated and will likely be discontinued, but at the moment you can allow cross database pointers with:

$db->_allow_external_pointers;    #never do this!

But you should avoid this if at all possible. Using real pointers will affect refcnts, even between two different databases. Your refcnts will be wrong if you simply osrm a random database. This will cause some of your data to become permenently un-deletable. Currently, there is no way to safely delete un-deletable data.

Instead, you can use references or cursors to refer to data in other databases. References may use the os_reference_protected class which is designed precisely to address this problem. Refcnts will not be updated remotely, but you'll still be protected from accessing deleted objects or removed databases. Imagine the freedom.

IMPLEMENTATION

You don't have to understand anything about the technical implementation. Just know that:

  • ObjectStore is outrageously powerful; sophisticated; and even over-engineered.

  • The perl interface is optimized to be fun and easy. And since ObjectStore is also blindingly fast, you can happily leave relational databases on the bookshelf where they belong.

Perl & C++ APIs: What's The Difference?

Most stuff should be roughly the same. The few exceptions have generally arisen because there was an easy way to make the interface more programmer friendly.

  • Transactions are perl-ified.

  • Some static methods sit directly under ObjStore:: instead of under their own classes. (Easier to import.)

  • Databases are always blessed according to your pleasure. Above and beyond, lookup, open, and is_open are augmented with multi-color, pop-tart style interfaces.

Representation

Memory usage is much more important in a database than in transient memory. When databases can be as large or larger than ten million megabytes, a few percent difference in compactness be noticable.

All values take a minimum of 8 bytes (OSSV). These 8 bytes are used to store a 16-bit value type, a pointer, and a general purpose 16-bit integer.

value stored                   extra allocation (in addition to OSSV)
------------------------------ -------------------------------------
undef                          none
pointer                        none
16-bit signed integers         none
32-bit signed integers         4 byte block (OSPV_iv)
double                         8 byte block (OSPV_nv)
string                         length of string (char*)
object (ref or container)      sizeof object (see subclasses of OSSVPV)
bless                          .5-1k bytes per class (zero per object)

splash collections XXX
ObjectStore collections XXX

The ODI FAQ also states: In addition, there is an associated entry in the info segment for the segment in question for each allocation of the object. This is done in the tag table. The overhead is 16 bits (i.e., 2 bytes) for each singleton (i.e., non-array) allocation, 32 bits for each character array allocation for character arrays <= 255 characters, and 48 bits for each character array allocation > 255 characters, or any array allocation of an object of another type. Also, depending on the size of an object (i.e., if you allocate a "huge" object - one that is >64Kb) there is other overhead caused by alignment constraints.

If this seems like a lot of overhead, consider that it is not really possible to directly compare these numbers to RDBMS statistics. (Part of the problem is that no RDBMS vendor can even give you these numbers.) At least, note that relational data can be stored with much less duplication in ObjectStore. (Definitely true if you write C++ extensions.) Of course, the real test must always be to code up your problem and make experimental measurements.

Hard-Coded Limits

  • Reference counts are only 32 bits unsigned.

  • Strings are limited to a length of 32767 bytes. (This limit will be relaxed.)

Bless

If you are a suspicious person like my mom, you might have suspected that the ObjStore module installs its own version of bless. Natually, it does. The augmented bless implements extra quality assurance to insure that blessings are stored persistently. For example:

package Scottie;
use ObjStore;
use base 'ObjStore::HV';
sub new {
    my ($class, $store) = @_;
    my $o = $class->SUPER::new($store, { fur => 'buffy' });
    $o;
}

package main;

my Scottie $dog = new Scottie($db);
# once a Scottie, always a Scottie

Technically speaking, bless is re-implemented such that it can be extended by the bless from and the bless to classes. (This is intrinsically confusing, so take a deep breath and prepare yourself.)

sub BLESS {
    my ($r1,$r2);
    if (ref $r1) { warn "$r1 leaving ".ref $r1." for a new life in $r2\n"; }
    else         { warn "$r2 entering $r1\n"; }
    $r1->SUPER::BLESS($r2);
}

The isa method is also tweaked such that it reports according to the moment of the bless as opposed to the current @ISA setup. (Note that UNIVERSAL::can is unmodified. If you train a puppy to growl threateningly upon command, the adult dog will not immediately forget the training it had as a puppy.)

UNLOADED

Generic tools such as posh or ospeek must bless objects when reading from an arbitrary database. Prior to trying to locate the implementations of arbitrary objects, get_INC is used to fetch the stored @INC and syncronize it with the transient @INC. Then, each class found in the database is require'd. However, if the require fails, a package must be faked-up. The UNLOADED package is added to the @ISA. This signals that the @ISA tree should not be assumed authoritative.

Go Extension Crazy

You cannot directly access persistent scalars from perl. They are always immediately copied into transient scalars.

While all persistent objects are blessed, they are not considered blessed in the database unless they are members of some non-default class. NOREFS is not invoked on non-blessed database objects.

$ObjStore::COMPILE_TIME XXX

ObjStore::File will be the base class for large binary data.

Each subclass of ObjStore::UNIVERSAL::Container has a %REP hash. Persistent object implementations add their create functions to the hash. The new method decides on the best representation, calls the best creation function from the %REP hash, returning the newly minted persistent object.

You can add your own C++ representation. If you want to know the specifics, look at the code for the built-in representations (GENERIC.*).

You can add new families of objects that inherit from ObjStore::UNIVERSAL. Suppose you want highly optimized, persistent bit vectors? Or matrics? These would not be difficult to add. Especially once Object Design figures out how to support multiple application schemas within the same executable. They claim that this tonal facility will be available in the next release.

ObjStore::Index

Indices are extremely efficient because they do not copy their keys. It is critical that the copy is avoided, since keys can be relocated when arrays need to grow. OSSVPV pointers are never relocated.

DIRECTION

  • PERFECT NATURAL CLARITY

    The overwhelming top priority is to make this extension work seemlessly, obviously, and effortlessly. Really, the only difference between lisp and perl is ease of use. No detail will be overlooked, all must conform to effortless styistic perfection.

  • APIs

    Support for database access control and any other interesting ObjectStore APIs.

  • MORE BUILT-IN DATA TYPES

    File objects compatible with IO::Handle. Support for a Text Object Manager? Support for bit vectors and matrics (PDL) ?

Why Is Perl a Better Fit For Databases Than SQL, C++, or Java?

  struct CXX_or_Java_style {
	char *name;
	char *title;
	double size;
  };

When you write a structure declaration in C++ or Java you are declaring field-names, field-types, and field-order. Programs almost always require a re-compile to change such specific declarations. This is fine for small applications but becomes cumbersome quickly. It is too hard to change. An SQL-style language is needed. When you create a table in SQL you are declaring only field-names and field-types.

create table SQL_style
(name varchar(80),
 title varchar(80),
 size double)

This is more flexible, but SQL gives you far less expressive power than C++ or Java. Applications end up being written in C++ or Java, while their data is stored with SQL. Managing the syncronization between the two languages creates enormous extra complexity. So much so that there are lots of software companies that exist solely to address this headache. Perl is better because it transparently spans all the requirements in a single language.

my $h1 = { name => undef, title => undef, size => 'perl' };

Only the field-names are specified. This declaration is actually even more flexible than SQL because the field-types are left dynamic. But not only is perl more flexible, it's also fast. Malcolm Beattie is working on a perl compiler which is currently in beta. Here is his brief description of a new hybrid hash-array that is supported: An array ref $a can be dereferenced as if it were a hash ref. $a->{foo} looks up the key "foo" in %{$a->[0]}. The value is the index in the true underlying array @$a. As an addition, if the array ref is in a lexical variable tagged with a classname ("my CXX $obj" to match your example above) then constant key dereferences of the form $obj->{foo} are mapped to $obj->[123] at compile time by looking up the index in %CXX::FIELDS.

For example:

my $schema_hashref = { 'field1' => 1, 'field2' => 2 };
my $arr = [$schema_hashref, 'fwaz', 'snorf'];
print "$arr->{field1} : $arr->{field2}\n";      # "fwaz : snorf"

Why Is Perl Easier Than Other Programming Languages?

I have no idea!

Summary (LONG)

  • SQL

    All perl databases use the same flexible schema that can be examined and updated with generic tools. This is the key advantage of SQL, now available in perl. In addition, Perl / ObjectStore is blatantly faster than SQL / C++. Not to mention that perl is a general purpose programming language and SQL is at best a query language.

  • C++

    Special purpose data types can be coded in C++ and dynamically linked into perl. Since C++ will always be faster than Java, this gives perl an edge in the long run. Perl is to C/C++ as C/C++ is to assembly language.

  • JAVA

    Java has the buzz, but:

    • Just like C++, the lack of a universal generic schema limits use to single applications. Without some sort of tie mechanism, I can't imagine how this could be remedied.

    • All Java databases must serialize data to store it. Until Java supports memory-mapped persistent allocation, database operations will always be sluggish compared to C++.

    • Perl now integrates with Java and the SwingSet / AWT API!

Summary (SHORT)

Perl can store data

  • optimized for flexibility and/or for speed

  • in transient memory and persistent memory

without violating the principle of encapsulation or obstructing general ease of use.

ETA

  • NOW TO 3 MONTHS

    Dynamically loaded application schemas; perl kernel-level threads; perl compiler

  • 1-6 MONTHS

    Proper tied arrays & repaired tie interface

EXPORTS

bless, begin, try_read, try_update, try_abort_only by default. Most other static methods can also be exported. try_* functions are depreciated.

BUGS

  • LEAKS TRANSIENT XPVRVs

    The problem is thoroughly understood. Work-arounds or a real fix have been discussed on the perl-porters mailing list.

  • os_protected_reference

    Allocates persistent memory that cannot be reclaimed without destroying the segment. This makes it non-trival to determine whether a segment is empty or not. The needed change is listed as ODI feature request #SE055496_O#.

  • TRANSACTIONS

    Transaction hold onto transient memory longer than necessary. The solution is to use doubly-linked lists. This was proven to work in an eariler version but unfortunately I took the code out because I thought it was too complicated.

  • MOP

    This is not a general purpose ObjectStore editor with complete MOP support. Actually, I don't think this is a bug.

  • HIGH VOLITILITY

    Everything is subject to change without notice. (But backward compatibility will be preserved when possible. :-)

  • POOR QUALITY DOCUMENTATION

    I didn't get a Ph.D in English. Sorry!

AUTHOR

Copyright © 1997-1998 Joshua Nathaniel Pritikin. All rights reserved.

This package is free software and is provided "as is" without express or implied warranty. It may be used, redistributed and/or modified under the terms of the Perl Artistic License (see http://www.perl.com/perl/misc/Artistic.html)

Perl / ObjectStore extension is available via any CPAN mirror site. See http://www.perl.com/CPAN/authors/id/JPRIT/ !

Portions of the collection code snapped from splash, Jim Morris's delightful C++ library ftp://ftp.wolfman.com/users/morris/public/splash .

Also, a poignant thanks to all the wonderful teachers with which I've had the opportunity of studying. If you have never had a teacher, I highly recommend it!

SEE ALSO

ObjStore::Tutorial, ObjStore::Table3, examples in the t/ directory, and SQL (never again!)

1 POD Error

The following errors were encountered while parsing the POD:

Around line 1249:

Non-ASCII character seen before =encoding in '©'. Assuming CP1252