NAME
ObjStore - Perl extension for ObjectStore OODBMS
SYNOPSIS
use ObjStore;
my $osdir = ObjStore->schema_dir;
my $DB = ObjStore::Database->open($osdir . "/perltest.db", 0, 0666);
try_update {
my $top = $DB->root('whiteboard');
$top ||= $DB->root('whiteboard', new ObjStore::HV($DB, 1000));
for (my $x=1; $x < 10000; $x++) {
$top->{$x} ||= {
id => $x,
m1 => "I will not talk in ObjectStore/Perl class.",
m2 => "I will study the documentation before asking questions.",
};
}
print "Very impressive. I see you are already an expert.\n";
};
print "[Abort] $@\n" if $@;
DESCRIPTION
The new SQL and the sunset of relational databases.
ObjectStore is the leading object-oriented database. It is engineered by Object Design, Inc. (http://www.odi.com) (NASDAQ: ODIS). The database uses the virtual memory mechanism to make persistent data available in the most efficient manner possible.
In case you didn't know, Object Design's Persistent Storage Engine has been licensed by Sun, Microsoft, Netscape, and Symantic for inclusion in their Java development environments.
Prior to this joining of forces,
ObjectStore was too radical a design decision for many applications.
Perl5 did not have a simple way of storing complex data persistently.
Now there is an easy way to build databases, especially if you care about preserving your ideals of data encapsulation. (See below!)
API
Much of the Perl API is a direct interface to the C++ API. Refer to the ObjectStore documentation for exact symantics. If you need a function that isn't available in Perl, send mail to the OS/Perl mailing list (see the README).
OBJSTORE
$name = ObjStore::release_name()
$major = ObjStore::release_major()
$minor = ObjStore::release_minor()
$maintenance = ObjStore::release_maintenance()
$yes = ObjStore::network_servers_available();
ObjStore::set_auto_open_mode(mode, fp, [sz]);
$num = ObjStore::return_all_pages();
$size = ObjStore::get_page_size();
@Servers = ObjStore::get_all_servers();
$in_abort = ObjStore::abort_in_progress();
$DB = ObjStore::open($pathname, $read_only, $mode);
$num = ObjStore::get_n_databases();
::SERVER
$name = $s->get_host_name();
$is_broken = $s->connection_is_broken();
$s->disconnect();
$s->reconnect();
@Databases = $s->get_databases();
::DATABASE
$DB->close();
$DB->destroy();
$DB->get_default_segment_size();
$DB->get_sector_size();
$DB->size();
$DB->size_in_sectors();
$ctime = $DB->time_created();
$is_open = $DB->is_open();
$DB->open_mvcc();
$is_mvcc = $DB->is_open_mvcc();
$read_only = $DB->is_open_read_only();
$can_write = $DB->is_writable();
$DB->set_fetch_policy(policy[, blocksize]);
Policy can be one of
segment
,page
, orstream
.$DB->set_lock_whole_segment(policy);
Policy can be one of
as_used
,read
, orwrite
.$DB = ObjStore::Database::of($pvar);
$Seg = $DB->create_segment();
$Seg = $DB->get_segment($segment_number);
@Segments = $DB->get_all_segments();
@Roots = $DB->get_all_roots();
$root = $DB->create_root($root_name);
$root = $DB->find_root($root_name);
$value = $DB->root($root_name[, $new_value]);
This is the recommended API for roots. If the given root is not found, creates a new one. Sets the root's value if $new_value is defined. Returns the root's current value.
$DB->destroy_root($root_name);
Destroys the root with the given name if it exists.
::ROOT
$root->get_name();
$root->get_value();
$root->set_value($new_value);
$root->destroy();
::TRANSACTION
ObjectStore transactions and exceptions are seemlessly integrated into Perl.
try_update {
$top = $DB->root('top');
$top->{abc} = 3;
die "Oops! abc should not change!"; # aborts the transaction
};
print $@ if $@;
There are three types of transactions: try_read
, try_update
, and try_abort_only
. Each execute the given block within an implicit eval
. After a transaction, be sure to check the value of $@
to see if anything went wrong. For example,
try_read {
my $var = $DB->root('top');
$var->{abc} = 7; # write to $var triggers exception
};
die if $@; # exception rethrown to top-level
Needless to say, you cannot access or modify persistent data outside of a transaction.
$T = ObjStore::Transaction::get_current();
$type = $T->get_type();
$pop = $T->get_parent();
$T->prepare_to_commit();
$yes = $T->is_prepare_to_commit_invoked();
$yes = $T->is_prepare_to_commit_completed();
ObjStore::set_transaction_priority($very_low);
ObjStore::set_max_retries($oops);
my $oops = ObjStore::get_max_retries();
my $yes = ObjStore::is_lock_contention();
my $type = ObjStore::get_lock_status($ref);
my $tm = ObjStore::get_readlock_timeout();
my $tm = ObjStore::get_writelock_timeout();
ObjStore::set_readlock_timeout($tm);
ObjStore::set_writelock_timeout($tm);
::SEGMENT
$Seg->destroy();
$size = $Seg->size();
$yes = $Seg->is_empty();
$yes = $Seg->is_deleted();
$num = $Seg->get_number();
$comment = $Seg->get_comment();
$Seg->set_comment($comment);
$Seg->lock_into_cache();
$Seg->unlock_from_cache();
$Seg->set_fetch_policy($policy[, $size]);
Policy can be one of
segment
,page
, orstream
.$Seg->set_lock_whole_segment($policy);
Policy can be one of
as_used
,read
, orwrite
.$Seg = ObjStore::Segment::of($pvar);
CREATING CONTAINERS
Databases are composed of segments. Segments dynamically resize very small to very big. You should split your data into lots segments when it makes sense. Segment improve locality of reference and can be a unit of locking or caching.
When you create a container you must specify the segment in which it is to be allocated. All containers are created using the form 'new ObjStore::$type($store, $cardinality)'
. You may pass any persistent object in place of $store and the new container will be created in the same segment as the $store object!
ARRAYS
The following code snippet creates a persistent array reference with an expected cardinality of ten elements.
my $a7 = new ObjStore::AV($store, 10);
None of the usually array operations are supported except fetch and store.
$a7->[1] = [1,2,3,[4,5],6];
Complete array support will be available as soon as Larry and friends fix the TIEARRAY interface. (See perltie(3) or http://www.perl.com more info.)
HASHES
The following code snippet creates a persistent hash reference with an expected cardinality of ten elements.
my $h7 = new ObjStore::HV($store, 10);
An array representation is used for low cardinalities. Arrays do not scale well, but they do afford a compact representation. ObjectStore's os_Dictionary
is used for large cardinalities.
$h7->{foo} = { 'fwaz'=> { 1=>'blort', 'snorf'=>3 }, b=>'ouph' };
Or the equally effective but unbearibly tedious:
my $h1 = $dict->{foo} ||= new ObjStore::HV($dict);
my $h2 = $h1->{fwaz} ||= new ObjStore::HV($h1);
$h2->{1}='blort';
$h2->{snorf}=3;
$h1->{b}='ouph';
Perl saves us again! Relief.
SETS
The following code snippet creates a set with an expected cardinality of ten elements.
my $set = new ObjStore::Set($store, 10);
Sets are simple collections. They do not support duplicates. The following methods are supported:
$set->a($obj, { hello=>1 });
$set->r($obj);
$yes = $set->contains($obj);
for (my $obj = $set->first; $obj; $obj = $set->next) {
# do something with $obj
}
An array representation is used for low cardinalities. Arrays are not efficient, but they are compact. ObjectStore's os_set
is used for large cardinalities.
Changing the membership of a set while iterating over the members has undefined results.
BLESS
The ObjStore module replaces bless
. Once you create a container, you may bless
it in your own package. The blessing is persistent. For example,
package MyObject;
use ObjStore;
@ISA = qw(ObjStore::HV);
sub new {
my ($class, $store) = @_;
my $o = bless(new ObjStore::HV($store), $class);
$o->{attribute} = 5;
$o;
}
If you store each class in a separate .pm
file in your @INC path (see require
), then the classes will be autoloaded as you access them.
CLASS AUTOLOADING
ObjStore tries to require
each class as you access persistent instances the first time. This means that you can write generic data processing programs that automatically load the appropriate libraries to manipulate data as the data is accessed.
To disable the class autoloading behavior:
ObjStore::disable_auto_class_loading();
This mechanism is orthogonal to the AUTOLOAD
mechanism for autoloading functions.
OSPEEK
While there is no official schema for a Perl database, the ospeek
utility generates a sample of data content and structure. The following output was snapped from a database that supports a CGI application we have developed. Note how circular references and pointers between objects are summarized.
WAIT! NO SCHEMA?! HOW CAN THIS WORK?
When you write down a central schema, you are violating the principle of encapsulation. This is dumb. None of the usual database operations require a central schema. Why create artificial dependencies between your classes when you can avoid it?
LAZY EVOLUTION
Even schema evolution can be done piecemeal. Give all your objects an evolve
method that insures that the representation is up-to-date.
Either version stamp your objects.
Or intelligently figure out how to evolve objects by examining their current structure.
The main thing is to keep an archive of prior formats of object instances to regression test your new evolve
methods. If you can do extracts to a mini-database, that would do the trick. Then just run your new code over a historical collection of mini-databases.
OSPEEK EXAMPLE
ObjStore::Root=SCALAR(0x18aa6c) Bright = Node {
VERSION => 5,
center => 1,
ctime => '19970814113317',
daily_hits => ObjStore::HV {
19970814 => 1,
},
desc => 'We are what we think. All that we are arises with our thoughts. With our thoughts we make the world.',
hits => 1,
name => 'Bright',
owner => '0',
reflected => '0',
rel => ObjStore::Set [
Node {
ctime => '19970814113317',
daily_hits => ObjStore::HV {
19970814 => 8,
},
desc => '',
hits => 8,
n_anon => 6,
name => 'Joe's Store',
owner => Node { ... }
reflected => 1,
rel => ObjStore::Set [
Node { ... }
Node {
ctime => '19970814113317',
desc => 'New users arrive here.',
hits => '0',
index => ObjStore::HV {
Anon-4 => User {
ctime => '19970814130657',
daily_hits => ObjStore::HV {
19970814 => 1,
},
desc => 'Anonymous temporary login.',
expire => '19970915130657',
hits => 1,
name => 'Anon-4',
owner => User { ... }
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
views => ObjStore::HV {
1 => User::View {
at => Node {
color => 'light green',
ctime => '19970814124048',
daily_hits => ObjStore::HV {
19970814 => 8,
},
desc => '',
hits => 8,
name => 'Research',
owner => Node { ... }
reflected => 1,
rel => ObjStore::Set [
Node { ... }
Node { ... }
Node { ... }
Node { ... }
],
url => '',
},
prior => ObjStore::HV {
0 => Node { ... }
1 => Node { ... }
2 => User { ... }
},
},
2 => User::View {
at => User { ... }
},
},
},
Anon-5 => User {
ctime => '19970814191636',
desc => 'Anonymous temporary login.',
expire => '19971013191636',
hits => '0',
name => 'Anon-5',
owner => User { ... }
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
views => ObjStore::HV {
1 => User::View {
at => User { ... }
prior => ObjStore::HV {
0 => User { ... }
},
},
2 => User::View {
at => User { ... }
},
},
},
Anon-6 => User {
ctime => '19970814191724',
desc => 'Anonymous temporary login.',
expire => '19971013191724',
hits => '0',
name => 'Anon-6',
owner => User { ... }
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
views => ObjStore::HV {
1 => User::View { ... }
2 => User::View { ... }
},
},
joshua => User {
ctime => '19970814113317',
daily_hits => ObjStore::HV {
19970814 => 13,
},
desc => '',
dreamer => 1,
expire => '19971013182157',
hits => 13,
name => 'joshua',
owner => User { ... }
passwd => 'zzReR55rX6.JA',
proposals => ObjStore::Set [
User::Proposal {
about => Node { ... }
ctime => '19970814195243',
from => User { ... }
to => User { ... }
},
],
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
url => '',
views => ObjStore::HV {
1 => User::View { ... }
2 => User::View { ... }
},
},
},
name => 'Users',
owner => Node { ... }
reflected => 2,
rel => ObjStore::Set [
Node { ... }
User { ... }
User { ... }
User { ... }
User { ... }
],
},
Node { ... }
],
},
Node { ... }
Node { ... }
],
},
WHY IS PERL A BETTER FIT FOR DATABASES THAN SQL, C++, OR JAVA?
When you write a structure declaration in C++ or Java you are assigning both field-names, field-types, and field-order.
struct CXX {
char *name;
char *title;
double size;
};
Programs almost always require a recompile to change any of these attributes. This is fine for small to medium size applications but is not suitable for large databases. It is too inflexible. An SQL-type language is needed.
When you create a table in SQL you are assigning only field-names and field-types.
create table CXX
(name varchar(80),
title varchar(80),
size double)
This is a more flexible data declaration, but SQL gives you far less expressive power than C++ or Java. Applications end up being written in C++ or Java while their data is stored in SQL. Managing the syncronization between the two languages creates a lot of extra complexity. So much so that there are many software companies that exist solely to help address this headache.
Perl is better because it spans all the requirements in a single language. For example, this is similar to an SQL table:
my $h1 = { name => undef, title => undef, size => undef };
Only the field-names are specified.
To address the other side of the spectrum, Malcolm Beattie is working on a Perl compiler which is currently in beta-test. Here is his brief description of a new hybrid hash-array that is supported:
An array ref $a can be dereferenced as if it were a hash
ref. $a->{foo} looks up the key "foo" in %{$a->[0]}. The value is the
index in the true underlying array @$a. As an addition, if the array
ref is in a lexical variable tagged with a classname ("my CXX $obj" to
match your example above) then constant key dereferences of the form
$obj->{foo} are mapped to $obj->[123] at compile time by looking up
the index in %CXX::FIELDS.
For example:
my $schema_hashref = { 'field1' => 1, 'field2' => 2 };
my $arr = [$schema_hashref, 'fwaz', 'snorf'];
print "$arr->{field1} : $arr->{field2}\n"; # "fwaz : snorf"
I haven't done benchmarks yet, but considering the implementation, compiled fake hashes should make Perl very competitive with Java / ObjectStore database applications in terms of raw performance.
SUMMARY (LONG)
SQL
All Perl databases use the same flexible schema that can be examined and updated with generic tools. This is the key advantage of SQL, now available in Perl.
Perl / ObjectStore is definitely faster than SQL too. Not to mention that Perl is a general purpose programming language and SQL is at best a 'query language'.
C++
Special purpose data types can be coded in C++ and dynamically linked into Perl. Since C++ will always be faster than Java, this gives Perl an edge in the long run. Perl is to C/C++ as C/C++ is to assembly language.
JAVA
Java has the buzz, but!
Just like C++, the lack of a universal generic schema limits use to a single application at a time. Without some sort of
tie
mechanism, I don't see how this can be remedied.All Java databases must serialize data to store it. Until Java supports persistent allocation directly, database operations will always be slower than C++.
Perl will soon integrate with Java enough to use SwingSet - AWT.
I'd like to see some comparisions of code length when solving the same problems in Java and in Perl. I have a strong suspicion that it is easier to do data processing in Perl.
ETA
0-3 MONTHS
Perl compiler; kernel threads; fake hashes
3-6 MONTHS
Dynamically loaded application schemas; Perl-Java integration
SUMMARY (SHORT)
Perl can store data
optimized for flexibility or for speed
in transient memory or persistent memory
without violating the principle of encapsulation or obstructing general ease of use.
TECHNICAL IMPLEMENTATION
You don't have to understand anything about the technical implementation. Just know that:
ObjectStore is outrageously powerful, sophisticated, even over-engineered.
The Perl interface is optimized for simplicity and easy of use. (If it's not fun, why bother?)
The performance of raw ObjectStore is so good that even with a gunky Perl layer, benchmarks will show that relational databases can be safely left on the bookshelf where they belong.
UNDER THE HOOD
It is not practical to simply make Perl's internal data structures persistent. Values in the database have different requirements than transient values. The dress code is upscale, New York style attire.
MEMORY
Memory usage is much more important in a database than in transient memory. When databases can be as large or larger than ten million megabytes, a few percent difference in compactness can mean a lot.
REFERENCE COUNTING
Persistent data is reference counted separately from transient data.
Sounds like New York, yes?
DIFFERENCES BETWEEN THE PERL AND C++ APIs
Most stuff should be exactly the same. However,
Some static methods are directly under
ObjStore::
.Transactions are significantly different: simple.
GO EXTENSION CRAZY
ObjStore::UNIVERSAL
is the base class for all persistent types.
You cannot directly access persistent scalars from Perl. They are always immediately copied into transient scalars. So the ObjStore::UNIVERSAL
base class is only for objects (or collections).
ObjStore::Set
is the base class for sets.
ObjStore::HV
is the base class for tied hashes.
ObjStore::AV
is the base class for tied arrays.
ObjStore::Text
will be the base class for large binary data.
When an ObjectStore exception occurs, $ObjStore::EXCEPTION
is called with an explaination. You can replace the default handler with your own function.
Each subclass of ObjStore::UNIVERSAL
has a %REP
hash. Persistent object implementations add their creation functions to the hash. Each packages' new
method decides on the best representation, calls the creation function, and returns the persistent object.
How does the ObjStore module know how to translate nested transient structures into nested persistent structures? ObjStore::DEFAULT_GATEWAY
makes the determination. You can replace it with your own code by using my $oldgw = ObjStore::set_gateway($coderef)
.
You can add your own C++ representations for each of Set, AV, and HV. If you want to know the specifics, look at the code for the standard built-in representations (GENERIC.*
).
You can add new families of objects that inherit from ObjStore::UNIVERSAL
. Suppose you want highly optimized, persistent bit vectors? Or matrics? These would not be difficult to add. Especially once Object Design figures out how to support multiple application schemas within the same executable. They claim that this tonal facility will be available in the next release.
OSSV_BRIDGE
The following explaination may be helpful to developers trying to understand the inner workings of the typemap. If you don't know what a typemap is, just skip to the next section.
The struct ossv_bridge
is used to bridge between Perl and C++ objects. It contains transient cursors and transient pointers to persistent data. OSSV
's are scalars. OSSVPV
are non-scalars. An OSSV
must be used to store a reference to an OSSVPV
. This corresponds roughly to how things are done in the Perl internals.
An instance of ossv_bridge
stores pointers to both an OSSV
and OSSVPV
. Only one pointer need be set, but if both are set they must refer to the same persistent object. The preferance is that the OSSV
be used, but the OSSVPV
should be used if the OSSV
is not already created or may become invalid (e.g. &ar[4] then array resizes).
DIRECTION
MORE BUILT-IN DATA TYPES
Text objects implemented using osmmtype and subclassed from IO::Handle. Support for one of Object Design's Text Object Managers. Support for bit vectors and matrics.
MORE APIS
Support for notification, database access control, and any other interesting ObjectStore APIs.
BUGS
NESTED TRANSACTIONS
Retry in the event of deadlock seems to run into an infinite loop within nested transactions. Nested transactions are disabled for till further notice.
TROMPED KEYWORDS
The ObjStore version of
bless
replaces the built-in and is imported by default.STRING CAVEAT
The length of string values are not stored in the database. Since lengths are calculated at each access, lengthy strings should be stored in a text object. Text objects will be available in a future release.
NUMERICS
Numbers (integers and doubles) are stored in separately allocated memory blocks. This is probably not as efficient as a union, but unions are nearly impossible to manage in ObjectStore. Would pool allocation be a good trade-off?
CURSED OBJECTS
The strings used to record the blessed nature of persistent objects are allocated in a private hash in the default segment of a database (See
'ospeek -all'
). If you accidentally mess up or change any of these strings, your objects will be cursed. You will need to re-bless each to fix the broken pointers.REFERENCE CAVEAT
Object reference counts are 32 bits wide, but scalar (
OSSV
) reference counts are only 16 bits. This does not restrict perl developers from creating zillions of references to a single hash (a hash is not a scalar), but it can cause some confusion in arcane cases. [Give example and work-around.]
AUTHOR
Copyright (c) 1997 Joshua Nathaniel Pritikin. All rights reserved.
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Perl / ObjectStore is available via any CPAN mirror site. See http://www.perl.com/CPAN/modules/by-module/ObjStore
Portions of the collection code snapped from splash, Jim Morris's delightful C++ library ftp://ftp.wolfman.com/users/morris/public/splash .
Also, a poignant thanks to all the wonderful teachers with which I've had the opportunity of studying.
SEE ALSO
Examples in the t/ directory, Perl5, ObjectStore, and happily not SQL!