NAME
ObjStore - perl extension for ObjectStore OODBMS
SYNOPSIS
use ObjStore ':ALL';
my $db = ObjStore::open(&schema_dir . "/perltest.db", 0, 0666);
try_update {
my $top = $db->root('whiteboard') ||
$db->root('whiteboard', new ObjStore::AV($db, 1000));
for (my $x=1; $x < 10000; $x++) {
my $z= $top->[$x];
$top->[$x] ||= {
id => $x,
m1 => "I will not talk in ObjectStore/perl class.",
m2 => "I will study the documentation before asking questions.",
};
}
print "Very impressive. I see you are already an expert.\n";
};
DESCRIPTION
The new SQL and the sunset of relational databases.
ObjectStore is the leading object-oriented database. It is engineered by Object Design, Inc. (http://www.odi.com) (NASDAQ: ODIS). The database uses the virtual memory mechanism to make persistent data available in the most efficient manner possible.
In case you didn't know, Object Design's Persistent Storage Engine has been licensed by Sun, Microsoft, Netscape, and Symantic for inclusion in their Java development environments.
Prior to this joining of forces,
ObjectStore was too radical a design decision for many applications.
perl5 did not have a simple way of storing complex data persistently.
Now there is an easy way to build databases, especially if you care about preserving your ideals of data encapsulation. (See below!)
API
Much of the perl API is a direct interface to the C++ API. Refer to the ObjectStore documentation for exact symantics. If you need a function that isn't available in perl, send mail to the OS/perl mailing list (see the README).
Fortunately, you probably wont need to use most of the API. It is listed below simply to make you feel more comfortable.
ObjStore
$name = ObjStore::release_name()
$major = ObjStore::release_major()
$minor = ObjStore::release_minor()
$maintenance = ObjStore::release_maintenance()
$yes = ObjStore::network_servers_available();
ObjStore::set_auto_open_mode(mode, fp, [sz]);
$num = ObjStore::return_all_pages();
$size = ObjStore::get_page_size();
@Servers = ObjStore::get_all_servers();
$in_abort = ObjStore::abort_in_progress();
$db = ObjStore::open($pathname, $read_only, $mode);
$num = ObjStore::get_n_databases();
::Server
$name = $s->get_host_name();
$is_broken = $s->connection_is_broken();
$s->disconnect();
$s->reconnect();
@Databases = $s->get_databases();
::Database
$db->close();
$db->destroy();
$db->get_default_segment_size();
$db->get_sector_size();
$db->size();
$db->size_in_sectors();
$ctime = $db->time_created();
$is_open = $db->is_open();
$db->open_mvcc();
$is_mvcc = $db->is_open_mvcc();
$read_only = $db->is_open_read_only();
$can_write = $db->is_writable();
$db->set_fetch_policy(policy[, blocksize]);
Policy can be one of
segment
,page
, orstream
.$db->set_lock_whole_segment(policy);
Policy can be one of
as_used
,read
, orwrite
.$db = ObjStore::Database::of($pvar);
$Seg = $db->create_segment();
$Seg = $db->get_segment($segment_number);
@Segments = $db->get_all_segments();
@Roots = $db->get_all_roots();
$root = $db->create_root($root_name);
$root = $db->find_root($root_name);
$value = $db->root($root_name[, $new_value]);
This is the recommended API for roots. If the given root is not found, creates a new one. Sets the root's value if $new_value is defined. Returns the root's current value.
$db->destroy_root($root_name);
Destroys the root with the given name if it exists.
::Root
$root->get_name();
$root->get_value();
$root->set_value($new_value);
$root->destroy();
::Transaction
ObjectStore transactions and exceptions are seemlessly integrated into perl. ObjectStore exceptions cause a die
in perl just as perl exceptions cause a transaction abort.
try_update {
$top = $db->root('top');
$top->{abc} = 3;
die "Oops! abc should not change!"; # aborts the transaction
};
There are three types of transactions: try_read
, try_update
, and try_abort_only
. In a read transaction, you are not allowed to modify persistent data.
try_read {
my $var = $db->root('top');
$var->{abc} = 7; # write to $var triggers die(...)
};
$T = ObjStore::Transaction::get_current();
$type = $T->get_type();
$pop = $T->get_parent();
$T->prepare_to_commit();
$yes = $T->is_prepare_to_commit_invoked();
$yes = $T->is_prepare_to_commit_completed();
ObjStore::set_transaction_priority($very_low);
ObjStore::set_max_retries($oops);
ObjStore::rethrow_exceptions
my $oops = ObjStore::get_max_retries();
my $yes = ObjStore::is_lock_contention();
my $type = ObjStore::get_lock_status($ref);
my $tm = ObjStore::get_readlock_timeout();
my $tm = ObjStore::get_writelock_timeout();
ObjStore::set_readlock_timeout($tm);
ObjStore::set_writelock_timeout($tm);
::Segment
$Seg->destroy();
$size = $Seg->size();
$yes = $Seg->is_empty();
$yes = $Seg->is_deleted();
$num = $Seg->get_number();
$comment = $Seg->get_comment();
$Seg->set_comment($comment);
$Seg->lock_into_cache();
$Seg->unlock_from_cache();
$Seg->set_fetch_policy($policy[, $size]);
Policy can be one of
segment
,page
, orstream
.$Seg->set_lock_whole_segment($policy);
Policy can be one of
as_used
,read
, orwrite
.$Seg = ObjStore::Segment::of($pvar);
CREATING CONTAINERS
Databases are comprised of segments. Segments dynamically resize from very small to very big. You should split your data into lots segments when it makes sense. Segment improve locality of reference and can be a unit of locking or caching.
When you create a container you must specify the segment in which it is to be allocated. All containers are created using the form 'new ObjStore::$type($store, $cardinality)'
. You may pass any persistent object in place of $store and the new container will be created in the same segment as the $store object!
Arrays
The following code snippet creates a persistent array reference with an expected cardinality of ten elements.
my $a7 = new ObjStore::AV($store, 10);
None of the usually array operations are supported except fetch and store. At least the following works:
$a7->[1] = [1,2,3,[4,5],6];
Complete array support will be available as soon as Larry and friends fix the TIEARRAY interface. (See perltie(3) or http://www.perl.com more info.)
Hashes
The following code snippet creates a persistent hash reference with an expected cardinality of ten elements.
my $h7 = new ObjStore::HV($store, 10);
An array representation is used for low cardinalities. Arrays do not scale well, but they do afford a compact representation. ObjectStore's os_Dictionary
is used for large cardinalities.
Data structures can be built with the normal perl syntax:
$h7->{foo} = { 'fwaz'=> { 1=>'blort', 'snorf'=>3 }, b=>'ouph' };
Or the equally effective but unbearibly tedious:
my $h1 = $dict->{foo} ||= new ObjStore::HV($dict);
my $h2 = $h1->{fwaz} ||= new ObjStore::HV($h1);
$h2->{1}='blort';
$h2->{snorf}=3;
$h1->{b}='ouph';
Perl saves us again! Relief.
Sets
The following code snippet creates a set with an expected cardinality of ten elements.
my $set = new ObjStore::Set($store, 10);
Sets are simple collections. They do not support duplicates. The following methods are supported:
$set->add($obj, { hello=>1 });
$set->rm($obj);
$yes = $set->contains($obj);
for (my $obj = $set->first; $obj; $obj = $set->next) {
# do something with $obj
}
An array representation is used for low cardinalities. Arrays are not efficient, but they are compact. ObjectStore's os_set
is used for large cardinalities.
Changing the membership of a set while iterating over the members has undefined results.
OSPEEK
While there is no official schema for a perl database, the ospeek
utility generates a sample of data content and structure. The following output was snapped from a database that supports a CGI application we have developed. Note how circular references and pointers between objects are summarized.
Wait! No Schema?! How Can This Scale?
How can relational databases scale?! When you write down a central schema, you are violating the principle of encapsulation. This is dumb. None of the usual database operations require a central schema. Why create artificial dependencies between your classes when you can avoid it?
Lazy Evolution
Even schema evolution can be done piecemeal. Give all your objects an evolve
method that insures that the representation is up-to-date.
Either tag your objects with version numbers,
Or intelligently figure out how to evolve objects by examining their current structure.
The main thing is to keep an archive of prior formats of object instances to regression test your new evolve
methods. If you can do extracts to a mini-database, that would do the trick. Then just run your new code through a historical mini-database.
ospeek Example Output
ObjStore::Root Bright = Node {
VERSION => 5,
center => 1,
ctime => '19970814113317',
daily_hits => ObjStore::HV {
19970814 => 1,
},
desc => 'We are what we think. All that we are arises with our thoughts. With our thoughts we make the world.',
hits => 1,
name => 'Bright',
owner => '0',
reflected => '0',
rel => ObjStore::Set [
Node {
ctime => '19970814113317',
daily_hits => ObjStore::HV {
19970814 => 8,
},
desc => '',
hits => 8,
n_anon => 6,
name => 'Joe's Store',
owner => Node { ... }
reflected => 1,
rel => ObjStore::Set [
Node { ... }
Node {
ctime => '19970814113317',
desc => 'New users arrive here.',
hits => '0',
index => ObjStore::HV {
Anon-4 => User {
ctime => '19970814130657',
daily_hits => ObjStore::HV {
19970814 => 1,
},
desc => 'Anonymous temporary login.',
expire => '19970915130657',
hits => 1,
name => 'Anon-4',
owner => User { ... }
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
views => ObjStore::HV {
1 => User::View {
at => Node {
color => 'light green',
ctime => '19970814124048',
daily_hits => ObjStore::HV {
19970814 => 8,
},
desc => '',
hits => 8,
name => 'Research',
owner => Node { ... }
reflected => 1,
rel => ObjStore::Set [
Node { ... }
Node { ... }
Node { ... }
Node { ... }
],
url => '',
},
prior => ObjStore::HV {
0 => Node { ... }
1 => Node { ... }
2 => User { ... }
},
},
2 => User::View {
at => User { ... }
},
},
},
Anon-5 => User {
ctime => '19970814191636',
desc => 'Anonymous temporary login.',
expire => '19971013191636',
hits => '0',
name => 'Anon-5',
owner => User { ... }
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
views => ObjStore::HV {
1 => User::View {
at => User { ... }
prior => ObjStore::HV {
0 => User { ... }
},
},
2 => User::View {
at => User { ... }
},
},
},
Anon-6 => User {
ctime => '19970814191724',
desc => 'Anonymous temporary login.',
expire => '19971013191724',
hits => '0',
name => 'Anon-6',
owner => User { ... }
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
views => ObjStore::HV {
1 => User::View { ... }
2 => User::View { ... }
},
},
joshua => User {
ctime => '19970814113317',
daily_hits => ObjStore::HV {
19970814 => 13,
},
desc => '',
dreamer => 1,
expire => '19971013182157',
hits => 13,
name => 'joshua',
owner => User { ... }
passwd => 'zzReR55rX6.JA',
proposals => ObjStore::Set [
User::Proposal {
about => Node { ... }
ctime => '19970814195243',
from => User { ... }
to => User { ... }
},
],
reflected => 3,
rel => ObjStore::Set [
Node { ... }
],
url => '',
views => ObjStore::HV {
1 => User::View { ... }
2 => User::View { ... }
},
},
},
name => 'Users',
owner => Node { ... }
reflected => 2,
rel => ObjStore::Set [
Node { ... }
User { ... }
User { ... }
User { ... }
User { ... }
],
},
Node { ... }
],
},
Node { ... }
Node { ... }
],
},
WHY IS PERL A BETTER FIT FOR DATABASES THAN SQL, C++, OR JAVA?
When you write a structure declaration in C++ or Java you are assigning both field-names, field-types, and field-order.
struct CXX {
char *name;
char *title;
double size;
};
Programs almost always require a recompile to change any of these attributes. This is fine for small to medium size applications but is not suitable for large databases. It is too inflexible. An SQL-type language is needed.
When you create a table in SQL you are assigning only field-names and field-types.
create table CXX
(name varchar(80),
title varchar(80),
size double)
This is a more flexible data declaration, but SQL gives you far less expressive power than C++ or Java. Applications end up being written in C++ or Java while their data is stored in SQL. Managing the syncronization between the two languages creates a lot of extra complexity. So much so that there are many software companies that exist solely to help address this headache.
perl is better because it spans all the requirements in a single language. For example, this is similar to an SQL table:
my $h1 = { name => undef, title => undef, size => undef };
Only the field-names are specified.
To address the other side of the spectrum, Malcolm Beattie is working on a perl compiler which is currently in beta-test. Here is his brief description of a new hybrid hash-array that is supported:
An array ref $a can be dereferenced as if it were a hash
ref. $a->{foo} looks up the key "foo" in %{$a->[0]}. The value is the
index in the true underlying array @$a. As an addition, if the array
ref is in a lexical variable tagged with a classname ("my CXX $obj" to
match your example above) then constant key dereferences of the form
$obj->{foo} are mapped to $obj->[123] at compile time by looking up
the index in %CXX::FIELDS.
For example:
my $schema_hashref = { 'field1' => 1, 'field2' => 2 };
my $arr = [$schema_hashref, 'fwaz', 'snorf'];
print "$arr->{field1} : $arr->{field2}\n"; # "fwaz : snorf"
I haven't done benchmarks yet, but considering the implementation, compiled fake hashes should make perl very competitive with Java / ObjectStore database applications in terms of raw performance.
Summary (long)
SQL
All perl databases use the same flexible schema that can be examined and updated with generic tools. This is the key advantage of SQL, now available in perl.
Perl / ObjectStore is definitely faster than SQL too. Not to mention that perl is a general purpose programming language and SQL is at best a 'query language'.
C++
Special purpose data types can be coded in C++ and dynamically linked into perl. Since C++ will always be faster than Java, this gives perl an edge in the long run. perl is to C/C++ as C/C++ is to assembly language.
JAVA
Java has the buzz, but!
Just like C++, the lack of a universal generic schema limits use to a single application at a time. Without some sort of
tie
mechanism, I don't see how this can be remedied.All Java databases must serialize data to store it. Until Java supports persistent allocation directly, database operations will always be slower than C++.
Perl will soon integrate with Java enough to use SwingSet - AWT.
I'd like to see some comparisions of code length when solving the same problems in Java and in perl. I have a strong suspicion that it is easier to do data processing in perl.
ETA
0-3 MONTHS
Perl compiler; kernel threads; fake hashes
3-6 MONTHS
Dynamically loaded application schemas; proper tied arrays; debugged tie interface; perl-Java integration
Summary (short)
Perl can store data
optimized for flexibility or for speed
in transient memory or persistent memory
without violating the principle of encapsulation or obstructing general ease of use.
ADVANCED FEATURES
Bless
The ObjStore module installs its own version of bless
which assures that blessings are persistent. For example:
package MyObject;
use ObjStore;
@ISA = qw(ObjStore::HV);
sub new {
my ($class, $store) = @_;
my $o = $class->SUPER::new($store, $class);
$o->{attribute} = 5;
$o;
}
package main;
my $o = new MyObject($db);
If you store each class in a separate .pm
file in your @INC path (see require
), then the classes will be autoloaded as you traverse your data.
Class Autoloading
ObjStore tries to require
each class as you access persistent instances the first time. This means that you can write generic data processing programs that automatically load the appropriate libraries to manipulate data as the data is accessed.
To disable the class autoloading behavior:
ObjStore::disable_class_auto_loading();
This mechanism is orthogonal to the AUTOLOAD
mechanism for autoloading functions.
Transactions Part Two
EVAL
Transactions are always executed within an implicit
eval
. If you do not want to abort your program when an ObjectStore exception occurs, you should indicate that you want to check errors yourself:ObjStore::rethrow_exceptions(0);
After a transaction, you will need to check the value of
$@
to see if anything went wrong and determine how to proceed.try_update { ... }; die if $@; # check for errors!
DEADLOCK
Transactions are automatically retried in the case of a deadlock. If you need to handle deadlocks specially, you can use ObjStore::set_max_retries(0) and write the logic (or illogic) yourself.
Stargate
The stargate determines which collection representations are used to store implicitly created hashes and arrays. It is called recursively on data structures in order to copy them into persistent memory. If you replace the default stargate with your own, make sure to dismember the transient structures as they are processed to insure that circular structures will be collected in transient memory.
ObjStore::set_stargate(sub {
my ($seg, $sv) = @_;
my $type = reftype $sv;
my $class = ref $sv;
if ($type eq 'HASH') {
my $hv = new ObjStore::HV($seg, ...);
while (my($hk,$v) = each %$sv) { $hv->STORE($hk, $v); }
%$sv = ();
if ($class ne 'HASH') { ObjStore::bless $hv, $class; }
$hv
} elsif ($type eq 'ARRAY') {
...
} else {
croak("Stargate: Don't know how to translate $sv");
}
};
TECHNICAL IMPLEMENTATION
You don't have to understand anything about the technical implementation. Just know that:
ObjectStore is outrageously powerful, sophisticated, even over-engineered.
The perl interface is optimized for simplicity and easy of use. (If it's not fun, why bother?)
The performance of raw ObjectStore is so good that even with a gunky perl layer, benchmarks will show that relational databases can be safely left on the bookshelf where they belong.
Differences Between The Perl And C++ APIs
Most stuff should be exactly the same. However,
Some static methods sit directly under
ObjStore::
.Transactions are simplified.
Data Representation
Memory usage is much more important in a database than in transient memory. When databases can be as large or larger than ten million megabytes, a few percent difference in compactness can mean a lot. Therefore, I am always thinking about ways of conserving persistent memory.
enum ossvtype {
ossv_undef=1,
ossv_iv=2, // integer
ossv_nv=3, // double
ossv_pv=4, // string
ossv_obj=5 // ref counted objects (containers or complex objects)
};
struct OSSV { // persistent scalar
void *vptr;
os_unsigned_int16 _refs; //unused
os_int16 _type;
};
struct hkey { // hash key
char *pv;
os_unsigned_int32 len;
};
struct hent { // hash element
hkey hk;
OSSV hv;
};
struct OSPV_iv { // IV storage
os_int32 iv;
};
struct OSPV_nv { // NV storage
double nv;
};
There are number of weaknesses in the current schema:
OSSV
The refcnt is no longer used and the type of an OSSV could be inferred instead of stored (save 4 bytes per OSSV). The same I32 can be used for an integer value or string length (save an allocation per I32).
HASH KEYS
Hash keys store their length but not their hash. Actually, hash keys probably shouldn't even cache their hashed value, just a straight char* to minimize memory usage (save 4 bytes & an allocation).
STRINGS
Strings do not store their length so you can't store strings with embedded NULLs.
NO WEAK REFERENCES
Changes will be made as soon as I finish the database evolver.
Go Extension Crazy
ObjStore::UNIVERSAL
is the base class for all persistent objects. You cannot directly access persistent scalars from perl. They are always immediately copied into transient scalars. So the ObjStore::UNIVERSAL
base class is only for objects (or collections).
ObjStore::UNIVERSAL::Container
is the base class for all containers.
ObjStore::Set
is the base class for sets.
ObjStore::HV
is the base class for tied hashes.
ObjStore::AV
is the base class for tied arrays.
ObjStore::Cursor
is the base class for cursors.
ObjStore::File
will be the base class for large binary data.
When an ObjectStore exception occurs, $ObjStore::EXCEPTION
is called with an explaination. You can replace the default handler with your own function.
Each subclass of ObjStore::UNIVERSAL
has a %REP
hash. Persistent object implementations add their creation functions to the hash. Each packages' new
method decides on the best representation, calls the creation function, and returns the persistent object.
You can add your own C++ representations for each of Set, AV, and HV. If you want to know the specifics, look at the code for the provided built-in representations (GENERIC.*
).
You can add new families of objects that inherit from ObjStore::UNIVERSAL
. Suppose you want highly optimized, persistent bit vectors? Or matrics? These would not be difficult to add. Especially once Object Design figures out how to support multiple application schemas within the same executable. They claim that this tonal facility will be available in the next release.
ossv_bridge typemap
The following explaination may be helpful to developers trying to understand the ObjStore typemap. If you don't know what a typemap is, just skip to the next section.
The struct ossv_bridge
is used to bridge between perl and C++ objects. It contains transient cursors and transient pointers to persistent data. Immediately after a transaction finishes, invalidate
is invoked on all outstanding bridges. This is necessary in order to update the reference counts properly. This was also the most difficult part to get right. But hey, how many databases do reference counting?
DIRECTION
MORE BUILT-IN DATA TYPES
Text objects implemented using osmmtype and subclassed from IO::Handle. Support for one of Object Design's Text Object Managers. Support for bit vectors and matrics.
MORE APIS
Support for notification, database access control, and any other interesting ObjectStore APIs.
EXPORTS
bless
, try_read
, try_update
, try_abort_only
by default. Most other static methods can also be exported.
BUGS
NESTED TRANSACTIONS
Disabled until the transaction support is cleaned up.
CURSED OBJECTS
The strings used to record the blessed nature of persistent objects are allocated in a private hash in the default segment of a database (See
'ospeek -all'
). If you accidentally mess up or change any of these strings, your objects will be cursed. You will need to re-bless each to fix the broken pointers. A database copy script is in the works.
AUTHOR
Copyright (c) 1997 Joshua Nathaniel Pritikin. All rights reserved.
This package is free software; you can redistribute it and/or modify it under the same terms as perl itself. perl / ObjectStore is available via any CPAN mirror site. See http://www.perl.com/CPAN/modules/by-module/ObjStore
Portions of the collection code snapped from splash, Jim Morris's delightful C++ library ftp://ftp.wolfman.com/users/morris/public/splash .
Also, a poignant thanks to all the wonderful teachers with which I've had the opportunity of studying.
SEE ALSO
Examples in the t/ directory, perl5, ObjectStore, and happily not SQL!