NAME

Rosetta::Framework - Main Rosetta purpose and design documentation

ABSTRACT

Rosetta is a comprehensive framework for database-using applications of any size or function that allows them to be easily portable across multiple database implementations because any proprietary details of each are abstracted away. At the same time, it is designed to be fast and efficient. Applications use Rosetta as a virtual embedded database, whose API is called the "Rosetta Native Interface" or "RNI", and whose feature set is an opaque normalized superset of all common database feature sets. The feature superset includes both data manipulation (with multi-table selects and updates plus subqueries or stored procedure calls) and schema manipulation (tables, views, procedures). Rosetta is designed to work equally well with both embedded and client-server databases; in the latter case, it is the client.

The RNI is implemented using mainly a "Command" design pattern, meaning that it has few real functions or methods, but those use objects as input and output, which are flexible enough to define any task or result. The RNI is verbose and intended to provide non-ambiguous structured definitions of all tasks, so that the results of executing them are easy to predict; the definitions are multi-dimensional data structures (or objects) having atomic values (which also have native data type formats). Rosetta has this as an advantage over other database abstractions that use serialized strings like SQL (such as ODBC/JDBC), because each database has its own SQL dialect, and applications using them must be coded differently for each one. Rosetta is especially suited for data-driven applications, since the composite scalar values in their data dictionaries can often be copied directly to RNI structures, saving applications the tedious work of generating SQL themselves. Rosetta also provides native internationalization support, for example allowing system messages to be in multiple user languages simultaneously.

Rosetta makes it easy to layer alternative APIs on top of RNI, so that you can simplify or customize it to your specific needs. As practical examples of this, there are several emulators provided for common existing database APIs (such as ODBC/JDBC), so that most applications can simply use Rosetta as a hot-swappable replacement for them; you do not have to "learn yet another language" or re-code your application in order for it to just work with more databases. While the Rosetta core must always be embedded in an application to be used, an extension is available that will allow it to be used in a client-server arrangement instead (as ODBC does), where the server is a proxy for the client; the client is embedded in the main application, and it talks to the server in network-serialized RNI, which then translates the request into native database actions. Some utilities built on Rosetta are also available, for such tasks as cloning or backing up a database (schema and/or data; this includes scanning the database to make a data dictionary), or editing one through a web interface (like PHPPgAdmin or PHPMyAdmin but for any RDBMS).

Rosetta is not a complete database by itself and you need to separately have an actual database to use it with. (But there is a new embedded database available, suited for small data sets, that was created just for Rosetta.) Rosetta does not usually implement features that are missing in a database being abstracted (such as foreign key constraints, transactional integrity, or geographical data types), in order to give it an identical feature set to a more capable database; Rosetta just allows for the features that do exist to be called in an identical way. A consequence of this is that your choice of database implementation will indeed affect what features you have available; your application will port without changes only to databases which support the features that you use. The RNI may not interface to every single feature of a particular database (neglecting esoteric ones), so you can't use those features with Rosetta (but support can be added). Rosetta does not automate installation of any separate database software or configure it (like a package manager); you will have to do that yourself.

Note: the file DataDictionary.pod should be read next, after Framework.pod.

COPYRIGHT AND LICENSE

This file is part of the Rosetta database abstraction framework.

Rosetta is Copyright (c) 1999-2003, Darren R. Duncan. All rights reserved. Address comments, suggestions, and bug reports to perl@DarrenDuncan.net, or visit "http://www.DarrenDuncan.net" for more information.

Rosetta is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License (GPL) version 2 as published by the Free Software Foundation (http://www.fsf.org/). You should have received a copy of the GPL as part of the Rosetta distribution, in the file named "LICENSE"; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

Any versions of Rosetta that you modify and distribute must carry prominent notices stating that you changed the files and the date of any changes, in addition to preserving this original copyright notice and other credits. Rosetta is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GPL for more details.

Linking Rosetta statically or dynamically with other modules is making a combined work based on Rosetta. Thus, the terms and conditions of the GPL cover the whole combination.

As a special exception, the copyright holders of Rosetta give you permission to link Rosetta with independent modules that are interfaces to or implementations of databases, regardless of the license terms of these independent modules, and to copy and distribute the resulting combined work under terms of your choice, provided that every copy of the combined work is accompanied by a complete copy of the source code of Rosetta (the version of Rosetta used to produce the combined work), being distributed under the terms of the GPL plus this exception. An independent module is a module which is not derived from or based on Rosetta, and which is fully useable when not linked to Rosetta in any form.

Note that people who make modified versions of Rosetta are not obligated to grant this special exception for their modified versions; it is their choice whether to do so. The GPL gives permission to release a modified version without this exception; this exception also makes it possible to release a modified version which carries forward this exception.

While it is by no means required, the copyright holders of Rosetta would appreciate being informed any time you create a modified version of Rosetta that you are willing to distribute, because that is a practical way of suggesting improvements to the standard version.

DEPENDENCIES

All of my modules require Perl 5.004 or newer, even though only some of them need it, in the interest of consistency. That said, Perl 5.004 is more than 3 years old, is available as a binary on practically every OS that supports Perl at all, and is a very solid release, so I see no reason to support anything older.

The Rosetta core modules, meaning all modules that are in the distribution that you hold right now, don't require anything else.

However, other distributions which hold extensions to the Rosetta core, such as the almost always used Rosetta-Drivers distributions, do have other dependencies. These are specified in said other distributions, and often include the popular DBI and DBD::* modules, which most Driver modules use.

This distribution is getting split up soon. When that happens, the "Rosetta" distribution will be dependant on the new "SQL-ObjectModel" distribution.

Perl Version

5.004

Standard Modules

I<none>

Nonstandard Modules

I<none>

SYNOPSIS

Content of settings file "survey_prefs.pl", used by script below:

my $rh_prefs = {
	database_driver => 'Rosetta::Driver::MySQL',
	database_open_args => {
		server => 'survey1',
		user => 'joebloe',
		pass => 'fdDF9X0sd7zy',
	},
	question_list => [
		{
			visible_title => "What's your name?",
			type => 'str',
			name => 'name',
			is_required => 1,
		}, {
			visible_title => "What's the combination?",
			type => 'int',
			name => 'words',
		}, {
			visible_title => "What's your favorite colour?",
			type => 'str',
			name => 'color',
		},
	],
};

Content of a simple CGI script for implementing a web survey:

#!/usr/bin/perl
use strict;
use warnings;

&script_main();

sub script_main {
	my $base_url = 'http://'.($ENV{'HTTP_HOST'} || '127.0.0.1').$ENV{'SCRIPT_NAME'};
	my ($curr_mode) = $ENV{'QUERY_STRING'} =~ m/mode=([^&]*)/;

	my $form_data_str = '';
	read( STDIN, $form_data_str, $ENV{'CONTENT_LENGTH'} );
	chomp( $form_data_str );
	my %form_values = ();
	foreach my $pair (split( '&', $form_data_str )) {
		my ($key, $value) = split( '=', $pair, 2 );
		next if( $key eq "" );
		$key =~ tr/+/ /;
		$key =~ s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge;
		$value =~ tr/+/ /;
		$value =~ s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge;
		$form_values{$key} = $value;
	}

	my $fn_prefs = 'survey_prefs.pl';

	print
		"Status: 200 OK\n",
		"Content-type: text/html\n\n",
		"<html><head>\n",
		"<title>Simple Web Survey</title>\n",
		"</head><body>\n",
		"<p><a href=\"$base_url?mode=install\">Install Schema</a>\n",
		" | <a href=\"$base_url?mode=remove\">Remove Schema</a>\n",
		" | <a href=\"$base_url?mode=fillin\">Fill In Form</a>\n",
		" | <a href=\"$base_url?mode=report\">See Report</a></p>\n",
		"<hr />\n",
		"<form method=\"POST\" action=\"$base_url?mode=$curr_mode\">\n",
		"<p>\n",
		(&script_make_screen( $fn_prefs, $curr_mode, \%form_values )),
		"</p>\n",
		"<p><input type=\"submit\" name=\"OK\" value=\"Do It Now\" /></p>\n",
		"</form>\n",
		"</body></html>\n";
}

sub script_make_screen {
	my ($fn_prefs, $curr_mode, $form_values) = @_;

	my $prefs = do $fn_prefs;
	unless( ref( $prefs ) eq 'HASH' ) {
		return( "Error: can't obtain required preferences hash from '$fn_prefs': ".
			(defined( $prefs ) ? "result not a hash ref, but '$prefs'" : 
			$@ ? "compilation or runtime error of '$@'" : $!) );
	}

	eval {
		require Rosetta; # also compiles SQL::ObjectModel
	};
	if( $@ ) {
		return( "Error: can't compile Rosetta modules: $@" );
	}

	my $engine = Rosetta->new_engine();
	$engine->throw_error( 0 ); # on error, ret result obj, do not throw exception

	my $dbh = $engine->execute_command( {
		'type' => 'database_open',
		'driver' => $prefs->{database_driver},
		'args' => $prefs->{database_open_args}, # includes server name, user/pass
	} );
	if( $dbh->is_error() ) {
		return( "Error: can't open database: ".$dbh->get_error() );
	}

	my $html_output = &script_while_opened( $prefs, $dbh, $curr_mode, $form_values );

	my $rv = $dbh->execute_command( {
		'type' => 'database_close',
	} );
	if( $rv->is_error() ) {
		return( "Error: can't close database: ".$rv->get_error() );
	}

	return( $html_output );
}

sub script_while_opened {
	my ($prefs, $dbh, $curr_mode, $form_values) = @_;

	my $questions = $prefs->{question_list};
	unless( ref( $questions ) eq 'ARRAY' and scalar( @{$questions} ) > 0 ) {
		return( "Error: no survey question list defined in prefs file" );
	}

	my $dd_table = SQL::ObjectModel->new_table( 'survey_data' );

	foreach my $question (@{$questions}) {
		unless( ref( $question ) eq 'HASH' and $question->{visible_title} ) {
			return( "Error: invalid question defined in prefs file" );
		}
		$dd_table->add_column( { 
			'name' => $question->{name},
			'data_type' => { 'base_type' => $question->{type}, },
			'is_req' => $question->{is_required},
		} ) or return( "Error: invalid question defined in prefs file" );
	}

	if( $curr_mode eq 'install' ) {
		return( &script_do_install( $dbh, $dd_table, $questions, $form_values ) );
	}

	if( $curr_mode eq 'remove' ) {
		return( &script_do_remove( $dbh, $dd_table, $questions, $form_values ) );
	}

	if( $curr_mode eq 'fillin' ) {
		return( &script_do_fillin( $dbh, $dd_table, $questions, $form_values ) );
	}

	if( $curr_mode eq 'report' ) {
		return( &script_do_report( $dbh, $dd_table, $questions, $form_values ) );
	}

	return( "This is a simple demo.  Click on the menu items to do them." );
}

sub script_to_install {
	my ($dbh, $dd_table, $questions, $form_values) = @_;

	unless( $form_values->{OK} ) {  
		# user is seeing screen for first time (did not click 'OK' button)
		return( join( "", 
			"<h1>Install Schema</h1>\n",
			"<p>Do you want to install new schema to store answers for ", 
			"the following questions?</p>\n",
			"<ol>\n",
			(map { "<li>".$_->{visible_title}."</li>\n" } @{$questions}),
			"</ol>\n",
		) );
	}

	# user saw screen and clicked the 'OK' button, so try to install;
	# the following makes a Command of type 'table_create' and executes it

	my $rv = $dbh->execute_command( $dd_table->new_command_create() );
	if( $rv->is_error() ) {
		return( "Error: can't create survey table: ".$rv->get_error() );
	}

	return( "The new schema was successfully created." );
}

sub script_to_remove {
	my ($dbh, $dd_table, $questions, $form_values) = @_;

	unless( $form_values->{OK} ) {  
		# user is seeing screen for first time (did not click 'OK' button)
		return( join( "", 
			"<h1>Remove Schema</h1>\n",
			"<p>Do you want to remove existing schema to store answers for ", 
			"the following questions?</p>\n",
			"<ol>\n",
			(map { "<li>".$_->{visible_title}."</li>\n" } @{$questions}),
			"</ol>\n",
		) );
	}

	# user saw screen and clicked the 'OK' button, so try to destroy;
	# the following makes a Command of type 'table_destroy' and executes it

	my $rv = $dbh->execute_command( $dd_table->new_command_destroy() );
	if( $rv->is_error() ) {
		return( "Error: can't remove survey table: ".$rv->get_error() );
	}

	return( "The new schema was successfully removed." );
}

sub script_to_fillin {
	my ($dbh, $dd_table, $questions, $form_values) = @_;

	unless( $form_values->{OK} ) {  
		# user is seeing screen for first time (did not click 'OK' button)
		return( join( "", 
			"<h1>Fill In Form</h1>\n",
			"<p>Please answer the following questions.  ",
			"Those marked with a '*' are required.</p>\n",
			(map { 
					'<p>'.($_->{is_required} ? '*' : '').$_->{visible_title}.":".
					'<input type="text" name="'.$_->{name}.'" /></p>'."\n"
				} @{$questions}),
		) );
	}

	# user saw screen and clicked the 'OK' button, so try to fillin;
	# the following makes a Command of type 'data_insert' and executes it
	my $dd_view = SQL::ObjectModel->new_view( $dd_table );
	my $rv = $dbh->execute_command( $dd_view->new_command_insert( $form_values ) );
	if( $rv->is_error() ) {
		return( "Error: can't save form values in database: ".$rv->get_error() );
	}

	return( "Your form submission was saved successfully." );
}

sub script_to_report {
	my ($dbh, $dd_table, $questions, $form_values) = @_;

	# the following makes a Command of type 'data_select' and executes it
	my $dd_view = SQL::ObjectModel->new_view( $dd_table );
	my $cursor = $dbh->execute_command( $dd_view->new_command_select() );
	if( $cursor->is_error() ) {
		return( "Error: can't fetch form values from database: ".$cursor->get_error() );
	}
	my $rowset = $cursor->get_all_rows();

	my @html_output = (
		"<h1>See Report</h1>\n",
		"<p>Here are the answers that previous visitors gave:</p>\n",
		"<table>\n",
		"<tr>\n",
		(map { "<th>".$_->{visible_title}."</th>\n" } @{$questions}),
		"</tr>\n",
	);
	my @question_names = map { $_->{name} } @{$questions};
	foreach my $row (@{$rowset}) {
		push( @html_output, 
			"<tr>\n",
			(map { "<td>".$row->{$_}."</td>\n" } @question_names),
			"</tr>\n",
		);
	}
	push( @html_output, "</table>\n" );

	return( join( "", @html_output ) );
}

1;

DESCRIPTION

The Rosetta framework is intended to support complex (or simple) database-using applications that are easily portable across databases because common product-specific details are abstracted away. These include the RDBMS product and vendor name, what dialect of SQL its scripting or query interface uses, whether the product uses SQL at all or some other method of querying, how query results are returned, what features the RDBMS supports, how to manage connections, how to manage schema, how to manage stored procedures, and perhaps how to manage users. The main thing that this framework will not be doing in the forseeable future is managing the installation and configuration of the RDBMS itself, which may be on the same machine or a different one.

There are two main types of functionality that the Rosetta framework is designed to implement; this functionality may be better described in different groupings.

The first functionality type is the management (creation, modification, deletion) of the schema in a database, including: tables, keys, constraints, relations, sequences, views, stored procedures, triggers, and users. This type of functionality typically is used infrequently and sets things up for the main functionality of your database-using application(s). In some cases, typically with single-user desktop applications, the application may install its own schema, and/or create new database files, when it starts up or upon the user's prompting; this can be analogous to the result of a "New..." (or "Save As...") command in a desktop financial management or file archiving application; the application would then carry on to use the schema as its personal working space. In other cases, typically with multiple-user client-server applications, one "Installer" or "Manager" type application or process with exclusive access will be run once to create the schema, and then a separate application or process will be run to make use of it as a shared working space.

The second functionality type is the management (creation, modification, deletion) of the data in a database, including such operations as: direct selects from single or multiple tables or views, direct inserts or updates or deletes of records, calling stored procedures, using sequences, managing temporary tables, managing transactions, managing data integrity. This type of functionality typically is used frequently and comprises the main functionality of your database-using application(s). In some cases, typically with public-accessible websites or services, all or most users will just be viewing data and not changing anything; everyone would use the same database user and they would not be prompted for passwords or other security credentials. In other cases, typically with private or restricted-access websites or services, all or most users will also be changing data; everyone would have their own real or application-simulated database user, whom they log in as with a password or other credentials; as the application implements, these users can have different activity privileges, and their actions can be audited.

The Rosetta framework can be considered a low-level service because it allows a fine level of granularity or detail for the commands you can make of it and the results you get back; you get a detailed level of control. But it is not low-level in the way that you would be entering any raw SQL, or even small fragments of raw SQL; that is expressly avoided because it would expose implementation details that aren't true on all databases. Rather, this framework provides the means for you to specify in an RDBMS-generic fashion exactly what it is you want to happen, and your request is mapped to native or emulated functionality for the actual RDBMS that is being used, to do the work. The implementation or mapping is different for each RDBMS being abstracted away, and makes maximum use of that database's built-in functionality. Thereby, the Rosetta framework achieves the greatest performance possible while still being 100% RDBMS-generic.

This differs from other database abstraction modules or frameworks that I am aware of on CPAN, since the others tend to either work towards the lowest-common-denominator database while emulating more complex functionality, which is very slow, or more often they provide a much more limited number of abstracted functions and expect you to do things manually (which is specific to single databases or non-portable) with any other functionality you need. With many modules, even the abstracted functions tend to accept sql fragments as part of their input, which in the broadest sense makes those non-portable as well. With my framework I am attempting the "holy grail" of maximum portability with maximum features and maximum speed, which to my knowledge none of the existing solutions on CPAN are doing, or would be able to do short of a full rewrite. This is largely why I am starting a new module framework rather than trying to help patch an existing solution; I believe a rewrite is needed.

The Rosetta framework is best used through its native interface (RNI), which accepts and returns only atomic values (or multi-dimensional data structures containing them); no "parsing" or such analysis is done such as with SQL statements. The main reason is that this framework is intended primarily for a data-driven application programming model, where the applications use a "data dictionary" to control what work it is doing; the applications can simply copy the composite scalar values of the data dictionary, without having to encode them into a single string. The RNI is designed to allow entry of a non-ambiguous structured definition of any task that you would want a database to do. Rosetta is intended to support a superset of features from all common generic RDBMS products, so it should have a native way of expressing any task that you can do now. For cases where you don't already have a data dictionary, Rosetta can scan your existing database to create one.

One would think that, despite all the advantages that Rosetta can bring to a new application that is designed around RNI (or a simplifying wrapper of it), it wouldn't be very helpful to an existing older application that is built around "a different way of doing things". From the latter perspective, there looks to be just as much work involved in porting their application to use Rosetta as there would be to port it to a new database or other interface framework. The problem would be the all-too-common having to "learn yet another language", and then port the application to it (for that matter, it would be a new language for new app builders as well, although that may not be the same problem). Either transition could be a significant cost and the hurdle can deter upgrades to making apps portable.

But to help with this situation, Rosetta also includes several emulators (each of which is a higher-level layer that translates its input into RNI calls) for common existing database interfaces, so that most applications can simply use Rosetta as a hot-swappable replacement for them; you do not have to "learn yet another language" or re-code your application in order for it to just work with more databases. It should be possible to emulate any existing interface, and if a new one comes along with features that Rosetta can't handle (interface to or implement), then this is a reasonable excuse to update the core so that it is possible. That said, the success of an emulator depends largely on whether code that was using the original module is using the original the way it was designed or not; code that was hacking the internals of a module (as Perl makes so easy) is less likely to work (sort of like how an app used to using un-documented APIs on an operating sytem, or doing direct OS data structure access on a non-memory-protected OS, would break if the implementation of that OS changed).

Included in the Rosetta distribution will be several applications which serve as examples of Rosetta in use, but in some cases are useful themselves. One example will be a utility for copying one database to another, such as for backup or restore, or just migration. Another example will be a web app that works sort of like "PHPPgAdmin" or "PHPMyAdmin" (letting users manually edit schema and data) except that it works with many RDBMS products. Some code porting utilities could also be available, to help makers of old applications migrate to RNI, for better control and performance than an emulator would provide.

STRUCTURE

The modules composing the Rosetta framework are grouped into several main categories, which can be called: "Locale", "Schema" (SQL::ObjectModel), "Engine", "Driver", "Proxy" (Server/Client), "Validator", "Emulator", and others.

Note: the file DataDictionary.pod should be read next, after Framework.pod.

These classes do not export any functions or methods, so you need to call them using object notation. This means using Class->function() for functions and $object->method() for methods. If you are inheriting any class for your own modules, then that often means something like $self->method().

This briefly illustrates the relationship of the module groups:

           DATABASE
              |
              |
     Proxy    |
     Client;Driver--------------\
    called by |   uses   used by \
              |                   \
      invokes |                    \
            Engine--------------Schema;Locale (both used by other 7 groups)
    called by | \ uses   used by /  |
              |  \              /   |
      invokes/|   \            /   /
            //| Emulator------/   /
Validator--// | | uses   used by /
           /  | |               /
     Proxy/   | |              /
     Server APPLICATION-------/
                  uses   used by

The Locale modules function as constant data resources, mainly holding user-readable text such as database error messages and so forth, so that this text is separate from normal module code and easy to edit. These are part of the Rosetta core, and are used by the Engine and Driver modules. There are multiple Locale files rather than just one partly so that each file can have a different localization; adding a new user language is as easy as adding another file. The set of text strings being stored should be comprehensive enough that any Driver module can use them, and therefore not need their own Locale files for displaying error messages. All strings are mapped to short codes (eg: 'R001372'); these codes are what the normal module code uses to display a particular message.

The "Schema" and "Engine" modules are collectively referred to as the "Rosetta Native Interface" or "RNI", and they are the "Core" of the Rosetta framework to which all else is attached. The distinction between the two is that Schema modules are purely container objects which hold descriptions of "things" (such as data types or tables or views), while Engine modules typically are not containers and represent "actions" (such as connections or cursors). Schema objects are complete on their own and can be serialized or stored indefinately off site for later retrieval and use, such as with a "data dictionary" describing a table or view. Engine objects only make sense within the context where they were created and often contain time-sensitive data; it wouldn't make sense to store them except during the short term, such as with a pool of active and reusable database connections. It is common for Engine objects to hold Schema objects as properties, to help them know how to do their actions, such as how to create a table or select from it.

The Driver modules are what enable Rosetta to use each type of database. They are what handle (interface to or implement) all the details of using a specific RDBMS product; you need at least one for each unique RDBMS product you plan to run your application on; they are "action" modules. They take Engine/Schema objects as input and convert them into the actual SQL or other method calls that the RDBMS products use, then they invoke the generated, and then interpret the results into other Engine/Schema objects to return as output. The Driver modules also deal with extracting any existing schema stored in an RDBMS so they can generate an RNI data dictionary from it when desired. Nothing talks to an RDBMS product except a Driver module, and nothing talks to a driver module except the Engine modules.

Note that application specific "stored procedures" which are implemented in Perl, and that are RDBMS product specific, would sit beside "Driver" in the above diagram, since they would have direct access to the product-specific code that the Driver uses to talk to the DATABASE. Those are not shown in the above diagram for simplicity. That said, all calls to such "stored procedures" from the main APPLICATION will still always go through Engine.

The Proxy modules comprise an extension that allows the normally embedded-only Rosetta to be used in a client-server situation (as ODBC is). But this extension is implemented in such a way that neither the Driver modules nor the APPLICATION need to know the difference of whether the extension is being used or not. The Proxy modules correspond to both a "Driver" (network client) and an "APPLICATION" (network server); they involve two applications that have the Rosetta core embedded talking to each other, with one handing all the database details on behalf of the other. See the following diagram:

  DATABASE
     |
   Driver
     |
   Engine:::::::
     |         =
Proxy Server   =
     |         =
  NETWORK      = (Driver and APP think these are the same single Engine)
     |         =
Proxy Client   =
     |         =
   Engine:::::::
     |
APPLICATION

The Validator modules are a common comprehensive test suite to run against all drivers in order to tell that they are implementing the Rosetta interface properly or not. This also saves each Driver maker from having to write their own tests. These would be used similarly to how Sun has an official validation suite for Java Virtual Machines to make sure they implement the official Java specification. Since not all databases are alike, each driver is expected to report programmatically which kinds of features it supports, and the validator suite will only test interface compliance for those features; this way, a missing feature does not mean the driver will fail its tests unless it explicitely says it has said feature.

The Emulator modules are extensions that are purely optional for use, and they are intended to facilitate rapid adaption (hot-swapping) of Rosetta into an existing application that is already built around a different database interface (such as DBI or ODBC or OCI). Each Emulator module is a higher-level layer which translates its input into pure RNI calls and translates the output correspondingly; each should have an identical public interface to what it emulates. So applications often won't need to be changed to use Rosetta, but they still just become more portable. Many Emulator modules would include SQL parsing functionality (as SQL is their input), and would convert SQL statements from known dialects into Engine/Schema objects

All of the Rosetta components are implemented using pure Perl 5 code for now and during the near future (although some drivers or extensions may have prerequisites that are not). Once Rosetta is feature complete in this form, it will be rewritten as a C library which has bindings for multiple languages, including Perl 5 and Parrot. Advantages of this change are not only greater speed and memory efficiency but also compatability; it will allow Rosetta (and through it Perl) to use databases which previously had no Perl bindings or DBI drivers before. Moreover, the potential user base would increase sharply because users of other languages like PHP or Python would be able to use it. The C version of Rosetta will probably have the "Schema" (SQL::ObjectModel) and "Engine" modules being mostly reduced to struct definitions, macros for manipulating the structs, header files, and a small amount of implementation code. Most of the normal implementation C code would be derived from the other modules, particularly the "Driver", "Proxy", "Validator", "Emulator" ones. The pure Perl version will probably stay around after that conversion should it be useful, but it won't have as large of a database support. I have no plans to make a pure Java version, but rather, the C version would bind to Java environments using a Java Bean.

BRIEF MODULE LIST

The following module names and descriptions are undergoing a state of transition, and several parts of what was the Rosetta core are planning to be split off into separate distributions which the remaining Rosetta core is dependant on but which are also useful by themselves. See below for details. The split will not be done immediately, although the modules have been renamed already. Most of these modules don't exist yet, but are being planned to exist, as they are implemented over time.

Any indented package names represent inner classes of what they are beneath.

In distribution "Rosetta"

To become "Locale-KeyedText" or similar:

Locale::KeyedText (Refer to user messages in programs by keys)

To become "SQL-ObjectModel" or similar:

SQL::ObjectModel (Unserialized SQL objects, use like XML DOM)
	SQL::ObjectModel::_::Codes (declare base types or enums; meanings hard-coded in SQL::ObjectModel)
	SQL::ObjectModel::_::Command (describes an action to do and what bind vars should be prompted for)
	SQL::ObjectModel::_::DataType (metadata for individual atomic or scalar values)
	SQL::ObjectModel::_::Database (describes a database, that which you connect to)
	SQL::ObjectModel::_::Namespace (a namespace in a database for schema objects)
	SQL::ObjectModel::_::Table (details of table columns are part of this class)
	SQL::ObjectModel::_::View (a parsed select, used for any DML like IUD, gen stored views)
	SQL::ObjectModel::_::Sequence (a sequence generator)
	SQL::ObjectModel::_::Block (a block of code to run in a db; part of a proc, function, trigger)
	SQL::ObjectModel::_::Trigger (a code block assoc with a table, only auto-called by db on events)
	SQL::ObjectModel::_::User (an entity who may connect to a database and order it around)
	SQL::ObjectModel::_::* ... like Privilege, PrivilegeType, whatever

SQL::ObjectModel::Locale::en (English localization of input error messages)
SQL::ObjectModel::Locale::fr (French localization of input error messages)
SQL::ObjectModel::Locale::* ... like German, Spanish, Dutch, Chinese, etc

SQL::ObjectModel::DataDictionary (POD only, describes a db suitable for storing SQL::ObjectModels)
SQL::ObjectModel::API_C (POD only, describes the core C API for SQL::ObjectModel)

To stay "Rosetta":

Rosetta (Framework for RDBMS-generic apps and schemas)
	Rosetta::_::Engine (base class to provide 'execute' function; calls Drivers)
	Rosetta::_::Command (describes an action to do, and bundles resolved bind vars)
	Rosetta::_::Result (result of action including errors)
	Rosetta::_::Connection (represents a connection)
	Rosetta::_::Transaction (represents context within connection)
	Rosetta::_::Cursor (result of a selection command; can be input to insert/update/delete)
	Rosetta::_::* ... like DriverEngineGlobals or whatever

Rosetta::Locale::en (English localization of execution/input error messages)
Rosetta::Locale::fr (French localization of execution/input error messages)
Rosetta::Locale::* ... like German, Spanish, Dutch, Chinese, etc

Rosetta::Framework (POD only, holds main framework documentation)
Rosetta::SimilarModules (POD only, compares and contrasts Rosetta with other modules)

Rosetta::Driver::Null (a driver with no functionality, for testing core)

In distribution "Rosetta-Drivers" or a multiplicity of similarly named

Note: more databases (and data access protocols) can be found listed at "http://freshmeat.net/articles/view/305/" and "http://freshmeat.net/articles/view/307/"

Rosetta::Driver::Native (a fully functional small RDBMS for when normal db not needed)
Rosetta::Driver::Oracle (interface to Oracle db 8, 9)
Rosetta::Driver::Sybase (interface to Sybase db)
Rosetta::Driver::PostgreSQL (interface to PostgreSQL db)
Rosetta::Driver::MySQL (interface/binding to MySQL cli-serv/embed db 3.23.x, 4.x, 5.x)
Rosetta::Driver::Informix (interface to IBM Informix db)
Rosetta::Driver::DB2 (interface to IBM's DB/2 db and its DB/2 Express db)
Rosetta::Driver::SQLServer (interface to MS SQLServer db)
Rosetta::Driver::Adabas (interface to Software AG's Adabas db)
Rosetta::Driver::SAPdb (interface to the SAP open source db; forked from Adabas apparently)
Rosetta::Driver::Unify (interface to Unify db)
Rosetta::Driver::Empress (interface to the Empress db)
Rosetta::Driver::PrimeBase (interface to PrimeBase db)
Rosetta::Driver::OpenBase (interface to OpenBase db)
Rosetta::Driver::FrontBase (interface to FrontBase db)
Rosetta::Driver::InterBase (binding to Borland InterBase embedded db)
Rosetta::Driver::Firebird (interface to Firebird db)
Rosetta::Driver::Solid (interface to Solid db)
Rosetta::Driver::Cache (interface to InterSystems Cache db)
Rosetta::Driver::xBase (binding to XBase db accessor library)
Rosetta::Driver::Ingres (interface to Ingres db)
Rosetta::Driver::Illustra (interface to Illustra database)
Rosetta::Driver::Valentina (binding to Paradigma Software's Valentina embedded db) 
Rosetta::Driver::RelX (binding to relX' embedded db) 
Rosetta::Driver::4D (interface to 4D db)
Rosetta::Driver::mSQL (interface to mSQL db)
Rosetta::Driver::Helix (interface to Helix db)
Rosetta::Driver::Pick (interface to Pick Systems' db)
Rosetta::Driver::PointBase (interface to PointBase db, if linking to Java is possible)
Rosetta::Driver::FileMaker (interface to FileMaker Pro db)
Rosetta::Driver::FoxPro (interface to FoxPro db)
Rosetta::Driver::SQLite (binding to public domain SQLite embedded db)
Rosetta::Driver::Access (interface to MS Access db)
Rosetta::Driver::Paradox (interface to Borland Paradox db)
Rosetta::Driver::dBase (interface to dBase db)
Rosetta::Driver::CSV (a database driver using CSV files for storage)
Rosetta::Driver::* ... any of the 100+ other databases or services not listed above

Rosetta::Driver::Common::RWViews (convert IUD against views into IUD against tables)
Rosetta::Driver::Common::SQLBuilder (generate ANSI-99 SQL and other SQL variants)
Rosetta::Driver::Common::DBIWrapper (helps drivers implemented using DBI/DBD::*)
Rosetta::Driver::Common::* ... other code for sharing between multiple drivers

In distribution "Rosetta-Proxy"

Rosetta::Proxy::Server (an application on top of RNI implementing a network server)
Rosetta::Proxy::Client (a "Driver" implementing a network client of the server)
Rosetta::Proxy::Command (network-safe packed repr of ::Engine::Command obj)
Rosetta::Proxy::Result (network-safe packed repr of ::Engine::Result obj)

In distribution "Rosetta-Validators"

Rosetta::Validator (a common comprehensive test suite to run against all drivers)
Rosetta::Validator::DB1 (tests include: new db, open db, close db)
Rosetta::Validator::Table1 (tests include: create/modify/delete tables, fetch table schema)
Rosetta::Validator::Select1 (tests Include: select recs from 1 table)
Rosetta::Validator::IUD1 (tests include: insert/update/delete recs in 1 table)
Rosetta::Validator::View1 (tests include: create/modify/delete views, fetch view schema)
Rosetta::Validator::Select2 (tests Include: select from mult tables using joins only)
Rosetta::Validator::Select3 (tests Include: select from mult tables using unions only)
Rosetta::Validator::Select4 (tests Include: select from mult tables using subqueries)
Rosetta::Validator::IUD2 (tests include: i/u/d against views using joins only)
Rosetta::Validator::IUD3 (tests include: i/u/d against views using unions only)
Rosetta::Validator::IUD4 (tests include: i/u/d against views using subqueries)
Rosetta::Validator::Select5 (tests include: various built-in functions)
Rosetta::Validator::IUD5 (tests include: i/u/d against views using built-in functions)
Rosetta::Validator::* ... like stored procedures, triggers, various others

In distribution "Rosetta-Emulators"

Rosetta::Emulator::DBI (emulates DBI/DBD::*, but result is more portable)
Rosetta::Emulator::ODBC (emulates an ODBC module, result more portable)
Rosetta::Emulator::OCI (emulates an OCI module, result more portable)
Rosetta::Emulator::* ... like some of the other abstraction modules
Rosetta::Emulator::*::* (helpers for emulated frameworks of several modules)

Rosetta::Emulator::Common::SQLParser (parse ANSI-99 SQL and other SQL variants)
Rosetta::Emulator::Common::* ... other code for sharing between multiple emulators

In distribution "Rosetta-Applications"

Rosetta::Utility::* ... higher level functionality like: Backup, Restore, Copy, ...

In unknown distributions

Rosetta::* ... who knows ...

Rosetta::Adapter::* ... I like the name so it might be used for something

RosettaX::* ... unofficial extensions or wrappers to Rosetta are allowed here ...

EXPANDED MODULE LIST

Note that the exact module names and descriptions listed within each grouping are an older draft and are subject to near-future revisions or rewrites, including addition of more modules or removal of some existing ones.

Note the third: now that some of these modules have been implemented, their updated descriptions here are more accurate.

Schema (SQL::ObjectModel) Modules

0

SQL::ObjectModel::_::Command - This Schema class describes an action that needs to be done against a database; the action may include several steps, and all of them must be done when executing the Command. A Command object has one mandatory string property named 'type' (eg: 'database_open', 'table_create', 'data_insert'), which sets the context for all of its other properties, which are in a hash property named 'args'. Elements of 'args' often include other Schema class objects like 'Table' or 'DataType'. Like all Schema objects, this is static and context-independant; it is intended to be serialized and passed to code like DBI->prepare(). While bind vars are often used with them (such as values to insert or update, or record matching conditions), it is always up to the calling code to resolve the bind vars such as with a user prompt (as SQL*Plus does), and pass those to code like DBI->execute(). See also Rosetta::Engine::_::Command.

0

SQL::ObjectModel::_::DataType - This Schema class describes a simple data type, which serves as metadata for a single atomic or scalar unit of data, or a column whose members are all of the same data type, such as in a regular database table or in row sets read from or to be written to one. This class would be used both when manipulating database schema and when manipulating database data.

0

SQL::ObjectModel::_::Table - This Schema class describes a single database table, and would be used for such things as managing schema for the table (eg: create, alter, destroy), and describing the table's "public interface" so other functionality like views or various DML operations know how to use the table. In its simplest sense, a Table object consists of a table name, a list of table columns, a list of keys, a list of constraints, and a few other implementation details. This class does not describe anything that is changed by DML activity, such as a count of stored records, or the current values of sequences attached to columns. This class would be used both when manipulating database schema and when manipulating database data. This class can generate SQL::ObjectModel::Command objects having types of: 'table_verify', 'table_create', 'table_alter', 'table_destroy'.

0

SQL::ObjectModel::_::View - This Schema class describes a single database view, which conceptually looks like a table, but it is used differently. Tables and views are similar in that they both represent or store a matrix of data, which has uniquely identifiable columns, and rows which can be uniquely identifiable but may not be. With the way that Rosetta implements views, you can do all of the same DML operations with them that you can do with tables: select, insert, update, delete rows; that said, the process for doing any of those with views is more complicated than with tables, but this complexity is usually internal to Rosetta so you shouldn't have to code any differently between them. Tables and views are different in that tables actually store data in themselves, while views don't. A view is actually a custom abstracted interface to one or more database tables which are related to each other in a specific way; when you issue DML against a view, you are actually fetching from or modifying the data stored in one (simplest case) or more tables. This class can generate SQL::ObjectModel::Command objects having types of: 'data_select', 'data_insert', 'data_update', 'data_delete', 'data_lock', 'data_unlock', 'view_verify', 'view_create', 'view_alter', 'view_destroy'.

Engine Modules

0

Rosetta::_::Engine - This Engine class is inherited by all other Engine Engine classes, and it provides functionality to talk to or manage Driver modules. Its main task is to define the execute_command() method, which takes a Command object saying what should be done next and returns or throws a Result object saying what actually was done (or what errors there were). For some command types, execute_command() may only start the process that needs doing (eg: get a select cursor), and invoking execute_command() again on the Result object (which is a subclass) will continue or finish the process (eg: fetch a row). Instantiated by itself, this class stores globals that are shared by all drivers or connections. Subclasses include: Result, Database.

0

Rosetta::_::Command - This Engine class describes an action that needs to be done against a database; its objects encapsulate SQL::ObjectModel::Command objects and also encapsulates the resolved bind vars, or live objects like 'Cursor'. Code using these objects should be doing something like both calling DBI->prepare() and DBI->execute(). See also SQL::ObjectModel::_::Command.

0

Rosetta::_::Result - This Engine class is inherited by all Engine classes that would be returned from or thrown by an execute_command() method, and it contains the return values or errors of a Command. Its main task is to implement the is_error() and get_error() methods, which say whether the Command failed or not, and if so then why. Some commands (eg: 'database_close') have no other meta-data or data to return, while others do (eg: 'data_select'). Subclasses include: Database.

0

Rosetta::_::Connection - This Engine class represents a connection to a database instance, and the simplest database applications use only one. You instantiate a Database object by executing a Command of type 'database_open'; that command usually takes 4 arguments, the first of which is mandatory: 'driver' is a string having the name of the Driver module to use, which also defines what RDBMS product is being used; 'server' is the name of the specific database instance to use; 'user' is the username to authenticate yourself against a multi-user database as; 'pass' is the associated password.

0

Rosetta::_::Cursor - This Engine class represents a cursor over a rowset that is being selected from a database. You instantiate a Cursor object by executing a command of type 'data_select'; that command usually takes 1 argument, which is mandatory: 'view' is a View object that describes the select statement being run, including what columns it has and their datatypes, what the source tables are, how they are joined, what the row filters are, sort order, and row limiting or paging.

Driver Modules

0

Rosetta::Driver::Native - This class implements a fully-functional stand-alone embedded RDBMS; it will implement all database features that the RNI can interface to, even if some standard databases can't. This includes transactions, locking, foreign keys, stored procedures, multiple 'users', and more. That said, it is intended mainly as a proof of concept and won't scale well beyond small data sets. The entire database, both schema and data will be loaded into RAM at once, so your use of it is limited by what will fit in RAM. Persistence is achieved by dumping all data structures used by the database to disk at once, into a single disk file, at appropriate times, and loading it from disk all at once. So it will probably work best for data sets that don't exceed a few megabytes.

0

Rosetta::Driver::Oracle - This class implements a driver for talking to Oracle 8.x and 9.x databases. Note that Oracle 9 is the first version of the Oracle database that runs under Mac OS X (10.2 and later). Oracle versions before 8.x, like 7.3, will not be supported unless there is a great demand.

0

Rosetta::Driver::MySQL - This class implements a driver for talking to MySQL databases. It is intended to be used only with MySQL versions 4.0.13 and later, with version 4.1 or greater being strongly recommended. That said, which version you should use depends on your stage of development; MySQL AB has declared the 4.1.x series to be alpha quality now (2003.04.10), and expects it to be production ready in eight months (and 5.x six months after that), whereas the 4.0.x series is production quality right now (2003.03.15). Given the current large installed base of MySQL 3.23.x, this class will temporarily support that version as well, with that support being deprecated, until most users have migrated to 4.x. 3.22.x will not be supported at all. Alternately, the older (3.x) MySQL versions will have a separate driver for them, so they can be supported long term.

0

Rosetta::Driver::Common::RWViews - This is a utility class containing common functionality for multiple Drivers. Given that most, if not all, existing RDBMS systems don't support updateable views, meaning treating a view like a table in every way, this class will implement that Rosetta feature, translating selects, inserts, updates, or deletes against a view into one or more corresponding actions against the underlying tables that the view interfaces.

0

Rosetta::Driver::Common::SQLBuilder - This is a utility class containing common functionality for multiple Drivers. Given that a majority of commonly used RDBMS systems use SQL as their primary command language, this class will take care of translating RNI commands into SQL commands. While a large part of SQL is implemented in a common way on most SQL-using databases, this class will take arguments that instruct it how to vary the SQL output to the dialect that a particular RDBMS uses. Note that the default SQL being produced will be compliant with the ANSI-SQL-99 standard.

0

Rosetta::Driver::Common::DBIWrapper - This is a utility class containing common functionality for multiple Drivers. Given that the pure perl versions of Rosetta, which are its first releases, will be implemented on top of the popular DBI and DBD modules, this class provides some Rosetta-specific wrapper functionality for DBI that makes it easier for Rosetta drivers to use it.

All other databases in common use should be supported as well; the ones in the above module list are products that I have used personally; I need to research others to know what versions exist or are stable or are in common use. Other RDBMS products include: Sybase, PostgreSQL, DB2, SQL-Server, OpenBase, FrontBase, Valentina, Informix, and others.

Miscellaneous Modules From Afar

In this context, "Wrapper" is a type of Rosetta extension that sits between the application and the RNI, such as Emulators. But Wrappers take many forms, most of which will not be included with this distribution.

One form of Wrapper is a value-added extension, possibly more application-specific, such as an interpreter for data dictionaries. For example, a data dictionary could say that an application is composed of screens or forms that are related in a certain way; each screen would contain several controls of various types, and some controls may correspond to specific columns in database tables. The module in question would determine from the data dictionary what needs to be retrieved from the database to support a particular screen, and ask the Engine modules to go get it. Similarly, if the application user edits data on the screens that should then be saved back to the database, the Wrapper module would ask the Engine modules to save it. On the other side of things, it is quite possible that the data dictionary for the application is itself stored in the database, and so the Engine modules can be asked to fetch portions of it as the Wrapper module requires.

Another form of Wrapper is an interface customizer or simplifier. if you know that certain details of your commands to Engine will always be the same, or you just like to express your needs in a different way, you can take care of the default values in a wrapper module, so that the rest of your application simply has to provide inputs that aren't always the same.

Another form of Wrapper is a data parser or serializer. For example, to convert database output to XML or convert XML to a database command (although, certain kinds of XML processing may be better implemented in the Engine/Driver layers for performance reasons, but if so it would still be an extension).

Another form of Wrapper is a command parser for various SQL dialects. For example, if you want to quickly port an application, which already includes SQL statements that are tailored to a specific database product, to a different database for which it is incompatible, a Wrapper module could parse that statement into the object representation that Engine uses. This is effectively an SQL-to-SQL translator. I would expect that, citing reasons of performance or application code simplicity, one wouldn't want to use this functionality long-term, but replace the SQL with Engine object definitions later.

Finally, one could also make Wrappers which emulate other database abstraction solutions for similar reasons to the above, which is a different type of quick porting. Since the intended feature set of Rosetta should be a superset of existing solutions' feature sets, it should be possible to emulate them with it.

SEE ALSO

perl(1), Rosetta, Rosetta::SimilarModules, SQL::ObjectModel, SQL::ObjectModel::DataDictionary, SQL::ObjectModel::XMLSchema, SQL::ObjectModel::API_C, DBI.

12 POD Errors

The following errors were encountered while parsing the POD:

Around line 879:

Expected text after =item, not a number

Around line 888:

Expected text after =item, not a number

Around line 903:

Expected text after =item, not a number

Around line 942:

Expected text after =item, not a number

Around line 951:

Expected text after =item, not a number

Around line 961:

Expected text after =item, not a number

Around line 972:

Expected text after =item, not a number

Around line 1002:

Expected text after =item, not a number

Around line 1009:

Expected text after =item, not a number

Around line 1024:

Expected text after =item, not a number

Around line 1033:

Expected text after =item, not a number

Around line 1044:

Expected text after =item, not a number