PDD 21: Namespaces

Abstract

Description and implementation of Parrot namespaces.

Description

- Namespaces should be stored under first-level namespaces corresponding to the HLL language name
- Namespaces should be hierarchical
- The get_namespace opcode takes a multidimensional hash key or an array of name strings
- Namespaces follow the semantics of the HLL in which they're defined
- exports follow the semantics of the library's language
- Two interfaces: typed and untyped

Definitions

"HLL"

A High Level Language, such as Perl, Python, or Tcl, in contrast to PIR, which is a low-level language.

"current namespace"

The current namespace at runtime is the namespace associated with the currently executing subroutine. PASM assigns each subroutine a namespace when compilation of the subroutine begins. Don't change the associated namespace of a subroutine unless you're prepared for weird consequences.

(PASM also has its own separate concept of current namespace which is used to initialize the runtime current namespace as well as determine where to store compiled symbols.)

Implementation

Namespace Indexing Syntax

Namespaces are denoted in Parrot as simple strings, multidimensional hash keys, or arrays of name strings.

A namespace may appear in Parrot source code as the string "a" or the key ["a"].

A nested namespace "b" inside the namespace "a" will appear as the key ["a"; "b"].

There is no limit to namespace nesting.

Naming Conventions

Parrot's target languages have a wide variety of namespace models. By implementing an API and standard conventions, it should be possible to allow interoperability while still allowing each one to choose the best internal representation.

True Root Namespace

The true root namespace is hidden from common usage, but it is available via the get_root_namespace opcode. For example:

$P0 = get_root_namespace

This root namespace stringifies to the empty string.

HLL Root Namespaces

Each HLL must store public items in a namespace named with the lowercased name of the HLL. This is the HLL root namespace. For instance, Tcl's user-created namespaces should live in the tcl namespace. This eliminates any accidental collisions between languages.

An HLL root namespace must be stored at the first level in Parrot's namespace hierarchy. These top-level namespaces should also be specified in a standard unicode encoding. The reasons for these restrictions is to allow compilers to remain completely ignorant of each other.

Parrot internals are stored in the default HLL root namespace parrot.

HLL Implementation Namespaces

Each HLL must store implementation internals (private items) in an HLL root namespace named with an underscore and the lowercased name of the HLL. For instance, Tcl's implementation internals should live in the _tcl namespace.

HLL User-Created Namespaces

Each HLL must store all user-created namespaces under the HLL root namespace. It is suggested that HLLs use hierarchical namespaces to practical extent. A single flat namespace can be made to work, but it complicates symbol exportation.

Namespace PMC API

Most languages leave their symbols plain, which makes lookups quite straightforward. Others use sigils or other mangling techniques, complicating the problem of interoperability.

Parrot namespaces assist with interoperability by providing two interface subsets: the untyped interface and the typed interface.

Untyped Interface

Each HLL may, when working with its own namespace objects, use the untyped interface, which allows direct naming in the native style of the namespace's HLL.

This interface consists of the standard Parrot hash interface, with all its keys, values, lookups, deletions, etc. Just treat the namespace like a hash. (It probably is one, really, deep down.)

The untyped interface also has one method:

get_name

Gets the name of the namespace $P2 as an array of strings. For example, if $P2 is a Perl 5 namespace "Some::Module", within the Perl 5 HLL, then get_name() on $P2 returns an array of "perl5", "Some", "Module". It returns the literal namespace names as the HLL stored them, without filtering for name mangling.

NOTE: Due to aliasing, this value may be wrong -- i.e. it may disagree with the namespace name with which you found the namespace in the first place.

Typed Interface

When a given namespace's HLL is either different from the current HLL or unknown, an HLL should generally use only the language-agnostic namespace interface. This interface isolates HLLs from each others' naming quirks. It consists of add_foo(), find_foo(), and del_foo() methods, for values of "foo" including "sub" (something executable), "namespace" (something in which to find more names), and "var" (anything).

NOTE: The job of the typed interface is to bridge naming differences, and only naming differences. Therefore: 1) It does not enforce, nor even notice, the interface requirements of "sub" or "namespace": e.g. execution of add_sub("foo", $P0) does not automatically guarantee that $P0 is an invokable subroutine; and 2) it does not prevent overwriting one type with another.

add_namespace

Store $P3 as a namespace under the namespace $P1, with the name of $S2.

add_sub

Store $P3 as a subroutine with the name of $S2 in the namespace $P1.

add_var

Store $P3 as a variable with the name of $S2 in the namespace $P1.

IMPLEMENTATION NOTE: Perl namespace implementations may choose to implement add_var() by checking which parts of the variable interface are implemented by $P0 (scalar, array, and/or hash) so it can decide on an appropriate sigil.

del_namespace, del_sub, del_var

Delete the sub, namespace, or variable named $S2 from the namespace $P1.

find_namespace, find_sub, find_var

Find the sub, namespace, or variable named $S3 in the namespace $P2.

IMPLEMENTATION NOTE: Perl namespace implementations should implement find_var() to check all variable sigils, but the order is not to be counted on by users. If you're planning to let Python code see your module, you should avoid exporting both our $A and our @A. (Well, you might want to consider not exporting variables at all, but that's a style issue.)

export_to

Export items from the namespace $P1 into the namespace $P2. The items to export are named in $P3, which may be an array of strings, a hash, or null. If $P3 is an array of strings, interpretation of items in an array follows the conventions of the source (exporting) namespace. If $P3 is a hash, the keys correspond to the names in the source namespace, and the values correspond to the names in the destination namespace. If a hash value is null or an empty string, the name in the hash key is used. A null $P3 requests the 'default' set of items. Any other type passed into $P3 throws an exception.

The base Parrot namespace export_to() function interprets item names as literals -- no wildcards or other special meaning. There is no default list of items to export, so $P3 of null and $P3 of an empty array have the same behavior.

NOTE: Exportation may entail non-obvious, odd, or even mischievous behavior. For example, Perl's pragmata are implemented as exports, and they don't actually export anything.

IMPLEMENTATION EXAMPLES: Suppose a Perl program were to import some Tcl module with an import pattern of "c*" -- something that might be expressed in Perl 6 as use tcl:Some::Module 'c*'. This operation would import all the commands that start with 'c' from the given Tcl namespace into the current Perl namespace. This is so because, regardless of whether 'c*' is a Perl 6 style export pattern, it is a valid Tcl export pattern.

{XXX - The ':' for HLL is just proposed. This example will need to be updated later.}

IMPLEMENTATION NOTE: Most namespace export_to implementations will restrict themselves to using the typed interface on the target namespace. However, they may also decide to check the type of the target namespace and, if it turns out to be of a compatible type, to use same-language shortcuts.

DESIGN TODO: Figure out a good convention for a default export list in the base namespace PMC. Maybe a standard method "expand_export_list()"?

Compiler PMC API

Methods

parse_name

Parse the name in $S3 using the rules specific to the compiler $P2, and return an array of individual name elements.

For example, a Java compiler would turn 'a.b.c' to ['a','b','c'], while a Perl compiler would turn 'a::b::c' into the same result. Meanwhile, due to Perl's sigil rules, '$a::b::c' would become ['a','b','$c'].

get_namespace

Ask the compiler $P2 to find its namespace which is named by the elements of the array in $P3. If $P3 is a null PMC or an empty array, get_namespace retrieves the base namespace for the HLL. It returns a namespace PMC on success and a null PMC on failure.

This method allows other HLLs to know one name (the HLL) and then work with that HLL's modules without having to know the name it chose for its namespace tree. (If you really want to know the name, the get_name() method should work on the returned namespace PMC.)

Note that this method is basically a convenience and/or performance hack, as it does the equivalent of get_root_namespace followed by zero or more calls to <namespace>.get_namespace(). However, any compiler is free to cheat if it doesn't get caught, e.g. to use the untyped namespace interface if the language doesn't mangle namespace names.

load_library

Ask this compiler to load a library/module named by the elements of the array in $P2, with optional control information in $P3.

For example, Perl 5's module named "Some::Module" should be loaded using (in pseudo Perl 6): perl5.load_library(["Some", "Module"], null).

The meaning of $P3 is compiler-specific. The only universal legal value is Null, which requests a "normal" load. The meaning of "normal" varies, but the ideal would be to perform only the minimal actions required.

On failure, an exception is thrown.

Subroutine PMC API

Some information must be available about subroutines to implement the correct behavior about namespaces.

Methods

get_namespace

Retrieve the namespace $P1 where the subroutine $P2 was defined. (As opposed to the namespace(s) that it may have been exported to.)

Namespace Opcodes

The namespace opcodes all have 3 variants: one that operates from the currently selected namespace (i.e. the namespace of the currently executing subroutine), one that operates from the HLL root namespace (identified by "hll" in the opcode name), and one that operates from the true root namespace (identified by "root" in the name).

set_namespace

Add the namespace PMC $P1 under the name denoted by a multidimensional hash key.

Add the namespace PMC $P2 under the name denoted by an array of name strings $P1.

get_namespace

Retrieve the current namespace, the HLL root namespace, or the true root namespace and store it in $P1.

Retrieve the namespace denoted by a multidimensional hash key and store it in $P1.

Retrieve the namespace denoted by the array of names $P2 and store it in $P1.

Thus, to get the "Foo::Bar" namespace from the top-level of the HLL if the name was known at compile time, you could retrieve the namespace with a key:

If the name was not known at compile time, you would retrieve the namespace with an array instead:

make_namespace

Create and retrieve the namespace denoted by a multidimensional hash key and store it in $P1. If the namespace already exists, only retrieve it.

Create and retrieve the namespace denoted by the array of names $P2 and store it in $P1. If the namespace already exists, only retrieve it.

get_global

Retrieve the symbol named $S2 in the current namespace, HLL root namespace, or true root namespace.

Retrieve the symbol named $S2 by a multidimensional hash key relative to the current namespace, HLL root namespace, or true root namespace.

Retrieve the symbol named $S3 by the array of names $P2 relative to the current namespace, HLL root namespace, or true root namespace.

set_global

Store $P2 as the symbol named $S1 in the current namespace, HLL root namespace, or true root namespace.

Store $P2 as the symbol named $S1 by a multidimensional hash key, relative to the current namespace, HLL root namespace, or true root namespace. If the given namespace does not exist it is created.

Store $P3 as the symbol named $S2 by the array of names $P1, relative to the current namespace, HLL root namespace, or true root namespace. If the given namespace does not exist it is created.

HLL Namespace Mapping

In order to make this work, Parrot must somehow figure out what type of namespace PMC to create.

Default Namespace

The default namespace PMC will implement Parrot's current behavior.

Compile-time Creation

This Perl:

#!/usr/bin/perl
package Foo;
$x = 5;

should map roughly to this PIR:

In this case, the main sub would be tied to Perl 5 by the .HLL directive, so a Perl 5 namespace would be created.

Run-time Creation

Consider the following Perl 5 program:

#!/usr/bin/perl
$a = 'x';
${"Foo::$a"} = 5;

The Foo:: namespace is created at run-time (without any optimizations). In these cases, Parrot should create the namespace based on the HLL of the PIR subroutine that calls the store function.

In this case, set_global should see that it was called from "main", which is in a Perl 5 namespace, so it will create the "Foo" namespace as a Perl 5 namespace.

Language Notes

Perl 6

Sigils

Perl 6 may wish to be able to access the namespace as a hash with sigils. That is certainly possible, even with subroutines and methods. It's not important that a HLL use the typed namespace API, it is only important that it provides it for others to use.

So Perl 6 may implement get_keyed and set_keyed VTABLE slots that allow the namespace PMC to be used as a hash. The find_sub method would, in this case, append a "&" sigil to the front of the sub/method name and search in the internal hash.

Python

Importing from Python

Since functions and variables overlap in Python's namespaces, when exporting to another HLL's namespace, the Python namespace PMC's export_to method should use introspection to determine whether x should be added using add_var or add_sub. $I0 = does $P0, "Sub" may be enough to decide correctly.

Subroutines and Namespaces

Since Python's subroutines and namespaces are just variables (the namespace collides there), the Python PMC's find_var method may return subroutines as variables.

Examples

Aliasing

Perl:

#!/usr/bin/perl6
sub foo {...}
%Foo::{"&bar"} = &foo;

PIR:

Cross-language Exportation

Perl 5:

#!/usr/bin/perl
use tcl:Some::Module 'w*';   # XXX - ':' after HLL may change in Perl 6
write("this is a tcl command");

PIR (without error checking):

References

None.