# Copyright (C) 2005-2007, The Perl Foundation.
# $Id: pdd21_namespaces.pod 21480 2007-09-22 10:00:33Z paultcochrane $

=head1 NAME

docs/pdds/pdd21_namespaces.pod - Parrot Namespaces

=head1 VERSION

$Revision: 21480 $

=head1 DESCRIPTION

=over 4

=item - Namespaces should be stored under first-level namespaces corresponding
to the HLL language name

=item - Namespaces should be hierarchical

=item - The C<get_namespace> opcode takes a multidimensional hash key or an
array of name strings

=item - Namespaces follow the semantics of the HLL in which they're defined

=item - exports follow the semantics of the library's language

=item - Two interfaces: typed and untyped

=back

=head1 DEFINITIONS

=head2 "HLL"

A High Level Language, such as Perl, Python, or Tcl, in contrast to PIR, which
is a low-level language.

=head2 "current namespace"

The I<current namespace> at runtime is the namespace associated with the
currently executing subroutine.  PASM assigns each subroutine a namespace when
compilation of the subroutine begins.  Don't change the associated namespace
of a subroutine unless you're prepared for weird consequences.

(PASM also has its own separate concept of current namespace which is used to
initialize the runtime current namespace as well as determine where to store
compiled symbols.)


=head1 IMPLEMENTATION

=head2 Namespace Indexing Syntax

Namespaces are denoted in Parrot as simple strings, multidimentional
hash keys, or arrays of name strings.

A namespace may appear in Parrot source code as the string C<"a"> or the
key C<["a"]>.

A nested namespace "b" inside the namespace "a" will appear as the key
C<["a"; "b"]>.

There is no limit to namespace nesting.

=head2 Naming Conventions

Parrot's target languages have a wide variety of namespace models. By
implementing an API and standard conventions, it should be possible to
allow interoperability while still allowing each one to choose the best
internal representation.


=over 4

=item True Root Namespace

The true root namespace is hidden from common usage, but it is available
via the C<get_root_namespace> opcode. For example:

  $P0 = get_root_namespace

=item HLL Root Namespaces

Each HLL must store public items in a namespace named with the lowercased
name of the HLL.  This is the HLL root namespace.  For instance, Tcl's
user-created namespaces should live in the C<tcl> namespace.  This
eliminates any accidental collisions between languages.

An HLL root namespace must be stored at the first level in Parrot's namespace
hierarchy.  These top-level namespaces should also be specified in a standard
unicode encoding.  The reasons for these restrictions is to allow compilers
to remain completely ignorant of each other.

=item HLL Implementation Namespaces

Each HLL must store implementation internals (private items) in an HLL
root namespace named with an underscore and the lowercased name of the
HLL.  For instance, Tcl's implementation internals should live in the
C<_tcl> namespace.

=item HLL User-Created Namespaces

Each HLL must store all user-created namespaces under the HLL root
namespace.  It is suggested that HLLs use hierarchical namespaces to
practical extent.  A single flat namespace can be made to work, but it
complicates symbol exportation.

=back

=head2 Namespace PMC API

Most languages leave their symbols plain, which makes lookups quite
straightforward.  Others use sigils or other mangling techniques, complicating
the problem of interoperability.

Parrot namespaces assist with interoperability by providing two interface
subsets: the I<untyped interface> and the I<typed interface>.

=head4 Untyped Interface

Each HLL may, when working with its own namespace objects, use the I<untyped
interface>, which allows direct naming in the native style of the namespace's
HLL.

This interface consists of the standard Parrot hash interface, with all its
keys, values, lookups, deletions, etc.  Just treat the namespace like a
hash.  (It probably is one, really, deep down.)

The untyped interface also has one method:

=over 4

=item C<get_name>

    $P1 = $P2.get_name()

Gets the name of the namespace $P2 as an array of strings.  For example,
if $P2 is a Perl 5 namespace "Some::Module", within the Perl 5 HLL, then
get_name() on $P2 returns an array of "perl5", "Some", "Module". It
returns the literal namespace names as the HLL stored them, without
filtering for name mangling.

NOTE: Due to aliasing, this value may be wrong -- i.e. it may disagree with
the namespace name with which you found the namespace in the first place.

=back

=head4 Typed Interface

When a given namespace's HLL is either different from the current HLL or
unknown, an HLL should generally use only the language-agnostic namespace
interface.  This interface isolates HLLs from each others' naming quirks.  It
consists of C<add_foo()>, C<find_foo()>, and C<del_foo()> methods, for
values of "foo" including "sub" (something executable), "namespace"
(something in which to find more names), and "var" (anything).

NOTE: The job of the typed interface is to bridge I<naming> differences, and
I<only> naming differences.  Therefore: 1) It does not enforce, nor even
notice, the interface requirements of "sub" or "namespace": e.g.
execution of C<add_sub("foo", $P0)> does I<not> automatically guarantee
that $P0 is an invokable subroutine; and 2) it does not prevent
overwriting one type with another.

=over 4

=item C<add_namespace>

    $P1.add_namespace($S2, $P3)

Store $P3 as a namespace under the namespace $P1, with the name of $S2.

=item C<add_sub>

    $P1.add_sub($S2, $P3)

Store $P3 as a subroutine with the name of $S2 in the namespace $P1.

=item C<add_var>

    $P1.add_var($S2, $P3)

Store $P3 as a variable with the name of $S2 in the namespace $P1.

IMPLEMENTATION NOTE: Perl namespace implementations may choose to implement
add_var() by checking which parts of the variable interface are
implemented by $P0 (scalar, array, and/or hash) so it can decide on an
appropriate sigil.

=item C<del_namespace>, C<del_sub>, C<del_var>

    $P1.del_namespace($S2)
    $P1.del_sub($S2)
    $P1.del_var($S2)

Delete the sub, namespace, or variable named $S2 from the namespace $P1.

=item C<find_namespace>, C<find_sub>, C<find_var>

    $P1 = $P2.find_namespace($S3)
    $P1 = $P2.find_sub($S3)
    $P1 = $P2.find_var($S3)

Find the sub, namespace, or variable named $S3 in the namespace $P2.

IMPLEMENTATION NOTE: Perl namespace implementations should implement
find_var() to check all variable sigils, but the order is not to be counted on
by users.  If you're planning to let Python code see your module, you should
avoid exporting both C<our $A> and C<our @A>.  (Well, you might want to
consider not exporting variables at all, but that's a style issue.)

=item C<export_to>

    $P1.export_to($P2, $P3)

Export items from the namespace $P1 into the namespace $P2.  The items to
export are named in $P3, which may be an array of strings, a hash, or null.
If $P3 is an array of strings, interpretation of items in an array follows
the conventions of the source (exporting) namespace.
If $P3 is a hash, the keys correspond to the names in the source namespace,
and the values correspond to the names in the destination namespace.
If a hash value is null or an empty string, the name in the hash key is used.
A null $P3 requests the 'default' set of items.
Any other type passed into $P3 throws an exception.

The base Parrot namespace export_to() function interprets item names as
literals -- no wildcards or other special meaning.  There is no default list
of items to export, so $P3 of null and $P3 of an empty array have the same
behavior.

NOTE: Exportation may entail non-obvious, odd, or even mischievious behavior.
For example, Perl's pragmata are implemented as exports, and they don't
actually export anything.

IMPLEMENTATION EXAMPLES: Suppose a Perl program were to import some Tcl module
with an import pattern of "c*" -- something that might be expressed in Perl 6
as C<use tcl:Some::Module 'c*'>.  This operation would import all the commands
that start with 'c' from the given Tcl namespace into the current Perl
namespace.  This is so because, regardless of whether 'c*' is a Perl 6 style
export pattern, it I<is> a valid Tcl export pattern.

{XXX - The ':' for HLL is just proposed. This example will need to be
updated later.}

IMPLEMENTATION NOTE: Most namespace C<export_to> implementations will restrict
themselves to using the typed interface on the target namespace.  However,
they may also decide to check the type of the target namespace and, if it
turns out to be of a compatible type, to use same-language shortcuts.

DESIGN TODO: Figure out a good convention for a default export list in the
base namespace PMC.  Maybe a standard method "expand_export_list()"?

=back

=head2 Compiler PMC API

=head3 Methods

=over 4

=item C<parse_name>

    $P1 = $P2.parse_name($S3)

Parse the name in $S3 using the rules specific to the compiler $P2, and
return an array of individual name elements.

For example, a Java compiler would turn 'C<a.b.c>' to C<['a','b','c']>,
while a Perl compiler would turn 'C<a::b::c>' into the same result.
Meanwhile, due to Perl's sigil rules, 'C<$a::b::c>' would become
C<['a','b','$c']>.

=item C<get_namespace>

    $P1 = $P2.get_namespace($P3)

Ask the compiler $P2 to find its namespace which is named by the
elements of the array in $P3.  If $P3 is a null PMC or an empty array,
C<get_namespace> retrieves the base namespace for the HLL.  It returns a
namespace PMC on success and a null PMC on failure.

This method allows other HLLs to know one name (the HLL) and then work with
that HLL's modules without having to know the name it chose for its namespace
tree.  (If you really want to know the name, the get_name() method should work
on the returned namespace PMC.)

Note that this method is basically a convenience and/or performance hack, as
it does the equivalent of C<get_root_namespace> followed by
zero or more calls to <namespace>.get_namespace().  However, any compiler is
free to cheat if it doesn't get caught, e.g. to use the untyped namespace
interface if the language doesn't mangle namespace names.

=item C<load_library>

    $P1.load_library($P2, $P3)

Ask this compiler to load a library/module named by the elements of the array
in $P2, with optional control information in $P3.

For example, Perl 5's module named "Some::Module" should be loaded using (in
pseudo Perl 6): C<perl5.load_library(["Some", "Module"], null)>.

The meaning of $P3 is compiler-specific.  The only universal legal value is
Null, which requests a "normal" load.  The meaning of "normal" varies, but
the ideal would be to perform only the minimal actions required.

On failure, an exception is thrown.  

=back

=head2 Subroutine PMC API

Some information must be available about subroutines to implement the correct
behavior about namespaces.

=head3 Methods

=over 4

=item C<get_namespace>

    $P1 = $P2.get_namespace()

Retrieve the namespace $P1 where the subroutine $P2 was defined. (As
opposed to the namespace(s) that it may have been exported to.)

=back

=head2 Namespace Opcodes

The namespace opcodes all have 3 variants: one that operates from the
currently selected namespace (i.e. the namespace of the currently
executing subroutine), one that operates from the HLL root namespace
(identified by "hll" in the opcode name), and one that operates from the
true root namespace (identified by "root" in the name).

=over 4

=item C<set_namespace>

    set_namespace [key], $P1
    set_hll_namespace [key], $P1
    set_root_namespace [key], $P1

Add the namespace PMC $P1 under the name denoted by a multidimensional
hash key.

    set_namespace $P1, $P2
    set_hll_namespace $P1, $P2
    set_root_namespace $P1, $P2

Add the namespace PMC $P2 under the name denoted by an array of name
strings $P1.

=item C<get_namespace>

    $P1 = get_namespace
    $P1 = get_hll_namespace
    $P1 = get_root_namespace

Retrieve the current namespace, the HLL root namespace, or the true
root namespace and store it in $P1.

    $P1 = get_namespace [key]
    $P1 = get_hll_namespace [key]
    $P1 = get_root_namespace [key]

Retrieve the namespace denoted by a multidimensional hash key and
store it in C<$P1>.

    $P1 = get_namespace $P2
    $P1 = get_hll_namespace $P2
    $P1 = get_root_namespace $P2

Retrieve the namespace denoted by the array of names $P2 and store it in
C<$P1>.

Thus, to get the "Foo::Bar" namespace from the top-level of the HLL if
the name was known at compile time, you could retrieve the namespace
with a key:

  $P0 = get_hll_namespace ["Foo"; "Bar"]

If the name was not known at compile time, you would retrieve the
namespace with an array instead:

  $P1 = split "::", "Foo::Bar"
  $P0 = get_hll_namespace $P1

=item C<make_namespace>

    $P1 = make_namespace [key]
    $P1 = make_hll_namespace [key]
    $P1 = make_root_namespace [key]

Create and retrieve the namespace denoted by a multidimensional hash key
and store it in C<$P1>. If the namespace already exists, only retrieve
it.

    $P1 = make_namespace $P2
    $P1 = make_hll_namespace $P2
    $P1 = make_root_namespace $P2

Create and retrieve the namespace denoted by the array of names $P2 and
store it in C<$P1>. If the namespace already exists, only retrieve it.

=item C<get_global>

    $P1 = get_global $S2
    $P1 = get_hll_global $S2
    $P1 = get_root_global $S2

Retrieve the symbol named $S2 in the current namespace, HLL root
namespace, or true root namespace.

    $P1 = get_global [key], $S2
    $P1 = get_hll_global [key], $S2
    $P1 = get_root_global [key], $S2

Retrieve the symbol named $S2 by a multidimensional hash key relative
to the current namespace, HLL root namespace, or true root namespace.

    $P1 = get_global $P2, $S3
    $P1 = get_hll_global $P2, $S3
    $P1 = get_root_global $P2, $S3

Retrieve the symbol named $S3 by the array of names $P2 relative to the
current namespace, HLL root namespace, or true root namespace.

=item C<set_global>

    set_global $S1, $P2
    set_hll_global $S1, $P2
    set_root_global $S1, $P2

Store $P2 as the symbol named $S1 in the current namespace, HLL root
namespace, or true root namespace.

    set_global [key], $S1, $P2
    set_hll_global [key], $S1, $P2
    set_root_global [key], $S1, $P2

Store $P2 as the symbol named $S1 by a multidimensional hash key,
relative to the current namespace, HLL root namespace, or true root
namespace.  If the given namespace does not exist it is created.

    set_global $P1, $S2, $P3
    set_hll_global $P1, $S2, $P3
    set_root_global $P1, $S2, $P3

Store $P3 as the symbol named $S2 by the array of names $P1, relative to
the current namespace, HLL root namespace, or true root namespace.  If
the given namespace does not exist it is created.

=back

=head2 HLL Namespace Mapping

In order to make this work, Parrot must somehow figure out what type of
namespace PMC to create.

=head3 Default Namespace

The default namespace PMC will implement Parrot's current behavior.

=head3 Compile-time Creation

This Perl:

  #!/usr/bin/perl
  package Foo;
  $x = 5;

should map roughly to this PIR:

  .HLL "Perl5", "perl5_group"
  .namespace [ "Foo" ]
  .sub main :main
    $P0 = new 'PerlInt'
    $P0 = 5
    set_global "$x", $P0
  .end

In this case, the C<main> sub would be tied to Perl 5 by the C<.HLL> directive,
so a Perl 5 namespace would be created.

=head3 Run-time Creation

Consider the following Perl 5 program:

  #!/usr/bin/perl
  $a = 'x';
  ${"Foo::$a"} = 5;

The C<Foo::> namespace is created at run-time (without any optimizations).  In
these cases, Parrot should create the namespace based on the HLL of the PIR
subroutine that calls the store function.

  .HLL "Perl5", "perl5_group"
  .sub main :main
    # $a = 'x';
    $P0 = new 'PerlString'
    $P0 = "x"
    set_global "$a", $P0
    # ${"Foo::$a"} = 5;
    $P1 = new PerlString
    $P1 = "Foo::"
    $P1 .= $P0
    $S0 = $P1
    $P2 = split "::", $S0
    $S0 = pop $P2
    $S0 = "$" . $S0
    $P3 = new 'PerlInt'
    $P3 = 5
    set_global $P2, $S0, $P3
  .end

In this case, C<set_global> should see that it was called from "main",
which is in a Perl 5 namespace, so it will create the "Foo" namespace as
a Perl 5 namespace.

=head1 LANGUAGE NOTES

=head2 Perl 6

=head3 Sigils

Perl 6 may wish to be able to access the namespace as a hash with sigils.  That
is certainly possible, even with subroutines and methods.  It's not important
that a HLL use the typed namespace API, it is only important that it provides
it for others to use.

So Perl 6 may implement C<get_keyed> and C<set_keyed> VTABLE slots that
allow the namespace PMC to be used as a hash.  The C<find_sub> method would,
in this case, append a "&" sigil to the front of the sub/method name and
search in the internal hash.

=head2 Python

=head3 Importing from Python

Since functions and variables overlap in Python's namespaces, when exporting
to another HLL's namespace, the Python namespace PMC's C<export_to> method
should use introspection to determine whether C<x> should be added using
C<add_var> or C<add_sub>.  C<$I0 = does $P0, "Sub"> may be enough to decide
correctly.

=head3 Subroutines and Namespaces

Since Python's subroutines and namespaces are just variables (the namespace
collides there), the Python PMC's C<find_var> method may return subroutines as
variables.


=head2 Examples

=head3 Aliasing

Perl:

  #!/usr/bin/perl6
  sub foo {...}
  %Foo::{"&bar"} = &foo;

PIR:

  .sub main :main
    $P0 = get_name "&foo"
    $P1 = get_namespace ["Foo"]

    # A smart perl6 compiler would emit this,
    # because it knows that Foo is a perl6 namespace:
    $P1["&bar"] = $P0

    # But a naive perl6 compiler would emit this:
    $P1.add_sub("bar", $P0)

  .end

  .sub foo
    ...
  .end

=head3 Cross-language Exportation

Perl 5:

  #!/usr/bin/perl
  use tcl:Some::Module 'w*';   # XXX - ':' after HLL may change in Perl 6
  write("this is a tcl command");

PIR (without error checking):

  .sub main :main
    .local pmc tcl
    .local pmc ns
    tcl = compreg "tcl"
    ns = new 'Array'
    ns = 2
    ns[0] = "Some"
    ns[1] = "Module"
    null $P0
    tcl.load_library(ns, $P0)
    $P0 = tcl.get_namespace(ns)
    $P1 = get_namespace
    $P0.export_to($P1, 'w*')
    "write"("this is a tcl command")
  .end

=head1 ATTACHMENTS

None.

=head1 FOOTNOTES

None.

=head1 REFERENCES

None.

=cut

__END__
Local Variables:
  fill-column:78
End: