PDD 31: HLL Compilers and Libraries

Abstract

This PDD describes the standard compiler API and support for cross-library communication between high-level languages (HLLs).

Description

Parrot's support for HLL interoperability is primarily focused on enabling programs written in one language to be able to use libraries and code written in a different language. At the same time, language implementors should not be overly restricted by a global specification.

This PDD describes an API for HLL compiler objects to use to promote library sharing among languages. It's intended to make it easy for a program to request loading of a local or foreign module, determine the capabilities provided by the module, and potentially import and integrate them into its own namespaces. In general, the API treats library-level interoperability as a negotiation among HLL compiler objects, with each HLL compiler maintaining primary control over the operations performed in its HLL space.

In particular, this HLL API does not attempt to prescribe how languages should organize their internal capabilities, objects, classes, namespaces, methods, data structures, and the like.

Implementation

Compiler API

This section describes the abstract API for HLL compiler objects.

Locating a compiler object

Generally HLL compilers are loaded via the load_language opcode, and register themselves using the compreg opcode. By convention, each HLL compiler should at minimum register itself using the name of its HLL namespace (see PDD 26), although a compiler can choose to register itself under other names as well.

Methods

compile
$P0 = compiler.'compile'(source [, options :named :slurpy])

Return the result of compiling source according to options. Common options include:

target

Stop the compilation process when the stage given by target has been reached. Common values for target include "parse", "past", "pir", and "pbc". The exact target types supported are dependent on the compiler itself and are not limited or standardized.

outer_ctx

Use the supplied context as the outer (lexical) context for the compilation. Some languages require this option to be able to look up lexical symbols in outer scopes when performing a dynamic compilation at runtime.

eval
$P0 = compiler.'eval'(source [, args :slurpy] [, options :named :slurpy])

Compile and evaluate (execute) the code given by source with args and according to options. The available options are generally the same as for the compile method above; in particular, the outer_ctx option can be used to specify the outer lexical context for the evaluated source.

parse_name
$P0 = compiler.'parse_name'(name)

Parse the string name using the rules specific to compiler, and return an array of individual name elements.

For example, a Java compiler would turn 'a.b.c' to ['a','b','c'], while a Perl compiler would turn 'a::b::c' into the same result. Perl's sigil rules would likely turn '$a::b::c' into ['a','b','$c'].

load_module
module = compiler.'load_module'(name)

Locate and load the module given by name using the rules for libraries specific to compiler, and return a module handle for the module just loaded. The name argument is typically an array or a string to be processed as in parse_name above. In general the module handle returned should be considered opaque by the caller, but specific HLL compilers are allowed to specify the nature of the handle returned (e.g., a namespace for the loaded module, or a specific "handle" object).

get_module
module = compiler.'get_module'(name)

Similar to load_module above, this method returns a handle to an already-loaded module given by name.

get_exports
$P0 = compiler.'get_exports'(module [,name,name,...] [, 'tagset'=>tagset])

Requests the exported objects given by name and/or tagset for module within the given compiler. The module argument should be a module handle as obtained by load_module or get_module above.

A tagset argument provides an identifier that a compiler and/or module can use to supply their own lists of items to be exported. By convention, a tagset of "DEFAULT" refers to the default set of exported items for the module, while "ALL" returns all available exports. Compilers and modules are free to define their own custom tagsets beyond these.

Any name arguments supplied generally limit the export list to the tagset items corresponding to the supplied names (as determined by the compiler invocant). If names are provided without an explicit tagset, then "ALL" is assumed. If neither names nor a tagset are provided, then symbols from "DEFAULT" are returned.

The returned export list is a hash of hashes; each entry in the top level hash has a key identifying the type of exported object (one of 'namespace', 'sub', or 'var') and a value hash containing the corresponding exported symbol names and objects. This hash-of-hashes approach is intended to generally correspond to the "Typed Interface" section of PDD 21 ("Namespaces"), and allows the module's source HLL to indicate the type of exported object to the caller. The hash-of-hash approach also accommodates languages where a single name might be used to refer to several objects that differ in type. (This PDD explicitly rejects the notion that a HLL should be directly exporting or injecting symbols into a foreign HLL's namespaces.)

HLL::Compiler class

HLL::Compiler is a common base class for compiler objects based on the Parrot Compiler Toolkit (PCT) and NQP (Not Quite Perl) libraries. It provides a default implementation of the abstract Compiler API above, plus some additional methods for simple symbol table export and import. The default methods are intended to support importing and exporting symbols using standard Parrot namespace objects (PDD 21). However, it's normal (and expected) that languages will subclass HLL::Compiler to provide language-specific semantics where needed.

Methods

language
$S0 = compiler.'language'([name])

If name is provided, sets the language name of the invocant and registers the invocant as the compiler for name via the compreg opcode.

Returns the language name of the compiler.

parse_name
$P0 = compiler.'parse_name'(name)

Splits a name based on double-colons, such that "A::B::C" becomes ['A','B','C'].

get_module
module = compiler.'get_module'(name)

Returns a handle to the HLL namespace associated with name (which is processed via the invocant's parse_name method if needed).

load_module
module = compiler.'load_module'(name)

Loads a module name via the load_bytecode opcode using both ".pbc" and ".pir" extensions. Parrot's standard library paths for load_bytecode are searched.

Returns the HLL namespace associated with name (which may be PMCNULL if loading failed or if the requested module did not create an associated namespace).

get_exports
$P0 = compiler.'get_exports'(module [,name,name,...] [, 'tagset'=>tagset])

Implements a simple exporting interface that meets the "Compiler API" above. The module argument is expected to be something that supports a hash interface, such as NameSpace or LexPad. (Note that this is what gets returned by the default get_module and load_module methods above.) The module["EXPORT"] entry should return another hash-like object keyed by tagset names; each of those tagset names then identify the exportable symbols associated with that tagset.

With this default arrangement, it's entirely possible for a module to indicate its tagsets by using symbol entries in namespaces. For example, a module with namespace ['XYZ'] can define its default exports by binding symbols in the ['XYZ';'EXPORT';'DEFAULT'] namespace. (Modules aren't required to use exactly this mechanism; it's just one possibility of many.)

If the "ALL" tagset is requested and there is no "ALL" entry in the module['EXPORT'] hash, then module itself is used as the source of exportable symbols for this method. This enables get_exports to be used to obtain symbols from modules that do not follow the "EXPORT" convention above (e.g., core Parrot modules).

As described in the Compiler API section above, the return value from get_exports is a hash-of-hashes with exported namespaces in the namespace hash, exported subroutines in the sub hash, and all other exports in the var hash.

import
compiler.'import'(target, export_hash)

Import the entries from export_hash (typically obtained via get_exports above) into target according to the rules for compiler. Any entries in export_hash['namespace'] are imported first, followed by entries in export_hash['sub'], followed by entries in export_hash['var'].

Note that this method is not part of the abstract Compiler API -- a HLL compiler is able to implement importing in any way it deems appropriate. The HLL::Compiler class provides this method as a useful default for many HLL compilers.

For each exported item of export_hash, import takes place by checking the invocant for an import_[type] method and using that if it exists (where [type] is one of "namespace", "sub", or "var"). These methods are used to implemented "typed imports", and allows the compiler object to perform any name mangling or other operations needed to properly import an object.

If the compiler invocant doesn't define an import_[type] method, import attempts to use any add_[type] method that exists on target (e.g., for the case where target is a namespace PMC supporting the typed interface defined by PDD 21).

If neither of these methods are available, then import simply binds the symbol using target's hash interface.

Examples

Importing a module Acme::Boom from language xyz into language abc

References

pdd21_namespaces.pod