NAME

Language::MuldisD::PerlHosted - How to format Perl hosted Abstract Muldis D

VERSION

This document is Language::MuldisD::PerlHosted version 0.3.0.

PREFACE

This document is part of the Muldis D language specification, whose root document is Language::MuldisD; you should read that root document before you read this one, which provides subservient details.

DESCRIPTION

This document outlines the specification of Abstract Muldis D as hosted in either Perl 5 or Perl 6, and as composed of just|mainly core Perl types; for brevity, the term PHMD will be used to refer to this spec.

Where Perl 5 and 6 differ, the terminology and examples in this documentation specifically uses Perl 6 terminology and examples by default, and adds analogous Perl 5 terminology as necessary.

Fundamentally, the various Muldis D scalar and collection types are represented by their equivalent Perl 5 or 6 native scalar and collection types. But since Muldis D is more strongly typed, or at least differently typed, than Perl, each Muldis D literal is represented by a Perl Array, whose elements include both the payload Perl literal plus explicit meta-data for how to interpret that Perl literal for mapping to Muldis D.

This document mainly just specifies a way to represent Muldis D values as Perl values. Since the fundamental way to do data definition in Muldis D is to update catalog (information schema) variables, aka the Muldis D meta-model, which are themselves just data, then this document only needs to tell you how to define values to put in the catalog variables. Defining data types or routines are done by defining catalog values describing them.

See instead Language::MuldisD::Core for how to actually define the tuples and relations that define your data types and routines and queries and so forth.

Note that this document (along with the aforementioned) is also intended to serve as a proposal for a generic portable AST that various Perl applications and components can use to represent their database schemas and queries, regardless of whether a native Muldis D implementation is in use; or this document can be used as a point of departure for documenting some alternative AST for that purpose.

This documentation is pending.

GENERAL STRUCTURE

A PHMD value is composed mainly of a tree of Perl Array, such that each Array is a tree node. The elements of each node/Array include typically a native Perl payload value, which may be a PHMD value itself, plus meta-data for that payload, that meta-data typically including the analogy of a class name, were PHMD nodes instead represented by a tree of PHMD-specific objects.

It should be emphasized that no Perl undefined values are allowed anywhere in a PHMD value; you must use only defined values instead. This documentation also assumes that only defined values are used, and that supplying a Perl undef will result in an error. If you genuinely want to represent that a value is unknown, then the Maybe node type is provided as one way you can explicitly say so. This policy may be reconsidered.

The root Perl Array of a PHMD value has 3 elements, which are:

  • The AST language/schema name, specifically the Perl Str value MuldisD for this spec.

  • The AST language/schema version plus authority, encoded either as a single Perl Str value, or as a Perl Array value containing separate values for the language version and authority.

  • The payload is any other PHMD node.

Examples of a root node:

[ 'MuldisD', [ '1.2.3', 'cpan:DUNCAND' ],
    [ 'Bool', 'md_enum', 'false' ] ]

[ 'MuldisD', ':ver<1.2.3>:auth<cpan:DUNCAND>',
    [ 'Bool', 'md_enum', 'false' ] ]

Both the payload node under the root node and every other node is a Perl Array with usually 2+ elements, where the first element is a Perl Str saying what kind of node it is, and the last element is the typically-single payload, and any sometimes-optional intermediate elements give extra meta-data to specify which of possibly several representation formats the payload is, so that it is correctly interpreted. Typically speaking, only the payload element is a Perl collection type, and typically all the other elements are Perl scalars.

SCALAR VALUES

sys.Core.Rat.Rat

This node type represents a logical boolean value. It has 3 elements:

  • Node type: the Perl Str value Bool.

  • Format; one of: md_enum, perl_bool, any_perl.

  • The payload.

This node is interpreted as a Muldis D sys.Core.Bool.Bool value as follows:

  • If the format is md_enum, then the payload must be a Perl Str having one of the values false, true. This format specifically is what the Concrete Muldis D grammar uses, and is the result of parsing it.

  • If the format is perl_bool, then: Under Perl 6, the payload must be a Perl Bool, and so Bool::False and Bool::True are mapped directly. Under Perl 5, the payload must be just the specific result of a Perl 5 logical expression, such as (1 == 0) or (1 == 1), and nothing else; said values are probably the empty string and number 1, respectively.

  • If the format is any_perl, then the payload may be any Perl value, and it is simply coerced into a boolean context as per Perl's own semantics; typically for built-in scalars, the empty string and number zero are considered false, and everything else true.

Examples:

[ 'Bool', 'md_enum', 'true' ]

[ 'Bool', 'perl_bool', Bool::False ] # Perl 6 only

[ 'Bool', 'perl_bool', (1 == 0) ]

[ 'Bool', 'perl_any', 42 ]

sys.Core.Order.Order

This node type represents an order-determination. It has 3 elements:

  • Node type: the Perl Str value Order.

  • Format; one of: md_enum, perl_order.

  • The payload.

This node is interpreted as a Muldis D sys.Core.Order.Order value as follows:

  • If the format is md_enum, then the payload must be a Perl Str having one of the values increase, same, decrease. This format specifically is what the Concrete Muldis D grammar uses, and is the result of parsing it.

  • If the format is perl_order, then: Under Perl 6, the payload must be a Perl Order, and so Order::Increase and Order::Same and Order::Decrease are mapped directly. Under Perl 5, the payload must be just the specific result of a Perl 5 order-determining expression, such as <(1 <= 2)>> or <(1 <= 1)>> or <(2 <= 1)>>, and nothing else; said values are probably the numbers [-1, 0, 1], respectively.

Examples:

[ 'Order', 'md_enum', 'same' ]

[ 'Order', 'perl_order', Order::Increase ] # Perl 6 only

[ 'Order', 'perl_order', (2 <=> 1) ]

sys.Core.Int.Int

This node type represents an integer value. It has 3-4 elements:

  • Node type: the Perl Str value Int.

  • Format; one of: md_int, perl_int, any_perl.

  • Only when format is md_int; the max-col-val.

  • The payload.

This node is interpreted as a Muldis D sys.Core.Int.Int value as follows:

  • If the format is md_int, then the max-col-val must be a Perl Str composed of a single [1-9A-Z] character, and the payload must be a Perl Str of the format 0 or \-?<[1-9A-Z]><[0-9A-Z]>*. This format specifically is what the Concrete Muldis D grammar uses, and is the result of parsing it. The payload is interpreted as a base-N integer where N might be between 2 and 36, and the given max-col-val says which possible value of N to use. Assuming all column values are between zero and N-minus-one, the max-col-val contains that N-minus-one. So to specify, eg, bases [2,8,10,16], use max-col-val of [1,7,9,F].

  • If the format is perl_int, then: Under Perl 6, the payload must be a Perl Int, which is mapped directly. Under Perl 5, the payload must be just a canonical integer value according to Perl.

  • If the format is any_perl, then the payload may be any Perl value, and it is simply coerced into an integer context as per Perl's own semantics, meaning base-10 where applicable. If something doesn't look numeric, it becomes zero; if something looks like a fractional number, it is truncated.

Examples:

[ 'Int', 'md_int', '1', '11001001' ] # binary

[ 'Int', 'md_int', '7', '0' ] # octal

[ 'Int', 'md_int', '7', '644' ] # octal

[ 'Int', 'md_int', '9', '-34' ] # decimal

[ 'Int', 'md_int', '9', '42' ] # decimal

[ 'Int', 'md_int', 'F', 'DEADBEEF' ] # hexadecimal

[ 'Int', 'md_int', 'Z', '-HELLOWORLD' ] # base-36

[ 'Int', 'perl_int', 21 ]

[ 'Int', 'any_perl', ' 171 ' ]

sys.Core.Int.UInt

This node type represents an unsigned / non-negative integer value; it is interpreted as a Muldis D sys.Core.Int.UInt. Its format is the same as for sys.Core.Int.Int but that the node type is 'UInt', its formats are respectively named for 'uint', and the payload may not have a leading -.

Examples:

[ 'UInt', 'md_uint', '3', '301' ] # base-4

[ 'UInt', 'perl_uint', 0 ]

sys.Core.Int.PInt

This node type represents a positive integer value; it is interpreted as a Muldis D sys.Core.Int.PInt. Its format is the same as for sys.Core.Int.UInt but that the node type is 'PInt', formats 'pint', and the payload may not be 0.

Examples:

[ 'PInt', 'md_pint', 'B', 'A09B' ] # base-12

[ 'PInt', 'perl_pint', 101 ]

sys.Core.Rat.Rat

This node type represents a rational value. It has 3-4 elements:

  • Node type: the Perl Str value Rat.

  • Format; one of: md_rat, perl_rat, any_perl.

  • Only when format is md_rat; the max-col-val.

  • The payload.

This node is interpreted as a Muldis D sys.Core.Rat.Rat value as follows:

  • If the format is md_rat, then the max-col-val must be a Perl Str composed of a single [1-9A-Z] character, and the payload must be a Perl Str of the format 0 or \-?<[1-9A-Z]><[0-9A-Z]>*\.?<[0-9A-Z]>*. This format specifically is what the Concrete Muldis D grammar uses, and is the result of parsing it. The payload is interpreted as a base-N rational where N might be between 2 and 36, and the given max-col-val says which possible value of N to use. Assuming all column values are between zero and N-minus-one, the max-col-val contains that N-minus-one. So to specify, eg, bases [2,8,10,16], use max-col-val of [1,7,9,F].

  • If the format is perl_rat, then: Under Perl 6, the payload must be a Perl Rat (or Num), which is mapped directly. Under Perl 5, the payload must be just a canonical rational or numeric value according to Perl.

  • If the format is any_perl, then the payload may be any Perl value, and it is simply coerced into a numeric context as per Perl's own semantics, meaning base-10 where applicable. If something doesn't look numeric, it becomes zero.

Examples:

[ 'Rat', 'md_rat', '1', '-1.1' ]

[ 'Rat', 'md_rat', '9', '-1.5' ] # same val as prev

[ 'Rat', 'md_rat', '9', '3.14159' ]

[ 'Rat', 'md_rat', 'A', '0.0' ]

[ 'Rat', 'md_rat', 'F', 'DEADBEEF.FACE' ]

[ 'Rat', 'md_rat', 'Z', '0.000AZE' ]

[ 'Rat', 'perl_rat', 21.003 ]

[ 'Rat', 'any_perl', ' 54.67 ' ]

sys.Core.Rat.URat

This node type represents a unsigned / non-negative rational value; it is interpreted as a Muldis D sys.Core.Rat.URat. Its format is the same as for sys.Core.Rat.Rat but that the node type is 'URat', its formats are respectively named for 'urat', and the payload may not have a leading -.

Examples:

[ 'URat', 'md_urat', '6', '500.001' ]

[ 'URat', 'perl_urat', 0.01 ]

sys.Core.Rat.PRat

This node type represents a positive rational value; it is interpreted as a Muldis D sys.Core.Rat.PRat. Its format is the same as for sys.Core.Rat.URat but that the node type is 'PRat', formats 'prat', and the payload may not be 0.

Examples:

[ 'PRat', 'md_prat', 'B', 'A09.B' ]

[ 'PRat', 'perl_prat', 0.101 ]

sys.Core.Blob.Blob

This node type represents a bit string. It has 3-4 elements:

  • Node type: the Perl Str value Blob.

  • Format; one of: md_blob, perl_blob.

  • Only when format is md_blob; the max-col-val.

  • The payload.

This node is interpreted as a Muldis D sys.Core.Blob.Blob value as follows:

  • If the format is md_blob, then the max-col-val must be a Perl Str composed of a single [137F] character, and the payload must be a Perl Str of the format <[0-9A-F]>*. This format specifically is what the Concrete Muldis D grammar uses, and is the result of parsing it. Each column of the payload specifies a sequence of one of [1,2,3,4] bits, depending on whether max-col-val is [1,3,7,F].

  • If the format is perl_blob, then: Under Perl 6, the payload must be a Perl Blob, which is mapped directly. Under Perl 5, the payload must be just a canonical Perl bit string, which is a scalar whose utf-8 flag is false.

Examples:

[ 'Blob', 'md_blob', '1', '00101110100010' ] # binary

[ 'Blob', 'md_blob', '3', ''

[ 'Blob', 'md_blob', 'F', 'A705E' # hexadecimal

[ 'Blob', 'perl_blob', (pack 'H2', 'P') ]

sys.Core.Blob.NEBlob

This node type represents a non-empty bit-string value; it is interpreted as a Muldis D sys.Core.Blob.NEBlob. Its format is the same as for sys.Core.Blob.Blob but that the node type is 'Blob', its formats are respectively named for 'neblob', the payload may not be the empty string.

Examples:

[ 'NEBlob', 'md_blob', '7', '523504376' ]

[ 'NEBlob', 'perl_neblob', (pack 'H2', 'Z') ]

sys.Core.Text.Text

This node type represents a bit string. It has 2 elements:

  • Node type: the Perl Str value Text.

  • The payload.

This node is interpreted as a Muldis D sys.Core.Text.Text value by directly mapping the payload. Note that, while Concrete Muldis D may contain a few escape sequences, those would be replaced with what they represent prior to making a PHMD node. Under Perl 6, the payload must be a Perl Str, which is mapped directly. Under Perl 5, the payload must be just a canonical Perl character string, which is a scalar whose utf-8 flag is true, or that doesn't contain any octets with a 1-valued highest bit.

Examples:

[ 'Text', 'Ceres' ]

[ 'Text', 'サンプル' ] # note: Perl 5 needs "use utf8;" pragma to work

[ 'Text', '' ]

sys.Core.Text.NEText

This node type represents a non-empty bit-string value; it is interpreted as a Muldis D sys.Core.Text.NEText. Its format is the same as for sys.Core.Text.Text but that the node type is 'Text', and the payload may not be the empty string.

Examples:

[ 'NEText', 'Perl' ]

NONSCALAR VALUES

sys.Core.Tuple.Tuple

This node type represents a tuple value. It has 3 elements:

  • Node type: the Perl Str value Tuple.

  • Type name; a Perl Str value (or char-mode Perl scalar).

  • The payload; a Perl Hash|Mapping value.

This node is interpreted as a Muldis D sys.Core.Tuple.Tuple value whose heading was predefined, as a tuple data type, for referencing now by the type name, and whose body is defined now by the payload. Each key+value pair of the payload defines a named attribute of the new tuple; the pair's key and value are, respectively, a Perl Str that specifies the attribute name, and a PHMD node that specifies the attribute value. The tuple body defined by the payload must correspond to the tuple heading named by the type name; that is, they must have the same degree, same attribute names, and compatible types.

Examples:

[ 'Tuple', 'sys.Core.Tuple.Tuple.D0', {} ]

[ 'Tuple', 'glo.the_db.account.user_t', {
    'login_name' => [ 'Text', 'hartmark' ],
    'login_pass' => [ 'Text', 'letmein' ],
    'is_special' => [ 'Bool', 'md_enum', 'true' ],
} ]

[ 'Tuple', 'glo.the_db.gene.person_t', {
    'name' => [ 'Text', 'Michelle' ],
    'age'  => [ 'Int', 'perl_int', 17 ],
} ]

sys.Core.Tuple.Database

This node type represents a database value; it is interpreted as a Muldis D sys.Core.Tuple.Database. Its format is the same as for sys.Core.Tuple.Tuple but that the node type is 'Database', the type name must be of a database type rather than just a tuple type, and all payload PHMD values must be of relation types.

Examples:

[ 'Database', 'sys.Core.Database.Database.D0', {} ]

[ 'Database', 'glo.the_db.account', {
    'user' => [ 'Relation', 'glo.the_db.account.user_r', ... ],
} ]

[ 'Database', 'glo.the_db.gene', {
    'person' => [ 'Relation', 'glo.the_db.gene.person_r', ... ],
} ]

sys.Core.Relation.Relation

This node type represents a relation value. It has 3 elements:

  • Node type: the Perl Str value Relation.

  • Type name; a Perl Str value (or char-mode Perl scalar).

  • The payload; a Perl Array|Seq|Set|KeySet of Hash|Mapping value.

This node is interpreted as a Muldis D sys.Core.Relation.Relation value whose heading was predefined, as a relation data type, for referencing now by the type name, and whose body is defined now by the payload. Each element of the payload defines a tuple of the new relation; each element is as per the payload of a tuple-defining PHMD node, including the need to correspond to the relation heading, which is common to all tuples in it.

Examples:

[ 'Relation', 'sys.Core.Relation.Relation.D0C0', [] ]

[ 'Relation', 'sys.Core.Relation.Relation.D0C1', [ {} ] ]

[ 'Relation', 'glo.the_db.account.user_r', [
    {
        'login_name' => [ 'Text', 'hartmark' ],
        'login_pass' => [ 'Text', 'letmein' ],
        'is_special' => [ 'Bool', 'md_enum', 'true' ],
    },
] ]

[ 'Relation', 'glo.the_db.gene.person_r', [
    {
        'name' => [ 'Text', 'Michelle' ],
        'age'  => [ 'Int', 'perl_int', 17 ],
    },
] ]

sys.Core.Relation.Set

This node type represents a set value. It has 3 elements:

  • Node type: the Perl Str value Set.

  • Type name; a Perl Str value (or char-mode Perl scalar).

  • The payload; a Perl Array|Seq|Set|KeySet value.

This node is interpreted as a Muldis D sys.Core.Relation.Set value whose heading was predefined, as a set data type, for referencing now by the type name, and whose body is defined now by the payload. Each element of the payload defines a unary tuple of the new set; each element is a PHMD node that defines the value attribute of the tuple.

Examples:

[ 'Set', 'glo.the_db.account.country_name', [
    [ 'Text', 'Canada' ],
    [ 'Text', 'Spain' ],
    [ 'Text', 'Jordan' ],
    [ 'Text', 'Thailand' ],
] ]

[ 'Set', 'glo.the_db.stats.some_ages', [
    [ 'Int', 'perl_int', 3 ],
    [ 'Int', 'perl_int', 16 ],
    [ 'Int', 'perl_int', 85 ],
] ]

sys.Core.Relation.Maybe

This node type represents a maybe value; it is interpreted as a Muldis D sys.Core.Relation.Maybe. Its format is the same as for sys.Core.Relation.Set but that the node type is 'Maybe', and the payload must have at most 1 element.

Examples:

[ 'Maybe', 'glo.the_db.gene.person_death_date', [] ]

[ 'Maybe', 'glo.the_db.gene.person_death_date', [
    [ 'Text', '2003.07.24' ],
] ]

sys.Core.Relation.Seq

This node type represents a sequence value. It has 3 elements:

  • Node type: the Perl Str value Seq.

  • Type name; a Perl Str value (or char-mode Perl scalar).

  • The payload; a Perl Array|Seq value.

This node is interpreted as a Muldis D sys.Core.Relation.Seq value whose heading was predefined, as a sequence data type, for referencing now by the type name, and whose body is defined now by the payload. Each element of the payload defines a binary tuple of the new sequence; the element value is a PHMD node that defines the value attribute of the tuple, and the element index is used as the index attribute of the tuple.

Examples:

[ 'Seq', 'glo.the_db.gene.sorted_person_name', [
    [ 'Text', 'Alphonse' ],
    [ 'Text', 'Edward' ],
    [ 'Text', 'Winry' ],
] ]

[ 'Seq', 'glo.the_db.stats.samples_by_order', [
    [ 'Int', 'perl_int', 57 ],
    [ 'Int', 'perl_int', 45 ],
    [ 'Int', 'perl_int', 63 ],
    [ 'Int', 'perl_int', 61 ],
] ]

sys.Core.Relation.Bag

This node type represents a bag value. It has 3 elements:

  • Node type: the Perl Str value Bag.

  • Type name; a Perl Str value (or char-mode Perl scalar).

  • Format; one of: aoa_counted, array_repeated, perl_bag (p6).

  • The payload; a Perl Bag|KeyBag value or Array|Seq or Array|Seq of Array|Seq.

This node is interpreted as a Muldis D sys.Core.Relation.Bag value whose heading was predefined, as a bag data type, for referencing now by the type name, and whose body is defined now by the payload. The payload is interpreted as follows:

  • If the format is aoa_counted, then the payload must be a Perl Array|Seq, and each element of the payload defines a binary tuple of the new bag; the element is a 2-element Array|Seq, and those 2 elements, by index order, are PHMD nodes that define the value and count attributes of the tuple; the count must be a positive integer.

  • If the format is array_repeated, then the payload must be a Perl Array|Seq, and each element of the payload contributes to a binary tuple of the new bag; the element value is a PHMD node that defines the value attribute of the tuple. The bag has 1 tuple for every distinct (after format normalization) element value in the payload, and the count attribute of that tuple says how many instances of said element were in the payload.

  • If the format is perl_bag, then the payload must be a Perl 6 (there is no Perl 5 analogy) Bag|KeyBag value; the payload elements are PHMD nodes corresponding to the value attribute of the new bag's tuples, and the mapping is as you should expect.

Examples:

[ 'Bag', 'glo.the_db.inventory.fruit', 'aoa_counted', [
    [
        [ 'Text', 'Apple' ],
        [ 'PInt', 'perl_uint', 500 ],
    ],
    [
        [ 'Text', 'Orange' ],
        [ 'PInt', 'perl_uint', 300 ],
    ],
    [
        [ 'Text', 'Banana' ],
        [ 'PInt', 'perl_uint', 400 ],
    ],
] ]

[ 'Bag', 'glo.the_db.inventory.whatsits', 'array_repeated', [
    [ 'Text', 'Foo' ],
    [ 'Text', 'Quux' ],
    [ 'Text', 'Foo' ],
    [ 'Text', 'Bar' ],
    [ 'Text', 'Baz' ],
    [ 'Text', 'Baz' ],
] ]

SEE ALSO

Go to Language::MuldisD for the majority of distribution-internal references, and Language::MuldisD::SeeAlso for the majority of distribution-external references.

AUTHOR

Darren Duncan (perl@DarrenDuncan.net)

LICENSE AND COPYRIGHT

This file is part of the formal specification of the Muldis D language.

Muldis D is Copyright © 2002-2007, Darren Duncan.

See the LICENSE AND COPYRIGHT of Language::MuldisD for details.

ACKNOWLEDGEMENTS

The ACKNOWLEDGEMENTS in Language::MuldisD apply to this file too.