NAME
Muldis::D::Dialect::PTMD_Tiny - How to format Plain Text Muldis D
VERSION
This document is Muldis::D::Dialect::PTMD_Tiny version 0.60.0.
PREFACE
This document is part of the Muldis D language specification, whose root document is Muldis::D; you should read that root document before you read this one, which provides subservient details.
DESCRIPTION
This document outlines the grammar of the Plain Text Muldis D dialect named PTMD_Tiny
. The fully-qualified name of this Muldis D dialect, in combination with the base language spec it is bundled with, is Muldis_D:'http://muldis.com':'N.N.N':PTMD_Tiny
(when the bundled base language version is substituted for the N.N.N
).
This dialect is designed to exactly match the Muldis D system catalog (the possible representation of Muldis D code that is visible to or updateable by Muldis D programs at runtime) as to what non-critical meta-data it explicitly stores; so code in the PTMD_Tiny
dialect should be round-trippable with the system catalog with the result maintaining all the details that were started with. Since it matches the system catalog, this dialect should be able to exactly represent all possible Muldis D base language code (and probably all extensions too), rather than a subset of it. That said, the PTMD_Tiny
dialect does provide a choice of multiple syntax options for writing Muldis D value literals and DBMS entity (eg type and routine) declarations, so several very distinct PTMD_Tiny
code artifacts may parse into the same system catalog entries. There is even a considerable level of abstraction in some cases, so that it is easier for programmers to write and understand typical PTMD_Tiny
code, and so that this code isn't absurdly verbose.
This dialect is designed to be as small as possible while meeting the above criteria, and is designed such that a parser that handles all of this dialect can be tiny, hence the dialect's Tiny
name. Likewise, a code generator for this dialect from the system catalog can be tiny.
A significant quality of the PTMD_Tiny
dialect is that it is designed to work easily for a single-pass parser, or at least a single-pass lexer; all the context that one needs to know for how to parse or lex any arbitrary substring of code is provided by prior code. Therefore, a PTMD_Tiny
parser can easily work on a streaming input like a file-handle where you can't go back earlier in the stream. Often this means a parser can work with little RAM.
Also the dialect is designed that any amount of whitespace can be added or omitted next to most non-alphanumeric characters (which happen to be next to alphanumeric tokens) without that affecting the meaning of the code at all, except obviously for within character string literals. And long binary or character strings can be split into arbitrary-size substrings, without affecting the meaning. And many elements are identified by name rather than ordinal position, so to some degree the order they appear has no effect on the meaning. So programmers can easily format (separate, indent, linewrap, order) code how they like, and making an automated code reformatter shouldn't be difficult. Often, named elements can also be omitted entirely for brevity, in which case the parser would use context to supply default values for those elements.
Given that plain text is (more or less) universally unambiguously portable between all general purpose languages that could be used to implement a DBMS, it is expected that every single Muldis D implementation will natively accept input in the PTMD_Tiny
dialect, which isn't dependent on any specific host language and should be easy enough to process, so it should be considered the safest official Muldis D dialect to write in by default, when you don't have a specific reason to use some other dialect.
See also the dialects HDMD_Perl6_Tiny and HDMD_Perl5_Tiny, which are derived directly from PTMD_Tiny
, and represent possible Perl 6 and 5 concrete syntax trees for it; in fact, most of the details in common with those other dialects are described just in the current file, for all 3 dialects.
GENERAL STRUCTURE
A PTMD_Tiny
Muldis D code file consists just of a full or partial Muldis D bootloader
routine definition, which begins with a language name declaration, and otherwise is simply an ordered sequence of imperative routine calls, where earlier routine calls are to system-defined data-definition routines (their arguments are values to put in the system catalog), and later ones are then to user-defined routines that the earlier statements either loaded or defined. This is conceptually what a PTMD_Tiny
file is, and it can even be that literally, but PTMD_Tiny
provides a canonical further abstraction which should be used when doing data-definition. And so you typically use syntax resembling routine and type declarations in a general purpose programming language, where simply declaring such an entity will cause it to be written into the system catalog for subsequent use.
The grammar in this file is formatted as a Perl 6 grammar (see http://perlcabal.org/syn/S05.html for details) which could be used to parse it, but it is only meant to be illustrative, and would need further additions or changes to actually function in Perl 6. The grammar consists mainly of named tokens which define matching rules. A token's name is declared as a bareword following the keyword token
and it is subsequently referenced with '<' and '>'
delimiters. Any other bareword in a token definition consisting of alphanumerics is matched literally, and all non-quoted whitespace is not significant. Any pairs of parenthesis ('(' and ')'
) in token definitions are capturing groups, and each parser match by a pair corresponds to a capture node or node element in the concrete syntax tree resulting from the parse.
The grammar of the PTMD_Tiny
dialect has 3 main subsections, the first being the syntax for declaring a Muldis D language name, the second being the syntax for Muldis D value literals, and the third being the syntax for DBMS entity definition and routine invocation. The subsection for a language name (having the root grammar token language_name
) is quite small and is defined partly in terms of the value literals subsection. The subsection for value literals (having the root grammar token value
) is completely self-defined and can be used in isolation from the wider grammar as a Muldis D sub-language; for example, a hosted-data Muldis D implementation may have an object representing a Muldis D value, which is initialized using code written in that sub-language. The subsection for entity definition and invocation (having the root grammar token boot_stmt
) is defined partly in terms of the value literals subsection. The root grammar token for the entire dialect is bootloader
.
REWRITE PROGRESS MARKER
What follows next in this file, between this point and the SEE ALSO, still has to be rewritten according to certain TODO plans, after which this whole file should reflect what the DESCRIPTION etc above says.
LANGUAGE NAME
Grammar:
token language_name {
<ln_base_name>
<value_node_elem_sep>
<ln_base_authority>
<value_node_elem_sep>
<ln_base_version_number>
<value_node_elem_sep>
<ln_dialect>
<value_node_elem_sep>
<ln_extensions>
}
token ln_base_name { Muldis_D }
token ln_base_authority { <char_str> }
token ln_base_version_number { <char_str> }
token ln_dialect { PTMD_Tiny }
token ln_extensions { <qtuple_payload> }
Examples:
Muldis_D:'http://muldis.com':'1.2.3':PTMD_Tiny:{}
Muldis_D:'http://muldis.com':'1.2.3':PTMD_Tiny:{
auto_add_attrs => Bool:true,
auto_unabbrev_std_names => Bool:true,
auto_chains_from_names => Bool:true
}
VALUE LITERAL COMMON ELEMENTS
A generic context value literal (or GCVL) is a value literal that includes explicit value kind meta-data (such as, "this is an Int
" or "this is a Name
") so that the literal can be properly interpreted in a context that is expecting a
value but has no expectation that said value belongs to a specific data type. For example, each element of a generic Muldis D collection value, such as a member of an array or tuple, could potentially have any type at all. In contrast, a specific context value literal (or SCVL) is a value literal that does not include explicit value kind meta-data, because the context of its use supplies said meta-data. For example, in a tuple value literal it is assumed that a value literal in an attribute name position must denote a Name
. The grammar token value
denotes a GCVL, as do most short-named grammar tokens, like int
or name
; in contrast, a grammar token containing payload
denotes a SCVL, like int_payload
or name_payload
.
Every GCVL has 1-3 elements, illustrated by this grammar:
token value {
(
(<value_kind>)
[
<value_node_elem_sep>
[(<type_name>) <value_node_elem_sep>]?
(<payload>)
]?
)
}
token value_node_elem_sep { \s* ':' \s* }
token value_kind {
Bool
| Int
| String
| Blob
| Text
| Rat
| Instant
| Duration
| Name
| NameChain
| DeclNameChain
| Comment
| Order
| RatRoundMeth
| Q? Scalar
| Q? Tuple
| Q? Relation
| Q? Set
| Nothing
| Q? Single
| Q? Array
| Q? Bag
| [UTC | Float] Instant
| UTCDuration
}
token type_name { <name_chain_payload> }
token payload {
<bool_payload>
| <int_payload>
| <string_payload>
| <blob_payload>
| <text_payload>
| <rat_payload>
| <instant_payload>
| <duration_payload>
| <name_payload>
| <name_chain_payload>
| <decl_name_chain_payload>
| <comment_payload>
| <order_payload>
| <rat_round_meth_payload>
| <qscalar_payload>
| <qtuple_payload>
| <qrelation_payload>
| <utc_instant_payload>
| <utc_duration_payload>
}
So a value
node has 1-3 elements in general:
value_kind
-
This is a character string of the format
<[A..Z]> <[ a..z A..Z ]>+
; it identifies the data type of the value literal in broad terms and is the only external meta-data ofpayload
necessary to interpret the latter; what grammars are valid forpayload
depend just onvalue_kind
. type_name
-
This is a Muldis D data type name, for example
sys.std.Core.Type.Int
; it identifies a specific subtype of the generic type denoted byvalue_kind
, and serves as an assertion that the Muldis D value denoted bypayload
is a member of the named subtype. Iffvalue_kind
is(Q|)Scalar
thentype_name
is mandatory; otherwise,type_name
is optional for allvalue
, except thattype_name
must be omitted whenvalue_kind
is one of the 4 [Bool
,Nothing
,Order
,RatRoundMeth
]; this isn't because those 4 types can't be subtyped, but because in practice doing so isn't useful. payload
-
This is mandatory for all
value
except whenvalue_kind
isNothing
, in which case it must be omitted; sinceNothing
is a singleton type, it isn't useful to say anything for that value other than thevalue_kind
.
Now the above-mentioned grammar for value
would be more appropriate if your PTMD_Tiny
parser is bare-metal hard-coded since it is more procedural. But if you are using a general purpose grammar processor utility to generate a parser (like an actual Perl 6 grammar), then you would typically want to use the following grammar for value
instead:
token value {
<bool>
| <int>
| <string>
| <blob>
| <text>
| <rat>
| <instant>
| <duration>
| <name>
| <name_chain>
| <decl_name_chain>
| <comment>
| <order>
| <rat_round_meth>
| <qscalar>
| <qtuple>
| <qrelation>
| <utc_instant>
| <utc_duration>
}
For GCVL examples, see the subsequent documentation sections.
SIMPLE CORE SCALAR VALUE LITERALS
sys.std.Core.Type.Bool
Grammar:
token bool {
Bool <value_node_elem_sep>
<bool_payload>
}
token bool_payload {
[false | true | 0 | 1]
}
Examples:
Bool:true
sys.std.Core.Type.Int
Grammar:
token int {
Int <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<int_payload>
}
token int_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<int_body>
}
token value_payload_elem_sep { \s* ';' \s* }
token int_max_col_val { <pint_head> }
token int_body { [0 | \-?<pint_body>] }
token nnint_body { [0 | <pint_body>] }
token pint_body { <pint_head> <pint_tail>? }
token pint_head { <[ 1..9 A..Z ]> }
token pint_tail { [<[ 0..9 A..Z _ ]>+] ** <segment_sep> }
token segment_sep { \s* '~' \s* }
Examples:
Int:1;11001001
Int:7;0
Int:7;644
Int:-34
Int:42
Int:F;DEADBEEF
Int:Z;-HELLOWORLD
Int:3;301
Int:B;A09B
sys.std.Core.Type.String
Grammar:
token string {
String <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<string_payload>
}
token string_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<ord_list_open>
[<int_body> ** <list_elem_sep>]?
<ord_list_close>
}
token ord_list_open { \s* '[' \s* }
token ord_list_close { \s* ']' \s* }
token list_elem_sep { \s* ',' \s* }
Examples:
String:F;[50,65,72,6C]
sys.std.Core.Type.Blob
Grammar:
token blob {
Blob <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<blob_payload>
}
token blob_payload {
<[137F]> <value_payload_elem_sep>
[[<[ 0..9 A..F ]>+] ** <segment_sep>]?
}
Examples:
Blob:1;00101110100010
Blob:3;
Blob:F;A705E
Blob:7;523504376
sys.std.Core.Type.Text
Grammar:
token text {
Text <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<text_payload>
}
token text_payload {
<char_str>
}
token char_str {
[<char_str_seg> ** <segment_sep>]?
}
token char_str_seg {
<quoted_char_str_seg>
| <nonquoted_char_str_seg>
}
token quoted_char_str_seg {
<[']>
['\b' | '\q' | <-[\\\']>]*
<[']>
}
token nonquoted_char_str_seg { <[ a..z A..Z 0..9 _ - ]>+ }
Examples:
Text:'Ceres'
Text:'サンプル'
Text:''
Text:'Perl'
sys.std.Core.Type.Rat
Grammar:
token rat {
Rat <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<rat_payload>
}
token rat_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<rat_body>
}
token rat_body {
<int_body>\.?<pint_tail>?
| <int_body> \s* \/ \s* <pint_body>
| <int_body> \s* \* \s* <pint_body> \s* \^ \s* <int_body>
}
token nnrat_body {
<nnint_body>\.?<pint_tail>?
| <nnint_body> \s* \/ \s* <pint_body>
| <nnint_body> \s* \* \s* <pint_body> \s* \^ \s* <int_body>
}
Examples:
Rat:1;-1.1
Rat:-1.5
Rat:3.14159
Rat:A;0.0
Rat:F;DEADBEEF.FACE
Rat:Z;0.000AZE
Rat:6;500001/1000
Rat:B;A09B/A
Rat:1;1011101101*10^-11011
Rat:45207196*10^37
Rat:1/43
Rat:314159*10^-5
sys.std.Core.Type.Instant
Grammar:
token instant {
Instant <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<instant_payload>
}
token instant_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<rat_body>
}
Examples:
Instant:1235556432
sys.std.Core.Type.Duration
Grammar:
token duration {
Duration <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<duration_payload>
}
token duration_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<rat_body>
}
Examples:
Duration:3600
Duration:-50
Duration:3.14159
Duration:1;1011101101*10^-11011
Duration:1/43
sys.std.Core.Type.Cat.Name
Grammar:
token name {
Name <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<name_payload>
}
token name_payload {
<char_str>
}
Examples:
Name:login_pass
Name:'First Name'
sys.std.Core.Type.Cat.NameChain
Grammar:
token name_chain {
NameChain <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<name_chain_payload>
}
token name_chain_payload {
<name_payload> [<name_chain_elem_sep> <name_payload>]+
}
token name_chain_elem_sep { \s* '.' \s* }
Examples:
NameChain:fed.data.the_db.gene.sorted_person_name
NameChain:fed.data.the_db.stats.'samples by order'
sys.std.Core.Type.Cat.DeclNameChain
Grammar:
token decl_name_chain {
DeclNameChain <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<decl_name_chain_payload>
}
token decl_name_chain_payload {
[<name_payload> ** <name_chain_elem_sep>]?
}
Examples:
DeclNameChain:gene.sorted_person_name
DeclNameChain:stats.'samples by order'
sys.std.Core.Type.Cat.Comment
Grammar:
token comment {
Comment <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<comment_payload>
}
token comment_payload {
<char_str>
}
Examples:
Comment:'This does something.'
sys.std.Core.Type.Cat.Order
Grammar:
token order {
Order <value_node_elem_sep>
<order_payload>
}
token order_payload {
[increase | same | decrease | -1 | 0 | 1]
}
Examples:
Order:same
sys.std.Core.Type.Cat.RatRoundMeth
Grammar:
token rat_round_meth {
RatRoundMeth <value_node_elem_sep>
<rat_round_meth_payload>
}
token rat_round_meth_payload {
[
half_down | half_up
| half_even
| to_floor | to_ceiling
| to_zero | to_inf
]
}
Examples:
RatRoundMeth:half_up
GENERIC Q/SCALAR AND Q/NONSCALAR VALUE LITERALS
sys.std.Core.Type.(Q|)Scalar
Grammar:
token qscalar {
Q? Scalar <value_node_elem_sep>
<type_name> <value_node_elem_sep>
<qscalar_payload>
}
token qscalar_payload {
<possrep_name> <value_payload_elem_sep>
<possrep_attrs>
}
token possrep_name { <name_payload> }
token possrep_attrs { <qtuple_payload> }
Examples:
Scalar:sys.std.Core.Type.Rat:float;{
mantissa => Int:45207196,
radix => Int:10,
exponent => Int:37
}
Scalar:sys.std.Temporal.Type.UTCDateTime:datetime;{
year => Int:2003,
month => Int:10,
day => Int:26,
hour => Int:1,
minute => Int:30,
second => Rat:0
}
Scalar:fed.lib.the_db.WeekDay:name;{
'' => Text:'monday'
}
Scalar:fed.lib.the_db.WeekDay:number;{
'' => Int:5
}
sys.std.Core.Type.(Q|)Tuple
Grammar:
token qtuple {
Q? Tuple <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<qtuple_payload>
}
token qtuple_payload {
<list_open>
[[<name_payload> <pair_elem_sep> <value>]
** <list_elem_sep>]?
<list_close>
}
token list_open { \s* '{' \s* }
token list_close { \s* '}' \s* }
token pair_elem_sep { \s* '=>' \s* }
Examples:
Tuple:{}
Tuple:type.tuple_from.var.fed.data.the_db.account.users:{
login_name => Text:'hartmark',
login_pass => Text:'letmein',
is_special => Bool:true
}
Tuple:{
name => Text:'Michelle',
age => Int:17
}
sys.std.Core.Type.(Q|)Relation
Grammar:
token qrelation {
<generic_qrelation>
| <qset>
| <nothing>
| <qsingle>
| <qarray>
| <qbag>
}
token generic_qrelation {
Q? Relation <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<generic_qrelation_payload>
}
token generic_qrelation_payload {
<generic_relation_empty_qbody_payload>
| <generic_relation_nonordered_qattr_payload>
| <generic_relation_ordered_qattr_payload>
}
token generic_relation_empty_qbody_payload {
<list_open>
[<name_payload> ** <list_elem_sep>]?
<list_close>
}
token generic_relation_nonordered_qattr_payload {
<list_open>
[<qtuple_payload> ** <list_elem_sep>]?
<list_close>
}
token generic_relation_ordered_qattr_payload {
<ord_list_open>
[<name_payload> ** <list_elem_sep>]?
<ord_list_close>
<value_payload_elem_sep>
<list_open>
[[
<ord_list_open>
[<value> ** <list_elem_sep>]?
<ord_list_close>
] ** <list_elem_sep>]?
<list_close>
}
Examples:
Relation:{}
Relation:{ x, y, z }
Relation:{ {} }
Relation:{
{
login_name => Text:'hartmark',
login_pass => Text:'letmein',
is_special => Bool:true
}
}
Relation:fed.lib.the_db.gene.Person:[ name, age ];{
[ Text:'Michelle', Int:17 ]
}
sys.std.Core.Type.(Q|)Set
Grammar:
token qset {
Q? Set <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<qset_payload>
}
token qset_payload {
<list_open>
[<value> ** <list_elem_sep>]?
<list_close>
}
Examples:
Set:fed.lib.the_db.account.Country_Names:{
Text:'Canada',
Text:'Spain',
Text:'Jordan',
Text:'Thailand'
}
Set:{
Int:3,
Int:16,
Int:85
}
sys.std.Core.Type.Nothing
Grammar:
token nothing {
Nothing
}
Examples:
Nothing
sys.std.Core.Type.(Q|)Single
Grammar:
token qsingle {
Q? Single <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<qsingle_payload>
}
token qsingle_payload {
<list_open>
<value>
<list_close>
}
Examples:
Single:{ Text:'I know this one!' }
sys.std.Core.Type.(Q|)Array
Grammar:
token qarray {
Q? Array <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<qarray_payload>
}
token qarray_payload {
<ord_list_open>
[<value> ** <list_elem_sep>]?
<ord_list_close>
}
Examples:
Array:[
Text:'Alphonse',
Text:'Edward',
Text:'Winry'
]
Array:fed.lib.the_db.stats.Samples_By_Order:[
Int:57,
Int:45,
Int:63,
Int:61
]
sys.std.Core.Type.(Q|)Bag
Grammar:
token qbag {
Q? Bag <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<qbag_payload>
}
token qbag_payload {
<qbag_payload_counted_values>
| <qbag_payload_repeated_values>
}
token qbag_payload_counted_values {
<list_open>
[[<value> <pair_elem_sep> <count>] ** <list_elem_sep>]?
<list_close>
}
token count {
[<int_max_col_val> <value_payload_elem_sep>]?
<pint_body>
}
token qbag_payload_repeated_values {
<list_open>
[<value> ** <list_elem_sep>]?
<list_close>
}
Examples:
Bag:fed.lib.the_db.inventory.Fruit:{
Text:'Apple' => 500,
Text:'Orange' => 300,
Text:'Banana' => 400
}
Bag:{
Text:'Foo',
Text:'Quux',
Text:'Foo',
Text:'Bar',
Text:'Baz',
Text:'Baz'
}
TEMPORAL EXTENSION SCALAR VALUE LITERALS
sys.std.Temporal.Type.(UTC|Float)Instant
Grammar:
token utc_instant {
[UTC | Float] Instant <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<utc_instant_payload>
}
token utc_instant_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<ord_list_open>
<int_body>? <list_elem_sep>
[<pint_body>? <list_elem_sep>] ** 2
[<nnint_body>? <list_elem_sep>] ** 2
<nnrat_body>?
<ord_list_close>
}
Examples:
UTCInstant:[1964,10,16,16,12,47.5]
UTCInstant:[2002,12,16,,,]
UTCInstant:[,,,14,2,29]
FloatInstant:[2003,4,5,2,,]
FloatInstant:[1407,,,,,]
sys.std.Temporal.Type.UTCDuration
Grammar:
token utc_duration {
UTCDuration <value_node_elem_sep>
[<type_name> <value_node_elem_sep>]?
<utc_duration_payload>
}
token utc_duration_payload {
[<int_max_col_val> <value_payload_elem_sep>]?
<ord_list_open>
[<int_body>? <list_elem_sep>] ** 5
<rat_body>
<ord_list_close>
}
Examples:
UTCDuration:[3,5,1,6,15,45.000012]
BOOTLOADER
Grammar:
token bootloader {
<language_name>
<boot_stmt>*
}
Examples:
Muldis_D:'http://muldis.com':'1.2.3':PTMD_Tiny:{}
boot_stmt:sys.std.Core.Cat.create_depot_procedure:{}:{ ... }
BOOTLOADER STATEMENT
Grammar:
token boot_stmt {
boot_stmt
<value_node_elem_sep>
<imperative_routine_name>
<value_node_elem_sep>
<imperative_routine_upd_args>
<value_node_elem_sep>
<imperative_routine_ro_args>
}
token imperative_routine_name { <name_chain_payload> }
token imperative_routine_upd_args {
<list_open>
[[<name_payload> <pair_elem_sep> <name_chain_payload>]
** <list_elem_sep>]?
<list_close>
}
token imperative_routine_ro_args { <qtuple_payload> }
Examples:
boot_stmt:sys.std.Core.Cat.create_depot_procedure:{}:{ ... }
MULDIS D TINY DIALECT PRAGMAS
All of the following pragmas apply to both the PTMD_Tiny
and HDMD_Perl(6|5)_Tiny
dialects, and have the same semantics with both.
auto_add_attrs
All Muldis D values, besides scalars lacking any possreps, are defined in terms of a collection of attribute values, and there is no such thing as an attribute being undefined; normally when one selects a value of a particular attribute-based type, they must supply values for all of its attributes; this is true with values comprising the system catalog as with any other values. Code written in the Muldis D PTMD_Tiny
or HDMD_Perl(6|5)_Tiny
dialect is comprised almost entirely of value literals, and by default all of the attribute values of said values must be explicitly given in the literals as sub-literals, even in the common case where some attributes just have the default values for their type.
While this fact allows for parsers to be very simple and for sub-literals to be compilable into values without knowing the context they're compiled into, it means that programmers would have to write maybe about twice as much code as they otherwise would if they could simply not write out the default-valued attributes.
If the 5th Extensions portion of the fully-qualified Muldis D language name contains a name+value pair of auto_add_attrs
+ Bool:true
, then this activates the optional auto_add_attrs
pragma, which provides one kind of automatic code completion. When auto_add_attrs
is active, programmers may omit any literal attributes that they want, and those attributes will be automatically defined by the parser to have the default values for their type. Or more specifically, the wider literal whose attributes are missing will be extended to become the default value of the type of the wider literal but that those attributes of its that were explicitly given will override the default's values for those attributes. The actual behaviour is essentially what the sys.std.QTuple.subst_in_default
function does.
But the auto_add_attrs
pragma is not simply an automatically invoked pre-processing Muldis D function, because it also serves the common case where one is defining relation literals that have different attributes specified per tuple; such a thing by itself isn't even valid as a generic relation, so it certainly can't be given to a Muldis D function; so the pragma has at least that advantage unique to itself.
Note that the lexer is exactly the same regardless of whether the auto_add_attrs
pragma is turned on or off, because the matters of missing attributes were never tested or enforced at the lexical level in the first place; rather the pragma only affects the parsing stage that follows the lexing. In other words, the actual syntax or grammar is identical regardless of the setting of this pragma.
Now one consequence of using the auto_add_attrs
pragma is that in general the parser must be more complicated, and read type definitions from the DBMS information schema so that it knows what attributes each literal is supposed to have, and their declared types, and also sub-literals can no longer in general be fully converted to values in isolation; now the parent-most literal must be evaluated first, because its declared type generally determines the declared types of its attributes, and then their attributes recursively. For nonscalar types, the initial declared type being looked at is the declared type of the bootloader-invoked routine's parameter that the literal is being given to as an argument.
Now if the declared type of said parameter is just a generic type, such as Relation
or Array
, then often no information can be gleaned from this context for what attributes should exist, and so you will need to make the arg literal include treat-as-type metadata that explicitly provides the specific type information needed; otherwise, auto_add_attrs
won't help you and you must then fully define relation values with the same attributes per tuple. But fortunately for brevity, a lot of the places where auto_add_attrs
would help you the most is when the bootloader is invoking system-defined data-defining procedures, and their parameters are all of attribute-specifying types, and it is in such data definition that you may be most likely to face a large number of default-valued attributes, such as comment
.
Note that the reason the auto_add_attrs
behaviour is turned off by default is twofold. First, the parser can be a lot simpler / more tiny with it off. Second, requiring users to explicitly define even default-valued attributes can make the code more self-documenting and can help users avoid some kinds of bugs due to action from unseen values, or due to some default values "silently" changing between language versions. So then essentially, turning on auto_add_attrs
means the programmer is telling the parser "I know what I'm doing" by explicitly asking for potentially less-safe behaviour. Of course, even with auto_add_attrs
turned on, you can still explicitly define attribute values that are their type's default values, so it is possible to compromise such as you like.
Also note that it should be trivial for a Muldis D implementation to let users input code written with auto_add_attrs
turned on, and then output the version of that code for their perusal with it turned off, so they can see what extra values were filled in without having to manually write said.
auto_unabbrev_std_names
Normally when one is specifying a NameChain
literal that is a reference to a standard system-defined type or routine, they must write out the name in full, starting with sys.std
and so on through the unique part of the entity name. While this allows for clearly self-documenting code, as well as for relatively simple parsers, it can also be added tedium to programmers that would prefer to write out the names in a less verbose manner, especially since to a point, a slightly more complicated parser could still unambiguously resolve a much shorter substring of the name.
If the 5th Extensions portion of the fully-qualified Muldis D language name contains a name+value pair of auto_unabbrev_std_names
+ Bool:true
, then this activates the optional auto_unabbrev_std_names
pragma, which provides one kind of automatic code completion. When auto_unabbrev_std_names
is active, programmers may omit any number of consecutive leading chain elements from such a NameChain
literal, so long as the remaining unqualified chain is distinct among all standard system-defined (sys.std
-prefix) DBMS entities (but that as an exception, a non-distinct abbreviation is allowed iff exactly 1 of the candidate entities is in the language core, sys.std.Core
-prefix, in which case that 1 is unambiguously the entity that is resolved to). This feature has no effect on the namespace prefixes like tuple_from
or array_of
; one still writes those as normal prepended to the otherwise shortened chains.
So for example, one can just write Int
rather than sys.std.Core.Type.Int
, is_identical
rather than sys.std.Core.Universal.is_identical
, QTuple.attr
rather than sys.std.Core.QTuple.attr
, min
rather than sys.std.Ordered.min
, array_of.Rat
rather than array_of.sys.std.Core.Type.Rat
, and so on.
The auto_unabbrev_std_names
pragma intentionally does not empower auto un-abbreviations of any namespaces other than sys.std
, to keep things simple for users to predict and for systems to implement; it does not affect sys.(imp|cat)
, nor any other top-level namespace. When one is referencing either any system-defined implementation-specific (non-standard) types or routines, or any user-defined types or routines, or any dbvars or constraints or whatever, their names can not be written abbreviated due to the auto_unabbrev_std_names
pragma.
Note that the lexer is exactly the same regardless of whether the auto_unabbrev_std_names
pragma is turned on or off, as per the auto_add_attrs
pragma. Many other comments about the other pragma also apply to this one.
auto_chains_from_names
Iff both the auto_add_attrs
and auto_unabbrev_std_names
pragmas are active, then the optional auto_chains_from_names
dependent pragma may be activated in the same manner (as an Extensions name+value pair with Bool:true
. When auto_chains_from_names
is active, programmers may write an otherwise abbreviated-to-one-chain-element NameChain
literal as a plain Name
literal; this can chop the literal down to a third or fourth of its otherwise-length such as in the case of a reference to the Int
type. When the parent literal of such a faux-Name
literal is examined for missing attributes, or examined that existing attributes are of the correct type, any attributes whose declared type says they are supposed to be NameChain
but that have an explicitly defined Name
child literal will have that literal mapped to and replaced with a single element NameChain
literal, which can be subsequently un-abbreviated into a standard system-defined type or routine name. The auto_chains_from_names
pragma will not work when the declared type being applied to a faux-Name
is not a NameChain
subtype, and such literals will then be taken as actual Name
; where such declared type information is missing, you will need to write out the abbreviated chain as an actual NameChain
literal. Note that the auto_chains_from_names
pragma has no effect on NameChain
literal bodies that don't comprise the payload portion of their parent literal, such as with the imperative routine name composed into a boot_stmt
literal; literal bodies in those positions will always be interpreted according to NameChain
literal body syntax.
SEE ALSO
Go to Muldis::D for the majority of distribution-internal references, and Muldis::D::SeeAlso for the majority of distribution-external references.
AUTHOR
Darren Duncan (perl@DarrenDuncan.net
)
LICENSE AND COPYRIGHT
This file is part of the formal specification of the Muldis D language.
Muldis D is Copyright © 2002-2009, Muldis Data Systems, Inc.
See the LICENSE AND COPYRIGHT of Muldis::D for details.
TRADEMARK POLICY
The TRADEMARK POLICY in Muldis::D applies to this file too.
ACKNOWLEDGEMENTS
The ACKNOWLEDGEMENTS in Muldis::D apply to this file too.