NAME
Muldis::D::Dialect::PTMD_Tiny - How to format Plain Text Muldis D
VERSION
This document is Muldis::D::Dialect::PTMD_Tiny version 0.56.0.
PREFACE
This document is part of the Muldis D language specification, whose root document is Muldis::D; you should read that root document before you read this one, which provides subservient details.
DESCRIPTION
This document outlines the grammar of the Plain Text Muldis D dialect named PTMD_Tiny
. The fully-qualified name of this Muldis D dialect, in combination with the base language spec it is bundled with, is Muldis_D:'http://muldis.com':'N.N.N':PTMD_Tiny
(when the bundled base language version is substituted for the N.N.N
).
This dialect is designed to exactly match the Muldis D system catalog (the possible representation of Muldis D code that is visible to or updateable by Muldis D programs at runtime) as to what non-critical meta-data it explicitly stores; so code in the PTMD_Tiny
dialect should be round-trippable with the system catalog with the result maintaining all the details that were started with. Since it matches the system catalog, this dialect should be able to exactly represent all possible Muldis D base language code (and probably all extensions too), rather than a subset of it. That said, the PTMD_Tiny
dialect does provide a choice of multiple syntax options for writing Muldis D value literals and DBMS entity (eg type and routine) declarations, so several very distinct PTMD_Tiny
code artifacts may parse into the same system catalog entries. There is even a considerable level of abstraction in some cases, so that it is easier for programmers to write and understand typical PTMD_Tiny
code, and so that this code isn't absurdly verbose.
This dialect is designed to be as small as possible while meeting the above criteria, and is designed such that a parser that handles all of this dialect can be tiny, hence the dialect's Tiny
name. Likewise, a code generator for this dialect from the system catalog can be tiny.
A significant quality of the PTMD_Tiny
dialect is that it is designed to work easily for a single-pass parser, or at least a single-pass lexer; all the context that one needs to know for how to parse or lex any arbitrary substring of code is provided by prior code. Therefore, a PTMD_Tiny
parser can easily work on a streaming input like a file-handle where you can't go back earlier in the stream. Often this means a parser can work with little RAM.
Also the dialect is designed that any amount of whitespace can be added or omitted next to most non-alphanumeric characters (which happen to be next to alphanumeric tokens) without that affecting the meaning of the code at all, except obviously for within character string literals. And long binary or character strings can be split into arbitrary-size substrings, without affecting the meaning. And many elements are identified by name rather than ordinal position, so to some degree the order they appear has no effect on the meaning. So programmers can easily format (separate, indent, linewrap, order) code how they like, and making an automated code reformatter shouldn't be difficult. Often, named elements can also be omitted entirely for brevity, in which case the parser would use context to supply default values for those elements.
Given that plain text is (more or less) universally unambiguously portable between all general purpose languages that could be used to implement a DBMS, it is expected that every single Muldis D implementation will natively accept input in the PTMD_Tiny
dialect, which isn't dependent on any specific host language and should be easy enough to process, so it should be considered the safest official Muldis D dialect to write in by default, when you don't have a specific reason to use some other dialect.
See also the dialects HDMD_Perl6_Tiny and HDMD_Perl5_Tiny, which are derived directly from PTMD_Tiny
, and represent possible Perl 6 and 5 concrete syntax trees for it; in fact, most of the details in common with those other dialects are described just in the current file, for all 3 dialects.
GENERAL STRUCTURE
A PTMD_Tiny
Muldis D code file consists just of a full or partial Muldis D bootloader
routine definition, which begins with a language name declaration, and otherwise is simply an ordered sequence of imperative routine calls, where earlier routine calls are to system-defined data-definition routines (their arguments are values to put in the system catalog), and later ones are then to user-defined routines that the earlier statements either loaded or defined. This is conceptually what a PTMD_Tiny
file is, and it can even be that literally, but PTMD_Tiny
provides a canonical further abstraction which should be used when doing data-definition. And so you typically use syntax resembling routine and type declarations in a general purpose programming language, where simply declaring such an entity will cause it to be written into the system catalog for subsequent use.
The grammar in this file is formatted as a Perl 6 grammar (see http://perlcabal.org/syn/S05.html for details) which could be used to parse it, but it is only meant to be illustrative, and would need further additions or changes to actually function in Perl 6. The grammar consists mainly of named tokens which define matching rules. A token's name is declared as a bareword following the keyword token
and it is subsequently referenced with '<' and '>'
delimiters. Any other bareword in a token definition consisting of alphanumerics is matched literally, and all non-quoted whitespace is not significant. Any pairs of parenthesis ('(' and ')'
) in token definitions are capturing groups, and each parser match by a pair corresponds to a capture node or node element in the concrete syntax tree resulting from the parse.
The grammar of the PTMD_Tiny
dialect has 3 main subsections, the first being the syntax for declaring a Muldis D language name, the second being the syntax for Muldis D value literals, and the third being the syntax for DBMS entity definition and routine invocation. The subsection for a language name (having the root grammar token language_name
) is quite small and is defined partly in terms of the value literals subsection. The subsection for value literals (having the root grammar token value
) is completely self-defined and can be used in isolation from the wider grammar as a Muldis D sub-language; for example, a hosted-data Muldis D implementation may have an object representing a Muldis D value, which is initialized using code written in that sub-language. The subsection for entity definition and invocation (having the root grammar token boot_stmt
) is defined partly in terms of the value literals subsection. The root grammar token for the entire dialect is bootloader
.
REWRITE PROGRESS MARKER
What follows next in this file, between this point and the SEE ALSO, still has to be rewritten according to certain TODO plans, after which this whole file should reflect what the DESCRIPTION etc above says.
GRAMMAR OF TINY PLAIN TEXT MULDIS D
token bootloader {
<language_name>
<bootloader_imperative_routine_call>*
}
token language_name {
Muldis_D
<val_node_elem_sep>
<ln_authority>
<val_node_elem_sep>
<ln_version>
<val_node_elem_sep>
<ln_dialect>
<val_node_elem_sep>
<ln_extensions>
}
token ln_authority { <char_str> }
token ln_version { <char_str> }
token ln_dialect { PTMD_Tiny }
token ln_extensions { <tuple_or_qv_payload> }
token bootloader_imperative_routine_call {
boot_call
<val_node_elem_sep>
<imperative_routine_name>
<val_node_elem_sep>
<imperative_routine_upd_args>
<val_node_elem_sep>
<imperative_routine_ro_args>
}
token imperative_routine_name { <name_chain_payload> }
token imperative_routine_upd_args {
<list_open>
[[<name_payload> <pair_elem_sep> <name_chain_payload>]
** <list_elem_sep>]?
<list_close>
}
token imperative_routine_ro_args { <tuple_or_qv_payload> }
token literal {
<scalar_or_qv>
| <bool>
| <int>
| <string>
| <blob>
| <text>
| <tuple_or_qv>
| <relation_or_qv>
| <name>
| <name_chain>
| <decl_name_chain>
| <comment>
| <order>
| <rat>
| <rat_round_meth>
| <instant>
| <duration>
}
token scalar_or_qv {
[Q]? Scalar <val_node_elem_sep>
<type_name> <val_node_elem_sep>
<scalar_or_qv_payload>
}
token scalar_or_qv_payload {
<possrep_name> <val_payload_elem_sep>
<possrep_attrs>
}
token possrep_name { <name_payload> }
token possrep_attrs { <tuple_or_qv_payload> }
token bool {
Bool <val_node_elem_sep>
[false|true|0|1]
}
token int {
Int <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<int_payload>
}
token int_payload {
[<int_max_col_val> <val_payload_elem_sep>]?
<int_body>
}
token string {
String <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<string_payload>
}
token string_payload {
[<int_max_col_val> <val_payload_elem_sep>]?
<ord_list_open>
[<int_body> ** <list_elem_sep>]?
<ord_list_close>
}
token blob {
Blob <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<blob_payload>
}
token blob_payload {
<[137F]> <val_payload_elem_sep>
[[<[ 0..9 A..F ]>+] ** <segment_sep>]?
}
token text {
Text <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<text_payload>
}
token text_payload {
<char_str>
}
token tuple_or_qv {
[Q]? Tuple <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<tuple_or_qv_payload>
}
token relation_or_qv {
<generic_relation_or_qv>
| <set_or_qv>
| <nothing>
| <single_or_qv>
| <array_or_qv>
| <bag_or_qv>
}
token generic_relation_or_qv {
[Q]? Relation <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<generic_relation_or_qv_payload>
}
token generic_relation_or_qv_payload {
<generic_relation_empty_body_or_qv_payload>
| <generic_relation_nonordered_attr_or_qv_payload>
| <generic_relation_ordered_attr_or_qv_payload>
}
token generic_relation_empty_body_or_qv_payload {
<list_open>
[<name_payload> ** <list_elem_sep>]?
<list_close>
}
token generic_relation_nonordered_attr_or_qv_payload {
<list_open>
[<tuple_or_qv_payload> ** <list_elem_sep>]?
<list_close>
}
token generic_relation_ordered_attr_or_qv_payload {
<ord_list_open>
[<name_payload> ** <list_elem_sep>]?
<ord_list_close>
<val_payload_elem_sep>
<list_open>
[[
<ord_list_open>
[<literal> ** <list_elem_sep>]?
<ord_list_close>
] ** <list_elem_sep>]?
<list_close>
}
token tuple_or_qv_payload {
<list_open>
[[<name_payload> <pair_elem_sep> <literal>]
** <list_elem_sep>]?
<list_close>
}
token set_or_qv {
[Q]? Set <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<set_or_qv_payload>
}
token set_or_qv_payload {
<list_open>
[<literal> ** <list_elem_sep>]?
<list_close>
}
token nothing {
Nothing
}
token single_or_qv {
[Q]? Single <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<single_or_qv_payload>
}
token single_or_qv_payload {
<list_open>
<literal>
<list_close>
}
token array_or_qv {
[Q]? Array <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<array_or_qv_payload>
}
token array_or_qv_payload {
<ord_list_open>
[<literal> ** <list_elem_sep>]?
<ord_list_close>
}
token bag_or_qv {
[Q]? Bag <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<bag_or_qv_payload>
}
token bag_or_qv_payload {
<bag_or_qv_payload_counted_values>
| <bag_or_qv_payload_repeated_values>
}
token bag_or_qv_payload_counted_values {
<list_open>
[[<literal> <pair_elem_sep> <count>] ** <list_elem_sep>]?
<list_close>
}
token count {
[<int_max_col_val> <val_payload_elem_sep>]?
<pint_body>
}
token bag_or_qv_payload_repeated_values {
<list_open>
[<literal> ** <list_elem_sep>]?
<list_close>
}
token list_open { \s* '{' \s* }
token list_close { \s* '}' \s* }
token ord_list_open { \s* '[' \s* }
token ord_list_close { \s* ']' \s* }
token list_elem_sep { \s* ',' \s* }
token pair_elem_sep { \s* '=>' \s* }
token val_node_elem_sep { \s* ':' \s* }
token val_payload_elem_sep { \s* ';' \s* }
token char_str {
[<char_str_seg> ** <segment_sep>]?
}
token segment_sep { \s* '~' \s* }
token char_str_seg {
<quoted_char_str_seg>
| <nonquoted_char_str_seg>
}
token quoted_char_str_seg {
<[']>
['\b'|'\q'|<-[\\\']>]*
<[']>
}
token nonquoted_char_str_seg { <[ a..z A..Z 0..9 _ - ]>+ }
token int_max_col_val { <pint_head> }
token int_body { [0|\-?<pint_body>] }
token nnint_body { [0|<pint_body>] }
token pint_body { <pint_head> <pint_tail>? }
token pint_head { <[ 1..9 A..Z ]> }
token pint_tail { [<[ 0..9 A..Z _ ]>+] ** <segment_sep> }
token name {
Name <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<name_payload>
}
token name_payload {
<char_str>
}
token name_chain {
NameChain <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<name_chain_payload>
}
token name_chain_payload {
<name_payload> [\s* <nc_elem_sep> \s* <name_payload>]+
}
token nc_elem_sep { '.' }
token decl_name_chain {
DeclNameChain <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<decl_name_chain_payload>
}
token decl_name_chain_payload {
[<name_payload> [\s* <nc_elem_sep> \s* <name_payload>]*]?
}
token comment {
Comment <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<comment_payload>
}
token comment_payload {
<char_str>
}
token order {
Order <val_node_elem_sep>
[increase|same|decrease|-1|0|1]
}
token rat {
Rat <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<rat_payload>
}
token rat_payload {
[<int_max_col_val> <val_payload_elem_sep>]?
<rat_body>
}
token rat_body {
<int_body>\.?<pint_tail>?
| <int_body> \s* \/ \s* <pint_body>
| <int_body> \s* \* \s* <pint_body> \s* \^ \s* <int_body>
}
token nnrat_body {
<nnint_body>\.?<pint_tail>?
| <nnint_body> \s* \/ \s* <pint_body>
| <nnint_body> \s* \* \s* <pint_body> \s* \^ \s* <int_body>
}
token rat_round_meth {
RatRoundMeth <val_node_elem_sep>
[half_down|half_up|half_even|to_floor|to_ceiling|to_zero|to_inf]
}
token instant {
[UTC|Float] Instant <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<instant_payload>
}
token instant_payload {
[<int_max_col_val> <val_payload_elem_sep>]?
<ord_list_open>
<int_body>? <list_elem_sep>
[<pint_body>? <list_elem_sep>] ** 2
[<nnint_body>? <list_elem_sep>] ** 2
<nnrat_body>?
<ord_list_close>
}
token duration {
Duration <val_node_elem_sep>
[<type_name> <val_node_elem_sep>]?
<duration_payload>
}
token duration_payload {
[<int_max_col_val> <val_payload_elem_sep>]?
<ord_list_open>
[<int_body>? <list_elem_sep>] ** 5
<rat_body>
<ord_list_close>
}
token type_name { <name_chain_payload> }
EXAMPLES
The following are fragments of actual Plain Text Muldis D code.
Muldis_D:'http://muldis.com':'1.2.3':PTMD_Tiny:{}
Muldis_D:'http://muldis.com':'1.2.3':PTMD_Tiny:{
auto_add_attrs => Bool:true,
auto_unabbrev_std_names => Bool:true,
auto_chains_from_names => Bool:true
}
boot_call:sys.std.Core.Cat.create_depot_procedure:{}:{ ... }
Scalar:sys.std.Rational.Type.Rat:float;{
mantissa => Int:45207196,
radix => Int:10,
exponent => Int:37
}
Scalar:sys.std.Temporal.Type.UTCDateTime:datetime;{
year => Int:2003,
month => Int:10,
day => Int:26,
hour => Int:1,
minute => Int:30,
second => Rat:0
}
Scalar:fed.lib.the_db.WeekDay:name;{
'' => Text:'monday'
}
Scalar:fed.lib.the_db.WeekDay:number;{
'' => Int:5
}
Bool:true
Int:1;11001001
Int:7;0
Int:7;644
Int:-34
Int:42
Int:F;DEADBEEF
Int:Z;-HELLOWORLD
Int:3;301
Int:B;A09B
String:F;[50,65,72,6C]
Blob:1;00101110100010
Blob:3;
Blob:F;A705E
Blob:7;523504376
Text:'Ceres'
Text:'サンプル'
Text:''
Text:'Perl'
Tuple:{}
Tuple:type.tuple_from.var.fed.data.the_db.account.users:{
login_name => Text:'hartmark',
login_pass => Text:'letmein',
is_special => Bool:true
}
Tuple:{
name => Text:'Michelle',
age => Int:17
}
Relation:{}
Relation:{ x, y, z }
Relation:{ {} }
Relation:{
{
login_name => Text:'hartmark',
login_pass => Text:'letmein',
is_special => Bool:true
}
}
Relation:fed.lib.the_db.gene.Person:[ name, age ];{
[ Text:'Michelle', Int:17 ]
}
Set:fed.lib.the_db.account.Country_Names:{
Text:'Canada',
Text:'Spain',
Text:'Jordan',
Text:'Thailand'
}
Set:{
Int:3,
Int:16,
Int:85
}
Nothing
Single:{ Text:'I know this one!' }
Array:[
Text:'Alphonse',
Text:'Edward',
Text:'Winry'
]
Array:fed.lib.the_db.stats.Samples_By_Order:[
Int:57,
Int:45,
Int:63,
Int:61
]
Bag:fed.lib.the_db.inventory.Fruit:{
Text:'Apple' => 500,
Text:'Orange' => 300,
Text:'Banana' => 400
}
Bag:{
Text:'Foo',
Text:'Quux',
Text:'Foo',
Text:'Bar',
Text:'Baz',
Text:'Baz'
}
Name:login_pass
Name:'First Name'
NameChain:fed.data.the_db.gene.sorted_person_name
NameChain:fed.data.the_db.stats.'samples by order'
DeclNameChain:gene.sorted_person_name
DeclNameChain:stats.'samples by order'
Comment:'This does something.'
Order:same
Rat:1;-1.1
Rat:-1.5
Rat:3.14159
Rat:A;0.0
Rat:F;DEADBEEF.FACE
Rat:Z;0.000AZE
Rat:6;500001/1000
Rat:B;A09B/A
Rat:1;1011101101*10^-11011
Rat:45207196*10^37
Rat:1/43
Rat:314159*10^-5
RatRoundMeth:half_up
UTCInstant:[1964,10,16,16,12,47.5]
UTCInstant:[2002,12,16,,,]
UTCInstant:[,,,14,2,29]
FloatInstant:[2003,4,5,2,,]
FloatInstant:[1407,,,,,]
Duration:[3,5,1,6,15,45.000012]
SEE ALSO
Go to Muldis::D for the majority of distribution-internal references, and Muldis::D::SeeAlso for the majority of distribution-external references.
AUTHOR
Darren Duncan (perl@DarrenDuncan.net
)
LICENSE AND COPYRIGHT
This file is part of the formal specification of the Muldis D language.
Muldis D is Copyright © 2002-2008, Darren Duncan.
See the LICENSE AND COPYRIGHT of Muldis::D for details.
TRADEMARK POLICY
The TRADEMARK POLICY in Muldis::D applies to this file too.
ACKNOWLEDGEMENTS
The ACKNOWLEDGEMENTS in Muldis::D apply to this file too.