Name
SPVM::Document::Language::SyntaxParsing - Syntax Parsing in the SPVM Language
Description
This document describes the grammer of the SPVM language and syntax parsing.
Syntax Parsing
Syntax parsing is the step to build an AST.
Thie step is just after tokenization.
Syntax parsing is performed according to the grammer of the SPVM language.
Grammer
The grammer of the SPVM language is described using GNU Bison syntax.
%token <opval> CLASS HAS METHOD OUR ENUM MY USE AS REQUIRE ALIAS ALLOW OUTMOST_CLASS MUTABLE
%token <opval> ATTRIBUTE MAKE_READ_ONLY INTERFACE EVAL_ERROR_ID ARGS_WIDTH VERSION_DECL
%token <opval> IF UNLESS ELSIF ELSE FOR WHILE LAST NEXT SWITCH CASE DEFAULT BREAK EVAL
%token <opval> SYMBOL_NAME VAR_NAME CONSTANT EXCEPTION_VAR
%token <opval> UNDEF VOID BYTE SHORT INT LONG FLOAT DOUBLE STRING OBJECT TRUE FALSE END_OF_FILE
%token <opval> FATCAMMA RW RO WO INIT NEW OF BASIC_TYPE_ID EXTENDS SUPER
%token <opval> RETURN WEAKEN DIE WARN PRINT SAY OUTMOST_CLASS_NAME UNWEAKEN '[' '{' '('
%type <opval> grammar
%type <opval> field_name method_name class_name
%type <opval> type qualified_type basic_type array_type opt_basic_type
%type <opval> array_type_with_length ref_type return_type type_comment opt_type_comment union_type
%type <opval> opt_classes classes class class_block opt_extends version_decl
%type <opval> opt_definitions definitions definition
%type <opval> enumeration enumeration_block opt_enumeration_items enumeration_items enumeration_item
%type <opval> method anon_method opt_args args arg use require class_alias our has anon_method_fields anon_method_field interface allow
%type <opval> opt_attributes attributes
%type <opval> opt_statements statements statement if_statement else_statement
%type <opval> for_statement while_statement foreach_statement
%type <opval> switch_statement case_statement case_statements opt_case_statements default_statement
%type <opval> block eval_block init_statement switch_block if_require_statement
%type <opval> die
%type <opval> var_decl var
%type <opval> operator opt_operators operators opt_operator
%type <opval> void_return_operator warn
%type <opval> unary_operator array_length
%type <opval> inc dec
%type <opval> binary_operator arithmetic_operator bit_operator comparison_operator string_concatenation logical_operator
%type <opval> assign
%type <opval> new array_init
%type <opval> type_check type_cast can
%type <opval> call_method
%type <opval> array_access field_access
%type <opval> weaken_field unweaken_field isweak_field
%type <opval> sequential
%right <opval> ASSIGN SPECIAL_ASSIGN
%left <opval> LOGICAL_OR
%left <opval> LOGICAL_AND
%left <opval> BIT_OR BIT_XOR
%left <opval> BIT_AND
%nonassoc <opval> NUMEQ NUMNE STREQ STRNE
%nonassoc <opval> NUMGT NUMGE NUMLT NUMLE STRGT STRGE STRLT STRLE ISA ISA_ERROR IS_TYPE IS_ERROR IS_COMPILE_TYPE NUMERIC_CMP STRING_CMP CAN
%left <opval> SHIFT
%left <opval> '+' '-' '.'
%left <opval> '*' DIVIDE DIVIDE_UNSIGNED_INT DIVIDE_UNSIGNED_LONG MODULO MODULO_UNSIGNED_INT MODULO_UNSIGNED_LONG
%right <opval> LOGICAL_NOT BIT_NOT '@' REFERENCE DEREFERENCE PLUS MINUS CONVERT SCALAR STRING_LENGTH ISWEAK TYPE_NAME COMPILE_TYPE_NAME DUMP NEW_STRING_LEN IS_READ_ONLY COPY AS_BOOL
%nonassoc <opval> INC DEC
%left <opval> ARROW
grammar
: opt_classes
field_name
: SYMBOL_NAME
method_name
: SYMBOL_NAME
class_name
: SYMBOL_NAME
qualified_type
: type
| MUTABLE type {
type
: basic_type
| array_type
| ref_type
basic_type
: SYMBOL_NAME
| BYTE
| SHORT
| INT
| LONG
| FLOAT
| DOUBLE
| OBJECT
| STRING
ref_type
: basic_type '*'
array_type
: basic_type '[' ']'
| array_type '[' ']'
array_type_with_length
: basic_type '[' operator ']'
| array_type '[' operator ']'
return_type
: qualified_type opt_type_comment
| VOID
opt_type_comment
: /* Empty */
| type_comment
type_comment
: OF union_type
union_type
: union_type BIT_OR type
| type
opt_classes
: /* Empty */
| classes
classes
: classes class
| class
class
: CLASS opt_basic_type opt_extends class_block END_OF_FILE
| CLASS opt_basic_type opt_extends ':' opt_attributes class_block END_OF_FILE
| CLASS opt_basic_type opt_extends ';' END_OF_FILE
| CLASS opt_basic_type opt_extends ':' opt_attributes ';' END_OF_FILE
opt_basic_type
: /* Empty */
| basic_type
opt_extends
: /* Empty */
| EXTENDS basic_type
class_block
: '{' opt_definitions '}'
opt_definitions
: /* Empty */
| definitions
definitions
: definitions definition
| definition
definition
: version_decl
| use
| class_alias
| allow
| interface
| init_statement
| enumeration
| our
| has ';'
| method
init_statement
: INIT block
version_decl
: VERSION_DECL CONSTANT ';'
use
: USE basic_type ';'
| USE basic_type AS class_name ';'
require
: REQUIRE basic_type
class_alias
: ALIAS basic_type AS class_name ';'
allow
: ALLOW basic_type ';'
interface
: INTERFACE basic_type ';'
enumeration
: opt_attributes ENUM enumeration_block
enumeration_block
: '{' opt_enumeration_items '}'
opt_enumeration_items
: /* Empty */
| enumeration_items
enumeration_items
: enumeration_items ',' enumeration_item
| enumeration_items ','
| enumeration_item
enumeration_item
: method_name
| method_name ASSIGN CONSTANT
our
: OUR VAR_NAME ':' opt_attributes qualified_type opt_type_comment ';'
has
: HAS field_name ':' opt_attributes qualified_type opt_type_comment
method
: opt_attributes METHOD method_name ':' return_type '(' opt_args ')' block
| opt_attributes METHOD method_name ':' return_type '(' opt_args ')' ';'
| opt_attributes METHOD ':' return_type '(' opt_args ')' block
| opt_attributes METHOD ':' return_type '(' opt_args ')' ';'
anon_method
: opt_attributes METHOD ':' return_type '(' opt_args ')' block
| '[' anon_method_fields ']' opt_attributes METHOD ':' return_type '(' opt_args ')' block
opt_args
: /* Empty */
| args
args
: args ',' arg
| args ','
| arg
arg
: var ':' qualified_type opt_type_comment
| var ':' qualified_type opt_type_comment ASSIGN operator
anon_method_fields
: anon_method_fields ',' anon_method_field
| anon_method_fields ','
| anon_method_field
anon_method_field
: HAS field_name ':' opt_attributes qualified_type opt_type_comment
| HAS field_name ':' opt_attributes qualified_type opt_type_comment ASSIGN operator
| var ':' opt_attributes qualified_type opt_type_comment
| var ':' opt_attributes qualified_type opt_type_comment ASSIGN operator
opt_attributes
: /* Empty */
| attributes
attributes
: attributes ATTRIBUTE
| ATTRIBUTE
opt_statements
: /* Empty */
| statements
statements
: statements statement
| statement
statement
: if_statement
| for_statement
| foreach_statement
| while_statement
| block
| switch_statement
| case_statement
| default_statement
| eval_block
| if_require_statement
| LAST ';'
| NEXT ';'
| BREAK ';'
| RETURN ';'
| RETURN operator ';'
| operator ';'
| void_return_operator ';'
| ';'
| die ';'
die
: DIE operator
| DIE
| DIE type operator
| DIE type
| DIE operator ',' operator
void_return_operator
: warn
| PRINT operator
| SAY operator
| weaken_field
| unweaken_field
| MAKE_READ_ONLY operator
warn
: WARN operator
| WARN
for_statement
: FOR '(' opt_operator ';' operator ';' opt_operator ')' block
foreach_statement
: FOR var_decl '(' '@' operator ')' block
| FOR var_decl '(' '@' '{' operator '}' ')' block
while_statement
: WHILE '(' operator ')' block
switch_statement
: SWITCH '(' operator ')' switch_block
switch_block
: '{' opt_case_statements '}'
| '{' opt_case_statements default_statement '}'
opt_case_statements
: /* Empty */
| case_statements
case_statements
: case_statements case_statement
| case_statement
case_statement
: CASE operator ':' block
| CASE operator ':'
default_statement
: DEFAULT ':' block
| DEFAULT ':'
if_require_statement
: IF '(' require ')' block
| IF '(' require ')' block ELSE block
if_statement
: IF '(' operator ')' block else_statement
| UNLESS '(' operator ')' block else_statement
else_statement
: /* NULL */
| ELSE block
| ELSIF '(' operator ')' block else_statement
block
: '{' opt_statements '}'
eval_block
: EVAL block
var_decl
: MY var ':' qualified_type opt_type_comment
| MY var
var
: VAR_NAME
opt_operators
: /* Empty */
| operators
opt_operator
: /* Empty */
| operator
operator
: var
| EXCEPTION_VAR
| CONSTANT
| UNDEF
| type_cast
| new
| var_decl
| EVAL_ERROR_ID
| ARGS_WIDTH
| TRUE
| FALSE
| OUTMOST_CLASS_NAME
| unary_operator
| binary_operator
| assign
| inc
| dec
| type_check
| BASIC_TYPE_ID type
| can
| array_init
| array_access
| field_access
| isweak_field
| call_method
| sequential
sequential
: '(' operators ')'
operators
: operators ',' operator
| operators ','
| operator
unary_operator
: '+' operator %prec PLUS
| '-' operator %prec MINUS
| BIT_NOT operator
| TYPE_NAME operator
| COMPILE_TYPE_NAME operator
| STRING_LENGTH operator
| DUMP operator
| DEREFERENCE var
| REFERENCE operator
| NEW_STRING_LEN operator
| COPY operator
| IS_READ_ONLY operator
| array_length
| AS_BOOL operator
array_length
: '@' operator
| '@' '{' operator '}'
| SCALAR '@' operator
| SCALAR '@' '{' operator '}'
inc
: INC operator
| operator INC
dec
: DEC operator
| operator DEC
binary_operator
: arithmetic_operator
| bit_operator
| comparison_operator
| string_concatenation
| logical_operator
arithmetic_operator
: operator '+' operator
| operator '-' operator
| operator '*' operator
| operator DIVIDE operator
| operator DIVIDE_UNSIGNED_INT operator
| operator DIVIDE_UNSIGNED_LONG operator
| operator MODULO operator
| operator MODULO_UNSIGNED_INT operator
| operator MODULO_UNSIGNED_LONG operator
bit_operator
: operator BIT_XOR operator
| operator BIT_AND operator
| operator BIT_OR operator
| operator SHIFT operator
comparison_operator
: operator NUMEQ operator
| operator NUMNE operator
| operator NUMGT operator
| operator NUMGE operator
| operator NUMLT operator
| operator NUMLE operator
| operator NUMERIC_CMP operator
| operator STREQ operator
| operator STRNE operator
| operator STRGT operator
| operator STRGE operator
| operator STRLT operator
| operator STRLE operator
| operator STRING_CMP operator
string_concatenation
: operator '.' operator
logical_operator
: operator LOGICAL_OR operator
| operator LOGICAL_AND operator
| LOGICAL_NOT operator
type_check
: operator ISA type
| operator ISA_ERROR type
| operator IS_TYPE type
| operator IS_ERROR type
| operator IS_COMPILE_TYPE type
type_cast
: '(' qualified_type ')' operator %prec CONVERT
| operator ARROW '(' qualified_type ')' %prec CONVERT
can
: operator CAN method_name
| operator CAN CONSTANT
assign
: operator ASSIGN operator
| operator SPECIAL_ASSIGN operator
new
: NEW basic_type
| NEW array_type_with_length
| anon_method
array_init
: '[' opt_operators ']'
| '{' operators '}'
| '{' '}'
call_method
: OUTMOST_CLASS SYMBOL_NAME '(' opt_operators ')'
| OUTMOST_CLASS SYMBOL_NAME
| basic_type ARROW method_name '(' opt_operators ')'
| basic_type ARROW method_name
| operator ARROW method_name '(' opt_operators ')'
| operator ARROW method_name
| operator ARROW '(' opt_operators ')'
array_access
: operator ARROW '[' operator ']'
| array_access '[' operator ']'
| field_access '[' operator ']'
field_access
: operator ARROW '{' field_name '}'
| field_access '{' field_name '}'
| array_access '{' field_name '}'
weaken_field
: WEAKEN var ARROW '{' field_name '}'
unweaken_field
: UNWEAKEN var ARROW '{' field_name '}'
isweak_field
: ISWEAK var ARROW '{' field_name '}'
Grammer Token
These are tokens for "Grammer" in grammer.
Tokens | Token Values |
---|---|
ALIAS | alias |
ALLOW | allow |
ARROW | -> |
AS | as |
AS_BOOL | as_bool |
ASSIGN | = |
BIT_AND | & |
BASIC_TYPE_ID | basic_type_id |
BIT_NOT | ~ |
BIT_OR | | |
BIT_XOR | ^ |
BREAK | break |
BYTE | byte |
CASE | case |
CLASS | class |
VAR_NAME | A variable name |
COMPILE_TYPE_NAME | compile_type_name |
CONSTANT | A literal |
CONVERT | (TYPE_NAME) |
COPY | copy |
OUTMOST_CLASS | & |
OUTMOST_CLASS_NAME | __PACKAGE__ |
DEC | -- |
DEFAULT | default |
DEREFERENCE | $ |
ATTRIBUTE | An attribute name |
DIE | die |
DIVIDE | / |
DIVIDE_UNSIGNED_INT | div_uint |
DIVIDE_UNSIGNED_LONG | div_ulong |
DOUBLE | double |
DUMP | dump |
ELSE | else |
ELSIF | elsif |
END_OF_FILE | The end of the file |
ENUM | enum |
EVAL_ERROR_ID | eval_error_id |
EXTENDS | extends |
EVAL | eval |
EXCEPTION_VAR | $@ |
FATCAMMA | => |
FLOAT | float |
FOR | for |
HAS | has |
CAN | can |
IF | if |
INTERFACE | interface |
INC | ++ |
INIT | INIT |
INT | int |
ISA | isa |
ISWEAK | isweak |
IS_TYPE | is_type |
IS_READ_ONLY | is_read_only |
LAST | last |
LENGTH | length |
LOGICAL_AND | && |
LOGICAL_NOT | ! |
LOGICAL_OR | || |
LONG | long |
MAKE_READ_ONLY | make_read_only |
METHOD | method |
MINUS | - |
MUTABLE | mutable |
MY | my |
SYMBOL_NAME | A symbol name |
NEW | new |
NEW_STRING_LEN | new_string_len |
OF | of |
NEXT | next |
NUMEQ | == |
NUMERIC_CMP | <=> |
NUMGE | >= |
NUMGT | > |
NUMLE | <= |
NUMLT | < |
NUMNE | != |
OBJECT | object |
OUR | our |
PLUS | + |
REF | \ |
TYPE_NAME | type_name |
MODULO | % |
MODULO_UNSIGNED_INT | mod_uint |
MODULO_UNSIGNED_LONG | mod_ulong |
REQUIRE | require |
RETURN | return |
RO | ro |
RW | rw |
SAY | say |
SCALAR | scalar |
SELF | self |
SHIFT | << >> >>> |
SHORT | short |
SPECIAL_ASSIGN | += -= *= /= &= |= ^= %= <<= >>= >>>= .= |
SRING_CMP | cmp |
STREQ | eq |
STRGE | ge |
STRGT | gt |
STRING | string |
STRLE | le |
STRLT | lt |
STRNE | ne |
SWITCH | switch |
UNDEF | undef |
UNLESS | unless |
UNWEAKEN | unweaken |
USE | use |
VAR | var |
VERSION | version |
VOID | void |
WARN | warn |
WEAKEN | weaken |
WHILE | while |
WO | wo |
Operator Precidence
The operator precidence in the SPVM language is described using GNU Bison syntax.
The bottom is the highest precidence and the top is the lowest precidence.
%right <opval> ASSIGN SPECIAL_ASSIGN
%left <opval> LOGICAL_OR
%left <opval> LOGICAL_AND
%left <opval> BIT_OR BIT_XOR
%left <opval> BIT_AND
%nonassoc <opval> NUMEQ NUMNE STREQ STRNE
%nonassoc <opval> NUMGT NUMGE NUMLT NUMLE STRGT STRGE STRLT STRLE ISA ISA_ERROR IS_TYPE IS_ERROR IS_COMPILE_TYPE NUMERIC_CMP STRING_CMP CAN
%left <opval> SHIFT
%left <opval> '+' '-' '.'
%left <opval> '*' DIVIDE DIVIDE_UNSIGNED_INT DIVIDE_UNSIGNED_LONG MODULO MODULO_UNSIGNED_INT MODULO_UNSIGNED_LONG
%right <opval> LOGICAL_NOT BIT_NOT '@' REFERENCE DEREFERENCE PLUS MINUS CONVERT SCALAR STRING_LENGTH ISWEAK TYPE_NAME COMPILE_TYPE_NAME DUMP NEW_STRING_LEN IS_READ_ONLY COPY
%nonassoc <opval> INC DEC
%left <opval> ARROW
The operator precidence can be increased using ()
.
# a * b is calculated at first
a * b + c
# b + c is calculated at first
a * (b + c)
See Also
Copyright & License
Copyright (c) 2023 Yuki Kimoto
MIT License