=encoding utf8 =head1 Name SPVM::Document::Language::Tokenization - Tokenization in the SPVM Language =head1 Description This document describes the tokenization in the SPVM language. =head1 Tokenization This section describes L<lexical analysis|https://en.wikipedia.org/wiki/Lexical_analysis> in the SPVM Language. This is called tokenization. See L<SPVM::Document::Language::SyntaxParsing> about syntax parsing. =head2 Character Encoding The character encoding of SPVM source codes is UTF-8. If a character is an ASCII character, it must be an ASCII printable character or a L<space character|/"Space Characters">. Compilation Errors: The charactor encoding of SPVM source codes must be UTF-8. Otherwise a compilation error occurs. If a character is an ASCII character, it must be an L<ASCII printable character|https://en.wikipedia.org/wiki/ASCII#Printable_characters> or a L<space character|/"Space Characters">. Otherwise a compilation error occurs. =head2 Line Terminators The line terminator is ASCII C<LF>. When a line terminator appears, the current line number is incremented by 1. =head2 Space Characters The space characters are ASCII C<SP>, C<HT>, C<FF>, C<LF>. =head2 Word Characters The word characters are ASCII C<a-zA-Z>, C<0-9>, C<_>. =head2 Names This section describes names. =head3 Symbol Name A symbol name consists of L<word characters|/"Word Characters"> and C<::>. It dose not contains C<__>. It dose not begin with C<0-9>. It dose not begin with C<::>. It dose not end with C<::>. It dose not contains C<::::>. It dose not begin with C<0-9>. Compliation Errors: If a symbol name is invald, a compilation error occurs. Examples: # Symbol names foo foo_bar2 Foo::Bar # Invalid symbol names 2foo foo__bar ::Foo Foo:: Foo::::Bar =head3 Class Name A class name is a L<symbol name|/"Symbol Name">. Each partial name of a class name must begin with an uppercase letter. Partial names are individual names separated by C<::>. For example, the partial names of C<Foo::Bar::Baz> are C<Foo>, C<Bar>, and C<Baz>. Compilation Errors: If a class name is invalid, a compilation error occurs. Examples: # Class names Foo Foo::Bar Foo::Bar::Baz3 Foo::bar Foo_Bar::Baz_Baz # Invalid class names Foo Foo::::Bar Foo::Bar:: Foo__Bar Foo::bar =head3 Method Name A method name is a L<symbol name|/"Symbol Name"> without C<::> or an empty string C<"">. Method names with the same name as L<keywords|/"Keywords"> are allowed. Compilation Errors: If a method name is invalid, a compilation error occurs. Examples: # Method names FOO FOO_BAR3 foo foo_bar _foo _foo_bar_ # Invalid method names foo__bar 3foo =head3 Field Name A field name is a L<symbol name|/"Symbol Name"> without C<::>. Field names with the same name as L<keywords|/"Keywords"> are allowed. Compilation Errors: If a field names is invalid, a compilation error occurs. Examples: # Field names FOO FOO_BAR3 foo foo_bar _foo _foo_bar_ # Invalid field names foo__bar 3foo Foo::Bar =head3 Variable Name A variable name begins with C<$> and is followed by a L<symbol name|/"Symbol Name">. The symbol name in a variable name can be surrounded by C<{> and C<}>. Compilation Errors: If a field names is invalid, a compilation error occurs. If an opening C<{> exists and the closing C<}> dose not exist, a compilation error occurs. Examples: # Variable names $name $my_name ${name} $Foo::name $Foo::Bar::name ${Foo::name} # Invalid variable names $::name $name:: $Foo::::name $my__name ${name =head4 Class Variable Name A class variable name is a L<variable name|/"Variable Name">. Examples: # Class variable names $NAME $MY_NAME ${NAME} $FOO::NAME $FOO::BAR::NAME ${FOO::NAME_BRACE} $FOO::name # Invalid class variable names $::NAME $NAME:: $FOO::::NAME $MY__NAME $3FOO ${NAME =head4 Local Variable Name A local variable name is a L<variable name|/"Variable Name"> without C<::>. Examples: # Local variable names $name $my_name ${name_brace} $_name $NAME # Invalid local variable names $::name $name:: $Foo::name $Foo::::name $my__name ${name $3foo =head2 Keywords The List of Keywords: alias allow as basic_type_id break byte can case cmp class compile_type_name copy default die div_uint div_ulong double dump elsif else enum eq eval eval_error_id extends for float false gt ge has if interface int interface_t isa isa_error isweak is_compile_type is_type is_error is_read_only args_width last length lt le long make_read_only my mulnum_t method mod_uint mod_ulong mutable native ne next new new_string_len of our object print private protected public precompile pointer return require required rw ro say static switch string short scalar true type_name undef unless unweaken use version void warn while weaken wo INIT __END__ __PACKAGE__ __FILE__ __LINE__ =head2 Operator Tokens The List of Operator Tokens: ! != $ % & && &= = == ^ ^= | || |= - -- -= ~ @ + ++ += * *= < <= > >= <=> % %= << <<= >>= >> >>> >>>= . .= / /= \ ( ) { } [ ] ; : , -> => =head2 Comment Comments have no meaning. #COMMENT A comment begins with C<#>. It is followed by any string I<COMMENT>. It ends with ASCII C<LF>. L<Line directives|/"Line Directive"> take precedence over comments. L<File directives|/"File Directive"> take precedence over comments. Examples: # This is a comment line =head2 line Directive A line directive set the current line number. #line NUMBER A line directive begins with C<#line> from the beggining of the line. It is followed by one or more ASCII C<SP>. It is followed by I<NUMBER>. I<NUMBER> is a positive 32bit integer. It ends with ASCII C<LF>. The current line number of the source code is set to I<NUMBER>. Line directives take precedence over L<comments|/"Comment">. Compilation Errors: A line directive must begin from the beggining of the line. Otherwise an compilation error occurs. A line directive must end with "\n". Otherwise an compilation error occurs. A line directive must have a line number. Otherwise an compilation error occurs. The line number given to a line directive must be a positive 32bit integer. Otherwise an compilation error occurs. Examples: class MyClass { static method main : void () { #line 39 } } =head2 file Directive A file directive set the current file path. #file "FILE_PATH" A file directive begins from the beggining of the source code excluding a shebang line. A shebang line before a file directive is allowed. #!command #file "FILE_PATH" It is followed by one or more ASCII C<SP>. It is followed by C<">. It is followed by I<FILE_PATH>. I<FILE_PATH> is a string that represetns a file path. It is closed with C<">. It ends with ASCII C<LF>. The current file path is set to I<FILE_PATH>. File directives take precedence over L<comments|/"Comment">. Compilation Errors: A file directive must begin from the beggining of the source code. Otherwise an compilation error occurs. A file directive must end with "\n". Otherwise an compilation error occurs. A file directive must have a file path. Otherwise an compilation error occurs. A file directive must end with ". Otherwise an compilation error occurs. Examples: #file "/path/MyClass.spvm" class MyClass { } =head2 lib Directive A lib directive gives a hint for a class search directory to L<spvm> command and L<spvmcc> command. #lib "CLASS_SEARCH_DIRECTORY" A lib directive begins from the beggining of a line. It is followed by one or more ASCII C<SP>. It is followed by C<">. It is followed by I<CLASS_SEARCH_DIRECTORY>. I<CLASS_SEARCH_DIRECTORY> is a string that represetns a L<class search directory|SPVM::Document::Language::Class/"Class Search Directories">. It is closed with C<">. It ends with ASCII C<LF>. The line directives take precedence over L<comments|/"Comment">. I<CLASS_SEARCH_DIRECTORY> can contains C<$FindBin::Bin>. This is expaned to the directory where the SPVM script is placed. #lib "$FindBin::Bin/lib/SPVM" Compilation Errors: A lib directive must begin from the beggining of a line. Otherwise an compilation error occurs. The directory specified by a lib directive end with "\n". Otherwise an compilation error occurs. The directory specified by a lib directive must not be an empty string. Otherwise an compilation error occurs. The directory specified by a lib directive must end with ". Otherwise an compilation error occurs. Examples: C<my_script.spvm>: #lib "$FindBin::Bin/lib/SPVM" class { } =head2 __END__ If a line begins with C<__END__> and ends with ASCII C<LF>, the line with C<__END__> and the below lines are interpreted as L<comments|/"Comment">. Examples: class MyClass { } __END__ foo bar =head2 POD POD is a syntax to write multiline comment. POD has no meaning. The Beginning of a POD: =NAME The beginning of a POD begins with C<=> from the beggining of the line. It is followed by I<NAME>. I<NAME> is any string that begins with ASCII C<a-zA-Z>. It ends with ASCII C<LF>. The End of a POD: =cut The end of a POD begins with C<=> from the beggining of the line. It is followed by C<cut>. It ends with ASCII C<LF>. Examples: =pod Comment1 Comment2 =cut =head1 Comment1 Comment2 =cut =head2 Fat Comma A fat comma is => The fat comma is an alias for a comma C<,>. # Comma ["a", "b", "c", "d"] # Fat Comma ["a" => "b", "c" => "d"] If the left operand of a fat comma is a L<symbol name|/"Symbol Name"> without C<::>, it is wrraped by C<"> and is treated as a L<string literal|/"String Literal">. # foo_bar2 is treated as "foo_bar2" [foo_bar2 => "Mark"] ["foo_bar2" => "Mark"] =head1 Literals A literal represents a constant value. =head2 Numeric Literals A numeric literal represents a constant L<number|SPVM::Document::Language::Types/"Number">. =head2 Integer Literals A interger literal represents a constant number of an L<integer type|SPVM::Document::Language::Types/"Integer Types">. =head3 Integer Literal Decimal Notation The interger literal decimal notation represents a number of int type or long type using decimal numbers C<0-9>. It can begin with a minus C<->. It is followed by one or more of C<0-9>. C<_> can be placed at the any positions after the first C<0-9> as a separator. C<_> has no meaning. It can end with the suffix C<L> or C<l>. If the suffix C<L> or C<l> exists, the return type is long type. Otherwise the return type is int type. Compilation Errors: If the return type is int type and the value is greater than the max value of int type or less than the minimal value of int type, a compilation error occurs. If the return type is long type and the value is greater than the max value of long type or less than the minimal value of long type, a compilation error occurs. Examples: 123 -123 123L 123l 123_456_789 -123_456_789L =head3 Integer Literal Hexadecimal Notation The interger literal hexadecimal notation represents a number of int type or long type using hexadecimal numbers C<0-9a-zA-Z>. It can begin with a minus C<->. It is followed by C<0x> or C<0X>. It is followed by one or more C<0-9a-zA-Z>. This is called hexadecimal numbers part. C<_> can be placed at the any positions after C<0x> or C<0X> as a separator. C<_> has no meaning. It can end with the suffix C<L> or C<l>. If the suffix C<L> or C<l> exists, the return type is long type. Otherwise the return type is int type. If the return type is int type, the hexadecimal numbers part is interpreted as an unsigned 32 bit integer, and is converted to a signed 32-bit integer without changing the bits. For example, C<0xFFFFFFFF> is -1. If the return type is long type, the hexadecimal numbers part is interpreted as unsigned 64 bit integer, and is converted to a signed 64-bit integer without changing the bits. For example, C<0xFFFFFFFFFFFFFFFFL> is C<-1L>. Compilation Errors: If the return type is int type and the hexadecimal numbers part is greater than hexadecimal C<FFFFFFFF>, a compilation error occurs. If the return type is long type and the hexadecimal numbers part is greater than hexadecimal C<FFFFFFFFFFFFFFFF>, a compilation error occurs. Examples: 0x3b4f 0X3b4f -0x3F1A 0xDeL 0xFFFFFFFF 0xFF_FF_FF_FF 0xFFFFFFFFFFFFFFFFL =head3 Integer Literal Octal Notation The interger literal octal notation represents a number of int type or long type using octal numbers C<0-7>. It can begin with a minus C<->. It is followed by C<0>. It is followed by one or more C<0-7>. This is called octal numbers part. C<_> can be placed at the any positions after C<0> as a separator. C<_> has no meaning. It can end with the suffix C<L> or C<l>. If the suffix C<L> or C<l> exists, the return type is long type. Otherwise the return type is int type. If the return type is int type, the octal numbers part is interpreted as an unsigned 32 bit integer, and is converted to a signed 32-bit integer without changing the bits. For example, C<037777777777> is -1. If the return type is long type, the octal numbers part is interpreted as unsigned 64 bit integer, and is converted to a signed 64-bit integer without changing the bits. For example, C<01777777777777777777777L> is C<-1L>. If the return type is long type, the value that is except for C<-> is interpreted as unsigned 64 bit integer C<uint64_t> type in the C language, and the following conversion is performed. Compilation Errors: If the return type is int type and the octal numbers part is greater than octal 37777777777, a compilation error occurs. If the return type is long type and the octal numbers part is greater than octal 1777777777777777777777, a compilation error occurs. Examples: 0755 -0644 0666L 0655_755 =head3 Integer Literal Binary Notation The interger literal binary notation represents a number of int type or long type using binary numbers C<0> and C<1>. It can begin with a minus C<->. It is followed by C<0b> or C<0B>. It is followed by one or more C<0> and C<1>. This is called binary numbers part. C<_> can be placed at the any positions after C<0b> or C<0B> as a separator. C<_> has no meaning. It can end with the suffix C<L> or C<l>. If the suffix C<L> or C<l> exists, the return type is long type. Otherwise the return type is int type. If the return type is int type, the binary numbers part is interpreted as an unsigned 32 bit integer, and is converted to a signed 32-bit integer without changing the bits. For example, C<0b11111111111111111111111111111111> is -1. If the return type is long type, the binary numbers part is interpreted as unsigned 64 bit integer, and is converted to a signed 64-bit integer without changing the bits. For example, C<0b1111111111111111111111111111111111111111111111111111111111111111L> is C<-1L>. Compilation Errors: If the return type is int type and the value that is except for C<-> is greater than binary C<11111111111111111111111111111111>, a compilation error occurs. If the return type is long type and the value that is except for C<-> is greater than binary C<1111111111111111111111111111111111111111111111111111111111111111>, a compilation error occurs. Examples: 0b0101 -0b1010 0b110000L 0b10101010_10101010 =head2 Floating Point Literals The floating point litral represetns a floating point number. =head3 Floating Point Literal Decimal Notation The floating point litral decimal notation represents a number of float type and double type using decimal numbers C<0-9>. It can begin with a minus C<->. It is followed by one or more C<0-9>. C<_> can be placed at the any positions after the first C<0-9>. It can be followed by a floating point part, an exponent part, or a combination of a floating point part and an exponent part. [Floating Point Part Begin] A floating point part begins with C<.>. It is followed by one or more C<0-9>. [Floating Point Part End] [Exponent Part Begin] An exponent part begins with C<e> or C<E>. It can be followed by C<+> or C<-> It is followed by one or more C<0-9>. [Exponent Part End] A floating point litral decimal notation can end with a suffix C<f>, C<F>, C<d>, or C<D>. If a suffix does not exists, a floating point litral decimal notation must have a floating point part or an exponent part. If the suffix C<f> or C<F> exists, the return type is float type. Otherwise the return type is double type. Compilation Errors: If the return type is float type, the floating point litral decimal notation without the suffix must be able to be parsed by the C<strtof> function in the C language. Otherwise, a compilation error occurs. If the return type is double type, the floating point litral decimal notation without the suffix must be able to be parsed by the C<strtod> function in the C language. Otherwise, a compilation error occurs. Examples: 1.32 -1.32 1.32f 1.32F 1.32d 1.32D 1.32e3 1.32e-3 1.32E+3 1.32E-3 1.32e3f 12e7 =head3 Floating Point Literal Hexadecimal Notation The floating point litral hexadecimal notation represents a number of float type and double type using hexadecimal numbers C<0-9a-zA-Z>. It can begin with a minus C<->. It is followed by C<0x> or C<0X>. It is followed by one or more C<0-9a-zA-Z>. C<_> can be placed at the any positions after C<0x> or C<0X>. It can be followed by a floating point part, an exponent part, or a combination of a floating point part and an exponent part. [Floating Point Part Begin] A floating point part begins with C<.> It is followed by one or more C<0-9a-zA-Z>. [Floating Point Part End] [Exponent Part Begin] An exponent part begins with C<p> or C<P>. It can be followed by C<+> or C<->. It is followed by one or more C<0-9>. [Exponent Part End] A floating point litral hexadecimal notation can end with a suffix C<f>, C<F>, C<d>, or C<D>. If a suffix does not exists, a floating point litral hexadecimal notation must have a floating point part or an exponent part. Compilation Errors: If the return type is float type, the floating point litral hexadecimal notation without the suffix must be able to be parsed by the C<strtof> function in the C language. Otherwise, a compilation error occurs. If the return type is double type, thefloating point litral hexadecimal notation without the suffix must be able to be parsed by the C<strtod> function in the C language. Otherwise, a compilation error occurs. Examples: 0x3d3d.edp0 0x3d3d.edp3 0x3d3d.edP3 0x3d3d.edP+3 0x3d3d.edP-3f 0x3d3d.edP-3F 0x3d3d.edP-3d 0x3d3d.edP-3D 0x3d3dP+3 =head2 Bool Literals The bool literal represents a bool object. =head3 true C<true> is the alias for L<Bool#TRUE|SPVM::Bool/"TRUE">. true Examples: # true my $bool_object_true = true; =head3 false C<false> is the alias for L<Bool#FALSE|SPVM::Bool/"FALSE">. false Examples: # false my $bool_object_false = false; =head2 Character Literal A character literal represents a number of L<byte type|SPVM::Document::Language::Types/"byte Type"> that normally represents an ASCII character. It begins with C<'>. It is followed by a printable ASCII character C<0x20-0x7e> or an L<character literal escape character|/"Character Literal Escape Characters">. It ends with C<'>. The return type is byte type. Compilation Errors: If the format of the character literal is invalid, a compilation error occurs. =head3 Character Literal Escape Characters The List of Character Literal Escape Characters: =begin html <table> <tr> <th> Character Literal Escape Characters </th> <th> Values </th> </tr> <tr> <td> \a </td> <td> <code>0x07</code> BEL </td> </tr> <tr> <td> \t </td> <td> <code>0x09</code> HT </td> </tr> <tr> <td> \n </td> <td> <code>0x0A</code> LF </td> </tr> <tr> <td> \f </td> <td> <code>0x0C</code> FF </td> </tr> <tr> <td> \r </td> <td> <code>0x0D</code> CR </td> </tr> <tr> <td> \" </td> <td> <code>0x22</code> " </td> </tr> <tr> <td> \' </td> <td> <code>0x27</code> ' </td> </tr> <tr> <td> \\ </td> <td> <code>0x5C</code> \ </td> </tr> <tr> <td> <a href="#Octal-Escape-Character">Octal Escape Character</a> </td> <td> A number represented by an octal escape character </td> </tr> <tr> <td> <a href="#Hexadecimal-Escape-Character">Hexadecimal Escape Character</a> </td> <td> A number represented by a hexadecimal escape character </td> </tr> </table> =end html The type of every character literal escape character is byte type. Examples: # Charater literals 'a' 'x' '\a' '\t' '\n' '\f' '\r' '\"' '\'' '\\' ' ' '\0' '\012' '\377' '\o{1}' '\xab' '\xAB' '\x0D' '\x0A' '\xD' '\xA' '\xFF' '\x{A}' =head2 Octal Escape Character The octal escape character represents an unsined 8-bit integer using octal numbers C<0-7>. The octal escape character is a part of a L<string literal|/"String Literal"> and a L<character literal|/"Character Literal">. It begins with C<\0>, C<\1>, C<\2>, C<\3>, C<\4>, C<\5>, C<\6>, C<\7>, or C<\o{>. If it begins with C<\0>, C<\1>, C<\2>, C<\3>, C<\4>, C<\5>, C<\6>, or C<\7>, it is followed by one to two C<0-7>. If it begins with C<\o{>, it is followed by one to three C<0-7>, and ends with C<}>. The octal numbers after C<\> or C<\o{> is called octal numbers part. Octal numbers part is interpreted as an unsined 8-bit integer, and is converted to a number of byte type without changing the bits. Compilation Errors: The octal numbers part must be less than or equal to C<377>. Otherwise a compilation error occurs. If an octal escape character begins with C<\o{>, the close C<}> must exist. Otherwise a compilation error occurs. Examples: # Octal escape characters \0 \01 \03 \012 \001 \077 \377 \o{1} \o{12} =head2 Hexadecimal Escape Character The hexadecimal escape character represents an unsined 8-bit integer using hexadecimal numbers C<0-9a-fA-F>. The hexadecimal escape character is a part of a L<string literal|/"String Literal"> and a L<character literal|/"Character Literal">. The hexadecimal escape character begins with C<\x>. It can be followed by C<{>. It is followed by one or two C<0-9a-fA-F>. This is called hexadecimal numbers part. If it contains C<{>, it must be followed by C<}>. Hexadecimal numbers part is interpreted as an unsined 8-bit integer, and is converted to a number of byte type without changing the bits. Compilation Errors: If the format of the hexadecimal escape character is invalid, a compilation error occurs. Examples: # Hexadecimal escape characters \xab \xAB \x0D \x0A \xD \xA \xFF \x{A} =head2 String Literal A string literal represents a constant L<string|SPVM::Document::Language::Types/"String">. A string literal begins with C<">. It is followed by zero or more UTF-8 characters, L<string literal escape characters|/"String Literal Escape Characters">, or L<variable expansions|/"Variable Expansion">. It ends with C<">. The return type is L<string type|SPVM::Document::Language::Types/"string Type">. Compilation Errors: If the format of the string literal is invalid, a compilation error occurs. Examples: # String literals "" "abc"; "ã‚ã„ã†" "hello\tworld\n" "hello\x0D\x0A" "hello\xA" "hello\x{0A}" "hello\0" "hello\012" "hello\377" "AAA $foo BBB" "AAA $FOO BBB" "AAA $$foo BBB" "AAA $foo->{x} BBB" "AAA $foo->[3] BBB" "AAA $foo->{x}[3] BBB" "AAA $@ BBB" "\N{U+3042}\N{U+3044}\N{U+3046}" =head3 String Literal Escape Characters The List of String Literal Escape Characters: =begin html <table> <tr> <th> String Literal Escape Characters </th> <th> Values </th> </tr> <tr> <td> \a </td> <td> <code>0x07</code> BEL </td> </tr> <tr> <td> \t </td> <td> <code>0x09</code> HT </td> </tr> <tr> <td> \n </td> <td> <code>0x0A</code> LF </td> </tr> <tr> <td> \f </td> <td> <code>0x0C</code> FF </td> </tr> <tr> <td> \r </td> <td> <code>0x0D</code> CR </td> </tr> <tr> <td> \" </td> <td> <code>0x22</code> " </td> </tr> <tr> <td> \$ </td> <td> <code>0x24</code> $ </td> </tr> <tr> <td> \' </td> <td> <code>0x27</code> ' </td> </tr> <tr> <td> \\ </td> <td> <code>0x5C</code> \ </td> </tr> <tr> <td> <a href="#Octal-Escape-Character">Octal Escape Character</a> </td> <td> A number represented by an octal escape character </td> </tr> <tr> <td> <a href="#Hexadecimal-Escape-Character">Hexadecimal Escape Character</a> </td> <td> A number represented by a hexadecimal escape character </td> </tr> <tr> <td> <a href="#Unicode-Escape-Character">A Unicode escape character</a> </td> <td> Numbers represented by an Unicode escape character </td> </tr> <tr> <td> <a href="#Raw-Escape-Characters">A raw escape character</a> </td> <td> Numbers represented by a hexadecimal escape character </td> </tr> </table> =end html The type of every string literal escape character ohter than the Unicode escape character and the raw escape character is byte type. The type of each number contained in the Unicode escape character and the raw escape character is byte type. =head3 Unicode Escape Character The Unicode escape character represents an UTF-8 character. An UTF-8 character is represented by an Unicode code point with hexadecimal numbers C<0-9a-fA-F>. This is one to four numbers of byte type. The Unicode escape character is a part of a L<string literal|/"String Literal">. It begins with C<\N{U+>. It is followed by one or more C<0-9a-fA-F>. This is called code point part. It ends with C<}>. Compilation Errors: If a code point part is not a Unicode scalar value, a compilation error occurs. Examples: # Unicode escape characters # ã‚ \N{U+3042} # ã„ \N{U+3044} # ㆠ\N{U+3046}" =head3 Raw Escape Characters A raw escape character is an escapa character that <\> is interpreted as ASCII C<\> and the following character is interpreted as itself. For example, a raw escape character C<\s> is ASCII chracters C<\s>. A raw escape character is a part of a L<string literal|/"String Literal">. The List of Raw Escape Characters: =begin html <table> <tr><th>Raw Escape Characters</th></tr> <tr><td>\!</td></tr> <tr><td>\#</td></tr> <tr><td>\%</td></tr> <tr><td>\&</td></tr> <tr><td>\(</td></tr> <tr><td>\)</td></tr> <tr><td>\*</td></tr> <tr><td>\+</td></tr> <tr><td>\,</td></tr> <tr><td>\-</td></tr> <tr><td>\.</td></tr> <tr><td>\/</td></tr> <tr><td>\:</td></tr> <tr><td>\;</td></tr> <tr><td>\<</td></tr> <tr><td>\=</td></tr> <tr><td>\></td></tr> <tr><td>\?</td></tr> <tr><td>\@</td></tr> <tr><td>\A</td></tr> <tr><td>\B</td></tr> <tr><td>\D</td></tr> <tr><td>\G</td></tr> <tr><td>\H</td></tr> <tr><td>\K</td></tr> <tr><td>\N</td></tr> <tr><td>\P</td></tr> <tr><td>\R</td></tr> <tr><td>\S</td></tr> <tr><td>\V</td></tr> <tr><td>\W</td></tr> <tr><td>\X</td></tr> <tr><td>\Z</td></tr> <tr><td>\[</td></tr> <tr><td>\]</td></tr> <tr><td>\^</td></tr> <tr><td>\_</td></tr> <tr><td>\`</td></tr> <tr><td>\b</td></tr> <tr><td>\d</td></tr> <tr><td>\g</td></tr> <tr><td>\h</td></tr> <tr><td>\k</td></tr> <tr><td>\p</td></tr> <tr><td>\s</td></tr> <tr><td>\v</td></tr> <tr><td>\w</td></tr> <tr><td>\z</td></tr> <tr><td>\{</td></tr> <tr><td>\|</td></tr> <tr><td>\}</td></tr> <tr><td>\~</td></tr> </table> =end html =head3 Variable Expansion The variable expasion is a syntax to embed L<getting a local variable|SPVM::Document::Language::Operators/"Getting a Local Variable">, L<getting a class variables|SPVM::Document::Language::Operators/"Getting a Class Variable">, a L<dereference|SPVM::Document::Language::Operators/"Dereference Operator">, L<getting a field|SPVM::Document::Language::Operators/"Getting a Field">, L<getting an array element|SPVM::Document::Language::Operators/"Getting an Array Element">, L<getting the exception variable|SPVM::Document::Language::Operators/"Getting the Exception Variable"> into a L<string literal|"String Literal">. "AAA $foo BBB" "AAA $FOO BBB" "AAA $$foo BBB" "AAA $foo->{x} BBB" "AAA $foo->[3] BBB" "AAA $foo->{x}[3] BBB" "AAA $foo->{x}->[3] BBB" "AAA $@ BBB" "AAA ${foo}BBB" The above codes are expanded to the following codes. "AAA " . $foo . " BBB" "AAA " . $FOO . " BBB" "AAA " . $$foo . " BBB" "AAA " . $foo->{x} . " BBB" "AAA " . $foo->[3] . " BBB" "AAA " . $foo->{x}[3] . " BBB" "AAA " . $foo->{x}->[3] . " BBB" "AAA " . $@ . "BBB" "AAA " . ${foo} . "BBB" The operation of getting field does not contain L<space characters|/"Space Characters"> between C<{> and C<}>. The index of getting array element must be a constant interger. The getting array dose not contain L<space characters|/"Space Characters"> between C<[> and C<]>. The end C<$> is interpreted by C<$>, not interpreted as a variable expansion. # AAA$ "AAA$" =head2 Single-Quoted String Literal A single-quoted string literal represents a constant string without variable expansions with a few escape characters. It begins with C<q'>. It is followed by zero or more UTF-8 characters, or L<single-quoted string literal escape characters|/"Single-Quoted String Literal Escape Characters">. It ends with C<'>. The return type is L<string type|SPVM::Document::Language::Types/"string Type">. Compilation Errors: A single-quoted string literal must be end with C<'>. Otherwise a compilation error occurs. If the escape character in a single-quoted string literal is invalid, a compilation error occurs. Examples: # Single-quoted string literals q'abc'; q'abc\'\\'; =head3 Single-Quoted String Literal Escape Characters The List of Single-Quoted String Literal Escape Characters: =begin html <table> <tr> <th> Single-Quoted String Literal Escape Characters </th> <th> Values </th> </tr> <tr> <td> \' </td> <td> <code>0x27</code> ' </td> </tr> <tr> <td> \\ </td> <td> <code>0x5C</code> \ </td> </tr> </table> =end html The type of every single-quoted string literal escape character is byte type. =head2 Here Document A here document represents a constant string in multiple lines without escape characters and L<variable expansions|/"Variable Expansion">. <<'HERE_DOCUMENT_NAME'; LINE1 LINE2 LINEn HERE_DOCUMENT_NAME A here document begins with C<<<'HERE_DOCUMENT_NAME';> and ASCII C<LF>. I<HERE_DOCUMENT_NAME> is a L<here document name|/"Here Document Name">. It is followed by a string in multiple lines. It ends with I<HERE_DOCUMENT_NAME> from the beginning of a line and ASCII C<LF>. Compilation Errors: C<<<'HERE_DOCUMENT_NAME';> must not contain L<space characters|/"Space Characters">. Otherwise a compilation error occurs. Examples: # Here document my $string = <<'EOS'; Hello World EOS =head3 Here Document Name A here document name consist of C<a-z>, C<A-Z>, C<_>, C<0-9>. The length of a here document name is greater than or equal to 0. A here document name cannot begin with C<0-9>. A here document name cannot contain C<__>. Compilaition Errors: If the format of a here document name is invalid, a compilatio error occurs. =head1 See Also =over 2 =item * L<SPVM::Document::Language::SyntaxParsing> =item * L<SPVM::Document::Language::Statements> =item * L<SPVM::Document::Language::Operators> =item * L<SPVM::Document::Language::Class> =item * L<SPVM::Document::Language> =item * L<SPVM::Document> =back =head1 Copyright & License Copyright (c) 2023 Yuki Kimoto MIT License