Name
SPVM::Document::Language::Tokenization - Lexical Tokenization in The SPVM Language
Description
This document describes lexical tokenization in the SPVM language.
Tokenization
The tokenizing the source codes of SPVM language is explained.
Character Encoding of Source Code
The character encoding of SPVM source codes is UTF-8.
If a character is ASCII, it must be ASCII printable characters or ASCII space characters except for ASCII CR
.
Compilation Errors:
The charactor encoding of SPVM source codes must be UTF-8. Otherwise a compilation error occurs.
If a character in an SPVM source code is ASCII, it must be ASCII printable or space.
The new line of SPVM source codes must be LF. The source code cannot contains CR and CRLF.
Line Terminators
The line terminators are 0x2A LF
of ASCII.
When a line terminator appears, the current line number is incremented by 1.
Space Character
Space characters are SP
, HT
, FF
of ASCII and the line terminators.
Word Character
The word characters are alphabet(a-zA-Z
), number(0-9), and underscore(_
) of ASCII.
Symbol Name
A symbol name is the characters that are composed of word characters and ::
.
A symbol name cannnot contains __
, and cannnot begin with a number 0-9.
A symbol name cannnot begin with ::
, and cannnot end with ::
.
A symbol name cannnot contains ::::
, and cannnot begin with a number 0-9.
# Symbol names
foo
foo_bar2
Foo::Bar
# Invalid symbol names
2foo
foo__bar
::Foo
Foo::
Foo::::Bar
Class Name
A class name is a symbol name.
The part names of a class name must begin uppercase letter. If the class name is Foo:Bar::Baz
, part names are Foo
, Bar
, and Baz
.
A class name must be the name that the relative class file path's all /
are replaced with ::
and the trailing .spvm
is removed. For example, If the relative class file path is Foo/Bar/Baz.spvm
, the class name must be Foo::Bar::Baz
.
# Valid class name in the class file "Foo/Bar/Baz.spvm"
class Foo::Bar::Baz {
}
# Invalid class name in the class file "Foo/Bar/Baz.spvm"
class Foo::Bar::Hello {
}
Compilation Errors:
If class names are invalid, a compilation error occurs.
Examples:
# Class names
Foo
Foo::Bar
Foo::Bar::Baz3
Foo::bar
Foo_Bar::Baz_Baz
# Invalid class names
Foo
Foo::::Bar
Foo::Bar::
Foo__Bar
Foo::bar
Method Name
A method name is a symbol name that doesn't contains ::
.
0-length method name is valid. This is used in the anon method.
Compilation Errors:
If method names are invalid, a compilation error occurs.
Examples:
# Valid method names
FOO
FOO_BAR3
foo
foo_bar
_foo
_foo_bar_
# Invalid method names
foo__bar
3foo
A method name that is the same as a "Keyword" in keyword is allowed.
# "if" is a valid method name
static method if : void () {
}
Field Name
A field name is a symbol name that doesn't contains ::
.
Compilation Errors:
If field names are invalid, a compilation error occurs.
Examples:
# Field names
FOO
FOO_BAR3
foo
foo_bar
_foo
_foo_bar_
# Invalid field names
foo__bar
3foo
Foo::Bar
The field name that is the same as a "Keyword" in keyword is allowed.
# "if" is a valid field name
has if : int;
Variable Name
A variable name begins with $
and is followed by a symbol name.
Compilation Errors:
The symbol name can be wrapped by {
and }
. If a opening {
exists and the closing }
doesn't exists, a compilation error occurs.
Examples:
# Variable names
$name
$my_name
${name}
$Foo::name
$Foo::Bar::name
${Foo::name}
# Invalid variable names
$::name
$name::
$Foo::::name
$my__name
${name
Class Variable Name
A class variable name is a variable name.
Compilation Errors:
If class variable names are invalid, a compilation error occurs.
Examples:
# Class variable names
$NAME
$MY_NAME
${NAME}
$FOO::NAME
$FOO::BAR::NAME
${FOO::NAME_BRACE}
$FOO::name
# Invalid class variable names
$::NAME
$NAME::
$FOO::::NAME
$MY__NAME
$3FOO
${NAME
Local Variable Name
A local variable name is a variable name that doesn't contain ::
.
Examples:
# Local variable names
$name
$my_name
${name_brace}
$_name
$NAME
# Invalid local variable names
$::name
$name::
$Foo::name
$Foo::::name
$my__name
${name
$3foo
Current Class
&
before method name means the current class. &
is replaced with CURRENT_CLASS_NAME->
.
Examples:
class Foo {
static method test : void () {
# This means Foo->sum(1, 2)
my $ret = &sum(1, 2);
}
static method sum : int ($num1 : int, $num2 : int) {
return $num1 + $num2;
}
}
Keyword
The list of keywords:
alias
allow
as
basic_type_id
break
byte
can
case
cmp
class
compile_type_name
copy
default
die
div_uint
div_ulong
double
dump
elsif
else
enum
eq
eval
eval_error_id
extends
for
float
false
gt
ge
has
if
interface
int
interface_t
isa
isa_error
isweak
is_compile_type
is_type
is_error
is_read_only
args_width
last
length
lt
le
long
make_read_only
my
mulnum_t
method
mod_uint
mod_ulong
mutable
native
ne
next
new
new_string_len
of
our
object
print
private
protected
public
precompile
pointer
return
require
required
rw
ro
say
static
switch
string
short
scalar
true
type_name
undef
unless
unweaken
use
version
void
warn
while
weaken
wo
INIT
__END__
__PACKAGE__
__FILE__
__LINE__
Operator for Tokenization
The list of the operators for tokenization:
!
!=
$
%
&
&&
&=
=
==
^
^=
|
||
|=
-
--
-=
~
@
+
++
+=
*
*=
<
<=
>
>=
<=>
%
%=
<<
<<=
>>=
>>
>>>
>>>=
.
.=
/
/=
\
(
)
{
}
[
]
;
:
,
->
=>
Note that the operators for tokenization are different from the operators that are explained in operators. The operators for tokenization are only for tokenization.
Comment
A comment begins with #
and ends with a line terminator.
# Comment
Comments have no meaning in source codes.
Line directives take precedence over comments.
A File directive take precedence over comments.
Line Directive
A line directive begins from the beggining of the line.
A line directive begins with #line
and positive 32bit integer
#line 39
And ends with a line terminator.
The line number in a line directive is set to the current line of the source code.
Line directives take precedence over comments.
Compilation Errors:
A line directive must begin from the beggining of the line. Otherwise an compilation error occurs.
A line directive must end with "\n". Otherwise an compilation error occurs.
A line directive must have a line number. Otherwise an compilation error occurs.
The line number given to a line directive must be a positive 32bit integer. Otherwise an compilation error occurs.
File Directive
A file directive begins from the beggining of the source code.
A file directive begins with #file "
and is followed by a file path, and is closed with "
#file "/Foo/Bar.spvm"
And ends with a line terminator.
The file path is set to the current file path of the source code.
A file directive take precedence over comments.
Compilation Errors:
A file directive must begin from the beggining of the source code. Otherwise an compilation error occurs.
A file directive must end with "\n". Otherwise an compilation error occurs.
A file directive must have a file path. Otherwise an compilation error occurs.
A file directive must end with ". Otherwise an compilation error occurs.
POD
POD(Plain Old Document) is a syntax to write documents in source codes.
The biginning of POD begins with =
, and is followed by any string that is composed of ASCII printable characters, and end with a line terminator.
The previous line of the biginning of POD must need a line terminator
The lator line of the biginning of POD must need a line terminator
=pod
=head1
=item * foo
The end of POD begins with =
, and is followed by cut
, and ends with a line terminator.
The previous line of the end of POD must need a line terminator
The lator line of the end of POD must need a line terminator
=cut
Examples:
=pod
Multi-Line
Comment
=cut
=head1
Multi-Line
Comment
=cut
POD has no meaning in source codes.
Literal
A literal is the way to write a constant value in source codes.
Literals are numeric literals, the floating point literal, the character literal, the string literal and the bool literal.
Numeric Literal
A numeric literal is the way to write a constant value that type is a numeric type in source codes.
Numeric literals are the integer literal and the floating point literal.
Integer Literal
A interger literal is a "Numeric Literal" in numeric literal to write a constant value that type is an integer type in source codes.
Integer Literal Decimal Notation
The interger literal decimal notation is the way to write an integer literal using decimal numbers 0-9.
A minus - can be at the beginning, and is followed by one or more of 0-9.
_
can be used as a separator at the any positions after the first 0-9. _
has no meaning.
The suffix L
or l
can be at the end.
If the suffix L
or l
exists, the return type is the long type. Otherwise the return type is the int type.
Compilation Errors:
If the return type is the int type and the value is greater than the max value of int type or less than the minimal value of int type, a compilation error occurs.
If the return type is the long type and the value is greater than the max value of long type or less than the minimal value of long type, a compilation error occurs.
Examples:
123
-123
123L
123l
123_456_789
-123_456_789L
Integer Literal Hexadecimal Notation
The interger literal hexadecimal notation is the way to write an integer literal using hexadecimal numbers 0-9a-zA-Z
.
A minus - can be at the beginning, and is followed by 0x
or 0X
, and is followed by one or more 0-9a-zA-Z
.
_
can be used as a separator at the any positions after 0x
or 0X
. _
has no meaning.
The suffix L
or l
can be at the end.
If the suffix L
or l
exists, the return type is the long type. Otherwise the return type is the int type.
If the return type is the int type, the value that is except for - is interpreted as unsigned 32 bit integer uint32_t
type in the C language, and the following conversion is performed.
uint32_t value_uint32_t;
int32_t value_int32_t = (int32_t)value_uint32_t;
And if - exists, the following conversion is performed.
value_int32_t = -value_int32_t;
For example, 0xFFFFFFFF
is the same as -1, -0xFFFFFFFF
is the same as 1.
If the return type is the long type, the value that is except for - is interpreted as unsigned 64 bit integer uint64_t
type in the C language, and the following conversion is performed.
uint64_t value_uint64_t;
value_int64_t = (int64_t)value_uint64_t;
And if - exists, the following conversion is performed.
value_int64_t = -value_int64_t;
For example, 0xFFFFFFFFFFFFFFFFL
is the same as -1L
, -0xFFFFFFFFFFFFFFFFL
is the same as 1L
.
Compilation Errors:
If the return type is the int type and the value that is except for - is greater than hexadecimal FFFFFFFF
, a compilation error occurs.
If the return type is the long type and the value that is except for - is greater than hexadecimal FFFFFFFFFFFFFFFF
, a compilation error occurs.
Examples:
0x3b4f
0X3b4f
-0x3F1A
0xDeL
0xFFFFFFFF
0xFF_FF_FF_FF
0xFFFFFFFFFFFFFFFFL
Integer Literal Octal Notation
The interger literal octal notation is the way to write an integer literal using octal numbers 0-7.
A minus - can be at the beginning, and is followed by 0, and is followed by one or more 0-7.
_
can be used as a separator at the any positions after 0. _
has no meaning.
The suffix L
or l
can be at the end.
If the suffix L
or l
exists, the return type is the long type. Otherwise the return type is the int type.
If the return type is the int type, the value that is except for - is interpreted as unsigned 32 bit integer uint32_t
type in the C language, and the following conversion is performed.
uint32_t value_uint32_t;
int32_t value_int32_t = (int32_t)value_uint32_t;
And if - exists, the following conversion is performed.
value_int32_t = -value_int32_t;
For example, 037777777777 is the same as -1, -037777777777 is the same as 1.
If the return type is the long type, the value that is except for - is interpreted as unsigned 64 bit integer uint64_t
type in the C language, and the following conversion is performed.
uint64_t value_uint64_t;
value_int64_t = (int64_t)value_uint64_t;
And if - exists, the following conversion is performed.
value_int64_t = -value_int64_t;
For example, 01777777777777777777777L
is the same as -1L
, -01777777777777777777777L
is the same as 1L
.
Compilation Errors:
If the return type is the int type and the value that is except for - is greater than octal 37777777777, a compilation error occurs.
If the return type is the long type and the value that is except for - is greater than octal 1777777777777777777777, a compilation error occurs.
Examples:
0755
-0644
0666L
0655_755
Integer Literal Binary Notation
The interger literal binary notation is the way to write an integer literal using binary numbers 0 and 1.
A minus - can be at the beginning, and is followed by 0b
or 0B
, and is followed by one or more 0 and 1.
_
can be used as a separator at the any positions after 0b
or 0B
. _
has no meaning.
The suffix L
or l
can be at the end.
If the suffix L
or l
exists, the return type is the long type. Otherwise the return type is the int type.
If the return type is the int type, the value that is except for - is interpreted as unsigned 32 bit integer uint32_t
type in the C language, and the following conversion is performed.
uint32_t value_uint32_t;
int32_t value_int32_t = (int32_t)value_uint32_t;
And if - exists, the following conversion is performed.
value_int32_t = -value_int32_t;
For example, 0b11111111111111111111111111111111
is the same as -1, -0b11111111111111111111111111111111
is the same as 1.
If the return type is the long type, the value that is except for - is interpreted as unsigned 64 bit integer uint64_t
type in the C language, and the following conversion is performed.
uint64_t value_uint64_t;
value_int64_t = (int64_t)value_uint64_t;
And if - exists, the following conversion is performed.
value_int64_t = -value_int64_t;
For example, 0b1111111111111111111111111111111111111111111111111111111111111111L
is the same as -1L
, -0b1111111111111111111111111111111111111111111111111111111111111111L
is the same as 1L
.
Compilation Errors:
If the return type is the int type and the value that is except for - is greater than binary 11111111111111111111111111111111, a compilation error occurs.
If the return type is the long type and the value that is except for - is greater than binary 1111111111111111111111111111111111111111111111111111111111111111, a compilation error occurs.
Examples:
0b0101
-0b1010
0b110000L
0b10101010_10101010
Floating Point Literal
The floating point litral is a "Numeric Literal" in numeric literal to write a constant value that type is a floating point type in source codes.
Floating Point Literal Decimal Notation
The floating point litral decimal notation is the way to write a floating point literal using decimal numbers 0-9 in source codes.
A minus - can be at the beginning, and is followed by one or more 0-9
_
can be used as a separator at the any positions after the first 0-9.
And can be followed by a floating point part.
A floating point part is . and is followed by one or more 0-9.
And can be followed by an exponent part.
An exponent part is e
or E
and is followed by +
, -, or ""
, and followed by one or more 0-9.
And can be followed by a suffix is f
, F
, d
, or D
.
one of a floating point part, an exponent part, or a suffix must exist.
If the suffix f
or F
exists, the return type is the float type. Otherwise the return type is the double type.
Compilation Errors:
If the return type is the float type, the floating point literal is parsed by the strtof
function of the C language. If the parsing fails, a compilation error occurs.
If the return type is the double type, the floating point literal is parsed by the strtod
function of the C language. If the parsing fails, a compilation error occurs.
Examples:
1.32
-1.32
1.32f
1.32F
1.32d
1.32D
1.32e3
1.32e-3
1.32E+3
1.32E-3
12e7
Floating Point Literal Hexadecimal Notation
The floating point litral hexadecimal notation is the way to write a floating point literal using hexadecimal numbers 0-9a-zA-Z
in source codes.
A minus - can be at the beginning, and is followed by 0x
or 0X
, and is followed by one or more 0-9a-zA-Z
.
_
can be used as a separator at the any positions after 0x
or 0X
.
And can be followed by a floating point part.
A floating point part is . and is followed by one or more 0-9a-zA-Z
.
And can be followed by an exponent part.
An exponent part is p
or P
and is followed by +
, -, or ""
, and followed by one or more decimal numbers 0-9.
And can be followed by a suffix f
, F
, d
, or D
if an exponent part exist.
one of a floating point part or an exponent part must exist.
If the suffix f
or F
exists, the return type is the float type. Otherwise the return type is the double type.
Compilation Errors:
If the return type is the float type, the floating point literal is parsed by the strtof
function of the C language. If the parsing fails, a compilation error occurs.
If the return type is the double type, the floating point literal is parsed by the strtod
function of the C language. If the parsing fails, a compilation error occurs.
Examples:
0x3d3d.edp0
0x3d3d.edp3
0x3d3d.edP3
0x3d3d.edP+3
0x3d3d.edP-3f
0x3d3d.edP-3F
0x3d3d.edP-3d
0x3d3d.edP-3D
0x3d3dP+3
Character Literal
A character literal is a literal to write a constant value that type is the byte type in source codes.
A character literal represents an ASCII character.
A character literal begins with '
.
And is followed by a printable ASCII character 0x20-0x7e
or an character literal escape character.
And ends with '
.
The return type is the byte type.
Compilation Errors:
If the format of the character literal is invalid, a compilation error occurs.
Character Literal Escape Characters
The list of character literal escape characters.
Character literal escape characters | ASCII characters |
---|---|
\0 |
0x00 NUL
|
\a |
0x07 BEL
|
\t |
0x09 HT
|
\n |
0x0A LF
|
\f |
0x0C FF
|
\r |
0x0D CR
|
\" |
0x22 "
|
\' |
0x27 '
|
\\ |
0x5C \
|
Octal Escape Character | An ASCII character |
Hexadecimal Escape Character | An ASCII character |
Examples:
# Charater literals
'a'
'x'
'\a'
'\t'
'\n'
'\f'
'\r'
'\"'
'\''
'\\'
'\0'
' '
'\xab'
'\xAB'
'\x0D'
'\x0A'
'\xD'
'\xA'
'\xFF'
'\x{A}'
String Literal
A string literal is a literal to write a constant value that type is the string type in source codes.
The return type is the string type.
A character literal begins with "
.
And is followed by zero or more than zero UTF-8 character, or string literal escape characters, or variable expansions.
And ends with "
.
Compilation Errors:
If the format of the string literal is invalid, a compilation error occurs.
Examples:
# String literals
"abc";
"あいう"
"hello\tworld\n"
"hello\x0D\x0A"
"hello\xA"
"hello\x{0A}"
"AAA $foo BBB"
"AAA $FOO BBB"
"AAA $$foo BBB"
"AAA $foo->{x} BBB"
"AAA $foo->[3] BBB"
"AAA $foo->{x}[3] BBB"
"AAA $@ BBB"
"\N{U+3042}\N{U+3044}\N{U+3046}"
String Literal Escape Characters
String literal escape characters | Descriptions |
---|---|
\0 |
ASCII 0x00 NUL
|
\a |
ASCII 0x07 BEL
|
\t |
ASCII 0x09 HT
|
\n |
ASCII 0x0A LF
|
\f |
ASCII 0x0C FF
|
\r |
ASCII 0x0D CR
|
\" |
ASCII 0x22 "
|
\$ |
ASCII 0x24 $
|
\' |
ASCII 0x27 '
|
\\ |
ASCII 0x5C \
|
Octal Escape Character | An ASCII character |
Hexadecimal Escape Character | An ASCII character |
Unicode escape character | An UTF-8 character |
Raw escape character | The value of raw escape character |
Unicode Escape Character
The Unicode escape character is the way to write an UTF-8 character using an Unicode code point that is written by hexadecimal numbers 0-9a-fA-F
.
The Unicode escape character can be used as an escape character of the string literal.
The Unicode escape character begins with N{U+
.
And is followed by one or more 0-9a-fA-F
.
And ends with }
.
Compilation Errors:
If the Unicode code point is not a Unicode scalar value, a compilation error occurs.
Examples:
# あいう
"\N{U+3042}\N{U+3044}\N{U+3046}"
# くぎが
"\N{U+304F}\N{U+304E}\N{U+304c}"
Raw Escape Character
The raw escape character is the escapa character that <\> has no effect and \
is interpreted as ASCII \
.
For example, \s
is ASCII chracters \s
, \d
is ASCII chracters <\d>.
The raw escape character can be used as an escape character of the string literal.
The raw escape character is designed to be used by regular expression classes such as Regex.
The list of raw escape characters.
# Raw excape literals
\! \# \% \& \( \) \* \+ \, \- \. \/
\: \; \< \= \> \? \@
\A \B \D \G \H \K \N \P \R \S \V \W \X \Z
\[ \] \^ \_ \`
\b \d \g \h \k \p \s \v \w \z
\{ \| \} \~
Octal Escape Character
The octal escape character is the way to write an ASCII code using octal numbers 0-7.
The octal escape character can be used as an escape character of the string literal and the character literal.
The octal escape character begins with \o{
, and it must be followed by one to three 0-7, and ends with }
.
Or the octal escape character begins with \0
, \1
, \2
, \3
, \4
, \5
, \6
, \7
, and it must be followed by one or two 0-7.
# Octal escape ch1racters in ch1racter literals
'\0'
'\012'
'\003'
'\001'
'\03'
'\01'
'\077'
'\377'
# Octal escape ch1racters in ch1racter literals
'\o{0}'
'\o{12}'
'\o{03}'
'\o{01}'
'\o{3}'
'\o{1}'
'\o{77}'
'\o{377}'
# Octal escape ch1racters in string literals
"Foo \0 Bar"
"Foo \012 Bar"
"Foo \003 Bar"
"Foo \001 Bar"
"Foo \03 Bar"
"Foo \01 Bar"
"Foo \077 Bar"
"Foo \377 Bar"
# Octal escape ch1racters in string literals
"Foo \o{12} Bar"
"Foo \o{12} Bar"
"Foo \o{03} Bar"
"Foo \o{01} Bar"
"Foo \o{3} Bar"
"Foo \o{1} Bar"
"Foo \o{77} Bar"
"Foo \o{377} Bar"
Hexadecimal Escape Character
The hexadecimal escape character is the way to write an ASCII code using hexadecimal numbers 0-9a-fA-F
.
The hexadecimal escape character can be used as an escape character of the string literal and the character literal.
The hexadecimal escape character begins with \x
.
And is followed by one or two 0-9a-fA-F
.
The hexadecimal numbers can be sorrounded by {
and }
.
# Hexadecimal escape characters in character literals
'\xab'
'\xAB'
'\x0D'
'\x0A'
'\xD'
'\xA'
'\xFF'
'\x{A}'
# Hexadecimal escape characters in string literals
"Foo \xab Bar"
"Foo \xAB Bar"
"Foo \x0D Bar"
"Foo \x0A Bar"
"Foo \xD Bar"
"Foo \xA Bar"
"Foo \xFF Bar"
"Foo \x{A} Bar"
Single-Quoted String Literal
A single-quoted string literal represents a constant string value in source codes.
The return type is the string type.
A character literal begins with q'
.
And is followed by zero or more than zero UTF-8 character, or escape characters.
And ends with '
.
Compilation Errors:
A single-quoted string literal must be end with '
. Otherwise a compilation error occurs.
If the escape character in a single-quoted string literal is invalid, a compilation error occurs.
Examples:
# Single-quoted string literals
q'abc';
q'abc\'\\';
Single-Quoted String Literal Escape Characters
Single-quoted string literal escape characters | Descriptions |
---|---|
\\ |
ASCII 0x5C \
|
\' |
ASCII 0x27 '
|
Bool Literal
The bool literal is a literal to represent a bool value in source codes.
true
true
is the alias for the TRUE method of Bool.
true
Examples:
# true
my $is_valid = true;
false
false
is the alias for FALSE method of Bool.
false
Examples:
# false
my $is_valid = false;
Variable Expansion
The variable expasion is the feature to embed getting local variable, getting class variables, dereference, "Getting Field" in getting field, getting array element, "Getting Exception Variable" in getting exception variable into the string literal.
"AAA $foo BBB"
"AAA $FOO BBB"
"AAA $$foo BBB"
"AAA $foo->{x} BBB"
"AAA $foo->[3] BBB"
"AAA $foo->{x}[3] BBB"
"AAA $foo->{x}->[3] BBB"
"AAA $@ BBB"
"AAA ${foo}BBB"
The above codes are convarted to the following codes.
"AAA " . $foo . " BBB"
"AAA " . $FOO . " BBB"
"AAA " . $$foo . " BBB"
"AAA " . $foo->{x} . " BBB"
"AAA " . $foo->[3] . " BBB"
"AAA " . $foo->{x}[3] . " BBB"
"AAA " . $foo->{x}->[3] . " BBB"
"AAA " . $@ . "BBB"
"AAA " . ${foo} . "BBB"
The getting field doesn't contain space characters between {
and }
.
The index of getting array element must be a constant value. The getting array doesn't contain space characters between [
and ]
.
The end $
is not interpreted as a variable expansion.
"AAA$"
Fat Comma
The fat comma =
> is a separator.
=>
The fat comma is an alias for Comma ,
.
# Comma
["a", "b", "c", "d"]
# Fat Comma
["a" => "b", "c" => "d"]
If the characters of LEFT_OPERAND of the fat camma is not wrapped by "
and the characters are a symbol name that does'nt contain ::
, the characters are treated as a string literal.
# foo_bar2 is treated as "foo_bar2"
[foo_bar2 => "Mark"]
["foo_bar2" => "Mark"]
Here Document
Here document is syntax to write a string literal in multiple lines without escapes and variable expansions.
<<'HERE_DOCUMENT_NAME';
line1
line2
line...
HERE_DOCUMENT_NAME
Here document syntax begins with <<'HERE_DOCUMENT_NAME';
+ a line terminator. HERE_DOCUMENT_NAME
is a here document name.
A string begins from the next line.
Here document syntax ends with the line that begins HERE_DOCUMENT_NAME
+ a line terminator.
Compilation Errors:
<<'HERE_DOCUMENT_NAME'
cannot contains spaces. If so, a compilation error occurs.
Examples:
# Here document
my $string = <<'EOS';
Hello
World
EOS
# No escapes and variable expaneions are performed.
my $string = <<'EOS';
$foo
\t
\
EOS
Here Document Name
Here document name is composed of a-z
, A-Z
, _
, 0-9
.
Compilaition Errors:
The length of a here document name must be greater than or equal to 0. Otherwise a compilation error occurs.
A here document name cannot start with a number. If so, a compilation error occurs.
A here document name cannot contain __
. If so, a compilation error occurs.
See Also
Copyright & License
Copyright (c) 2023 Yuki Kimoto
MIT License
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 964:
Non-ASCII character seen before =encoding in '"あいう"'. Assuming UTF-8