NAME
Sidef::Parser - Parser for the Sidef programming language
SYNOPSIS
use Sidef::Parser;
my $parser = Sidef::Parser->new(
file_name => 'script.sf',
script_name => 'script.sf',
);
my $code = 'say "Hello, World!"';
my $ast = $parser->parse_script(code => \$code);
DESCRIPTION
Sidef::Parser is the main parser for the Sidef programming language. It performs lexical analysis and syntactic parsing of Sidef source code, generating an Abstract Syntax Tree (AST) that can be executed or compiled.
The parser handles:
Variable declarations and scoping
Function and method definitions
Class and module declarations
Operators and expressions
Control flow structures
String interpolation and special literals
Regex patterns
Block constructs
METHODS
Constructor
new
my $parser = Sidef::Parser->new(%options);
Creates a new parser instance. Accepts the following optional parameters:
line- Starting line number (default: 1)inc- Array reference of include pathsclass- Current namespace (default: 'main')file_name- Name of file being parsed (default: '-')script_name- Name of main script (default: '-')interactive- Boolean flag for interactive modeeval_mode- Boolean flag for eval mode
Core Parsing Methods
parse_script
my $ast = $parser->parse_script(code => \$code);
Parses a complete Sidef script and returns the Abstract Syntax Tree. This is the main entry point for parsing.
Parameters:
code- Reference to string containing Sidef code
Returns: AST structure (typically a hash reference)
parse_expr
my $expr = $parser->parse_expr(code => \$code);
Parses a single expression. Handles literals, variables, operators, function calls, and other expression forms.
parse_obj
my $obj = $parser->parse_obj(code => \$code, %options);
Parses an object or value with optional method calls and operators.
Options:
multiline- Allow multiline expressions
parse_block
my $block = $parser->parse_block(code => \$code, %options);
Parses a code block enclosed in braces {...}.
Options:
with_vars- Include variable declarationstopic_var- Create topic variable (_)is_module- Block is a module definitionprev_class- Previous class context
parse_arg
my $arg = $parser->parse_arg(code => \$code);
Parses arguments enclosed in parentheses (...).
parse_array
my $array = $parser->parse_array(code => \$code);
Parses array literals enclosed in brackets [...].
Variable and Declaration Parsing
parse_init_vars
my $vars = $parser->parse_init_vars(code => \$code, %options);
Parses variable declarations with optional initialization and type annotations.
Options:
type- Declaration type ('var', 'global', 'const', 'static', 'del', 'has')private- Private declaration (not added to symbol table)params- Parsing function/method parameterscallback- Callback function for each variableignore_delim- Hash of delimiters to ignore
get_init_vars
my $vars = $parser->get_init_vars(code => \$code, %options);
Similar to parse_init_vars but returns string representations instead of objects.
Options:
with_vals- Include values in outputtype- Declaration type
find_var
my $var = $parser->find_var($var_name, $class_name);
my ($var, $is_lexical) = $parser->find_var($var_name, $class_name);
Looks up a variable in the symbol table by name and class.
In scalar context, returns the variable hash or undef. In list context, returns the variable hash and a boolean indicating if it's lexical.
String and Literal Parsing
get_quoted_string
my $string = $parser->get_quoted_string(code => \$code, %options);
Extracts a quoted string with support for various delimiter pairs.
Options:
no_count_line- Don't count newlines in the string
Supports delimiters: '...', "...", (...), [...], {...}, and many Unicode paired delimiters.
get_quoted_words
my $words = $parser->get_quoted_words(code => \$code);
Parses space-separated quoted words, returning an array reference.
get_method_name
my ($method, $takes_arg, $type) = $parser->get_method_name(code => \$code);
Extracts a method or operator name from the input.
Returns:
- 1. Method/operator name (or hashref for expression-based names)
- 2. Boolean indicating if operator requires an argument
- 3. Operator type (from hyper_ops hash, or 'op', or empty string)
Whitespace and Comment Handling
parse_whitespace
my $found = $parser->parse_whitespace(code => \$code);
Skips whitespace, comments, and handles here-documents. Returns true if whitespace was found.
Handles:
Horizontal and vertical whitespace
Single-line comments (
#...)Multi-line C-style comments (
/* ... */)Embedded comments (
#`(...))Here-documents (
<EOF,<'EOF',<<-EOF)Zero-width spaces
backtrack_whitespace
$parser->backtrack_whitespace(code => \$code);
Moves the position backwards past any trailing whitespace that was just parsed.
Helper Methods
parse_delim
my $end_delim = $parser->parse_delim(code => \$code, %options);
Parses a delimiter and returns its corresponding closing delimiter.
Options:
ignore_delim- Hash of delimiters to ignore
get_name_and_class
my ($name, $class) = $parser->get_name_and_class($var_name);
Splits a potentially qualified variable name into name and class components.
Examples:
'foo' => ('foo', 'main')
'Foo::bar' => ('bar', 'Foo')
check_declarations
$parser->check_declarations($vars_hash);
Checks variable declarations for unused variables and generates warnings (except in interactive/eval mode).
Error Handling
fatal_error
$parser->fatal_error(
error => "Error message",
reason => "Additional context",
code => $code,
pos => $position,
line => $line_number,
var => $var_name,
);
Throws a fatal parsing error with detailed context information including:
File name and line number
Error position with visual indicator
Error message and reason
Suggestions for similar variable names (if
varprovided)
PARSER CONFIGURATION
The parser maintains several configuration hashes:
postfix_ops
Hash of postfix operators that can appear after an expression:
'--', '++', '...', '!', '!!'
hyper_ops
Hash of hyper/meta operators that transform other operators:
map => [1, 'map_operator']
pam => [1, 'pam_operator']
zip => [1, 'zip_operator']
wise => [1, 'wise_operator']
scalar => [1, 'scalar_operator']
rscalar => [1, 'rscalar_operator']
cross => [1, 'cross_operator']
unroll => [1, 'unroll_operator']
reduce => [0, 'reduce_operator']
lmap => [0, 'map_operator']
Format: [takes_args, method_name]
built_in_classes
Hash of built-in class names like:
File, Array, String, Number, Hash, Regex, etc.
keywords
Hash of reserved keywords:
if, elsif, else, while, for, foreach, func, class, module,
return, break, next, var, const, static, import, include, etc.
Delimiters
The parser supports extensive delimiter pairs for strings and grouping:
( ) [ ] { } < >
« » ‹ › " " ' '
And many more Unicode paired delimiters
SPECIAL FEATURES
Here-Documents
Support for here-documents with optional indentation:
<<EOF # Basic here-doc
<<'EOF' # Non-interpolating
<<"EOF" # Interpolating (default)
<<-EOF # With indentation stripping
Quote Operators
Variety of quote operators for different types:
%q/.../ # String (non-interpolating)
%Q/.../ # String (interpolating)
%w/.../ # Word array
%i/.../ # Integer array
%r/.../ # Regex
%f/.../ # File object
%x/.../ # Backtick command
Magic Variables
Support for Perl-compatible magic variables:
$. $? $$ $! $@ $/ etc.
Number Formats
Support for various number literal formats:
123 # Decimal
0b1010 # Binary
0o755 # Octal
0xFF # Hexadecimal
3.14 # Float
1.5e10 # Scientific notation
42i # Imaginary
1.23f # Explicit float
¹²³ # Superscript (for exponents)
REGULAR EXPRESSIONS
The parser uses several compiled regular expressions for efficiency:
static_obj_re- Matches static objects liketrue,false,nil, built-in typesprefix_obj_re- Matches prefix keywords likeif,while,returnquote_operators_re- Matches quote-like operatorsoperators_re- Matches all operators including symbolic and Unicodevar_name_re- Matches valid variable namesmethod_name_re- Matches valid method namesmatch_flags_re- Matches regex modifier flags
SYMBOL TABLE
The parser maintains a hierarchical symbol table with:
vars- Hash of arrays containing variable information per namespaceref_vars_refs- Referenced variables from outer scopesclass- Current namespace/class context
Each variable entry contains:
{
obj => $variable_object,
name => $variable_name,
count => $usage_count,
type => $declaration_type,
line => $declaration_line,
}
AUTHOR
Daniel "Trizen" Șuteu
LICENSE
This module is free software; you can redistribute it and/or modify it under the same terms as Sidef itself.
SEE ALSO
Sidef - The Sidef programming language
https://github.com/trizen/sidef - Sidef on GitHub