NAME
Sidef::Optimizer - AST optimization and constant folding for Sidef programs
SYNOPSIS
use Sidef::Optimizer;
my $optimizer = Sidef::Optimizer->new(%options);
my %optimized_ast = $optimizer->optimize($ast_struct);
my $optimized_expr = $optimizer->optimize_expr($expression);
DESCRIPTION
The Sidef::Optimizer module provides Abstract Syntax Tree (AST) optimization capabilities for the Sidef programming language. It performs constant folding, expression simplification, and other compile-time optimizations to improve runtime performance.
The optimizer works by traversing the AST and applying transformation rules based on the types of objects and operations involved. When expressions can be evaluated at compile-time (such as 2 + 3 or "hello".uc), the optimizer replaces them with their computed results.
METHODS
new
my $optimizer = Sidef::Optimizer->new(%options);
Creates a new optimizer instance. The %options hash can contain various configuration parameters for the optimization process.
Returns a blessed Sidef::Optimizer object.
optimize
my %optimized = $optimizer->optimize($struct);
my $optimized_ref = $optimizer->optimize($struct); # scalar context
Optimizes an entire AST structure. The input $struct is a hash reference where keys are class names and values are arrays of expressions.
In list context, returns a hash of optimized expressions grouped by class. In scalar context, returns either a hash reference (if multiple expressions) or the last optimized expression (if only one).
Parameters:
$struct- Hash reference containing the AST structure to optimize
Returns: Optimized AST structure (hash in list context, varies in scalar context)
optimize_expr
my $optimized = $optimizer->optimize_expr($expression);
Optimizes a single expression node from the AST. This is the core optimization method that handles different node types and applies appropriate transformations.
The method recognizes and optimizes various node types including:
Primitive objects (strings, numbers, booleans, etc.)
Variables (local, static, const, define)
Control structures (if/else, loops, try/catch)
Method calls with constant arguments
Block structures
Array and hash indexing operations
Parameters:
$expr- Expression hash reference or object to optimize
Returns: Optimized expression (object or hash reference)
OPTIMIZATION FEATURES
Constant Folding
The optimizer can evaluate constant expressions at compile-time:
# Before optimization
var x = 2 + (3 * 4);
# After optimization (conceptually)
var x = 14;
Supported for operations on:
Numbers (arithmetic, bitwise, comparison)
Strings (concatenation, case conversion, methods)
Booleans (logical operations)
Arrays (many methods when arguments are constant)
Complex number types (Gaussian integers, fractions, quaternions, etc.)
Method Call Optimization
When method receivers and arguments are all compile-time constants, the method is invoked during optimization:
# "HELLO" computed at compile-time
var greeting = "hello".uc;
# 120 computed at compile-time
var factorial = 5.factorial;
Unary Operator Optimization
Special handling for unary operators like negation, sqrt, and logical not:
# -42 computed at compile-time
var x = -42;
# 3 computed at compile-time
var y = √9;
SUPPORTED TYPES AND OPERATIONS
The optimizer maintains extensive rule tables for various Sidef types:
String Operations
Optimizable string methods include:
Case conversion:
lc,uc,fc,tc,wc,tclc,lcfirstTrimming:
trim,trim_beg,trim_end,chop,chompInspection:
chars_len,bytes_len,graphs_len,is_emptyTransformation:
reverse,sort,repeat,splitComparison:
eq,ne,lt,gt,le,ge,cmpOperations:
concat,prepend,index,slice,rangeEncoding:
encode_utf8,decode_utf8,encode,decodeAnd many more...
Number Operations
Optimizable numeric methods include:
Arithmetic:
+,-,*,/,%,**Bitwise:
&,|,^,<<,>>Comparison:
==,!=,<,>,<=,>=,<=>Mathematical:
sqrt,abs,sin,cos,tan,log,expNumber theory:
factorial,is_prime,divisors,sigmaConversion:
int,float,rat,complexAnd extensive other mathematical operations...
Array Operations
Optimizable array methods include:
Access:
first,last,item,sliceModification:
sort,reverse,unique,flattenInspection:
len,is_empty,contains,indexReduction:
sum,prod,min,max,reduceGeneration:
permutations,combinations,subsetsAnd more...
Complex Number Types
The optimizer supports operations on:
Gaussian integers (complex numbers with integer components)
Fractions (rational numbers)
Modular numbers (integers modulo n)
Quadratic numbers (a + b√d form)
Quaternions (4D extension of complex numbers)
Each type has specific optimization rules for their operations.
Range Objects
Special handling for:
RangeNumber - numeric ranges with optional step
RangeString - character/string ranges
Methods like first, last, length, reverse are optimizable.
INTERNAL FUNCTIONS
methods
my @method_refs = methods($package, @method_names);
Helper function that loads a package and returns method references for the specified method names. Uses caching to avoid repeated lookups.
Parameters:
$package- Fully qualified package name@method_names- List of method names to retrieve
Returns: List of code references
dtypes
my @method_refs = dtypes($datatype, @method_names);
Similar to methods, but works with data type to type class mappings. Converts data type names to their corresponding type class and retrieves methods.
Parameters:
$datatype- Data type identifier@method_names- List of method names to retrieve
Returns: List of code references
table
my $table_ref = table(@type_constants);
Creates an array reference from type constants, used in building optimization rule tables.
Parameters:
@type_constants- List of type constant values
Returns: Array reference
build_tree
my $tree = build_tree(@data);
Constructs a hierarchical tree structure for optimization rules. The tree enables efficient lookup of optimization opportunities based on object type and method signature.
Parameters:
@data- Array of rule specifications
Returns: Hash reference representing the rule tree
OPTIMIZATION RULES
The module maintains a comprehensive %rules hash that maps type classes to their optimizable operations. Each rule specifies:
The method or operation name
Expected argument type signatures
Conditions for optimization
The rule structure is built hierarchically using nested hashes for efficient lookup during optimization.
TYPE CONSTANTS
The module defines constants for all supported Sidef types:
STRING -
Sidef::Types::String::StringNUMBER -
Sidef::Types::Number::NumberBOOL -
Sidef::Types::Bool::BoolARRAY -
Sidef::Types::Array::ArrayREGEX -
Sidef::Types::Regex::RegexMOD -
Sidef::Types::Number::ModGAUSS -
Sidef::Types::Number::GaussFRACTION -
Sidef::Types::Number::FractionQUADRATIC -
Sidef::Types::Number::QuadraticQUATERNION -
Sidef::Types::Number::QuaternionRANGENUM -
Sidef::Types::Range::RangeNumberRANGESTR -
Sidef::Types::Range::RangeStringAnd corresponding data type variants (
*_DT)
CACHING
The optimizer uses several caching mechanisms:
%cache - Caches module paths and method lookups
%addr - Tracks processed object addresses to prevent infinite loops
These caches are crucial for performance when optimizing large ASTs.
SPECIAL NODE TYPES
The optimizer handles various special AST node types:
Variables
Sidef::Variable::Variable- Regular variablesSidef::Variable::Static- Static variablesSidef::Variable::Const- ConstantsSidef::Variable::Define- Defined valuesSidef::Variable::Init- Initialization expressionsSidef::Variable::NamedParam- Named parameters
Control Structures
Sidef::Types::Block::If- If/elsif/else conditionalsSidef::Types::Block::With- With/else blocksSidef::Types::Block::While- While loopsSidef::Types::Block::ForEach- For-each loopsSidef::Types::Block::ForIn- For-in loopsSidef::Types::Block::CFor- C-style for loopsSidef::Types::Block::Try- Try/catch blocksSidef::Types::Block::Given- Given/when constructsSidef::Types::Block::Loop- Generic loops
Other Special Nodes
Sidef::Types::Bool::Ternary- Ternary conditional operatorSidef::Types::Array::HCArray- Hybrid code/array structureSidef::Meta::Assert- Assertion expressionsSidef::Meta::Module- Module definitionsSidef::Meta::Included- Included codeSidef::Meta::PrefixMethod- Prefix method calls
OPTIMIZATION WORKFLOW
The typical optimization workflow:
- 1. Parse source code into AST
- 2. Create optimizer instance
- 3. Call
optimize()on the AST structure - 4. Optimizer recursively processes all nodes
- 5. Constant expressions are evaluated
- 6. Method calls on constants are executed
- 7. Optimized AST is returned
- 8. Optimized AST is used for code generation
PERFORMANCE CONSIDERATIONS
The optimizer caches method lookups extensively
Address tracking prevents redundant processing of shared nodes
Rule lookup uses hierarchical hashing for O(1) access
Only safe, side-effect-free operations are optimized
Type checking ensures correctness of optimizations
LIMITATIONS
Only pure functions with constant arguments are optimized
Methods with side effects are not optimized
Very large constant arrays/strings may not be optimized
User-defined methods are generally not optimized
Some complex control flow may limit optimization
DEPENDENCIES
utf8 - UTF-8 handling
Scalar::Util -
refaddrfor object trackingAlgorithm::Loops -
NestedLoopsfor rule tree buildingVarious
Sidef::Types::*modules for type implementations
SEE ALSO
Sidef - The Sidef programming language
Sidef::Parser - Sidef parser that generates the AST
Sidef::Deparse - AST to source code converter
Sidef::Types::* - Sidef type implementations
AUTHOR
Daniel Șuteu (trizen)
LICENSE
This module is free software; you can redistribute it and/or modify it under the same terms as Sidef itself.