NAME

Sidef::Optimizer - AST optimization and constant folding for Sidef programs

SYNOPSIS

use Sidef::Optimizer;

my $optimizer = Sidef::Optimizer->new(%options);
my %optimized_ast = $optimizer->optimize($ast_struct);
my $optimized_expr = $optimizer->optimize_expr($expression);

DESCRIPTION

The Sidef::Optimizer module provides Abstract Syntax Tree (AST) optimization capabilities for the Sidef programming language. It performs constant folding, expression simplification, and other compile-time optimizations to improve runtime performance.

The optimizer works by traversing the AST and applying transformation rules based on the types of objects and operations involved. When expressions can be evaluated at compile-time (such as 2 + 3 or "hello".uc), the optimizer replaces them with their computed results.

METHODS

new

my $optimizer = Sidef::Optimizer->new(%options);

Creates a new optimizer instance. The %options hash can contain various configuration parameters for the optimization process.

Returns a blessed Sidef::Optimizer object.

optimize

my %optimized = $optimizer->optimize($struct);
my $optimized_ref = $optimizer->optimize($struct);  # scalar context

Optimizes an entire AST structure. The input $struct is a hash reference where keys are class names and values are arrays of expressions.

In list context, returns a hash of optimized expressions grouped by class. In scalar context, returns either a hash reference (if multiple expressions) or the last optimized expression (if only one).

Parameters:

  • $struct - Hash reference containing the AST structure to optimize

Returns: Optimized AST structure (hash in list context, varies in scalar context)

optimize_expr

my $optimized = $optimizer->optimize_expr($expression);

Optimizes a single expression node from the AST. This is the core optimization method that handles different node types and applies appropriate transformations.

The method recognizes and optimizes various node types including:

  • Primitive objects (strings, numbers, booleans, etc.)

  • Variables (local, static, const, define)

  • Control structures (if/else, loops, try/catch)

  • Method calls with constant arguments

  • Block structures

  • Array and hash indexing operations

Parameters:

  • $expr - Expression hash reference or object to optimize

Returns: Optimized expression (object or hash reference)

OPTIMIZATION FEATURES

Constant Folding

The optimizer can evaluate constant expressions at compile-time:

# Before optimization
var x = 2 + (3 * 4);

# After optimization (conceptually)
var x = 14;

Supported for operations on:

  • Numbers (arithmetic, bitwise, comparison)

  • Strings (concatenation, case conversion, methods)

  • Booleans (logical operations)

  • Arrays (many methods when arguments are constant)

  • Complex number types (Gaussian integers, fractions, quaternions, etc.)

Method Call Optimization

When method receivers and arguments are all compile-time constants, the method is invoked during optimization:

# "HELLO" computed at compile-time
var greeting = "hello".uc;

# 120 computed at compile-time
var factorial = 5.factorial;

Unary Operator Optimization

Special handling for unary operators like negation, sqrt, and logical not:

# -42 computed at compile-time
var x = -42;

# 3 computed at compile-time
var y = √9;

SUPPORTED TYPES AND OPERATIONS

The optimizer maintains extensive rule tables for various Sidef types:

String Operations

Optimizable string methods include:

  • Case conversion: lc, uc, fc, tc, wc, tclc, lcfirst

  • Trimming: trim, trim_beg, trim_end, chop, chomp

  • Inspection: chars_len, bytes_len, graphs_len, is_empty

  • Transformation: reverse, sort, repeat, split

  • Comparison: eq, ne, lt, gt, le, ge, cmp

  • Operations: concat, prepend, index, slice, range

  • Encoding: encode_utf8, decode_utf8, encode, decode

  • And many more...

Number Operations

Optimizable numeric methods include:

  • Arithmetic: +, -, *, /, %, **

  • Bitwise: &, |, ^, <<, >>

  • Comparison: ==, !=, <, >, <=, >=, <=>

  • Mathematical: sqrt, abs, sin, cos, tan, log, exp

  • Number theory: factorial, is_prime, divisors, sigma

  • Conversion: int, float, rat, complex

  • And extensive other mathematical operations...

Array Operations

Optimizable array methods include:

  • Access: first, last, item, slice

  • Modification: sort, reverse, unique, flatten

  • Inspection: len, is_empty, contains, index

  • Reduction: sum, prod, min, max, reduce

  • Generation: permutations, combinations, subsets

  • And more...

Complex Number Types

The optimizer supports operations on:

  • Gaussian integers (complex numbers with integer components)

  • Fractions (rational numbers)

  • Modular numbers (integers modulo n)

  • Quadratic numbers (a + b√d form)

  • Quaternions (4D extension of complex numbers)

Each type has specific optimization rules for their operations.

Range Objects

Special handling for:

  • RangeNumber - numeric ranges with optional step

  • RangeString - character/string ranges

Methods like first, last, length, reverse are optimizable.

INTERNAL FUNCTIONS

methods

my @method_refs = methods($package, @method_names);

Helper function that loads a package and returns method references for the specified method names. Uses caching to avoid repeated lookups.

Parameters:

  • $package - Fully qualified package name

  • @method_names - List of method names to retrieve

Returns: List of code references

dtypes

my @method_refs = dtypes($datatype, @method_names);

Similar to methods, but works with data type to type class mappings. Converts data type names to their corresponding type class and retrieves methods.

Parameters:

  • $datatype - Data type identifier

  • @method_names - List of method names to retrieve

Returns: List of code references

table

my $table_ref = table(@type_constants);

Creates an array reference from type constants, used in building optimization rule tables.

Parameters:

  • @type_constants - List of type constant values

Returns: Array reference

build_tree

my $tree = build_tree(@data);

Constructs a hierarchical tree structure for optimization rules. The tree enables efficient lookup of optimization opportunities based on object type and method signature.

Parameters:

  • @data - Array of rule specifications

Returns: Hash reference representing the rule tree

OPTIMIZATION RULES

The module maintains a comprehensive %rules hash that maps type classes to their optimizable operations. Each rule specifies:

  • The method or operation name

  • Expected argument type signatures

  • Conditions for optimization

The rule structure is built hierarchically using nested hashes for efficient lookup during optimization.

TYPE CONSTANTS

The module defines constants for all supported Sidef types:

  • STRING - Sidef::Types::String::String

  • NUMBER - Sidef::Types::Number::Number

  • BOOL - Sidef::Types::Bool::Bool

  • ARRAY - Sidef::Types::Array::Array

  • REGEX - Sidef::Types::Regex::Regex

  • MOD - Sidef::Types::Number::Mod

  • GAUSS - Sidef::Types::Number::Gauss

  • FRACTION - Sidef::Types::Number::Fraction

  • QUADRATIC - Sidef::Types::Number::Quadratic

  • QUATERNION - Sidef::Types::Number::Quaternion

  • RANGENUM - Sidef::Types::Range::RangeNumber

  • RANGESTR - Sidef::Types::Range::RangeString

  • And corresponding data type variants (*_DT)

CACHING

The optimizer uses several caching mechanisms:

  • %cache - Caches module paths and method lookups

  • %addr - Tracks processed object addresses to prevent infinite loops

These caches are crucial for performance when optimizing large ASTs.

SPECIAL NODE TYPES

The optimizer handles various special AST node types:

Variables

  • Sidef::Variable::Variable - Regular variables

  • Sidef::Variable::Static - Static variables

  • Sidef::Variable::Const - Constants

  • Sidef::Variable::Define - Defined values

  • Sidef::Variable::Init - Initialization expressions

  • Sidef::Variable::NamedParam - Named parameters

Control Structures

  • Sidef::Types::Block::If - If/elsif/else conditionals

  • Sidef::Types::Block::With - With/else blocks

  • Sidef::Types::Block::While - While loops

  • Sidef::Types::Block::ForEach - For-each loops

  • Sidef::Types::Block::ForIn - For-in loops

  • Sidef::Types::Block::CFor - C-style for loops

  • Sidef::Types::Block::Try - Try/catch blocks

  • Sidef::Types::Block::Given - Given/when constructs

  • Sidef::Types::Block::Loop - Generic loops

Other Special Nodes

  • Sidef::Types::Bool::Ternary - Ternary conditional operator

  • Sidef::Types::Array::HCArray - Hybrid code/array structure

  • Sidef::Meta::Assert - Assertion expressions

  • Sidef::Meta::Module - Module definitions

  • Sidef::Meta::Included - Included code

  • Sidef::Meta::PrefixMethod - Prefix method calls

OPTIMIZATION WORKFLOW

The typical optimization workflow:

1. Parse source code into AST
2. Create optimizer instance
3. Call optimize() on the AST structure
4. Optimizer recursively processes all nodes
5. Constant expressions are evaluated
6. Method calls on constants are executed
7. Optimized AST is returned
8. Optimized AST is used for code generation

PERFORMANCE CONSIDERATIONS

  • The optimizer caches method lookups extensively

  • Address tracking prevents redundant processing of shared nodes

  • Rule lookup uses hierarchical hashing for O(1) access

  • Only safe, side-effect-free operations are optimized

  • Type checking ensures correctness of optimizations

LIMITATIONS

  • Only pure functions with constant arguments are optimized

  • Methods with side effects are not optimized

  • Very large constant arrays/strings may not be optimized

  • User-defined methods are generally not optimized

  • Some complex control flow may limit optimization

DEPENDENCIES

  • utf8 - UTF-8 handling

  • Scalar::Util - refaddr for object tracking

  • Algorithm::Loops - NestedLoops for rule tree building

  • Various Sidef::Types::* modules for type implementations

SEE ALSO

AUTHOR

Daniel Șuteu (trizen)

LICENSE

This module is free software; you can redistribute it and/or modify it under the same terms as Sidef itself.