NAME

Lugh::Autograd::Ops - Differentiable operations for automatic differentiation

SYNOPSIS

use Lugh;
use Lugh::Autograd;

my $ctx = Lugh::Context->new(mem_size => 16 * 1024 * 1024);

# Create tensors with gradient tracking
my $a = Lugh::Autograd::Tensor->new($ctx, 'f32', 4, { requires_grad => 1 });
my $b = Lugh::Autograd::Tensor->new($ctx, 'f32', 4, { requires_grad => 1 });

$a->set_data(1.0, 2.0, 3.0, 4.0);
$b->set_data(2.0, 2.0, 2.0, 2.0);

# Element-wise operations
my $sum_result = Lugh::Autograd::Ops->add($ctx, $a, $b);
my $prod_result = Lugh::Autograd::Ops->mul($ctx, $a, $b);

# Reduction operations
my $total = Lugh::Autograd::Ops->sum($ctx, $prod_result);

# Compute the graph
my $graph = Lugh::Graph->new($ctx);
my $raw = Lugh::Tensor->from_ptr($total->_raw_tensor_ptr);
$graph->build_forward($raw);
$graph->compute($ctx, 1);

# Backward pass
$total->backward;

# Access gradients
my $grad_a = $a->grad;  # Gradients w.r.t. $a
my $grad_b = $b->grad;  # Gradients w.r.t. $b

DESCRIPTION

Lugh::Autograd::Ops provides differentiable tensor operations that automatically track gradients for backpropagation. Each operation records its inputs in the computation graph, enabling automatic gradient computation via the backward() method.

All operations return Lugh::Autograd::Tensor objects. If any input tensor has requires_grad set to true and gradient tracking is enabled globally, the output tensor will also track gradients.

CLASS METHODS

add

my $c = Lugh::Autograd::Ops->add($ctx, $a, $b);

Performs element-wise addition of two tensors.

Parameters:

$ctx - A Lugh::Context object
$a - First Lugh::Autograd::Tensor operand
$b - Second Lugh::Autograd::Tensor operand

Returns: A new Lugh::Autograd::Tensor containing $a + $b

Gradient: For z = x + y, the gradients are:

dL/dx = dL/dz
dL/dy = dL/dz

Example:

my $ctx = Lugh::Context->new(mem_size => 16 * 1024 * 1024);

my $x = Lugh::Autograd::Tensor->new($ctx, 'f32', 3, { requires_grad => 1 });
my $y = Lugh::Autograd::Tensor->new($ctx, 'f32', 3, { requires_grad => 1 });

$x->set_data(1.0, 2.0, 3.0);
$y->set_data(4.0, 5.0, 6.0);

my $z = Lugh::Autograd::Ops->add($ctx, $x, $y);
$ctx->compute;

# z contains [5.0, 7.0, 9.0]
my @z_data = $z->get_data;

# Backward pass
my $loss = Lugh::Autograd::Ops->sum($ctx, $z);
$ctx->compute;
$loss->backward;

# Both gradients are [1.0, 1.0, 1.0] (gradient of sum flows equally)
my $grad_x = $x->grad;
my $grad_y = $y->grad;

mul

my $c = Lugh::Autograd::Ops->mul($ctx, $a, $b);

Performs element-wise multiplication of two tensors.

Parameters:

$ctx - A Lugh::Context object
$a - First Lugh::Autograd::Tensor operand
$b - Second Lugh::Autograd::Tensor operand

Returns: A new Lugh::Autograd::Tensor containing $a * $b

Gradient: For z = x * y, the gradients are:

dL/dx = dL/dz * y
dL/dy = dL/dz * x

Example:

my $ctx = Lugh::Context->new(mem_size => 16 * 1024 * 1024);

my $x = Lugh::Autograd::Tensor->new($ctx, 'f32', 3, { requires_grad => 1 });
my $y = Lugh::Autograd::Tensor->new($ctx, 'f32', 3, { requires_grad => 1 });

$x->set_data(2.0, 3.0, 4.0);
$y->set_data(5.0, 6.0, 7.0);

my $z = Lugh::Autograd::Ops->mul($ctx, $x, $y);
$ctx->compute;

# z contains [10.0, 18.0, 28.0]
my @z_data = $z->get_data;

# Backward pass
my $loss = Lugh::Autograd::Ops->sum($ctx, $z);
$ctx->compute;
$loss->backward;

# grad_x = y values = [5.0, 6.0, 7.0]
# grad_y = x values = [2.0, 3.0, 4.0]
my $grad_x = $x->grad;
my $grad_y = $y->grad;

sum

my $scalar = Lugh::Autograd::Ops->sum($ctx, $a);

Reduces a tensor to a scalar by summing all elements.

Parameters:

$ctx - A Lugh::Context object
$a - The Lugh::Autograd::Tensor to sum

Returns: A new Lugh::Autograd::Tensor containing a single scalar value

Gradient: For y = sum(x), the gradient is:

dL/dx_i = dL/dy  (gradient broadcasts to all elements)

Example:

my $ctx = Lugh::Context->new(mem_size => 16 * 1024 * 1024);

my $x = Lugh::Autograd::Tensor->new($ctx, 'f32', 4, { requires_grad => 1 });
$x->set_data(1.0, 2.0, 3.0, 4.0);

my $total = Lugh::Autograd::Ops->sum($ctx, $x);
$ctx->compute;

# total contains [10.0] (scalar tensor)
my @total_data = $total->get_data;

# Backward pass
$total->backward;

# All gradients are 1.0 (sum distributes gradient equally)
my $grad = $x->grad;  # [1.0, 1.0, 1.0, 1.0]

sub

my $c = Lugh::Autograd::Ops->sub($ctx, $a, $b);

Performs element-wise subtraction of two tensors.

Gradient: For z = x - y:

dL/dx = dL/dz
dL/dy = -dL/dz

div

my $c = Lugh::Autograd::Ops->div($ctx, $a, $b);

Performs element-wise division of two tensors.

Gradient: For z = x / y:

dL/dx = dL/dz / y
dL/dy = -dL/dz * x / y^2

scale

my $c = Lugh::Autograd::Ops->scale($ctx, $a, $scalar);

Multiplies all elements of a tensor by a scalar value.

Parameters:

$ctx - A Lugh::Context object
$a - The Lugh::Autograd::Tensor to scale
$scalar - A numeric scalar value

Gradient: For y = s * x:

dL/dx = s * dL/dy

matmul

my $c = Lugh::Autograd::Ops->matmul($ctx, $a, $b);

Performs matrix multiplication of two tensors.

Gradient: For C = A @ B:

dL/dA = dL/dC @ B^T
dL/dB = A^T @ dL/dC

mean

my $scalar = Lugh::Autograd::Ops->mean($ctx, $a);

Reduces a tensor to a scalar by computing the mean of all elements.

Gradient: For y = mean(x):

dL/dx_i = dL/dy / n  (where n is the number of elements)

relu

my $c = Lugh::Autograd::Ops->relu($ctx, $a);

Applies the Rectified Linear Unit activation function element-wise.

Formula: relu(x) = max(0, x)

Gradient:

dL/dx = dL/dy if x > 0, else 0

gelu

my $c = Lugh::Autograd::Ops->gelu($ctx, $a);

Applies the Gaussian Error Linear Unit activation function element-wise.

Formula: gelu(x) = 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3)))

Used in transformer models like BERT and GPT.

silu

my $c = Lugh::Autograd::Ops->silu($ctx, $a);

Applies the Sigmoid Linear Unit (Swish) activation function element-wise.

Formula: silu(x) = x * sigmoid(x)

Used in models like LLaMA and other modern architectures.

Gradient:

dL/dx = sigmoid(x) * (1 + x * (1 - sigmoid(x))) * dL/dy

softmax

my $c = Lugh::Autograd::Ops->softmax($ctx, $a);

Applies the softmax function, converting logits to probabilities.

Formula: softmax(x)_i = exp(x_i) / sum(exp(x_j))

Output values are in range (0, 1) and sum to 1.

Gradient:

dL/dx_i = y_i * (dL/dy_i - sum_j(dL/dy_j * y_j))

rms_norm

my $c = Lugh::Autograd::Ops->rms_norm($ctx, $a);
my $c = Lugh::Autograd::Ops->rms_norm($ctx, $a, $eps);  # custom epsilon

Applies Root Mean Square Layer Normalization.

Formula: rms_norm(x) = x / sqrt(mean(x^2) + eps)

Parameters:

$eps - (Optional) Small constant for numerical stability, default 1e-5

Used in transformer models like LLaMA for efficient normalization.

GRADIENT TRACKING

Operations respect the global gradient tracking state controlled by Lugh::Autograd:

use Lugh::Autograd;

# Gradients tracked normally
my $c = Lugh::Autograd::Ops->add($ctx, $a, $b);

# Disable gradient tracking for efficiency
Lugh::Autograd::no_grad {
    my $inference = Lugh::Autograd::Ops->add($ctx, $a, $b);
    # $inference->requires_grad is false
};

When gradient tracking is disabled:

Output tensors have requires_grad = 0
No computation graph is built
Memory usage is reduced

COMPUTATION WORKFLOW

The typical workflow for using autograd operations is:

# 1. Create context and tensors
my $ctx = Lugh::Context->new(mem_size => 16 * 1024 * 1024);
my $x = Lugh::Autograd::Tensor->new($ctx, 'f32', 10, { requires_grad => 1 });

# 2. Set input data
$x->set_data(1.0, 2.0, 3.0, ...);

# 3. Build computation graph (forward pass)
my $y = Lugh::Autograd::Ops->mul($ctx, $x, $x);  # x^2
my $loss = Lugh::Autograd::Ops->sum($ctx, $y);

# 4. Execute the computation
$ctx->compute;

# 5. Read forward pass results
my @loss_val = $loss->get_data;

# 6. Compute gradients (backward pass)
$loss->backward;

# 7. Read gradients
my $grad = $x->grad;  # Contains 2*x for each element

CHAINING OPERATIONS

Operations can be chained to build complex computation graphs:

my $ctx = Lugh::Context->new(mem_size => 16 * 1024 * 1024);

my $x = Lugh::Autograd::Tensor->new($ctx, 'f32', 4, { requires_grad => 1 });
my $w = Lugh::Autograd::Tensor->new($ctx, 'f32', 4, { requires_grad => 1 });
my $b = Lugh::Autograd::Tensor->new($ctx, 'f32', 4, { requires_grad => 1 });

$x->set_data(1.0, 2.0, 3.0, 4.0);
$w->set_data(0.5, 0.5, 0.5, 0.5);
$b->set_data(0.1, 0.1, 0.1, 0.1);

# Linear layer: y = w * x + b
my $wx = Lugh::Autograd::Ops->mul($ctx, $w, $x);
my $y = Lugh::Autograd::Ops->add($ctx, $wx, $b);
my $loss = Lugh::Autograd::Ops->sum($ctx, $y);

$ctx->compute;
$loss->backward;

# All leaf tensors now have gradients computed
my $grad_x = $x->grad;
my $grad_w = $w->grad;
my $grad_b = $b->grad;

ERROR HANDLING

Operations will die with an error message if:

The context is invalid or has been freed
Input tensors are not valid Lugh::Autograd::Tensor objects
Input tensors have been freed
Tensor shapes are incompatible for the operation

eval {
    my $result = Lugh::Autograd::Ops->add($ctx, $a, $b);
};
if ($@) {
    warn "Operation failed: $@";
}

AUTHOR

LNATION <email@lnation.org>

LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

To install Lugh, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Lugh

CPAN shell

perl -MCPAN -e shell
install Lugh

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

CLASS METHODS

add

mul

sum

sub

div

scale

matmul

mean

relu

gelu

silu

softmax

rms_norm

GRADIENT TRACKING

COMPUTATION WORKFLOW

CHAINING OPERATIONS

ERROR HANDLING

SEE ALSO

AUTHOR

LICENSE

Module Install Instructions