NAME

Lugh::Ops - Tensor Operations for Neural Network Computation

VERSION

Version 0.01

SYNOPSIS

use Lugh;

my $ctx = Lugh::Context->new(mem_size => 10 * 1024 * 1024);

# Create input tensors
my $a = Lugh::Tensor->new_f32($ctx, 100);
my $b = Lugh::Tensor->new_f32($ctx, 100);
$a->set_f32(@a_data);
$b->set_f32(@b_data);

# Arithmetic operations
my $sum = Lugh::Ops::add($ctx, $a, $b);
my $product = Lugh::Ops::mul($ctx, $a, $b);

# Matrix operations
my $w = Lugh::Tensor->new_f32($ctx, 100, 50);
my $x = Lugh::Tensor->new_f32($ctx, 100, 10);
my $y = Lugh::Ops::mul_mat($ctx, $w, $x);

# Activation functions
my $activated = Lugh::Ops::silu($ctx, $a);
my $probs = Lugh::Ops::soft_max($ctx, $a);

# Normalization
my $normed = Lugh::Ops::rms_norm($ctx, $a, 1e-5);

# Build and compute
my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($sum);
$graph->compute($ctx, 4);

my @result = $sum->get_f32();

DESCRIPTION

Lugh::Ops provides tensor operations that form the building blocks of neural network computation. These operations create computation graph nodes that are evaluated lazily when the graph is computed.

All operations are static functions that take a context and input tensor(s), returning a new tensor representing the operation result.

Lazy Evaluation

Operations don't compute results immediately. Instead, they build a computation graph:

my $a = Lugh::Tensor->new_f32($ctx, 100);
my $b = Lugh::Tensor->new_f32($ctx, 100);
my $c = Lugh::Ops::add($ctx, $a, $b);  # No computation yet!

my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($c);
$graph->compute($ctx, 4);  # Computation happens here

my @result = $c->get_f32();  # Now we can read results

This allows ggml to optimize the computation and use multiple threads.

FUNCTIONS

add

my $c = Lugh::Ops::add($ctx, $a, $b);

Element-wise addition of two tensors.

Parameters:

  • $ctx - A Lugh::Context object

  • $a - First tensor

  • $b - Second tensor (must match shape or be broadcastable)

Returns: A new tensor C = A + B.

Example:

my $a = ...;  # [1.0, 2.0, 3.0]
my $b = ...;  # [4.0, 5.0, 6.0]
my $c = Lugh::Ops::add($ctx, $a, $b);
# Result: [5.0, 7.0, 9.0]

mul

my $c = Lugh::Ops::mul($ctx, $a, $b);

Element-wise multiplication of two tensors.

Parameters:

  • $ctx - A Lugh::Context object

  • $a - First tensor

  • $b - Second tensor (must match shape or be broadcastable)

Returns: A new tensor C = A * B (element-wise).

Example:

my $a = ...;  # [1.0, 2.0, 3.0]
my $b = ...;  # [4.0, 5.0, 6.0]
my $c = Lugh::Ops::mul($ctx, $a, $b);
# Result: [4.0, 10.0, 18.0]

mul_mat

my $c = Lugh::Ops::mul_mat($ctx, $a, $b);

Matrix multiplication.

Parameters:

  • $ctx - A Lugh::Context object

  • $a - Left matrix [K, N]

  • $b - Right matrix [K, M]

Returns: A new tensor C = A^T × B with shape [N, M].

Note: ggml's mul_mat has specific dimension semantics. For standard matrix multiply A × B where A is [M, K] and B is [K, N]:

# Transpose A and pass to mul_mat
C = mul_mat(A^T, B)  # C is [M, N]

In practice for neural networks:

# Weight matrix W: [out_dim, in_dim]
# Input X: [in_dim, batch]
# Output: [out_dim, batch]
my $output = Lugh::Ops::mul_mat($ctx, $weights, $input);

Example:

my $w = Lugh::Tensor->new_f32($ctx, 100, 50);   # [100, 50]
my $x = Lugh::Tensor->new_f32($ctx, 100, 10);   # [100, 10]
my $y = Lugh::Ops::mul_mat($ctx, $w, $x);       # [50, 10]

soft_max

my $probs = Lugh::Ops::soft_max($ctx, $logits);

Applies softmax to convert logits to probabilities.

Parameters:

  • $ctx - A Lugh::Context object

  • $logits - Input tensor

Returns: A new tensor with softmax applied along the first dimension.

Formula:

softmax(x_i) = exp(x_i) / Σ exp(x_j)

The output sums to 1.0 along the softmax dimension.

Example:

my $logits = ...;  # [2.0, 1.0, 0.1]
my $probs = Lugh::Ops::soft_max($ctx, $logits);
# Result: [0.659, 0.242, 0.099]  (sums to 1.0)

rms_norm

my $normed = Lugh::Ops::rms_norm($ctx, $x, $eps);

Applies Root Mean Square Layer Normalization.

Parameters:

  • $ctx - A Lugh::Context object

  • $x - Input tensor

  • $eps - Epsilon for numerical stability (e.g., 1e-5)

Returns: A new tensor with RMSNorm applied.

Formula:

RMSNorm(x) = x / √(mean(x²) + ε)

Unlike LayerNorm, RMSNorm does not center the values (no mean subtraction).

Example:

my $x = ...;
my $normed = Lugh::Ops::rms_norm($ctx, $x, 1e-5);
# RMS of $normed is approximately 1.0

silu

my $activated = Lugh::Ops::silu($ctx, $x);

Applies the SiLU (Sigmoid Linear Unit) activation function.

Parameters:

  • $ctx - A Lugh::Context object

  • $x - Input tensor

Returns: A new tensor with SiLU applied element-wise.

Formula:

SiLU(x) = x × σ(x) = x / (1 + exp(-x))

Also known as "Swish" activation. Used in modern LLMs like LLaMA.

Example:

my $x = ...;  # [-1.0, 0.0, 1.0]
my $y = Lugh::Ops::silu($ctx, $x);
# Result: [-0.269, 0.0, 0.731]

ADDITIONAL OPERATIONS

The XS code exposes these operations. Additional ggml operations can be added as needed:

Unary Operations

  • neg - Negate: -x

  • abs - Absolute value: |x|

  • sqr - Square: x²

  • sqrt - Square root: √x

  • exp - Exponential: e^x

  • log - Natural logarithm: ln(x)

  • sin, cos - Trigonometric functions

  • relu - ReLU: max(0, x)

  • gelu - GELU activation

  • tanh - Hyperbolic tangent

Binary Operations

  • sub - Subtraction: a - b

  • div - Division: a / b

  • scale - Scale: a × scalar

Reduction Operations

  • sum - Sum of elements

  • mean - Mean of elements

  • max - Maximum element

  • min - Minimum element

Shape Operations

  • reshape - Change tensor shape

  • permute - Transpose dimensions

  • transpose - Swap two dimensions

  • view - Create a view with different strides

  • cont - Make tensor contiguous in memory

Attention Operations

  • diag_mask_inf - Apply causal mask (upper triangle → -inf)

  • rope_ext - Apply rotary position embeddings

  • flash_attn_ext - Flash attention (optimized)

BROADCASTING

Operations support NumPy-style broadcasting:

# Scalar × Vector
my $scalar = Lugh::Tensor->new_f32($ctx, 1);
my $vector = Lugh::Tensor->new_f32($ctx, 100);
my $result = Lugh::Ops::mul($ctx, $scalar, $vector);  # [100]

# Row × Matrix
my $row = Lugh::Tensor->new_f32($ctx, 100, 1);     # [100, 1]
my $matrix = Lugh::Tensor->new_f32($ctx, 100, 50); # [100, 50]
my $result = Lugh::Ops::mul($ctx, $row, $matrix);  # [100, 50]

Broadcasting rules:

1. Dimensions are compared from right to left
2. Dimensions match if equal or one of them is 1
3. Missing dimensions are treated as 1

OPERATION FUSION

ggml automatically fuses compatible operations when building the computation graph, reducing memory traffic and improving performance:

# These may be fused into a single kernel:
my $x = Lugh::Ops::mul($ctx, $a, $b);
my $y = Lugh::Ops::add($ctx, $x, $c);

GPU ACCELERATION

On supported platforms, operations automatically use:

  • Metal - Apple GPU acceleration on macOS

  • CUDA - NVIDIA GPU acceleration

  • Vulkan - Cross-platform GPU

  • BLAS - Accelerate/OpenBLAS for matrix operations

No code changes needed - ggml selects the best backend.

THREAD SAFETY

Operation functions are thread-safe - they only create graph nodes. The actual computation happens in $graph->compute() which handles parallelization internally.

SEE ALSO

Lugh, Lugh::Context, Lugh::Tensor, Lugh::Graph

https://github.com/ggerganov/ggml - ggml library

AUTHOR

lnation <email@lnation.org>

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 153:

Non-ASCII character seen before =encoding in '×'. Assuming UTF-8