NAME
Lugh::Ops - Tensor Operations for Neural Network Computation
VERSION
Version 0.01
SYNOPSIS
use Lugh;
my $ctx = Lugh::Context->new(mem_size => 10 * 1024 * 1024);
# Create input tensors
my $a = Lugh::Tensor->new_f32($ctx, 100);
my $b = Lugh::Tensor->new_f32($ctx, 100);
$a->set_f32(@a_data);
$b->set_f32(@b_data);
# Arithmetic operations
my $sum = Lugh::Ops::add($ctx, $a, $b);
my $product = Lugh::Ops::mul($ctx, $a, $b);
# Matrix operations
my $w = Lugh::Tensor->new_f32($ctx, 100, 50);
my $x = Lugh::Tensor->new_f32($ctx, 100, 10);
my $y = Lugh::Ops::mul_mat($ctx, $w, $x);
# Activation functions
my $activated = Lugh::Ops::silu($ctx, $a);
my $probs = Lugh::Ops::soft_max($ctx, $a);
# Normalization
my $normed = Lugh::Ops::rms_norm($ctx, $a, 1e-5);
# Build and compute
my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($sum);
$graph->compute($ctx, 4);
my @result = $sum->get_f32();
DESCRIPTION
Lugh::Ops provides tensor operations that form the building blocks of neural network computation. These operations create computation graph nodes that are evaluated lazily when the graph is computed.
All operations are static functions that take a context and input tensor(s), returning a new tensor representing the operation result.
Lazy Evaluation
Operations don't compute results immediately. Instead, they build a computation graph:
my $a = Lugh::Tensor->new_f32($ctx, 100);
my $b = Lugh::Tensor->new_f32($ctx, 100);
my $c = Lugh::Ops::add($ctx, $a, $b); # No computation yet!
my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($c);
$graph->compute($ctx, 4); # Computation happens here
my @result = $c->get_f32(); # Now we can read results
This allows ggml to optimize the computation and use multiple threads.
FUNCTIONS
add
my $c = Lugh::Ops::add($ctx, $a, $b);
Element-wise addition of two tensors.
Parameters:
$ctx- A Lugh::Context object$a- First tensor$b- Second tensor (must match shape or be broadcastable)
Returns: A new tensor C = A + B.
Example:
my $a = ...; # [1.0, 2.0, 3.0]
my $b = ...; # [4.0, 5.0, 6.0]
my $c = Lugh::Ops::add($ctx, $a, $b);
# Result: [5.0, 7.0, 9.0]
mul
my $c = Lugh::Ops::mul($ctx, $a, $b);
Element-wise multiplication of two tensors.
Parameters:
$ctx- A Lugh::Context object$a- First tensor$b- Second tensor (must match shape or be broadcastable)
Returns: A new tensor C = A * B (element-wise).
Example:
my $a = ...; # [1.0, 2.0, 3.0]
my $b = ...; # [4.0, 5.0, 6.0]
my $c = Lugh::Ops::mul($ctx, $a, $b);
# Result: [4.0, 10.0, 18.0]
mul_mat
my $c = Lugh::Ops::mul_mat($ctx, $a, $b);
Matrix multiplication.
Parameters:
$ctx- A Lugh::Context object$a- Left matrix [K, N]$b- Right matrix [K, M]
Returns: A new tensor C = A^T × B with shape [N, M].
Note: ggml's mul_mat has specific dimension semantics. For standard matrix multiply A × B where A is [M, K] and B is [K, N]:
# Transpose A and pass to mul_mat
C = mul_mat(A^T, B) # C is [M, N]
In practice for neural networks:
# Weight matrix W: [out_dim, in_dim]
# Input X: [in_dim, batch]
# Output: [out_dim, batch]
my $output = Lugh::Ops::mul_mat($ctx, $weights, $input);
Example:
my $w = Lugh::Tensor->new_f32($ctx, 100, 50); # [100, 50]
my $x = Lugh::Tensor->new_f32($ctx, 100, 10); # [100, 10]
my $y = Lugh::Ops::mul_mat($ctx, $w, $x); # [50, 10]
soft_max
my $probs = Lugh::Ops::soft_max($ctx, $logits);
Applies softmax to convert logits to probabilities.
Parameters:
$ctx- A Lugh::Context object$logits- Input tensor
Returns: A new tensor with softmax applied along the first dimension.
Formula:
softmax(x_i) = exp(x_i) / Σ exp(x_j)
The output sums to 1.0 along the softmax dimension.
Example:
my $logits = ...; # [2.0, 1.0, 0.1]
my $probs = Lugh::Ops::soft_max($ctx, $logits);
# Result: [0.659, 0.242, 0.099] (sums to 1.0)
rms_norm
my $normed = Lugh::Ops::rms_norm($ctx, $x, $eps);
Applies Root Mean Square Layer Normalization.
Parameters:
$ctx- A Lugh::Context object$x- Input tensor$eps- Epsilon for numerical stability (e.g., 1e-5)
Returns: A new tensor with RMSNorm applied.
Formula:
RMSNorm(x) = x / √(mean(x²) + ε)
Unlike LayerNorm, RMSNorm does not center the values (no mean subtraction).
Example:
my $x = ...;
my $normed = Lugh::Ops::rms_norm($ctx, $x, 1e-5);
# RMS of $normed is approximately 1.0
silu
my $activated = Lugh::Ops::silu($ctx, $x);
Applies the SiLU (Sigmoid Linear Unit) activation function.
Parameters:
$ctx- A Lugh::Context object$x- Input tensor
Returns: A new tensor with SiLU applied element-wise.
Formula:
SiLU(x) = x × σ(x) = x / (1 + exp(-x))
Also known as "Swish" activation. Used in modern LLMs like LLaMA.
Example:
my $x = ...; # [-1.0, 0.0, 1.0]
my $y = Lugh::Ops::silu($ctx, $x);
# Result: [-0.269, 0.0, 0.731]
ADDITIONAL OPERATIONS
The XS code exposes these operations. Additional ggml operations can be added as needed:
Unary Operations
neg- Negate: -xabs- Absolute value: |x|sqr- Square: x²sqrt- Square root: √xexp- Exponential: e^xlog- Natural logarithm: ln(x)sin,cos- Trigonometric functionsrelu- ReLU: max(0, x)gelu- GELU activationtanh- Hyperbolic tangent
Binary Operations
sub- Subtraction: a - bdiv- Division: a / bscale- Scale: a × scalar
Reduction Operations
sum- Sum of elementsmean- Mean of elementsmax- Maximum elementmin- Minimum element
Shape Operations
reshape- Change tensor shapepermute- Transpose dimensionstranspose- Swap two dimensionsview- Create a view with different stridescont- Make tensor contiguous in memory
Attention Operations
diag_mask_inf- Apply causal mask (upper triangle → -inf)rope_ext- Apply rotary position embeddingsflash_attn_ext- Flash attention (optimized)
BROADCASTING
Operations support NumPy-style broadcasting:
# Scalar × Vector
my $scalar = Lugh::Tensor->new_f32($ctx, 1);
my $vector = Lugh::Tensor->new_f32($ctx, 100);
my $result = Lugh::Ops::mul($ctx, $scalar, $vector); # [100]
# Row × Matrix
my $row = Lugh::Tensor->new_f32($ctx, 100, 1); # [100, 1]
my $matrix = Lugh::Tensor->new_f32($ctx, 100, 50); # [100, 50]
my $result = Lugh::Ops::mul($ctx, $row, $matrix); # [100, 50]
Broadcasting rules:
- 1. Dimensions are compared from right to left
- 2. Dimensions match if equal or one of them is 1
- 3. Missing dimensions are treated as 1
OPERATION FUSION
ggml automatically fuses compatible operations when building the computation graph, reducing memory traffic and improving performance:
# These may be fused into a single kernel:
my $x = Lugh::Ops::mul($ctx, $a, $b);
my $y = Lugh::Ops::add($ctx, $x, $c);
GPU ACCELERATION
On supported platforms, operations automatically use:
Metal - Apple GPU acceleration on macOS
CUDA - NVIDIA GPU acceleration
Vulkan - Cross-platform GPU
BLAS - Accelerate/OpenBLAS for matrix operations
No code changes needed - ggml selects the best backend.
THREAD SAFETY
Operation functions are thread-safe - they only create graph nodes. The actual computation happens in $graph->compute() which handles parallelization internally.
SEE ALSO
Lugh, Lugh::Context, Lugh::Tensor, Lugh::Graph
https://github.com/ggerganov/ggml - ggml library
AUTHOR
lnation <email@lnation.org>
LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 153:
Non-ASCII character seen before =encoding in '×'. Assuming UTF-8