NAME
Lugh::Tensor - N-Dimensional Tensor with ggml Backend
VERSION
Version 0.09
SYNOPSIS
use Lugh;
# Create a context
my $ctx = Lugh::Context->new(mem_size => 1024 * 1024);
# Create tensors
my $vector = Lugh::Tensor->new_f32($ctx, 100); # 1D
my $matrix = Lugh::Tensor->new_f32($ctx, 100, 200); # 2D
my $tensor3d = Lugh::Tensor->new_f32($ctx, 10, 20, 30); # 3D
# Set values
$vector->set_f32(1.0, 2.0, 3.0, ...); # Must provide all elements
# Get values
my @values = $vector->get_f32();
# Get tensor properties
my $n = $tensor->nelements(); # Total element count
my $dims = $tensor->n_dims(); # Number of dimensions
my @shape = $tensor->shape(); # Size of each dimension
DESCRIPTION
Lugh::Tensor represents an N-dimensional array of numbers, implemented using ggml's tensor system. Tensors are the fundamental building blocks for neural network computations.
Tensor Properties
Data type - F32 (32-bit float), or quantized types for model weights
Dimensions - 1D to 4D arrays
Shape - Size of each dimension
Strides - Memory layout for traversal
Memory Layout
Tensors use row-major (C-style) memory layout:
2D tensor [3, 4]:
Memory: [a00, a01, a02, a03, a10, a11, a12, a13, a20, a21, a22, a23]
Logical:
a00 a01 a02 a03
a10 a11 a12 a13
a20 a21 a22 a23
The first dimension changes fastest in memory.
CONSTRUCTOR
new_f32
my $tensor = Lugh::Tensor->new_f32($context, @dimensions);
Creates a new tensor with F32 (32-bit float) data type.
Parameters:
$context- A Lugh::Context object@dimensions- 1 to 4 dimension sizes
Returns: A Lugh::Tensor object.
Throws: Dies if allocation fails or dimensions are invalid.
Examples:
# 1D vector with 100 elements
my $v = Lugh::Tensor->new_f32($ctx, 100);
# 2D matrix with 100 rows, 200 columns
my $m = Lugh::Tensor->new_f32($ctx, 100, 200);
# 3D tensor
my $t = Lugh::Tensor->new_f32($ctx, 10, 20, 30);
# 4D tensor (max dimensions)
my $t4 = Lugh::Tensor->new_f32($ctx, 2, 3, 4, 5);
METHODS
set_f32
$tensor->set_f32(@values);
Sets all tensor elements from a list of values.
Parameters:
@values- Exactly nelements() float values
Throws: Dies if wrong number of values provided.
Example:
my $t = Lugh::Tensor->new_f32($ctx, 3);
$t->set_f32(1.0, 2.0, 3.0);
get_f32
my @values = $tensor->get_f32();
Returns all tensor elements as a list.
Returns: A list of nelements() float values.
Example:
my @data = $tensor->get_f32();
print "First element: $data[0]\n";
print "Sum: ", sum(@data), "\n";
nelements
my $n = $tensor->nelements();
Returns the total number of elements in the tensor.
Example:
my $t = Lugh::Tensor->new_f32($ctx, 10, 20, 30);
print $t->nelements(); # 6000
n_dims
my $dims = $tensor->n_dims();
Returns the number of dimensions (1-4).
Example:
my $t = Lugh::Tensor->new_f32($ctx, 10, 20);
print $t->n_dims(); # 2
shape
my @shape = $tensor->shape();
Returns the size of each dimension.
Example:
my $t = Lugh::Tensor->new_f32($ctx, 10, 20, 30);
my @shape = $t->shape(); # (10, 20, 30)
type
my $type_id = $tensor->type();
Returns the numeric type ID of the tensor (e.g., 0 for F32, 12 for Q4_K).
Example:
my $t = Lugh::Tensor->new_f32($ctx, 100);
print $t->type(); # 0 (F32)
type_name
my $name = $tensor->type_name();
Returns the string name of the tensor's type.
Example:
my $t = Lugh::Tensor->new_f32($ctx, 100);
print $t->type_name(); # "f32"
# From a quantized model tensor
print $weight_tensor->type_name(); # "q4_K"
type_size
my $bytes = $tensor->type_size();
Returns the size in bytes of one block of this type.
blck_size
my $elements = $tensor->blck_size();
Returns the number of elements per block. For quantized types this is typically 32 or 256.
is_quantized
my $bool = $tensor->is_quantized();
Returns true if the tensor uses a quantized data type.
Example:
if ($tensor->is_quantized()) {
print "Tensor uses ", $tensor->type_name(), " quantization\n";
}
nbytes
my $bytes = $tensor->nbytes();
Returns the total number of bytes used by the tensor's data.
Example:
my $t = Lugh::Tensor->new_f32($ctx, 1000);
print $t->nbytes(); # 4000 (1000 × 4 bytes)
quantize
my $quantized = $tensor->quantize($ctx, $dest_type);
Quantizes an F32 tensor to the specified quantized type. Returns a new tensor.
Parameters:
$ctx- A Lugh::Context with enough memory for the result$dest_type- Target quantization type (from Lugh::Quant)
Returns: A new Lugh::Tensor with the quantized data.
Throws: Dies if source is not F32 or destination is not a quantized type.
Example:
use Lugh::Quant qw(Q4_K);
my $f32 = Lugh::Tensor->new_f32($ctx, 256);
$f32->set_f32(@weights);
my $q4 = $f32->quantize($ctx, Q4_K);
printf "Compressed: %d -> %d bytes\n", $f32->nbytes, $q4->nbytes;
dequantize
my $f32 = $tensor->dequantize($ctx);
Dequantizes a quantized (or F16/BF16) tensor back to F32. Returns a new tensor.
Parameters:
$ctx- A Lugh::Context with enough memory for the result
Returns: A new F32 Lugh::Tensor.
Throws: Dies if tensor is already F32.
Example:
# Round-trip: F32 -> Q4_K -> F32
my $original = Lugh::Tensor->new_f32($ctx, 256);
$original->set_f32(@data);
my $quantized = $original->quantize($ctx, Lugh::Quant::Q4_K);
my $restored = $quantized->dequantize($ctx);
# Compare original vs restored to measure quantization loss
my @orig = $original->get_f32();
my @rest = $restored->get_f32();
TENSOR OPERATIONS
Tensors can be used with Lugh::Ops to build computation graphs:
my $a = Lugh::Tensor->new_f32($ctx, 100);
my $b = Lugh::Tensor->new_f32($ctx, 100);
$a->set_f32(@a_data);
$b->set_f32(@b_data);
# Create operation result tensors
my $c = Lugh::Ops::add($ctx, $a, $b); # Element-wise add
my $d = Lugh::Ops::mul($ctx, $a, $b); # Element-wise multiply
my $e = Lugh::Ops::soft_max($ctx, $a); # Softmax
# Build and compute graph
my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($c);
$graph->compute($ctx, 4);
# Get results
my @result = $c->get_f32();
DATA TYPES
ggml supports many tensor data types:
Float Types
GGML_TYPE_F32 (0) - 32-bit float (4 bytes per element)
GGML_TYPE_F16 (1) - 16-bit float (2 bytes per element)
GGML_TYPE_BF16 (30) - Brain float16 (2 bytes per element)
Quantized Types
Used for model weights to reduce memory:
Q4_0, Q4_1, Q4_K_S, Q4_K_M - 4-bit quantization (~0.5 bytes/element)
Q5_0, Q5_1, Q5_K_S, Q5_K_M - 5-bit quantization (~0.625 bytes/element)
Q8_0, Q8_1, Q8_K - 8-bit quantization (1 byte per element)
Q2_K, Q3_K - 2-3 bit quantization (~0.3 bytes/element)
Quantized tensors from model files can be used directly in operations - ggml handles dequantization automatically during computation.
BROADCASTING
Many operations support broadcasting (NumPy-style):
# Scalar broadcast: [1] op [n] -> [n]
my $scalar = Lugh::Tensor->new_f32($ctx, 1);
my $vector = Lugh::Tensor->new_f32($ctx, 100);
my $result = Lugh::Ops::mul($ctx, $scalar, $vector);
# Row broadcast: [1, n] op [m, n] -> [m, n]
# Column broadcast: [m, 1] op [m, n] -> [m, n]
The broadcasting rules follow standard tensor semantics.
MATRIX MULTIPLICATION
Matrix multiplication follows the pattern:
A [k, n] × B [k, m] → C [n, m]
Note: ggml uses column-major interpretation for mul_mat
Example:
my $a = Lugh::Tensor->new_f32($ctx, 4, 3); # 3×4 matrix
my $b = Lugh::Tensor->new_f32($ctx, 4, 2); # 2×4 matrix
my $c = Lugh::Ops::mul_mat($ctx, $a, $b); # 3×2 result
COMMON TENSOR SHAPES
In transformer models:
Token embeddings - [n_embd, n_vocab]
Hidden state - [n_embd, n_tokens]
Attention Q/K/V - [head_dim, n_heads, n_tokens]
FFN weights - [n_embd, ffn_dim] or [ffn_dim, n_embd]
Logits - [n_vocab, n_tokens]
VIEWS AND RESHAPING
Tensors can be reshaped without copying data:
# Operations like reshape, permute, transpose
# create views of the same memory
my $flat = Lugh::Tensor->new_f32($ctx, 120);
# Internally, ggml can view this as [2,3,4,5] without copying
Note: View operations are internal to ggml. The Perl API currently focuses on creating new tensors and computing results.
THREAD SAFETY
Tensor objects themselves are not thread-safe. However, ggml's graph computation can use multiple CPU threads for parallel operations:
$graph->compute($ctx, $n_threads);
This uses pthreads internally, parallelizing matrix operations across the specified number of threads.
MEMORY
Tensors are allocated from their context's memory arena:
Metadata: ~256 bytes per tensor
Data: type-specific (4 bytes per element for F32)
Memory is freed when the context is destroyed, not when individual tensor objects go out of scope.
SEE ALSO
Lugh, Lugh::Context, Lugh::Ops, Lugh::Graph
https://github.com/ggerganov/ggml - ggml tensor library
AUTHOR
lnation <email@lnation.org>
LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.