NAME

Lugh::Graph - Computation Graph for Tensor Operations

VERSION

Version 0.01

SYNOPSIS

use Lugh;

# Create context and tensors
my $ctx = Lugh::Context->new(mem_size => 10 * 1024 * 1024);

my $a = Lugh::Tensor->new_f32($ctx, 1000);
my $b = Lugh::Tensor->new_f32($ctx, 1000);
$a->set_f32(@a_data);
$b->set_f32(@b_data);

# Build computation
my $c = Lugh::Ops::add($ctx, $a, $b);
my $d = Lugh::Ops::mul($ctx, $c, $c);
my $e = Lugh::Ops::soft_max($ctx, $d);

# Create graph and add operations
my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($e);

# Execute computation
$graph->compute($ctx, 4);  # Use 4 threads

# Read results
my @result = $e->get_f32();

DESCRIPTION

Lugh::Graph represents a computation graph - a directed acyclic graph (DAG) of tensor operations. The graph enables:

Lazy evaluation - Operations are not computed until graph is run
Optimization - ggml can fuse and optimize operations
Parallelization - Multiple threads for matrix operations
Memory planning - Efficient allocation of intermediate tensors

Graph Structure

A computation graph consists of nodes (tensors) and edges (dependencies):

Input A    Input B
   │          │
   └────┬─────┘
        │
     Add(A,B) = C
        │
        ├─────────┐
        │         │
     Mul(C,C) = D │
        │         │
        └────┬────┘
             │
        SoftMax(D) = E
             │
          Output

The graph tracks dependencies so operations execute in correct order.

Build Phase vs Compute Phase

1. Build Phase - Create tensors and operations, recording the graph
2. Compute Phase - Execute all operations in dependency order

This separation allows the same graph to be executed multiple times with different input values.

CONSTRUCTOR

new

my $graph = Lugh::Graph->new($context);

Creates a new empty computation graph.

Parameters:

$context - A Lugh::Context object for graph metadata

Returns: A Lugh::Graph object.

Example:

my $ctx = Lugh::Context->new(mem_size => 1024 * 1024);
my $graph = Lugh::Graph->new($ctx);

METHODS

build_forward

$graph->build_forward($output_tensor);

Adds an output tensor and all its dependencies to the graph.

Parameters:

$output_tensor - The tensor to compute (a Lugh::Tensor)

Details:

This method traverses backwards from the output tensor, adding all required operations to the graph. Multiple outputs can be added by calling build_forward multiple times.

Example:

my $loss = Lugh::Ops::...;
my $accuracy = Lugh::Ops::...;

my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($loss);
$graph->build_forward($accuracy);

compute

$graph->compute($context, $n_threads);

Executes all operations in the graph.

Parameters:

$context - The context for computation
$n_threads - Number of CPU threads to use

Thread Usage:

1 thread - Sequential execution, lowest overhead
N threads - Parallel matrix operations (recommended: CPU cores)
Too many threads - Diminishing returns, overhead increases

Example:

# Single-threaded
$graph->compute($ctx, 1);

# Use all CPU cores (example for 8-core machine)
$graph->compute($ctx, 8);

# Common recommendation
use Sys::Info;
my $info = Sys::Info->new;
my $cpu = $info->device('CPU');
$graph->compute($ctx, $cpu->count);

GRAPH OPERATIONS

Multiple Outputs

my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($output1);
$graph->build_forward($output2);
$graph->compute($ctx, 4);

# Both outputs are now computed
my @result1 = $output1->get_f32();
my @result2 = $output2->get_f32();

Reusing a Graph

# Build once
my $graph = Lugh::Graph->new($ctx);
$graph->build_forward($output);

# Run multiple times with different inputs
for my $input_data (@all_inputs) {
    $input->set_f32(@$input_data);
    $graph->compute($ctx, 4);
    my @result = $output->get_f32();
    push @all_results, \@result;
}

EXECUTION MODEL

Forward Execution

Operations are executed in topological order (dependencies first):

1. Input tensors (already have data)
2. First layer of operations
3. Second layer of operations
4. ... and so on to outputs

Memory Allocation

ggml allocates memory for intermediate tensors during computation. The context must have enough memory for:

Input tensors
Output tensors
All intermediate tensors
Graph metadata

Thread Pool

When using multiple threads, ggml creates a thread pool:

Main Thread
     │
┌────┴────┬────────┬────────┐
│         │        │        │
Worker 0  Worker 1  Worker 2  Worker 3
│         │        │        │
└────┬────┴────────┴────────┘
     │
Barrier Sync
     │
Next Operation

Matrix multiplications and other large operations are parallelized across workers.

PERFORMANCE TIPS

Batch Operations

Instead of many small graph executions, batch inputs:

# Slower: Many small graphs
for my $input (@inputs) {
    $graph->compute($ctx, 4);
}

# Faster: One large computation
# (if using batched tensors)
$batched_graph->compute($ctx, 4);

Memory Reuse

The same context can be reused for multiple graph executions, avoiding repeated memory allocation.

Graph Caching

For inference, build the graph once and reuse:

# Build once at startup
my $inference_graph = build_inference_graph($model);

# Reuse for each query
sub infer {
    my ($tokens) = @_;
    $input_tensor->set_data(@$tokens);
    $inference_graph->compute($ctx, 4);
    return $output_tensor->get_f32();
}

BACKEND SELECTION

ggml automatically selects the best compute backend:

CPU - Always available, uses SIMD (SSE/AVX/NEON)
Metal - Apple Silicon and AMD GPUs on macOS
CUDA - NVIDIA GPUs
Vulkan - Cross-platform GPU
BLAS - Accelerate (macOS) or OpenBLAS for matrix ops

The backend is selected at ggml build time and runtime.

DEBUGGING

Graph Size

# After building
my $n_nodes = ...;  # (Not yet exposed, could add)
print "Graph has $n_nodes operations\n";

Operation Timing

For performance analysis, you can time the compute call:

use Time::HiRes qw(time);

my $start = time();
$graph->compute($ctx, 4);
my $elapsed = time() - $start;

print "Compute took ${elapsed}s\n";

ERROR HANDLING

Common Errors

Shape mismatch - Operations require compatible tensor shapes
Out of memory - Context too small for tensors
Null tensor - Operation returned NULL (allocation failure)

Error Recovery

Graph operations die on error. Use eval for error handling:

eval {
    $graph->compute($ctx, 4);
};
if ($@) {
    warn "Computation failed: $@";
    # Handle error...
}

THREAD SAFETY

Graph objects are NOT thread-safe. Each Perl thread should create its own graphs. However, the compute() method uses internal threading that is safe.

IMPLEMENTATION NOTES

Internally, Lugh::Graph wraps struct ggml_cgraph*:

struct ggml_cgraph {
    int n_nodes;
    int n_leafs;
    struct ggml_tensor ** nodes;  // Operations
    struct ggml_tensor ** grads;  // Gradients (for training)
    struct ggml_tensor ** leafs;  // Inputs
    ...
};

The graph is computed using ggml_graph_compute_with_ctx().

AUTHOR

lnation <email@lnation.org>

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 67:: Non-ASCII character seen before =encoding in '│'. Assuming UTF-8

To install Lugh, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Lugh

CPAN shell

perl -MCPAN -e shell
install Lugh

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

VERSION

SYNOPSIS

DESCRIPTION

Graph Structure

Build Phase vs Compute Phase

CONSTRUCTOR

new

METHODS

build_forward

compute

GRAPH OPERATIONS

Multiple Outputs

Reusing a Graph

EXECUTION MODEL

Forward Execution

Memory Allocation

Thread Pool

PERFORMANCE TIPS

Batch Operations

Memory Reuse

Graph Caching

BACKEND SELECTION

DEBUGGING

Graph Size

Operation Timing

ERROR HANDLING

Common Errors

Error Recovery

THREAD SAFETY

IMPLEMENTATION NOTES

SEE ALSO

AUTHOR

LICENSE

Module Install Instructions