NAME
Lugh::LoRA - Low-Rank Adaptation (LoRA) adapter support for Lugh
SYNOPSIS
use Lugh;
# Load base model first
my $model = Lugh::Model->new(file => 'base-model.gguf');
# Load a LoRA adapter (GGUF format)
my $lora = Lugh::LoRA->new(
adapter => 'adapter.gguf',
model => $model, # Required: validates architecture match
);
# Load a LoRA adapter (SafeTensors format)
my $lora = Lugh::LoRA->new(
adapter => 'adapter.safetensors',
model => $model,
);
# Create a trainable LoRA adapter (for fine-tuning)
my $trainable_lora = Lugh::LoRA->create(
model => $model,
rank => 16, # LoRA rank (default: 16)
alpha => 32.0, # Scaling factor (default: 32.0)
targets => [qw(attn_q attn_v)], # Which layers to adapt
);
# Check adapter properties
say "Alpha: ", $lora->alpha;
say "Scale: ", $lora->scale;
say "Weights: ", $lora->n_weights;
say "Format: ", $lora->format;
say "Trainable: ", $lora->trainable ? "yes" : "no";
# Adjust the LoRA scaling factor
$lora->scale(0.5); # Half strength
# Get weight names
my @names = $lora->weight_names;
# Access trainable weight tensors for gradient-based training
my $tensor_a = $trainable_lora->get_weight_tensor('blk.0.attn_q.weight', 'a');
my $tensor_b = $trainable_lora->get_weight_tensor('blk.0.attn_q.weight', 'b');
# Save trained adapter to GGUF format
$trainable_lora->save('my-finetuned-adapter.gguf');
# Use with inference
my $inference = Lugh::Inference->new(model => $model);
my @logits = $inference->forward(tokens => \@tokens, lora => $lora);
DESCRIPTION
Lugh::LoRA provides support for loading, creating, training, and saving Low-Rank Adaptation (LoRA) adapters for base models. LoRA is an efficient fine-tuning technique that adds small rank-decomposition weight matrices to frozen pre-trained models.
The modified output is computed as:
output = original_output + (alpha / rank) * scale * B @ A @ x
Where:
AandBare the low-rank LoRA matricesalphais the scaling factor from the adapter metadatarankis the inner dimension of the decompositionscaleis a user-adjustable multiplier (default 1.0)
TRAINABLE LORA
Lugh supports creating trainable LoRA adapters from scratch for fine-tuning:
my $lora = Lugh::LoRA->create(
model => $model,
rank => 16,
alpha => 32.0,
targets => [qw(attn_q attn_v)],
);
Trainable LoRA adapters:
Have
requires_gradenabled on all weight tensorsInclude pre-allocated gradient tensors for backpropagation
Use standard LoRA initialization (A: Kaiming, B: zeros)
Can be saved to GGUF format after training
Target Layers
The targets parameter specifies which layers to add LoRA adapters to. Supported targets include:
attn_q- Query projection (default)attn_k- Key projectionattn_v- Value projection (default)attn_output- Output projectionffn_up- FFN up projectionffn_down- FFN down projectionffn_gate- FFN gate projection (SwiGLU models)
SUPPORTED FORMATS
GGUF Format
The native format for Lugh. GGUF LoRA files contain:
Metadata indicating this is a LoRA adapter (
general.type = "adapter")The LoRA alpha value (
adapter.lora.alpha)Architecture information for validation
Paired LoRA tensors (
*.lora_a,*.lora_b)
GGUF adapters provide architecture validation - the adapter's architecture must match the base model to ensure compatibility.
SafeTensors Format
The HuggingFace format for storing tensor data. SafeTensors files contain:
A JSON header with tensor metadata (names, shapes, offsets)
Raw tensor data in little-endian format
SafeTensors LoRA files typically use HuggingFace naming conventions which are automatically translated to the internal format:
base_model.model.layers.0.self_attn.q_proj.lora_A.weight
-> blk.0.attn_q.weight
Note: SafeTensors format does not include alpha metadata, so you may need to set it manually via the alpha accessor or by checking the adapter's original config.json.
METHODS
new
my $lora = Lugh::LoRA->new(
adapter => $path, # Required: path to LoRA file
model => $model, # Required: Lugh::Model for architecture validation
scale => $scale, # Optional: scaling factor (default 1.0)
);
Creates a new LoRA adapter from a GGUF or SafeTensors file. The file format is automatically detected from the file extension.
The model parameter is required because LoRA adapters must be validated against the base model's architecture to ensure compatibility. The adapter's layer names and tensor shapes must match the base model.
Note: The file parameter is also accepted as an alias for adapter.
alpha
my $alpha = $lora->alpha;
$lora->alpha(32.0);
Get or set the alpha scaling factor. This is typically set from the adapter metadata but can be overridden for SafeTensors files or experimentation.
scale
my $scale = $lora->scale;
$lora->scale(0.5);
Get or set the user scale multiplier. The effective scaling is alpha * scale / rank. Use this to adjust LoRA influence without changing alpha.
Common values:
1.0- Full LoRA effect (default)0.5- Half LoRA effect0.0- Disable LoRA (base model only)2.0- Double LoRA effect (may cause instability)
n_weights
my $count = $lora->n_weights;
Returns the number of LoRA weight pairs in the adapter. Each weight pair consists of an A matrix and a B matrix.
format
my $fmt = $lora->format;
Returns the source format of the adapter: "gguf", "safetensors", or "trainable".
weight_names
my @names = $lora->weight_names;
Returns the list of tensor names that have LoRA adaptations. Names are in the internal format (e.g., "blk.0.attn_q.weight").
trainable
my $is_trainable = $lora->trainable;
Returns true if this is a trainable LoRA adapter created with create(), false if it was loaded from a file with new().
create
my $lora = Lugh::LoRA->create(
model => $model, # Required: Lugh::Model
rank => $rank, # Optional: LoRA rank (default: 16)
alpha => $alpha, # Optional: scaling factor (default: 32.0)
scale => $scale, # Optional: user scale (default: 1.0)
targets => \@targets, # Optional: layers to adapt
context => $ctx, # Optional: Lugh::Context for tensors
);
Creates a new trainable LoRA adapter. Unlike new(), this creates fresh weight matrices initialized for training rather than loading from a file.
The rank parameter controls the size of the low-rank decomposition. Common values are 4, 8, 16, 32, or 64. Lower ranks use less memory but have less expressiveness. Valid range: 1-256.
get_weight_tensor
my $tensor_a = $lora->get_weight_tensor($name, 'a');
my $tensor_b = $lora->get_weight_tensor($name, 'b');
Returns the LoRA weight tensor as a Lugh::Autograd::Tensor object. Only available on trainable adapters (created with create()).
The $name parameter is the base weight name (e.g., "blk.0.attn_q.weight"). The second parameter specifies which matrix: 'a' for the down-projection or 'b' for the up-projection.
The returned tensor has requires_grad enabled and gradient storage allocated for use with backpropagation.
save
$lora->save('my-adapter.gguf');
Saves the LoRA adapter to a GGUF file. The path must end with .gguf.
The saved file includes:
Metadata: general.type, adapter.type, adapter.lora.alpha
Architecture information (if available)
All LoRA tensor pairs (*.lora_a, *.lora_b)
The saved adapter can be loaded with new() for inference.
USING LORA WITH INFERENCE
LoRA adapters integrate with all forward methods:
Basic Forward
my @logits = $inference->forward(
tokens => \@tokens,
lora => $lora,
);
With KV Cache
my $cache = $inference->create_kv_cache();
my @logits = $inference->forward_cache(
cache => $cache,
tokens => \@tokens,
lora => $lora,
);
With Memory Pool
my $pool = $inference->create_memory_pool();
my @logits = $inference->forward_pool(
pool => $pool,
tokens => \@tokens,
lora => $lora,
);
Batch Processing
my $results = $inference->forward_batch(
sequences => [\@seq1, \@seq2, \@seq3],
lora => $lora,
);
Adjusting LoRA Strength
Use the scale property to adjust LoRA influence:
$lora->scale(0.0); # Disable LoRA (base model only)
$lora->scale(0.5); # Half LoRA effect
$lora->scale(1.0); # Full LoRA effect (default)
$lora->scale(2.0); # Double LoRA effect
SEE ALSO
Lugh, Lugh::Inference, Lugh::KVCache, Lugh::Autograd::Tensor
AUTHOR
lnation <email@lnation.org>
LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.