Lugh-0.10 - Pure C LLM Inference Engine for Perl (built on ggml) - metacpan.org

20 Jan 2026 21:44:44 UTC
Browse (raw)
Changes
How to Contribute
Issues
Testers (3 / 44 / 0)
Kwalitee
License: artistic_2
Perl: v5.8.3
Activity
24 month
Tools
Download (4.19MB)
MetaCPAN Explorer
Permissions
Subscribe to distribution
- Dependencies
- Alien::ggml
- Cpanel::JSON::XS
- and possibly others
- Reverse dependencies
- CPAN Testers List
- Dependency graph
Permalinks
This version
Authors:

lnation

Released by:

Robert Acock

Maintainers:

LNATION owner

Contributors:

lnation

Changes for version 0.10 - 2026-01-20

Fix forward Batch Mode with Per-Sequence KV Caches
- forward_batch() now accepts caches => [$cache1, $cache2, ...] for per-sequence KV caching in batch mode
- Enables parallel incremental decoding of multiple conversations
- Each sequence uses its own independent cache
- Number of caches must match number of sequences
- Also supported via forward(sequences => [...], caches => [...])
- forward_batch_pool() also supports caches parameter
- New tests in t/28-unified-forward.t
Speculative Decoding (Phase 16)
- New Lugh::Speculative module for faster inference
- Uses smaller draft model to generate candidate tokens
- Main model verifies in parallel for 2-3x speedup
- Dual KV cache management for both models
- Statistics: acceptance_rate(), tokens_drafted/accepted, total_steps
- Vocab compatibility validation between models
- New tests in t/29-speculative.t

Modules

Lugh

Pure C LLM Inference Engine for Perl (built on ggml)

Memory Context for Tensor Allocation

Computation Graph for Tensor Operations

Lugh::Inference

Transformer Forward Pass and Token Generation

KV Cache for efficient incremental decoding

Low-Rank Adaptation (LoRA) adapter support for Lugh

GGUF Model Loading and Tensor Access

Tensor Operations for Neural Network Computation

Chat Template Formatting for LLM Conversations

Quantization utilities for Lugh tensors

RoPE (Rotary Position Embedding) Scaling Configuration

Lugh::Speculative

Speculative decoding for faster LLM inference

N-Dimensional Tensor with ggml Backend

Lugh::Tokenizer

BPE Tokenizer for Text Encoding and Decoding

Examples

Other files

To install Lugh, copy and paste the appropriate command in to your terminal.

cpanm Lugh

perl -MCPAN -e shell
install Lugh

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)