Changes for version 0.05 - 2026-01-18

  • Performance Optimizations
  • GPU backend activation
    • Multi-backend support: Metal, BLAS, CUDA, Vulkan, CPU
  • New backend discovery API:
    • Lugh::available_backends() - List all available backends
    • Lugh::backend_count() - Get backend count
    • Lugh::backend_info($name) - Get backend metadata
    • Lugh::backend_available($name) - Check availability
    • Lugh::best_backend() - Get recommended backend
    • Lugh::has_metal() / Lugh::metal_available() - Metal support
  • Backend selection parameter for Lugh::Inference->new(backend => $name)
  • Memory pools for efficient repeated inference:
    • create_memory_pool() - Create reusable compute resources
    • forward_with_pool($pool, \@tokens) - Forward pass with pooled resources
    • Lugh::MemoryPool class with reset() and backend() methods
  • Batch processing for multiple sequences:
    • forward_batch(\@sequences) - Process multiple token sequences
  • New comprehensive performance test file t/09-performance.t
  • Full documentation for new APIs Lugh/Lugh::Inference

Modules

Pure C LLM Inference Engine for Perl (built on ggml)
Memory Context for Tensor Allocation
Computation Graph for Tensor Operations
Transformer Forward Pass and Token Generation
KV Cache for efficient incremental decoding
GGUF Model Loading and Tensor Access
Tensor Operations for Neural Network Computation
N-Dimensional Tensor with ggml Backend
BPE Tokenizer for Text Encoding and Decoding