Changes for version 0.03 - 2026-01-18

  • Added generate() method for multi-token autoregressive generation
  • Added sample_top_k() method for top-k sampling
  • Generation supports: greedy, top_p, top_k, temperature, streaming callbacks
  • Added EOS token stopping and callback-based early stopping
  • New test suite t/07-generate.t with 22 tests including exact output validation

Modules

Pure C LLM Inference Engine for Perl (built on ggml)
Memory Context for Tensor Allocation
Computation Graph for Tensor Operations
Transformer Forward Pass and Token Generation
GGUF Model Loading and Tensor Access
Tensor Operations for Neural Network Computation
N-Dimensional Tensor with ggml Backend
BPE Tokenizer for Text Encoding and Decoding