Changes for version 0.08 - 2026-01-20

  • New Lugh::Quant module
  • All GGML quantization types exposed as constants:
    • Float types: F32, F16, BF16, F64
    • Integer types: I8, I16, I32, I64
    • Basic quantization: Q4_0, Q4_1, Q5_0, Q5_1, Q8_0, Q8_1
    • K-quant types: Q2_K, Q3_K, Q4_K, Q5_K, Q6_K, Q8_K
    • IQ types: IQ1_S, IQ1_M, IQ2_XXS, IQ2_XS, IQ2_S, IQ3_XXS, IQ3_S, IQ4_NL, IQ4_XS
    • Experimental: TQ1_0, TQ2_0, MXFP4
  • Type introspection functions:
    • type_name(), type_size(), blck_size(), type_sizef()
    • is_quantized(), requires_imatrix(), row_size()
    • type_count(), all_types(), all_quantized_types()
    • type_from_name(), type_info()
  • Tensor type methods: type(), type_name(), type_size(), blck_size(), is_quantized(), nbytes()
  • OO quantize/dequantize on Lugh::Tensor:
    • $tensor->quantize($ctx, $type) - F32 to quantized
    • $tensor->dequantize($ctx) - quantized/F16/BF16 to F32
  • Quantization loss test demonstrating accuracy vs bit depth

Modules

Pure C LLM Inference Engine for Perl (built on ggml)
Memory Context for Tensor Allocation
Computation Graph for Tensor Operations
Transformer Forward Pass and Token Generation
KV Cache for efficient incremental decoding
Low-Rank Adaptation (LoRA) adapter support for Lugh
GGUF Model Loading and Tensor Access
Tensor Operations for Neural Network Computation
Chat Template Formatting for LLM Conversations
Quantization utilities for Lugh tensors
N-Dimensional Tensor with ggml Backend
BPE Tokenizer for Text Encoding and Decoding