Changes for version 0.202 - 2026-02-25

  • Engine base class hierarchy: introduce Engine::Remote (JSON + HTTP
    • url required) and Engine::OpenAIBase (+ OpenAICompatible, OpenAPI, Models, Temperature, ResponseSize, SystemPrompt, Streaming, Chat). All 15 engines now extend these base classes instead of repeating 10+ role composition statements. New engines need only 2-3 lines.
  • Migrate non-OpenAI engines to extend Engine::Remote: Anthropic, Gemini, Ollama, AKI
  • Migrate OpenAI-compatible engines to extend Engine::OpenAIBase: OpenAI, DeepSeek, Groq, Perplexity, Mistral, MiniMax, NousResearch, AKIOpenAI, OllamaOpenAI, vLLM (Whisper inherits via OpenAI)
  • New engine: Cerebras — fastest inference platform (llama-3.3-70b)
  • New engine: OpenRouter — unified gateway for 300+ models
  • New engine: Replicate — thousands of open-source models
  • New engine: LlamaCpp — llama.cpp server with embeddings
  • OpenAICompatible: api_key is now optional (undef = no Authorization header), enabling local engines (vLLM, llama.cpp) without dummy keys
  • OpenAICompatible: model is now optional in requests, enabling single-model servers (vLLM, llama.cpp) without explicit model names
  • Add comprehensive engine hierarchy test (t/10_engine_hierarchy.t) verifying inheritance, role composition, instantiation, and request generation for all 19 engines
  • Raider self-tools: raider_mcp => 1 enables LLM-controlled tools: raider_ask_user, raider_pause, raider_abort, raider_wait, raider_wait_for, raider_session_history, raider_manage_mcps, raider_switch_engine
  • Raider engine_catalog: runtime engine switching via self-tool or API
  • Raider mcp_catalog: dynamic MCP server activation/deactivation
  • Raider inline tools: quick tool definitions without MCP server setup
  • Raider::Result: typed result objects (final, question, pause, abort) with backward-compatible stringification
  • AKI: openai() no longer carries over native model name (different naming between native and /v1 API), uses default model and warns
  • Add live embedding test (t/82_live_embedding.t) with semantic similarity verification via Math::Vector::Similarity for OpenAI, Mistral, Ollama, OllamaOpenAI, and LlamaCpp
  • Add live chat test (t/83_live_chat.t) for all 16 engines including Cerebras, OpenRouter, Perplexity, MiniMax, and LlamaCpp

Documentation

Simple chat with Ollama
Simple chat with OpenAI
Simple script to check the model list on an OpenAI compatible API
Simple transcription with a Whisper compatible server or OpenAI

Modules

The clan of fierce vikings with 🪓 and 🛡️ to AId your rAId
AKI.IO native API
AKI.IO via OpenAI-compatible API
Cerebras Inference API
Google Gemini API
GroqCloud API
llama.cpp server
Nous Research Inference API
Ollama via OpenAI-compatible API
Base class for OpenAI-compatible engines
Perplexity Sonar API
Base class for all remote engines
Whisper compatible transcription server
vLLM inference server
Autonomous agent with conversation history and MCP tools
Result object from a Raider raid
A HTTP Request inside of Langertha
LLM response with metadata
Role for APIs with normal chat functionality
Role for an engine where you can specify the context size (in tokens)
Role for APIs with embedding functionality
Role for HTTP APIs
Role for JSON
Langfuse observability integration
Role for APIs with several models
Role for OpenAI-compatible API format
Role for APIs with OpenAPI definition
Role for an engine where you can specify structured output
Role for an engine where you can specify the response size (in tokens)
Role for an engine that can set a seed
Role for streaming support
Role for APIs with system prompt
Role for an engine that can have a temperature setting
Configurable think tag filtering for reasoning models
Role for MCP tool calling support
Role for APIs with transcription functionality
Iterator for streaming responses
Represents a single chunk from a streaming response
Bring your own viking!