Changes for version 0.201 - 2026-02-23
- Add Response.thinking attribute for chain-of-thought reasoning:
- Native extraction: DeepSeek/OpenAI-compatible reasoning_content, Anthropic thinking blocks, Gemini thought parts — automatically populated on Response.thinking, no configuration needed
- Think tag filter: <think> tag stripping enabled by default on all engines. Handles both closed (<think>...</think>) and unclosed (<think>...) tags. Configurable tag name via think_tag (default: 'think'). Disable with think_tag_filter => 0. Filtering applied across all text paths: simple_chat, streaming, tool calling, and Raider.
- Add NousResearch reasoning attribute — enables chain-of-thought reasoning for Hermes 4 and DeepHermes 3 models by prepending the standard Nous reasoning system prompt
- Langfuse cascading traces — Raider now creates proper hierarchical Trace → Span (iteration) → Generation (llm-call) / Span (tool) structure instead of flat trace → generation. Iteration spans group the LLM call and its tool calls. Tool spans capture per-tool timing, input, and output. Trace is updated with final output at raid end.
- Langfuse: add langfuse_span() for creating span events
- Langfuse: add langfuse_update_trace(), langfuse_update_span(), langfuse_update_generation() for updating observations after creation
- Langfuse: langfuse_trace() now supports tags, user_id, session_id, release, version, public, and environment fields
- Langfuse: langfuse_generation() now supports parent_observation_id, model_parameters, level, status_message, and version fields
- Langfuse: Raider generations now include token usage data and model parameters (temperature, max_tokens) when available
- Raider: add langfuse_trace_name, langfuse_user_id, langfuse_session_id, langfuse_tags, langfuse_release, langfuse_version, langfuse_metadata attributes for customizing Langfuse trace creation
- Refactor all OpenAI-compatible engines to compose Langertha::Role::OpenAICompatible directly instead of extending Langertha::Engine::OpenAI. Each engine now only includes the roles it actually supports (e.g. DeepSeek gets Chat but not Embedding). Removes all "doesn't support X" croak overrides. Affected engines: DeepSeek, Groq, Mistral, MiniMax, NousResearch, Perplexity, vLLM, AKIOpenAI, OllamaOpenAI.
- Add Raider context compression — when prompt token usage exceeds a configurable threshold (max_context_tokens * context_compress_threshold), history is automatically summarized via LLM before the next raid. Supports separate compression_engine for using cheaper models. Manual compression via compress_history/compress_history_f.
- Add Raider session_history — full chronological archive of ALL messages including tool calls and results, persisted across clear_history and reset. Queryable by the LLM via MCP tool registered with register_session_history_tool().
- Add MiniMax to live tool calling test (t/80_live_tool_calling.t) and live raider test (t/82_live_raider.t)
- Add t/83_live_minimax.t: dedicated MiniMax live test covering simple_chat, list_models, and Raider with Coding Plan web search
- Add Raider inject() method for mid-raid context injection — queue messages from async callbacks, timers, or other tasks that get picked up at the next iteration naturally
- Add Raider on_iteration callback — called before each LLM call (iterations 2+) with ($raider, $iteration), returns messages to inject. Injected messages are persisted in history.
- Add Langertha::Engine::MiniMax for MiniMax AI API (chat, streaming, tool calling via OpenAI-compatible API)
- Rewrite all POD to inline style across all modules — =attr directly after has, =method directly after sub. Add POD to all previously undocumented modules.
- Improve =seealso cross-links: remove redundant main module links, add meaningful related module references
Documentation
Simple chat with Ollama
Simple chat with OpenAI
Simple script to check the model list on an OpenAI compatible API
Simple transcription with a Whisper compatible server or OpenAI
Modules
The clan of fierce vikings with 🪓 and 🛡️ to AId your rAId
AKI.IO native API
AKI.IO via OpenAI-compatible API
Anthropic API
DeepSeek API
Google Gemini API
GroqCloud API
MiniMax API
Mistral API
Nous Research Inference API
Ollama API
Ollama via OpenAI-compatible API
OpenAI API
Perplexity Sonar API
Whisper compatible transcription server
vLLM inference server
Autonomous agent with conversation history and MCP tools
A HTTP Request inside of Langertha
LLM response with metadata
Role for APIs with normal chat functionality
Role for an engine where you can specify the context size (in tokens)
Role for APIs with embedding functionality
Role for HTTP APIs
Role for JSON
Langfuse observability integration
Role for APIs with several models
Role for OpenAI-compatible API format
Role for APIs with OpenAPI definition
Role for an engine where you can specify structured output
Role for an engine where you can specify the response size (in tokens)
Role for an engine that can set a seed
Role for streaming support
Role for APIs with system prompt
Role for an engine that can have a temperature setting
Configurable think tag filtering for reasoning models
Role for MCP tool calling support
Role for APIs with transcription functionality
Iterator for streaming responses
Represents a single chunk from a streaming response
Bring your own viking!
Examples
- ex/async_await.pl
- ex/ctx.pl
- ex/embedding.pl
- ex/hermes_tools.pl
- ex/ircbot.pl
- ex/json_grammar.pl
- ex/langfuse-k8s.yaml
- ex/langfuse.pl
- ex/logic.pl
- ex/mcp_inprocess.pl
- ex/mcp_stdio.pl
- ex/ollama.pl
- ex/ollama_image.pl
- ex/raider.pl
- ex/response.pl
- ex/sample.ogg
- ex/streaming_anthropic.pl
- ex/streaming_callback.pl
- ex/streaming_future.pl
- ex/streaming_gemini.pl
- ex/streaming_iterator.pl
- ex/streaming_mojo.pl
- ex/structured_code.pl
- ex/structured_output.pl
- ex/structured_sentences.pl
- ex/synopsis.pl
- ex/transcription.pl