Revision history for Langertha
0.500 2026-04-26 18:50:51Z
!!! Heads-up for callers upgrading from 0.404 — items marked [BREAKING]
!!! below may need code changes; everything else is additive.
[BREAKING] Langertha::Response->tool_calls is now ArrayRef[Langertha::
ToolCall] (was ArrayRef[HashRef]). Code that read $r->tool_calls->[0]
->{name}/->{arguments}/->{id}/->{synthetic} as hash keys must switch
to ->name / ->arguments / ->id / ->synthetic method calls. The
Response constructor still accepts the old HashRef form and upgrades
transparently (BUILDARGS), so passing tool_calls in is unchanged;
only consumption changed. tool_call_args() is unchanged.
[BREAKING] Langertha::Engine::Whisper no longer extends
Langertha::Engine::OpenAI. It now extends the new
Langertha::Engine::TranscriptionBase, so a Whisper instance no
longer has simple_chat / chat_f / chat_with_tools_f / embedding /
simple_image / Tools / ImageGeneration / Embedding methods.
Existing code that called only transcription methods is unaffected.
To get a Whisper handle from an OpenAI engine without restating
credentials use the new $openai->whisper attribute.
[BREAKING] Langertha::Role::ResponseFormat::decode_loose_json is now
a method on the role, not a free function. Code that called
Langertha::Role::ResponseFormat::decode_loose_json($text) directly
must switch to $engine->decode_loose_json($text). This makes it
overridable per engine for providers that need a custom strategy.
The standalone Langertha::Util that briefly existed has been
removed for the same reason.
- New Langertha::Engine::TranscriptionBase: slim base class for
OpenAI-shape transcription-only engines (composes OpenAICompatible,
OpenAPI, Models, Transcription, Capabilities — no Chat / Tools /
Embedding / ImageGeneration). Whisper now extends it.
- Langertha::Engine::OpenAI gained a `whisper` lazy attribute that
returns a Langertha::Engine::TranscriptionBase configured with the
parent's api_key/url and `whisper-1` as transcription_model.
`$openai->whisper->simple_transcription($file)` is the canonical
way to use OpenAI's hosted Whisper from a chat-side engine.
- New Langertha::Role::Capabilities, composed by Langertha::Role::
Chat (and therefore present on every engine via composition). One
central role-to-flag map drives engine_capabilities; engines
override via `around engine_capabilities` for wire-reality
corrections. Capabilities reported by each role:
Chat -> chat
Streaming -> streaming
Tools -> tools_native + tool_choice_{auto,any,none,named}
HermesTools -> tools_hermes
ResponseFormat -> response_format_json_object/json_schema
Embedding -> embedding
Transcription -> transcription
ImageGeneration -> image_generation
Temperature -> temperature
Seed -> seed
ContextSize -> context_size
ResponseSize -> response_size
SystemPrompt -> system_prompt
ParallelToolUse -> parallel_tool_use
The earlier `does()`-based heuristic in Role::Chat is gone;
`$engine->supports($cap)` is the canonical query.
- Langertha::Tool gained from_mcp (camelCase inputSchema), from_gemini
(flat `parameters`), to_gemini, to_mcp, to_json_schema. from_hash
now auto-detects MCP / Anthropic / Gemini shapes in addition to
OpenAI. This kills the input_schema/inputSchema/parameters/
function.parameters chaos that used to live in chat_f.
- Langertha::ToolCall gained a `synthetic` boolean attribute (false
by default) and a from_gemini constructor; ToolCall->extract now
pulls Gemini functionCall parts out of candidates[0].content.parts.
- Langertha::Response.tool_calls is now populated by every native
tool-calling engine (OpenAICompatible, AnthropicBase, Gemini,
Ollama) as well as the chat_f synthetic-tool fallback path. Single
source of truth — same shape regardless of provider.
Langertha::Response gained tool_call($name) returning the matching
Langertha::ToolCall object (vs. tool_call_args returning args).
- Langertha::Stream::Chunk gained an optional tool_calls attribute
(ArrayRef[Langertha::ToolCall]). Langertha::Role::Chat got
aggregate_tool_calls($chunks) for collecting them after a stream
ends. Per-engine streaming tool-call delta accumulation will land
incrementally; the structures are in place.
- Langertha::Engine::AnthropicBase, Langertha::Engine::Gemini, and
Langertha::Engine::Ollama now compose Langertha::Role::
ResponseFormat. Anthropic emulates response_format via a
synthesized tool plus forced tool_choice (the chat_response parser
lifts the resulting tool_use input back into Response.content as
JSON). Gemini translates response_format into generationConfig
(responseMimeType + responseSchema). Ollama translates into the
`format` parameter (string 'json' for json_object, schema HashRef
for json_schema). The legacy Ollama json_format attribute still
works as a fallback when response_format isn't set.
- Langertha::Engine::OpenAIBase now composes Langertha::Role::ResponseFormat,
so every OpenAI-compatible engine (Perplexity, DeepSeek, Groq, Mistral,
MiniMax, Cerebras, OpenRouter, Replicate, HuggingFace, AKIOpenAI,
TSystems, Scaleway, Ollama-OpenAI, vLLM, SGLang, LlamaCpp, NousResearch)
accepts a response_format constructor argument. Removed redundant
individual ResponseFormat composition from those engines.
- Langertha::ToolChoice gained to_perplexity (string-only API: auto/none/
required, named coerces to required) and to_gemini (toolConfig.
functionCallingConfig with mode AUTO/ANY/NONE plus allowed_function_names
for named forcing) serializers.
- Langertha::Engine::Gemini chat_request and chat_stream_request now
translate tool_choice in any input shape (canonical / OpenAI / Anthropic)
into Gemini's toolConfig payload.
- Langertha::Role::Chat got chat_f, a named-arguments async entry point:
$engine->chat_f(messages => [...], tools => [...], tool_choice => ...,
response_format => ...). simple_chat_f delegates to it; existing
@messages-style call sites are unchanged. Forced-named tool calls on
engines that lack native named-tool-forcing but support json_schema
response_format (currently Perplexity) are auto-rewritten through the
response_format path; the response text is loose-parsed (handles
```json fences and prose-wrapped JSON) and a synthetic tool_calls
entry is attached so callers see the same shape regardless of provider.
- Langertha::Response gained a tool_calls attribute and tool_call_args
accessor; clone_with carries tool_calls through.
- Langertha::Role::Chat exposes engine_capabilities (default derived from
role composition) and a supports($cap) helper so software can query
what the engine can honour before sending parameters.
- Langertha::Role::ResponseFormat gained decode_loose_json($text), a
tolerant decoder for structured-output responses that may be wrapped
in code fences or prose.
- New Langertha::Engine::TSystems for the T-Systems AI Foundation
Services / LLM Hub OpenAI-compatible endpoint
(https://llm-server.llmhub.t-systems.net/v2). Bearer auth via
LANGERTHA_TSYSTEMS_API_KEY, default model gpt-oss-120b (T-Cloud,
Germany; reliable tool calling), supports chat, streaming, tool
calling, embeddings (default text-embedding-bge-m3) and structured
output. GDPR-compliant; T-Cloud models are processed in Germany,
hyperscaler models in the EU.
- New Langertha::Engine::Scaleway for Scaleway Generative APIs
(https://api.scaleway.ai/v1) — EU-hosted, drop-in OpenAI-compatible
replacement. Bearer auth via LANGERTHA_SCALEWAY_API_KEY, default
model llama-3.1-8b-instruct, supports chat, streaming, tool
calling, embeddings and structured output.
0.404 2026-04-21 14:06:44Z
- New Langertha::Content role and Langertha::Content::Image value object
for provider-agnostic vision input. Mirrors the Langertha::ToolChoice
pattern: one canonical block (from_url / from_file / from_data /
from_base64) serializes to OpenAI image_url, Anthropic image source
(URL or base64), and Gemini inline_data via to_openai / to_anthropic
/ to_gemini. Gemini auto-downloads URL-only images on first call
because it has no URL source equivalent; media_type is sniffed from
the extension or the fetched Content-Type header.
- Langertha::Role::Chat gained content_format ('openai' by default,
'anthropic' on AnthropicBase, 'gemini' on Gemini) and a normalization
pass in chat_messages: a user message whose content is an arrayref
containing Langertha::Content objects is converted to the engine's
native wire format (bare strings in the array are wrapped as text
blocks, and Gemini messages are rebuilt into role/parts with
assistant -> model). Messages without Langertha::Content objects are
passed through untouched, so existing callers are unaffected.
- Fixes the "messages.0.content.1: Input tag 'image_url' ... does not
match 'image'" 400 from Anthropic when the same [text + image] prompt
was reused across engines: the canonical block is what callers
author, each engine produces its own format.
0.403 2026-04-21 12:04:54Z
- Fixed "Wide character in subroutine entry" crash on non-ASCII JSON
responses. Role::JSON's shared instance is configured with utf8=>1
(bytes in/out), but parse_response and execute_streaming_request
were feeding it Perl-Unicode via $response->decoded_content, which
blew up the first time a response body contained a non-ASCII byte
(Umlaut, em-dash, CJK, emoji). Both entry points now use
$response->content (raw bytes), keeping the pipeline consistent
with the outgoing side. The two spots that re-decode JSON
substrings out of an already-decoded tree (OpenAICompatible's
extract_tool_call for tool_call.function.arguments, and
HermesTools' response_tool_calls for <tool_call> XML bodies) now
go through a new Role::JSON::decode_json_text helper that
centralizes the encode_utf8 bridge.
- format_tools in OpenAICompatible, AnthropicBase, Gemini, and Ollama
now accept input_schema, inputSchema, or parameters as the schema
key (snake_case preferred, camelCase for MCP spec compatibility,
parameters as OpenAI-style fallback). Matches the defensive lookup
already done by Langertha::Tool::from_hash and
Raider::tools_as_mcp, which mix both styles internally.
0.402 2026-04-20 22:07:40Z
- [BREAKING] Langertha::Engine::MiniMax now talks to MiniMax's native
OpenAI-compatible endpoint (https://api.minimax.io/v1) instead of
the Anthropic-compatible shim. The previous behavior is preserved
as a new class Langertha::Engine::MiniMaxAnthropic (URL corrected
to /anthropic/v1 as MiniMax actually documents). Background:
MiniMax's /anthropic endpoint does not reliably re-parse
stringified tool-call arguments, causing intermittent tool-calling
failures where the Anthropic SDK sees a wrapper object whose key
rotates between 'result', 'arguments', and the tool name.
MiniMax's native OpenAI endpoint avoids the shim entirely. Users
who need the Anthropic wire format should switch from
Langertha::Engine::MiniMax to Langertha::Engine::MiniMaxAnthropic.
The default model is now MiniMax-M2.7 (was MiniMax-M2.5) on both
classes.
- Automatic tool_choice normalization. chat_request in OpenAICompatible
and AnthropicBase now runs any tool_choice passed via %extra through
Langertha::ToolChoice and emits the target-engine's native format.
Callers can pass Anthropic-style (type+name), OpenAI-style
(type:function + function.name), or string shorthands ('auto',
'none', 'required', 'any') to any engine — no more engine-specific
branching needed.
- New Langertha::Role::ParallelToolUse with a canonical
`parallel_tool_use` boolean attribute. Constructor also accepts the
provider-native alias names: `parallel_tool_calls` (OpenAI) and
`disable_parallel_tool_use` (Anthropic, inverted). The attribute is
translated per-engine to the native request parameter — OpenAI sends
`parallel_tool_calls`, Anthropic folds `disable_parallel_tool_use`
into the tool_choice block. Automatically composed by
Langertha::Role::Tools so every tool-capable engine gets it.
- Langertha::ToolChoice accepts 'any' as a string shorthand and
{type:'required'} as a hash form (both normalize to canonical type
'any').
- Added MiniMax-M2.7 to the static model list and made it the default.
0.401 2026-04-12 21:24:49Z
- Guard list_models in OpenAICompatible against engines that do not
support listModels operation; use StaticModels for Perplexity and
NousResearch instead of hitting a 404.
- Fix Moose warning in ToolChoice by importing only enum from
Moose::Util::TypeConstraints.
0.400 2026-04-07 23:01:11Z
- New value object Langertha::Usage for token counting with from_hash /
from_response constructors and to_openai/anthropic/ollama_format
serializers.
- New value object Langertha::Cost for the monetary cost of a single
LLM call (input_usd / output_usd / total_usd / currency).
- New Langertha::Pricing — model→rule catalog with cost_for(usage,
model) returning a Cost.
- New Langertha::UsageRecord — Usage + Cost + tagged metadata
(provider, engine, model, route, api_key_id, duration_ms, tool
counts) with to_hash for ledger storage.
- New value object Langertha::Tool for canonical tool definitions
with from_openai / from_anthropic / from_list constructors and
to_openai / to_anthropic / to_ollama / to_hash serializers.
- New value object Langertha::ToolCall for canonical tool
invocations with from_openai / from_anthropic / from_ollama /
extract / extract_hermes_from_text constructors and to_openai /
to_anthropic_block / to_ollama serializers.
- New value object Langertha::ToolChoice with enum-typed canonical
type ('auto' / 'any' / 'none' / 'tool'), auto/any/none/specific
shortcut constructors, and to_openai / to_anthropic conversions.
- Refactor Langertha::Metrics, Langertha::Input, Langertha::Output,
Langertha::Input::Tools, Langertha::Output::Tools into thin
backwards-compatibility facades over the new value objects.
External APIs unchanged; existing callers keep working.
- The five facade modules now emit a one-time Carp::carp at load
time pointing callers at the new value objects.
- dist.ini sets irc = #langertha so PodWeaver injects an IRC
support block in every module's POD.
0.309 2026-04-05 16:37:32Z
- Fix Moose role composition: consolidate all separate `with` calls into
single `with map { 'Langertha::Role::'.$_ } qw(...)` form across all
engines; this exposed a real role conflict between Role::OpenAICompatible
and Role::OpenAPI on `_build_openapi_operations`.
- Fix role conflict: remove `_build_openapi_operations` from
Role::OpenAICompatible (wrong place), define it in Engine::OpenAIBase
(the consuming class) using `use_module` instead of `require` hack.
- Apply same `use_module('Langertha::Spec::*')->data` pattern to
Engine::Ollama, Engine::Mistral, Engine::LMStudio.
- Add missing `make_immutable` to Engine::Whisper and Request::HTTP.
- Remove unused `namespace::autoclean` from Stream and Stream::Chunk.
0.308 2026-04-04 15:03:20Z
0.307 2026-03-10 17:42:28Z
- Add new OpenAI-compatible self-hosted engine:
Langertha::Engine::SGLang.
- Add engine-scope module discovery via Module::Pluggable in Langertha:
`available_engine_classes`, `available_engine_ids`,
and generic `discover_modules_in_scope`.
- Update `resolve_engine_class` to use discovered module scope
(`Langertha::Engine::*` + `LangerthaX::Engine::*`) with deterministic
core-first lookup.
- Add `Langertha->new_engine($name_or_class, %args)` helper for
resolve+load+construct in one call.
- Document third-party custom engines under `LangerthaX::Engine::*`
and include resolver behavior in docs.
- Add tests for discovered engine classes/ids and LangerthaX fallback
(`t/99-engine-resolution.t` + `t/lib` fixture module).
- Extend load/hierarchy/readme coverage for the new SGLang engine.
- Add `Module::Pluggable` as a direct runtime dependency.
0.306 2026-03-10 13:37:01Z
- Add new shared core modules for cross-format normalization:
Langertha::Input(+::Tools), Langertha::Output(+::Tools),
and Langertha::Metrics.
- Core modules centralize tool schema conversion (OpenAI/Anthropic/Ollama),
Hermes XML extraction/normalization, and usage/cost metric normalization.
- Add core tests t/97_input_output.t and t/98_metrics.t and extend t/00_load.t.
0.305 2026-03-08 21:51:01Z
- New engine base class: Langertha::Engine::AnthropicBase for
Anthropic-compatible APIs (shared /v1/messages chat/streaming/tool/model
handling and Anthropic rate-limit parsing). Anthropic now extends this
base, and MiniMax + LMStudioAnthropic were migrated to extend it too.
- New engine: Langertha::Engine::LMStudio — native LM Studio local REST
API adapter (POST /api/v1/chat, SSE streaming with message.delta/chat.end,
GET /api/v1/models). Supports optional bearer auth via
LANGERTHA_LMSTUDIO_API_KEY, plus basic auth via URL userinfo.
Includes openai() helper returning a Langertha::Engine::LMStudioOpenAI
instance for LM Studio's /v1 endpoint.
- New engine: Langertha::Engine::LMStudioOpenAI for LM Studio's
OpenAI-compatible /v1 endpoint (defaults api_key to C<lmstudio>).
- New engine: Langertha::Engine::LMStudioAnthropic for LM Studio's
Anthropic-compatible /v1/messages endpoint. Includes LMStudio->anthropic
helper for easy conversion from native engine instances; defaults api_key
to C<lmstudio>.
- New OpenAPI spec: share/lmstudio.yaml with operationIds for LM Studio
native chat and model listing, plus Langertha::Spec::LMStudio for
pre-computed operation lookup.
- Tests: extend t/00_load.t, t/10_engine_hierarchy.t, and
t/11_basic_auth.t to cover LMStudio loading, inheritance/roles,
request mapping, and auth behavior.
Extend t/83_live_chat.t with optional LM Studio live coverage via
TEST_LANGERTHA_LMSTUDIO_URL, TEST_LANGERTHA_LMSTUDIO_MODEL, and
TEST_LANGERTHA_LMSTUDIO_API_KEY.
- Documentation: add POD for LMStudio, LMStudioOpenAI, and
LMStudioAnthropic helpers/attributes and expand README examples
to include explicit LMStudioOpenAI/LMStudioAnthropic class usage.
- Orchestration foundation on top of Raider:
add Langertha::Role::Runnable (run_f contract),
Langertha::RunContext (input/state/artifacts/metadata/trace + branch/merge),
Langertha::Raid base class, and concrete orchestrators
Langertha::Raid::Sequential, Langertha::Raid::Parallel, Langertha::Raid::Loop.
Supports nested composition of Raider and Raid nodes.
- Unified result model:
add Langertha::Result as common result abstraction (final/question/pause/abort),
and make Langertha::Raider::Result a backward-compatible subclass so Raider
and Raid share the same result semantics.
- Raider compatibility + interface:
Raider now composes Langertha::Role::Runnable and exposes run_f($ctx)
as an orchestration-friendly wrapper around raid_f while keeping existing
public raid_f/respond_f behavior intact.
- Raider fixes:
_gather_tools_f now uses the active engine (not always the default engine),
and Langfuse model parameters are recalculated after engine/tool dirtiness
refresh during runtime engine switching.
- Raider respond_f consistency:
plugin_after_tool_call hooks are now applied to remaining tool calls during
continuation flow (self-tools and MCP tools), matching main loop behavior.
- Tests:
add t/96_raid_orchestration.t covering Runnable compatibility, sequential/
parallel/loop orchestration, nested Raid trees, context propagation and
parallel isolation/merge semantics, result propagation (final/question/pause/abort),
and error paths for all orchestrator types.
Extend t/00_load.t to include new modules.
- Documentation:
add inline POD for all new orchestration/result/context modules and
refresh Raider::Result POD to reflect shared result inheritance.
Extend README with a new "Raid — Workflow Orchestration" section
(RunContext, Sequential/Parallel/Loop, unified results, nesting),
plus a top-level table of contents, architecture overview, and a
minimal sequential orchestration example.
0.304 2026-03-07 02:05:27Z
- New role: Langertha::Role::HermesTools — extracted Hermes-style XML
tool calling into a dedicated role. Engines compose this role instead
of setting a hermes_tools flag. Cleaner polymorphic dispatch: Role::Tools
provides the tool loop and default native API path, HermesTools overrides
build_tool_chat_request to inject tools into the system prompt.
- Role::Tools cleaned up: removed all hermes branching, private _hermes_*
methods, and hermes_tools attribute. Five polymorphic methods
(format_tools, response_tool_calls, extract_tool_call,
format_tool_results, response_text_content) are now provided by either
the engine (native) or HermesTools (XML).
- AKI.pm (native API): added tool calling support via HermesTools role
with hermes_extract_content override for AKI's response format.
- AKIOpenAI.pm: composes HermesTools role (replaces hermes_tools flag).
- NousResearch.pm: composes HermesTools role (replaces hermes_tools flag).
- Raider and Chat: simplified tool loop — removed all hermes if/else
branching, uses polymorphic build_tool_chat_request.
0.303 2026-03-01 03:24:11Z
0.302 2026-02-27 03:48:44Z
- Fix list_models URL construction: add overridable list_models_path
method to Role::OpenAICompatible (default: /models). Mistral
overrides to /v1/models. Fixes broken URL for engines whose base
URL does not include /v1.
- New Role::StaticModels: provides list_models from a hardcoded model
list without HTTP requests. Used by MiniMax.
- HuggingFace: list_models now queries the Hub API
(huggingface.co/api/models) with search, pipeline_tag, and
inference_provider filters. Only returns models with active
inference providers.
0.301 2026-02-27 01:57:13Z
- Rate limit extraction from HTTP response headers: new
Langertha::RateLimit data class with normalized requests_limit,
requests_remaining, tokens_limit, tokens_remaining, and reset
fields plus raw provider-specific headers. Supported providers:
OpenAI/Groq/Cerebras/OpenRouter/Replicate/HuggingFace
(x-ratelimit-*) and Anthropic (anthropic-ratelimit-*). Engine
stores latest rate_limit, Response carries per-response rate_limit
with requests_remaining/tokens_remaining convenience methods.
- New engine: HuggingFace — HuggingFace Inference Providers
(OpenAI-compatible, org/model format, chat + streaming + tool calling)
0.300 2026-02-26 21:03:33Z
- Plugin system: Langertha::Plugin base class with lifecycle hooks
(plugin_before_raid, plugin_build_conversation, plugin_before_llm_call,
plugin_after_llm_response, plugin_before_tool_call,
plugin_after_tool_call, plugin_after_raid) and self_tools support.
Plugins can be specified by short name (resolved to
Langertha::Plugin::* or LangerthaX::Plugin::*).
- Langertha::Plugin::Langfuse: Langfuse observability as a plugin
(alternative to engine-level Role::Langfuse), with cascading traces,
generations, and tool call spans in the Raider loop.
- Role::PluginHost: shared plugin hosting for engines and Raider,
with plugin resolution, instantiation, and _plugin_instances caching.
- Wrapper classes: Langertha::Chat, Langertha::Embedder,
Langertha::ImageGen for wrapping engines with optional overrides
(model, system_prompt, temperature, etc.) and plugin lifecycle hooks.
- Class sugar: `use Langertha qw( Raider )` and
`use Langertha qw( Plugin )` for quick subclass setup with
auto-import of Moose and Future::AsyncAwait.
- Image generation: Role::ImageGeneration with image_model attribute,
OpenAICompatible image_request/image_response/simple_image methods,
OpenAI now composes ImageGeneration role (default: gpt-image-1).
- Role::KeepAlive: extracted keep_alive attribute from Ollama into
a reusable role with get_keep_alive accessor.
- Ollama: update to current API — use operationIds chat/embed/list/ps
(was generateChat/generateEmbeddings/getModels/getRunningModels),
embedding response uses embeddings[0] (was embedding).
- NousResearch: reasoning_prompt is now a configurable attribute
(was hardcoded string).
- Groq, Mistral, OpenAI: consolidate `with 'Langertha::Role::Tools'`
into the main role composition block.
- Log::Any debug/trace logging in Role::Chat, Role::Embedding,
Role::HTTP, Role::Tools, and Role::OpenAPI for request lifecycle
visibility.
- Add Log::Any to cpanfile runtime dependencies.
- Update OpenAPI specs: openai.yaml, mistral.yaml, ollama.yaml to
latest upstream versions.
- Pre-computed OpenAPI lookup tables: ship Langertha::Spec::OpenAI (148
ops), Langertha::Spec::Mistral (67 ops), and Langertha::Spec::Ollama
(12 ops) as static Perl data instead of parsing YAML + constructing
OpenAPI::Modern at runtime. Startup cost drops from ~16s to <1ms.
- New openapi_operations attribute in Role::OpenAPI with automatic
fallback: engines that override _build_openapi_operations get the
fast path; custom engines using openapi_file still work via the
slow YAML/OpenAPI::Modern path.
- Add maint/generate_spec_data.pl to regenerate Spec modules from
share/*.yaml when specs are updated.
- New tests: t/84_live_imagegen.t, t/87_raider_plugins.t,
t/89_langertha_sugar.t, t/91_plugin_config.t, t/92_embedder.t,
t/93_chat.t, t/94_plugin_langfuse.t, t/95_imagegen.t.
0.202 2026-02-25 03:50:44Z
- Engine base class hierarchy: introduce Engine::Remote (JSON + HTTP
+ url required) and Engine::OpenAIBase (+ OpenAICompatible, OpenAPI,
Models, Temperature, ResponseSize, SystemPrompt, Streaming, Chat).
All 15 engines now extend these base classes instead of repeating
10+ role composition statements. New engines need only 2-3 lines.
- Migrate non-OpenAI engines to extend Engine::Remote:
Anthropic, Gemini, Ollama, AKI
- Migrate OpenAI-compatible engines to extend Engine::OpenAIBase:
OpenAI, DeepSeek, Groq, Perplexity, Mistral, MiniMax, NousResearch,
AKIOpenAI, OllamaOpenAI, vLLM (Whisper inherits via OpenAI)
- New engine: Cerebras — fastest inference platform (llama-3.3-70b)
- New engine: OpenRouter — unified gateway for 300+ models
- New engine: Replicate — thousands of open-source models
- New engine: LlamaCpp — llama.cpp server with embeddings
- OpenAICompatible: api_key is now optional (undef = no Authorization
header), enabling local engines (vLLM, llama.cpp) without dummy keys
- OpenAICompatible: model is now optional in requests, enabling
single-model servers (vLLM, llama.cpp) without explicit model names
- Add comprehensive engine hierarchy test (t/10_engine_hierarchy.t)
verifying inheritance, role composition, instantiation, and request
generation for all 19 engines
- Raider self-tools: raider_mcp => 1 enables LLM-controlled tools:
raider_ask_user, raider_pause, raider_abort, raider_wait,
raider_wait_for, raider_session_history, raider_manage_mcps,
raider_switch_engine
- Raider engine_catalog: runtime engine switching via self-tool or API
- Raider mcp_catalog: dynamic MCP server activation/deactivation
- Raider inline tools: quick tool definitions without MCP server setup
- Raider::Result: typed result objects (final, question, pause, abort)
with backward-compatible stringification
- AKI: openai() no longer carries over native model name (different
naming between native and /v1 API), uses default model and warns
- Add live embedding test (t/82_live_embedding.t) with semantic
similarity verification via Math::Vector::Similarity for OpenAI,
Mistral, Ollama, OllamaOpenAI, and LlamaCpp
- Add live chat test (t/83_live_chat.t) for all 16 engines including
Cerebras, OpenRouter, Perplexity, MiniMax, and LlamaCpp
0.201 2026-02-23 03:50:17Z
- Add Response.thinking attribute for chain-of-thought reasoning:
- Native extraction: DeepSeek/OpenAI-compatible reasoning_content,
Anthropic thinking blocks, Gemini thought parts — automatically
populated on Response.thinking, no configuration needed
- Think tag filter: <think> tag stripping enabled by default on
all engines. Handles both closed (<think>...</think>) and
unclosed (<think>...) tags. Configurable tag name via
think_tag (default: 'think'). Disable with
think_tag_filter => 0. Filtering applied across all text
paths: simple_chat, streaming, tool calling, and Raider.
- Add NousResearch reasoning attribute — enables chain-of-thought
reasoning for Hermes 4 and DeepHermes 3 models by prepending
the standard Nous reasoning system prompt
- Langfuse cascading traces — Raider now creates proper hierarchical
Trace → Span (iteration) → Generation (llm-call) / Span (tool)
structure instead of flat trace → generation. Iteration spans group
the LLM call and its tool calls. Tool spans capture per-tool timing,
input, and output. Trace is updated with final output at raid end.
- Langfuse: add langfuse_span() for creating span events
- Langfuse: add langfuse_update_trace(), langfuse_update_span(),
langfuse_update_generation() for updating observations after creation
- Langfuse: langfuse_trace() now supports tags, user_id, session_id,
release, version, public, and environment fields
- Langfuse: langfuse_generation() now supports parent_observation_id,
model_parameters, level, status_message, and version fields
- Langfuse: Raider generations now include token usage data and
model parameters (temperature, max_tokens) when available
- Raider: add langfuse_trace_name, langfuse_user_id, langfuse_session_id,
langfuse_tags, langfuse_release, langfuse_version, langfuse_metadata
attributes for customizing Langfuse trace creation
- Refactor all OpenAI-compatible engines to compose
Langertha::Role::OpenAICompatible directly instead of extending
Langertha::Engine::OpenAI. Each engine now only includes the roles
it actually supports (e.g. DeepSeek gets Chat but not Embedding).
Removes all "doesn't support X" croak overrides. Affected engines:
DeepSeek, Groq, Mistral, MiniMax, NousResearch, Perplexity, vLLM,
AKIOpenAI, OllamaOpenAI.
- Add Raider context compression — when prompt token usage exceeds
a configurable threshold (max_context_tokens * context_compress_threshold),
history is automatically summarized via LLM before the next raid.
Supports separate compression_engine for using cheaper models.
Manual compression via compress_history/compress_history_f.
- Add Raider session_history — full chronological archive of ALL
messages including tool calls and results, persisted across
clear_history and reset. Queryable by the LLM via MCP tool
registered with register_session_history_tool().
- Add MiniMax to live tool calling test (t/80_live_tool_calling.t)
and live raider test (t/82_live_raider.t)
- Add t/83_live_minimax.t: dedicated MiniMax live test covering
simple_chat, list_models, and Raider with Coding Plan web search
- Add Raider inject() method for mid-raid context injection —
queue messages from async callbacks, timers, or other tasks
that get picked up at the next iteration naturally
- Add Raider on_iteration callback — called before each LLM call
(iterations 2+) with ($raider, $iteration), returns messages
to inject. Injected messages are persisted in history.
- Add Langertha::Engine::MiniMax for MiniMax AI API
(chat, streaming, tool calling via OpenAI-compatible API)
- Rewrite all POD to inline style across all modules —
=attr directly after has, =method directly after sub.
Add POD to all previously undocumented modules.
- Improve =seealso cross-links: remove redundant main module
links, add meaningful related module references
0.200 2026-02-22 21:53:36Z
- Add Langertha::Response: metadata container wrapping LLM text content
with id, model, finish_reason, usage (token counts), timing, and created
fields. Uses overload stringification for backward compatibility —
existing code treating responses as strings continues to work.
- All chat_response methods now return Langertha::Response objects:
- Role::OpenAICompatible: extracts id, model, created, finish_reason, usage
- Engine::Anthropic: extracts id, model, stop_reason, input/output_tokens
- Engine::Gemini: extracts modelVersion, finishReason, usageMetadata
(normalized to prompt_tokens/completion_tokens/total_tokens)
- Engine::Ollama: extracts model, done_reason, eval counts, timing fields
- Engine::AKI: extracts model_name, total_duration
- Add Langertha::Raider: autonomous agent with conversation history and
MCP tool calling. Features mission (system prompt), persistent history
across raids, cumulative metrics (raids, iterations, tool_calls, time_ms),
clear_history and reset methods. Supports Hermes tool calling.
Auto-instruments raids with Langfuse traces and per-iteration
generation events when Langfuse is enabled on the engine.
- Add Langertha::Role::Langfuse: observability integration with Langfuse
REST API. Composed into Role::Chat — every engine has Langfuse support
built in. Auto-instruments simple_chat with trace and generation events.
Batched ingestion via POST /api/public/ingestion with Basic Auth.
Disabled by default — active when langfuse_public_key and
langfuse_secret_key are set (via constructor or LANGFUSE_PUBLIC_KEY /
LANGFUSE_SECRET_KEY / LANGFUSE_URL env vars).
- Add ex/response.pl: Response metadata showcase (tokens, model, timing)
- Add ex/raider.pl: autonomous file explorer agent example
- Add ex/langfuse.pl: Langfuse observability example
- Add ex/langfuse-k8s.yaml: Kubernetes manifest for self-hosted Langfuse
with pre-configured project and API keys (zero setup)
- Add t/70_response.t: Response unit tests across all engine formats
- Add t/72_langfuse.t: Langfuse integration tests with mock HTTP
- Add t/82_live_raider.t: live Raider integration test
- Add Langertha::Role::OpenAICompatible: extracted OpenAI API format
methods into a reusable role. Engines that use the OpenAI-compatible
API format now compose this role instead of duplicating methods.
Engine::OpenAI and all subclasses continue to work unchanged.
- Add Langertha::Engine::OllamaOpenAI: first-class engine for Ollama's
OpenAI-compatible /v1 endpoint. Ollama's openai() method now returns
this engine instead of a raw Engine::OpenAI instance.
- Add Langertha::Engine::AKI for AKI.IO native API
(chat completions with key-in-body auth, synchronous mode,
dynamic endpoint listing via list_models and endpoint_details)
- Add Langertha::Engine::AKIOpenAI for AKI.IO via OpenAI-compatible API
(chat, streaming, tool calling via Role::OpenAICompatible)
- Add Langertha::Engine::NousResearch for Nous Research Inference API
with Hermes-native tool calling via <tool_call> XML tags
- Add Langertha::Engine::Perplexity for Perplexity Sonar API
(chat and streaming only, no tool calling)
- Add hermes_tools feature flag to Langertha::Role::Tools for
Hermes-native tool calling via <tool_call>/<tool_response> XML tags;
enables MCP tool calling on any model that supports the Hermes
prompt format, even without API-level tool support
- Add hermes_call_tag, hermes_response_tag attributes for custom
XML tag names (default: tool_call, tool_response)
- Add hermes_tool_instructions attribute for customizing the
instruction text without changing the structural XML template
- Add hermes_tool_prompt attribute for full system prompt override
- Add hermes_extract_content() method for engines to override
response content extraction in Hermes mode
- MCP tool calling now supported on ALL engines:
- OpenAI (inherited by Groq, vLLM, Mistral, DeepSeek)
- Anthropic (with Anthropic-native tool format)
- Gemini (with Gemini-native functionDeclarations format)
- Ollama (OpenAI-compatible tool format)
- NousResearch (Hermes-native via <tool_call> XML tags)
- Add extract_tool_call() to Role::Tools for engine-agnostic
tool call parsing across all provider formats
- Fix Gemini tool calling: pass-through native message formats,
convert MCP tool results to Gemini's functionResponse object
- Fix Gemini chat_request to preserve native parts in messages
from tool result round-trips
- Remove hardcoded all_models() lists from all engines; model
discovery is now exclusively dynamic via list_models()
- Update default models:
- Anthropic: claude-sonnet-4-6 (short alias)
- Gemini: gemini-2.5-flash (2.0-flash deprecated for new users)
- Add Hermes tool calling unit test with mock round-trip
(t/66_tool_calling_hermes.t)
- Add vLLM tool calling unit test (t/65_tool_calling_vllm.t)
- Add live integration test for all engines including Ollama, vLLM,
and NousResearch (t/80_live_tool_calling.t) with multi-model support
- Add mock round-trip test for Ollama tool calling
(t/64_tool_calling_ollama_mock.t) using fixture data
- Add shared Test::MockAsyncHTTP test helper (t/lib/)
for mocking async HTTP in engine tests
- Normalize test API key env vars to TEST_LANGERTHA_*_API_KEY
prefix to prevent accidental use of production keys
- Add TEST_LANGERTHA_OLLAMA_URL and TEST_LANGERTHA_OLLAMA_MODELS
env vars for Ollama live testing
- Add TEST_LANGERTHA_VLLM_URL, TEST_LANGERTHA_VLLM_MODEL, and
TEST_LANGERTHA_VLLM_TOOL_CALL_PARSER env vars for vLLM live testing
- Add AKI.IO native API unit test (t/25_aki_requests.t) with mock
response parsing for chat, list_models, and endpoint_details
- Add AKI.IO live integration test (t/81_live_aki.t) for
list_models, endpoint_details, and simple_chat
- Add AKI.IO to live tool calling test (t/80_live_tool_calling.t)
via OpenAI-compatible API
- Add TEST_LANGERTHA_AKI_API_KEY and TEST_LANGERTHA_AKI_MODEL
env vars for AKI.IO live testing
- Use RFC 2606 test.invalid domain for dummy URLs in unit tests
- Add ex/hermes_tools.pl example for Hermes-native tool calling
- Rewrite all POD to inline style across all 37 modules —
=attr directly after has, =method directly after sub.
Add POD to 18 previously undocumented modules.
0.100 2026-02-20 05:33:44Z
- Add MCP (Model Context Protocol) tool calling support
- New Langertha::Role::Tools for engine-agnostic tool calling
- Anthropic engine: full tool calling support (format_tools,
response_tool_calls, format_tool_results, response_text_content)
- Async chat_with_tools_f() method for automatic multi-round
tool-calling loop with configurable max iterations
- Requires Net::Async::MCP for MCP server communication
- Add Future::AsyncAwait support for async/await syntax
- All _f methods (simple_chat_f, simple_chat_stream_f, etc.)
- Streaming with real-time async callbacks
- Add streaming support
- Synchronous callback, iterator, and Future-based APIs
- SSE parsing for OpenAI/Anthropic/Groq/Mistral/DeepSeek
- NDJSON parsing for Ollama
- Add Gemini engine (Google AI Studio)
- Add dynamic model listing via provider APIs with caching
- Add Anthropic extended parameters (effort, inference_geo)
- Improve POD documentation across all modules
0.008 2025-03-30 04:55:38Z
- Add Mistral engine integration
- Adapt Mistral OpenAPI spec for our parser
0.007 2025-01-25 19:29:51Z
- Add DeepSeek engine
0.006 2024-09-30 14:07:25Z
- Add Structured Output support
- Add Groq engine and Groq Whisper support
- Add TEST_WITHOUT_STRUCTURED_OUTPUT env variable
0.005 2024-08-22 13:43:31Z
- Fix data type on keep_alive and remove POSIX round usage
0.004 2024-08-13 23:10:57Z
- Fix interpretation of max_tokens on Anthropic (response size, not context)
0.003 2024-08-11 00:21:01Z
- Add context size and temperature controls
0.002 2024-08-10 02:22:12Z
- Add Whisper Transcription API
- Add more engines
- Fix encoding issues
0.001 2024-08-03 22:47:33Z
- Initial release
- Unified Perl interface for LLM APIs
- Engines: OpenAI, Anthropic, Ollama
- Role-based architecture (Chat, HTTP, Models, JSON, Embedding)
- OpenAPI spec-driven request generation
- Embedding support