NAME

Langertha::Engine::vLLM - vLLM inference server

VERSION

version 0.303

SYNOPSIS

use Langertha::Engine::vLLM;

my $vllm = Langertha::Engine::vLLM->new(
    url           => 'http://localhost:8000/v1',
    system_prompt => 'You are a helpful assistant',
);

print $vllm->simple_chat('Say something nice');

# MCP tool calling (requires server started with tool-call-parser)
use Future::AsyncAwait;

my $vllm = Langertha::Engine::vLLM->new(
    url         => 'http://localhost:8000/v1',
    model       => 'Qwen/Qwen2.5-3B-Instruct',
    mcp_servers => [$mcp],
);

my $response = await $vllm->chat_with_tools_f('Add 7 and 15');

DESCRIPTION

Provides access to vLLM, a high-throughput inference engine for large language models. Composes Langertha::Role::OpenAICompatible since vLLM exposes an OpenAI-compatible API.

Only url is required. The URL must include the /v1 path prefix (e.g., http://localhost:8000/v1). Since vLLM serves exactly one model (configured at server startup), no model name or API key is needed.

MCP tool calling requires the vLLM server to be started with --enable-auto-tool-choice and --tool-call-parser matching the model (hermes for Qwen2.5/Hermes, llama3 for Llama, mistral for Mistral).

See https://docs.vllm.ai/ for installation and configuration details.

THIS API IS WORK IN PROGRESS

SEE ALSO

SUPPORT

Issues

Please report bugs and feature requests on GitHub at https://github.com/Getty/langertha/issues.

CONTRIBUTING

Contributions are welcome! Please fork the repository and submit a pull request.

AUTHOR

Torsten Raudssus <torsten@raudssus.de> https://raudss.us/

COPYRIGHT AND LICENSE

This software is copyright (c) 2026 by Torsten Raudssus.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.