NAME
Langertha::Engine::vLLM - vLLM inference server
VERSION
version 0.303
SYNOPSIS
use Langertha::Engine::vLLM;
my $vllm = Langertha::Engine::vLLM->new(
url => 'http://localhost:8000/v1',
system_prompt => 'You are a helpful assistant',
);
print $vllm->simple_chat('Say something nice');
# MCP tool calling (requires server started with tool-call-parser)
use Future::AsyncAwait;
my $vllm = Langertha::Engine::vLLM->new(
url => 'http://localhost:8000/v1',
model => 'Qwen/Qwen2.5-3B-Instruct',
mcp_servers => [$mcp],
);
my $response = await $vllm->chat_with_tools_f('Add 7 and 15');
DESCRIPTION
Provides access to vLLM, a high-throughput inference engine for large language models. Composes Langertha::Role::OpenAICompatible since vLLM exposes an OpenAI-compatible API.
Only url is required. The URL must include the /v1 path prefix (e.g., http://localhost:8000/v1). Since vLLM serves exactly one model (configured at server startup), no model name or API key is needed.
MCP tool calling requires the vLLM server to be started with --enable-auto-tool-choice and --tool-call-parser matching the model (hermes for Qwen2.5/Hermes, llama3 for Llama, mistral for Mistral).
See https://docs.vllm.ai/ for installation and configuration details.
THIS API IS WORK IN PROGRESS
SEE ALSO
https://docs.vllm.ai/ - vLLM documentation
Langertha::Role::OpenAICompatible - OpenAI API format role
Langertha::Role::Tools - MCP tool calling interface
Langertha::Engine::OllamaOpenAI - Another self-hosted OpenAI-compatible engine
SUPPORT
Issues
Please report bugs and feature requests on GitHub at https://github.com/Getty/langertha/issues.
CONTRIBUTING
Contributions are welcome! Please fork the repository and submit a pull request.
AUTHOR
Torsten Raudssus <torsten@raudssus.de> https://raudss.us/
COPYRIGHT AND LICENSE
This software is copyright (c) 2026 by Torsten Raudssus.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.