NAME
WebService::Ollama - ollama client
VERSION
Version 0.05
SYNOPSIS
my $ollama = WebService::Ollama->new(
base_url => 'http://localhost:11434',
model => 'llama3.2'
);
$ollama->load_completion_model;
my $string = "";
my $why = $ollama->completion(
prompt => 'Why is the sky blue?',
stream => 1,
stream_cb => sub {
my ($res) = @_;
$string .= $res->response;
}
); # returns all chunked responses as an array
$ollama->unload_completion_model;
SUBROUTINES/METHODS
version
Retrieve the Ollama version
$ollama->version;
create_model
Create a model from: another model, a safetensors directory or a GGUF file.
Parameters
- model
-
name of the model to create
- from
-
(optional) name of an existing model to create the new model from
- files
-
(optional) a dictionary of file names to SHA256 digests of blobs to create the model from
- adapters
-
(optional) a dictionary of file names to SHA256 digests of blobs for LORA adapters
- template
-
(optional) the prompt template for the model
- license
-
(optional) a string or list of strings containing the license or licenses for the model
- system
-
(optional) a string containing the system prompt for the model
- parameters
-
(optional) a dictionary of parameters for the model (see Modelfile for a list of parameters)
- messages
-
(optional) a list of message objects used to create a conversation
- stream
-
(optional) if false the response will be returned as a single response object, rather than a stream of objects
- stream_cb
-
(optional) cb to handle stream data
- quantize
-
(optional) quantize a non-quantized (e.g. float16) model
$ollama->create_model(
model => 'mario',
from => 'llama3.2',
system => 'You are Mario from Super Mario Bros.'
);
my $mario_story = $ollama->chat(
model => 'mario',
messages => [
{
role => 'user',
content => 'Hello, Tell me a story.',
}
],
);
copy_model
Copy a model. Creates a model with another name from an existing model.
Parameters
- source
-
source of model to be copied from.
- destination
-
destination of model to be copied to.
$ollama->copy_model(
source => 'llama3.2',
destination => 'llama3-backup'
);
delete_model
Delete a model and its data.
Parameters
- model
-
model name to delete
$ollama->delete_model(
model => 'mario'
);
available_models
List models that are available locally.
$ollama->available_models;
running_models
List models that are currently loaded into memory.
$ollama->running_models;
load_completion_model
Load a model into memory
$ollama->load_completion_model;
$ollama->load_completion_model(model => 'llava');
unload_completion_model
Unload a model from memory
$ollama->unload_completion_model;
$ollama->unload_completion_model(model => 'llava');
completion
Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
Parameters
- model
-
(required) the model name
- prompt
-
the prompt to generate a response for
- suffix
-
the text after the model response
- images
-
(optional) a list of base64-encoded images (for multimodal models such as llava)
- image_files
-
(optional) a list of image files
Advanced parameters (optional):
- format
-
the format to return a response in. Format can be json or a JSON schema
- options
-
additional model parameters listed in the documentation for the Modelfile such as temperature
- system
-
system message to (overrides what is defined in the Modelfile)
- template
-
the prompt template to use (overrides what is defined in the Modelfile)
- stream
-
if false the response will be returned as a single response object, rather than a stream of objects
- stream_cb
-
(optional) cb to handle stream data
- raw
-
if true no formatting will be applied to the prompt. You may choose to use the raw parameter if you are specifying a full templated prompt in your request to the API
- keep_alive
-
controls how long the model will stay loaded into memory following the request (default: 5m)
- context (deprecated)
-
the context parameter returned from a previous request to /generate, this can be used to keep a short conversational memory
my $image = $ollama->completion(
model => 'llava',
prompt => 'What is in this image?',
image_files => [
"t/pingu.png"
]
);
my $json = $ollama->completion(
prompt => "What color is the sky at different times of the day? Respond using JSON",
format => "json",
)->json_response;
my $json2 = $ollama->completion(
prompt => "Ollama is 22 years old and is busy saving the world. Respond using JSON",
format => {
type => "object",
properties => {
age => {
"type" => "integer"
},
available => {
"type" => "boolean"
}
},
required => [
"age",
"available"
]
}
)->json_response;
load_chat_model
Load a model into memory
$ollama->load_chat_model;
$ollama->load_chat_model(model => 'llava');
unload_chat_model
Unload a model from memory
$ollama->unload_chat_model;
$ollama->unload_chat_model(model => 'llava');
chat
Generate the next message in a chat with a provided model.
Parameters
- model
-
(required) the model name
- messages
-
the messages of the chat, this can be used to keep a chat memory
The message object has the following fields:
- role
-
the role of the message, either system, user, assistant, or tool
- content
-
the content of the message
- images
-
(optional) a list of images to include in the message (for multimodal models such as llava)
- tool_calls
-
(optional): a list of tools in JSON that the model wants to use
- tools
-
list of tools in JSON for the model to use if supported
- format
-
the format to return a response in. Format can be json or a JSON schema.
- options
-
additional model parameters listed in the documentation for the Modelfile such as temperature
- stream
-
if false the response will be returned as a single response object, rather than a stream of objects
- keep_alive
-
controls how long the model will stay loaded into memory following the request (default: 5m)
my $completion = $ollama->chat(
messages => [
{
role => 'user',
content => 'Why is the sky blue?',
}
],
);
my $image = $ollama->chat(
model => 'llava',
messages => [
{
role => 'user',
content => 'What is in this image?',
image_files => [
"t/pingu.png"
]
}
]
);
my $json = $ollama->chat(
messages => [
{
role => "user",
"content" => "Ollama is 22 years old and is busy saving the world. Respond using JSON",
}
],
format => {
type => "object",
properties => {
age => {
"type" => "integer"
},
available => {
"type" => "boolean"
}
},
required => [
"age",
"available"
]
}
)->json_response;
embed
Generate embeddings from a model
Parameters
- model
-
name of model to generate embeddings from
- input
-
text or list of text to generate embeddings for
- truncate
-
(optional) truncates the end of each input to fit within context length. Returns error if false and context length is exceeded. Defaults to true
- options
-
(optional) additional model parameters listed in the documentation for the Modelfile such as temperature
- keep_alive
-
(optional) controls how long the model will stay loaded into memory following the request (default: 5m)
my $embeddings = $ollama->embed(
model => "nomic-embed-text",
input => "Why is the sky blue?"
);
AUTHOR
LNATION, <email at lnation.org>
BUGS
Please report any bugs or feature requests to bug-webservice-ollama at rt.cpan.org
, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=WebService-Ollama. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc WebService::Ollama
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
LICENSE AND COPYRIGHT
This software is Copyright (c) 2025 by LNATION.
This is free software, licensed under:
The Artistic License 2.0 (GPL Compatible)