NAME
WebService::Ollama - ollama client
VERSION
Version 0.07
SYNOPSIS
my
$ollama
= WebService::Ollama->new(
model
=>
'llama3.2'
);
$ollama
->load_completion_model;
my
$string
=
""
;
my
$why
=
$ollama
->completion(
prompt
=>
'Why is the sky blue?'
,
stream
=> 1,
stream_cb
=>
sub
{
my
(
$res
) =
@_
;
$string
.=
$res
->response;
}
);
# returns all chunked responses as an array
$ollama
->unload_completion_model;
SUBROUTINES/METHODS
version
Retrieve the Ollama version
$ollama
->version;
create_model
Create a model from: another model, a safetensors directory or a GGUF file.
Parameters
- model
-
name of the model to create
- from
-
(optional) name of an existing model to create the new model from
- files
-
(optional) a dictionary of file names to SHA256 digests of blobs to create the model from
- adapters
-
(optional) a dictionary of file names to SHA256 digests of blobs for LORA adapters
- template
-
(optional) the prompt template for the model
- license
-
(optional) a string or list of strings containing the license or licenses for the model
- system
-
(optional) a string containing the system prompt for the model
- parameters
-
(optional) a dictionary of parameters for the model (see Modelfile for a list of parameters)
- messages
-
(optional) a list of message objects used to create a conversation
- stream
-
(optional) if false the response will be returned as a single response object, rather than a stream of objects
- stream_cb
-
(optional) cb to handle stream data
- quantize
-
(optional) quantize a non-quantized (e.g. float16) model
$ollama
->create_model(
model
=>
'mario'
,
from
=>
'llama3.2'
,
system
=>
'You are Mario from Super Mario Bros.'
);
my
$mario_story
=
$ollama
->chat(
model
=>
'mario'
,
messages
=> [
{
role
=>
'user'
,
content
=>
'Hello, Tell me a story.'
,
}
],
);
copy_model
Copy a model. Creates a model with another name from an existing model.
Parameters
$ollama
->copy_model(
source
=>
'llama3.2'
,
destination
=>
'llama3-backup'
);
delete_model
Delete a model and its data.
Parameters
$ollama
->delete_model(
model
=>
'mario'
);
available_models
List models that are available locally.
$ollama
->available_models;
running_models
List models that are currently loaded into memory.
$ollama
->running_models;
load_completion_model
Load a model into memory
$ollama
->load_completion_model;
$ollama
->load_completion_model(
model
=>
'llava'
);
unload_completion_model
Unload a model from memory
$ollama
->unload_completion_model;
$ollama
->unload_completion_model(
model
=>
'llava'
);
completion
Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
Parameters
- model
-
(required) the model name
- prompt
-
the prompt to generate a response for
- suffix
-
the text after the model response
- images
-
(optional) a list of base64-encoded images (for multimodal models such as llava)
- image_files
-
(optional) a list of image files
Advanced parameters (optional):
- format
-
the format to return a response in. Format can be json or a JSON schema
- options
-
additional model parameters listed in the documentation for the Modelfile such as temperature
- system
-
system message to (overrides what is defined in the Modelfile)
- template
-
the prompt template to use (overrides what is defined in the Modelfile)
- stream
-
if false the response will be returned as a single response object, rather than a stream of objects
- stream_cb
-
(optional) cb to handle stream data
- raw
-
if
true
no
formatting will be applied to the prompt. You may choose to
use
the raw parameter
if
you are specifying a full templated prompt in your request to the API
- keep_alive
-
controls how long the model will stay loaded into memory following the request (default: 5m)
- context (deprecated)
-
the context parameter returned from a previous request to /generate, this can be used to keep a short conversational memory
my
$image
=
$ollama
->completion(
model
=>
'llava'
,
prompt
=>
'What is in this image?'
,
image_files
=> [
"t/pingu.png"
]
);
my
$json
=
$ollama
->completion(
prompt
=>
"What color is the sky at different times of the day? Respond using JSON"
,
format
=>
"json"
,
)->json_response;
my
$json2
=
$ollama
->completion(
prompt
=>
"Ollama is 22 years old and is busy saving the world. Respond using JSON"
,
format
=> {
type
=>
"object"
,
properties
=> {
age
=> {
"type"
=>
"integer"
},
available
=> {
"type"
=>
"boolean"
}
},
required
=> [
"age"
,
"available"
]
}
)->json_response;
load_chat_model
Load a model into memory
$ollama
->load_chat_model;
$ollama
->load_chat_model(
model
=>
'llava'
);
unload_chat_model
Unload a model from memory
$ollama
->unload_chat_model;
$ollama
->unload_chat_model(
model
=>
'llava'
);
chat
Generate the next message in a chat with a provided model.
Parameters
- model
-
(required) the model name
- messages
-
the messages of the chat, this can be used to keep a chat memory
The message object has the following fields:
- tools
-
list of tools in JSON for the model to use if supported
- format
-
the format to return a response in. Format can be json or a JSON schema.
- options
-
additional model parameters listed in the documentation for the Modelfile such as temperature
- stream
-
if false the response will be returned as a single response object, rather than a stream of objects
- keep_alive
-
controls how long the model will stay loaded into memory following the request (default: 5m)
my
$completion
=
$ollama
->chat(
messages
=> [
{
role
=>
'user'
,
content
=>
'Why is the sky blue?'
,
}
],
);
my
$image
=
$ollama
->chat(
model
=>
'llava'
,
messages
=> [
{
role
=>
'user'
,
content
=>
'What is in this image?'
,
image_files
=> [
"t/pingu.png"
]
}
]
);
my
$json
=
$ollama
->chat(
messages
=> [
{
role
=>
"user"
,
"content"
=>
"Ollama is 22 years old and is busy saving the world. Respond using JSON"
,
}
],
format
=> {
type
=>
"object"
,
properties
=> {
age
=> {
"type"
=>
"integer"
},
available
=> {
"type"
=>
"boolean"
}
},
required
=> [
"age"
,
"available"
]
}
)->json_response;
embed
Generate embeddings from a model
Parameters
- model
-
name of model to generate embeddings from
- input
-
text or list of text to generate embeddings for
- truncate
-
(optional) truncates the end of each input to fit within context length. Returns error if false and context length is exceeded. Defaults to true
- options
-
(optional) additional model parameters listed in the documentation for the Modelfile such as temperature
- keep_alive
-
(optional) controls how long the model will stay loaded into memory following the request (default: 5m)
my
$embeddings
=
$ollama
->embed(
model
=>
"nomic-embed-text"
,
input
=>
"Why is the sky blue?"
);
AUTHOR
LNATION, <email at lnation.org>
BUGS
Please report any bugs or feature requests to bug-webservice-ollama at rt.cpan.org
, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=WebService-Ollama. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc WebService::Ollama
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
LICENSE AND COPYRIGHT
This software is Copyright (c) 2025 by LNATION.
This is free software, licensed under:
The Artistic License 2.0 (GPL Compatible)