NAME

WebService::Ollama - ollama client

VERSION

Version 0.05

SYNOPSIS

my $ollama = WebService::Ollama->new(
	base_url => 'http://localhost:11434',
	model => 'llama3.2'
);

$ollama->load_completion_model;

my $string = "";

my $why = $ollama->completion(
	prompt => 'Why is the sky blue?',
	stream => 1,
	stream_cb => sub {
		my ($res) = @_;
		$string .= $res->response;
	}
); # returns all chunked responses as an array

$ollama->unload_completion_model;

SUBROUTINES/METHODS

version

Retrieve the Ollama version

$ollama->version;

create_model

Create a model from: another model, a safetensors directory or a GGUF file.

Parameters

model

name of the model to create

from

(optional) name of an existing model to create the new model from

files

(optional) a dictionary of file names to SHA256 digests of blobs to create the model from

adapters

(optional) a dictionary of file names to SHA256 digests of blobs for LORA adapters

template

(optional) the prompt template for the model

license

(optional) a string or list of strings containing the license or licenses for the model

system

(optional) a string containing the system prompt for the model

parameters

(optional) a dictionary of parameters for the model (see Modelfile for a list of parameters)

messages

(optional) a list of message objects used to create a conversation

stream

(optional) if false the response will be returned as a single response object, rather than a stream of objects

stream_cb

(optional) cb to handle stream data

quantize

(optional) quantize a non-quantized (e.g. float16) model

$ollama->create_model(
	model => 'mario',
	from => 'llama3.2',
	system => 'You are Mario from Super Mario Bros.'
);


my $mario_story = $ollama->chat(
	model => 'mario',
	messages => [
		{
			role => 'user',
			content => 'Hello, Tell me a story.',
		}
	],
);

copy_model

Copy a model. Creates a model with another name from an existing model.

Parameters

source

source of model to be copied from.

destination

destination of model to be copied to.

$ollama->copy_model(
	source => 'llama3.2',
	destination => 'llama3-backup'
);

delete_model

Delete a model and its data.

Parameters

model

model name to delete

$ollama->delete_model(
	model => 'mario'
);

available_models

List models that are available locally.

$ollama->available_models;

running_models

List models that are currently loaded into memory.

$ollama->running_models;

load_completion_model

Load a model into memory

$ollama->load_completion_model;

$ollama->load_completion_model(model => 'llava');

unload_completion_model

Unload a model from memory

$ollama->unload_completion_model;

$ollama->unload_completion_model(model => 'llava');

completion

Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.

Parameters

model

(required) the model name

prompt

the prompt to generate a response for

suffix

the text after the model response

images

(optional) a list of base64-encoded images (for multimodal models such as llava)

image_files

(optional) a list of image files

Advanced parameters (optional):

format

the format to return a response in. Format can be json or a JSON schema

options

additional model parameters listed in the documentation for the Modelfile such as temperature

system

system message to (overrides what is defined in the Modelfile)

template

the prompt template to use (overrides what is defined in the Modelfile)

stream

if false the response will be returned as a single response object, rather than a stream of objects

stream_cb

(optional) cb to handle stream data

raw
if true no formatting will be applied to the prompt. You may choose to use the raw parameter if you are specifying a full templated prompt in your request to the API
keep_alive

controls how long the model will stay loaded into memory following the request (default: 5m)

context (deprecated)

the context parameter returned from a previous request to /generate, this can be used to keep a short conversational memory

my $image = $ollama->completion(
	model => 'llava',
	prompt => 'What is in this image?',
	image_files => [
		"t/pingu.png"
	]
); 

my $json = $ollama->completion(
	prompt => "What color is the sky at different times of the day? Respond using JSON",
	format => "json",
)->json_response;

my $json2 = $ollama->completion(
	prompt => "Ollama is 22 years old and is busy saving the world. Respond using JSON",
	format => {
		type => "object",
		properties => {
			age => {
				"type" => "integer"
			},
			available => {
				"type" => "boolean"
			}
		},
		required => [
			"age",
			"available"
		]
	}
)->json_response;

load_chat_model

Load a model into memory

$ollama->load_chat_model;

$ollama->load_chat_model(model => 'llava');

unload_chat_model

Unload a model from memory

$ollama->unload_chat_model;

$ollama->unload_chat_model(model => 'llava');

chat

Generate the next message in a chat with a provided model.

Parameters

model

(required) the model name

messages

the messages of the chat, this can be used to keep a chat memory

The message object has the following fields:

role

the role of the message, either system, user, assistant, or tool

content

the content of the message

images

(optional) a list of images to include in the message (for multimodal models such as llava)

tool_calls

(optional): a list of tools in JSON that the model wants to use

tools

list of tools in JSON for the model to use if supported

format

the format to return a response in. Format can be json or a JSON schema.

options

additional model parameters listed in the documentation for the Modelfile such as temperature

stream

if false the response will be returned as a single response object, rather than a stream of objects

keep_alive

controls how long the model will stay loaded into memory following the request (default: 5m)

my $completion = $ollama->chat(
	messages => [
		{
			role => 'user',
			content => 'Why is the sky blue?',
		}
	],
);


my $image = $ollama->chat(
	model => 'llava',
	messages => [
		{
			role => 'user',
			content => 'What is in this image?',
			image_files => [
				"t/pingu.png"
			]
		}
	]
);

my $json = $ollama->chat(
	messages => [
		{
			role => "user",
			"content" => "Ollama is 22 years old and is busy saving the world. Respond using JSON",
		}
	],
	format => {
		type => "object",
		properties => {
			age => {
				"type" => "integer"
			},
			available => {
				"type" => "boolean"
			}
		},
		required => [
			"age",
			"available"
		]
	}
)->json_response;

embed

Generate embeddings from a model

Parameters

model

name of model to generate embeddings from

input

text or list of text to generate embeddings for

truncate

(optional) truncates the end of each input to fit within context length. Returns error if false and context length is exceeded. Defaults to true

options

(optional) additional model parameters listed in the documentation for the Modelfile such as temperature

keep_alive

(optional) controls how long the model will stay loaded into memory following the request (default: 5m)

my $embeddings = $ollama->embed(
	model => "nomic-embed-text",
	input => "Why is the sky blue?"
);

AUTHOR

LNATION, <email at lnation.org>

BUGS

Please report any bugs or feature requests to bug-webservice-ollama at rt.cpan.org, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=WebService-Ollama. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc WebService::Ollama

You can also look for information at:

ACKNOWLEDGEMENTS

LICENSE AND COPYRIGHT

This software is Copyright (c) 2025 by LNATION.

This is free software, licensed under:

The Artistic License 2.0 (GPL Compatible)