NAME

Lugh::Prompt - Chat Template Formatting for LLM Conversations

VERSION

Version 0.01

SYNOPSIS

use Lugh::Prompt;

# Create prompt formatter for a specific format
my $prompt = Lugh::Prompt->new(format => 'chatml');

# Or auto-detect from model architecture
my $prompt = Lugh::Prompt->new(model => $model);

# Format messages into a prompt string
my $text = $prompt->apply(
    { role => 'system', content => 'You are a helpful assistant.' },
    { role => 'user',   content => 'Hello!' },
);
# Returns: "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n..."

# Shortcut functions
my $text = Lugh::Prompt::chatml(
    { role => 'user', content => 'Hello!' }
);
my $text = Lugh::Prompt::llama3(
    { role => 'user', content => 'Hello!' }
);

DESCRIPTION

Lugh::Prompt provides XS-based chat template formatting for various LLM chat formats. It converts a list of messages (with role and content) into the specific token format expected by different model families.

Supported Formats

  • chatml - ChatML format used by Qwen, Phi, Yi, and many others

  • llama2 - Llama 2 chat format with [INST] tags

  • llama3 - Llama 3 format with special tokens

  • gemma - Google Gemma format

  • mistral - Mistral Instruct format

  • zephyr - Zephyr format with |role| tags

  • alpaca - Alpaca instruction format

  • vicuna - Vicuna chat format

  • raw - No formatting, just concatenate content

CONSTRUCTOR

new

my $prompt = Lugh::Prompt->new(%options);

Creates a new prompt formatter.

Options:

  • format - Format name (chatml, llama2, llama3, mistral, gemma, etc.)

  • model - Lugh::Model object to auto-detect format from architecture

METHODS

format_name

my $name = $prompt->format_name;

Returns the name of the format being used.

apply

my $text = $prompt->apply(@messages, %options);

Formats a list of messages into a prompt string.

Messages: Each message is a hashref with:

  • role - 'system', 'user', or 'assistant'

  • content - The message text

Options:

  • add_generation_prompt - Add tokens to prompt assistant response (default: 1)

  • add_bos - Add BOS token at start (default: 1)

format_message

my $text = $prompt->format_message($role, $content);

Formats a single message with its role-specific prefix and suffix.

CLASS METHODS

available_formats

my @formats = Lugh::Prompt->available_formats;

Returns a list of all available format names.

format_for_architecture

my $format = Lugh::Prompt->format_for_architecture($arch);

Returns the recommended chat format for a given model architecture.

has_format

my $bool = Lugh::Prompt->has_format($name);

Returns true if the named format exists.

get_format

my $info = Lugh::Prompt->get_format($name);

Returns a hashref with format details (prefixes, suffixes, tokens).

SHORTCUT FUNCTIONS

chatml, llama2, llama3, mistral, gemma, zephyr, alpaca, vicuna, raw

my $text = Lugh::Prompt::chatml(@messages, %opts);
my $text = Lugh::Prompt::llama3(@messages, %opts);
# etc.

Shortcut functions for each format.

SEE ALSO

Lugh, Lugh::Model, Lugh::Inference

https://huggingface.co/docs/transformers/chat_templating - HuggingFace Chat Templates

AUTHOR

lnation <email@lnation.org>

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.