NAME

Lugh::RoPE - RoPE (Rotary Position Embedding) Scaling Configuration

SYNOPSIS

use Lugh::RoPE;

# Create a default (no scaling) config
my $rope = Lugh::RoPE->new();

# Linear scaling: extend 4K context to 16K
my $rope = Lugh::RoPE->linear(4096, 16384);

# YaRN scaling: extend 4K context to 32K
my $rope = Lugh::RoPE->yarn(4096, 32768);

# Use presets
my $rope = Lugh::RoPE->linear_2x(4096);   # 4K -> 8K
my $rope = Lugh::RoPE->linear_4x(4096);   # 4K -> 16K
my $rope = Lugh::RoPE->yarn_32k(4096);    # 4K -> 32K
my $rope = Lugh::RoPE->yarn_64k(4096);    # 4K -> 64K
my $rope = Lugh::RoPE->yarn_128k(4096);   # 4K -> 128K

# Manual configuration with all parameters
my $rope = Lugh::RoPE->new(
    scaling_type => 'yarn',
    n_ctx_orig   => 4096,
    target_ctx   => 32768,
    freq_base    => 10000.0,
    ext_factor   => -1.0,     # -1 = auto-compute
    attn_factor  => 1.0,
    beta_fast    => 32.0,
    beta_slow    => 1.0,
);

# Use with inference
$inference->forward($model, $tokens, { rope => $rope });

# Query configuration
say $rope->scaling_type_name;  # "yarn"
say $rope->freq_scale;         # 0.125 (4096/32768)

DESCRIPTION

Lugh::RoPE provides configuration for RoPE (Rotary Position Embedding) scaling, enabling models to handle context lengths beyond their training limit.

Scaling Methods

  • none - No scaling, use original context length

  • linear - Simple frequency interpolation. Works well for 2-4x extensions.

  • yarn - YaRN (Yet another RoPE extensioN). Combines NTK interpolation with attention temperature scaling. Better for larger extensions (4-16x+).

  • longrope - LongRoPE method (experimental)

CONSTRUCTORS

new

my $rope = Lugh::RoPE->new(%options);

Create a new RoPE configuration with explicit parameters.

Options:

scaling_type

Type of scaling: 'none', 'linear', 'yarn', or 'longrope'. Can also use constants: ROPE_SCALING_NONE, ROPE_SCALING_LINEAR, etc.

n_ctx_orig

Original training context length.

target_ctx

Target extended context length.

freq_base

Base frequency for RoPE. Default 0 (use model's value).

freq_scale

Frequency scaling factor. Auto-computed from n_ctx_orig/target_ctx if not set.

ext_factor

YaRN extension factor. -1.0 = auto-compute.

attn_factor

YaRN attention temperature factor. Default 1.0.

beta_fast

YaRN high-frequency boundary. Default 32.0.

beta_slow

YaRN low-frequency boundary. Default 1.0.

none

my $rope = Lugh::RoPE->none();

Create a no-scaling configuration. Uses model's original context.

linear

my $rope = Lugh::RoPE->linear($n_ctx_orig, $target_ctx);

Create linear scaling configuration.

yarn

my $rope = Lugh::RoPE->yarn($n_ctx_orig, $target_ctx, %yarn_opts);

Create YaRN scaling configuration. Optional YaRN parameters:

my $rope = Lugh::RoPE->yarn(4096, 32768,
    beta_fast => 16.0,
    beta_slow => 2.0,
);

PRESETS

Convenient methods for common configurations:

linear_2x

my $rope = Lugh::RoPE->linear_2x($n_ctx_orig);

Linear scaling to 2x original context.

linear_4x

my $rope = Lugh::RoPE->linear_4x($n_ctx_orig);

Linear scaling to 4x original context.

yarn_32k

my $rope = Lugh::RoPE->yarn_32k($n_ctx_orig);

YaRN scaling to 32K context.

yarn_64k

my $rope = Lugh::RoPE->yarn_64k($n_ctx_orig);

YaRN scaling to 64K context.

yarn_128k

my $rope = Lugh::RoPE->yarn_128k($n_ctx_orig);

YaRN scaling to 128K context.

from_model

my $rope = Lugh::RoPE->from_model($model);

Extract RoPE configuration from a model's GGUF metadata. This reads all RoPE-related parameters that were stored when the model was created, including:

  • Scaling type (none, linear, yarn, longrope)

  • Original and target context lengths

  • Frequency base and scale

  • YaRN parameters (ext_factor, attn_factor, beta_fast, beta_slow)

Example:

use Lugh::Model;
use Lugh::RoPE;

my $model = Lugh::Model->new(file => 'model.gguf');
my $rope = Lugh::RoPE->from_model($model);

say "Model uses ", $rope->scaling_type_name, " scaling";
say "Original context: ", $rope->n_ctx_orig;

# Use extracted config (or override with forward())
$inference->forward(tokens => \@tokens, rope => $rope);

ACCESSORS

All accessors are read-only:

scaling_type - Returns integer constant
scaling_type_name - Returns string: 'none', 'linear', 'yarn', 'longrope'
n_ctx_orig - Original context length
target_ctx - Target context length
freq_base - Base frequency
freq_scale - Frequency scale factor
ext_factor - YaRN extension factor
attn_factor - YaRN attention factor
beta_fast - YaRN beta fast
beta_slow - YaRN beta slow

CONSTANTS

use Lugh::RoPE;

Lugh::RoPE::ROPE_SCALING_NONE()      # 0
Lugh::RoPE::ROPE_SCALING_LINEAR()    # 1
Lugh::RoPE::ROPE_SCALING_YARN()      # 2
Lugh::RoPE::ROPE_SCALING_LONGROPE()  # 3

TECHNICAL DETAILS

Linear Scaling

Simple frequency interpolation:

freq_scale = n_ctx_orig / target_ctx

The RoPE frequencies are divided by the scale factor, allowing positions beyond the training length to map to the trained frequency range.

YaRN Scaling

YaRN (Yet another RoPE extensioN) uses a more sophisticated approach:

1. NTK-aware interpolation that preserves high-frequency components
2. Attention temperature scaling to compensate for entropy changes
3. Smooth blending between scaled and unscaled frequencies

Parameters:

  • ext_factor: Controls interpolation strength. -1 = auto-compute based on scale ratio.

  • attn_factor: Attention temperature scaling. 1.0 = no scaling.

  • beta_fast: High-frequency boundary (above this, minimal scaling)

  • beta_slow: Low-frequency boundary (below this, full scaling)

SEE ALSO

Lugh, Lugh::Inference, Lugh::Model

Paper references:

AUTHOR

Lugh Authors

LICENSE

Same as Perl itself.