NAME

OpenAIAsync::Client - IO::Async based client for OpenAI compatible APIs

SYNOPSIS

use IO::Async::Loop;
use OpenAIAsync::Client;

my $loop = IO::Async::Loop->new();

my $client = OpenAIAsync::Client->new();

$loop->add($client);

my $output = await $client->chat({
    model => "gpt-3.5-turbo",
    messages => [
      {
        role => "system",
        content => "You are a helpful assistant that tells fanciful stories"
      },
      {
        role => "user",
        content => "Tell me a story of two princesses, Judy and Emmy.  Judy is 8 and Emmy is 2."
      }
    ],



  max_tokens => 1024, 
})->get();

# $output is now an OpenAIAsync::Type::Response::ChatCompletion

THEORY OF OPERATION

This module implements the IO::Async::Notifier interface, this means that you create a new client and then call $loop->add($client) this casues all Futures that are created to be part of the IO::Async::Loop of your program. This way when you call await on any method it will properly suspend the execution of your program and do something else concurrently (probably waiting on requests).

Methods

new()

Create a new OpenAIAsync::Client. You'll need to register the client with $loop->add($client) after creation.

PARAMETERS

  • api_base (optional)

    Base url of the service to connect to. Defaults to https://api.openai.com/v1. This should be a value pointing to something that implements the v1 OpenAI API, which for OobaBooga's text-generation-webui might be something like http://localhost:5000/v1.

    It will also be pulled from the environment variable OPENAI_API_BASE in the same fashion that the OpenAI libraries in other languages will do.

  • api_key (required)

    Api key that will be passed to the service you call. This gets passed as a header Authorization: Api-Key .... to the service in all of the REST calls. This should be kept secret as it can be used to make all kinds of calls to paid services.

    It will also be pulled from the environment variable OPENAI_API_KEY in the same fashion that the OpenAI libraries in other languages will do.

  • api_org_name (optional)

    A name for the organization that's making the call. This can be used by OpenAI to help identify which part of your company is making any specific request, and I believe to help itemize billing and other tasks.

  • http_user_agent (optional)

    Set the useragent that's used to contact the API service. Defaults to

    __PACKAGE__." Perl/$VERSION (Net::Async::HTTP/".$Net::Async::HTTP::VERSION." IO::Async/".$IO::Async::VERSION." Perl/$])"

    The default is to make it easier to debug if we ever see weird issues with the requests being generated but it does reveal some information about the code environment.

  • http_max_in_flight (optional)

    How many requests should we allow to happen at once. Increasing this will increase the allowed parallel requests, but that can also allow you to make too many requests and cost more in API calls.

    Defaults to 2

  • http_max_connections_per_host (optional)

    TODO, I'm thinking this one will get dropped. Effectively since we're only ever connecting to one server this ends up functioning the same as the above parameter.

    Defaults to 2

  • http_max_redirects (optional)

    How many redirects to allow. The official OpenAI API never sends redirects (for now) but for self hosted or other custom setups this might happen and should be handled correctly

    Defaults to 3

  • http_timeout (optional)

    How long to wait on any given request to start.

    Defaults to 120 seconds.

  • http_stall_timeout (optional)

    How long to wait on any given request to decide if it's been stalled. If a request starts responding and then stops part way through, this is how we'll treat it as stalled and time it out

    Defaults to 600s (10 minutes). This is unlikely to happen except for a malfunctioning inference service since once generation starts to return it'll almost certainly finish.

  • http_other (optional)

    A hash ref that gets passed as additional parameters to Net::Async::HTTP's constructor. All values will be overriden by the ones above, so if a parameter is supported use those first.

completion (deprecated)

Create a request for completion, this takes a prompt and returns a response. See OpenAIAsync::Types::Request::Completion for exact details.

This particular API has been deprecated by OpenAI in favor of doing everything through the chat completion api below. However it is still supported by OpenAI and compatible servers as it's a very simple interface to use

chat

Create a request for the chat completion api. This takes a series of messages and returns a new chat response. See OpenAIAsync::Types::Request::ChatCompletion for exact details.

This API takes a series of messages from different agent sources and then responds as the assistant agent. A typical interaction is to start with a "system" agent message to set the context for the assistant, followed by the "user" agent type for the user's request. You'll then get the response from the assistant agent to give to the user.

To continue the chat, you'd then take the new message and insert it into the list of messages as part of the chat and make a new request with the user's response. I'll be creating a new module that uses this API and helps manage the chat in an easier manner with a few helper functions.

embedding

Create a request for calculating the embedding of an input. This takes a bit of text and returns a gigantic list of numbers, see OpenAIAsync::Types::Request::Embedding for exact details.

These values are a bit difficult to explain how they work, but essentially you get a mathematical object, a vector, that describes the contents of the input as a point in an N-dimensional space (typically 768 or 1536 dimensions). The dimensions themselves really don't have any inherit mathematical meaning but are instead relative to one-another from the training data of the embedding model.

You'll want to take the vector and store it in a database that supports vector operations, like PostgreSQL with the pgvector extension.

image_generate

Unimplemented, but once present will be used to generate images with Dall-E (or for self hosted, stable diffusion).

text_to_speech

Unimplemented, but can be used to turn text to speech using whatever algorithms/models are supported.

speech_to_text

Unimplemented. The opposite of the above.

vision

Unimplemented, I've not investigated this one much yet but I believe it's to get a description of an image and it's contents.

Missing apis

At least some for getting the list of models and some other meta information, those will be added next after I get some more documentation written

See Also

IO::Async, Future::AsyncAwait, Net::Async::HTTP

License

Artistic 2.0

Author

Ryan Voots, ... etc.