NAME

OpenAPI::Client::OpenAI::Path::evals-eval_id-runs - Documentation for the /evals/{eval_id}/runs path.

DESCRIPTION

This document describes the API endpoint at /evals/{eval_id}/runs.

PATHS

GET /evals/{eval_id}/runs

Get eval runs

Get a list of runs for an evaluation.

Operation ID

getEvalRuns

$client->getEvalRuns( ... );

Parameters

  • eval_id (in path) (Required) - The ID of the evaluation to retrieve runs for.

    Type: string

  • after (in query) (Optional) - Identifier for the last run from the previous pagination request.

    Type: string

  • limit (in query) (Optional) - Number of runs to retrieve.

    Type: integer

    Default: 20

  • order (in query) (Optional) - Sort order for runs by timestamp. Use `asc` for ascending order or `desc` for descending order. Defaults to `asc`.

    Type: string

    Allowed values: asc, desc

    Default: asc

  • status (in query) (Optional) - Filter runs by status. One of `queued` | `in_progress` | `failed` | `completed` | `canceled`.

    Type: string

    Allowed values: queued, in_progress, completed, canceled, failed

Responses

Status Code: 200

A list of runs for the evaluation

Content Types:

  • application/json

    Example (See the OpenAI spec for more detail):

    {
      "object": "list",
      "data": [
        {
          "object": "eval.run",
          "id": "evalrun_67b7fbdad46c819092f6fe7a14189620",
          "eval_id": "eval_67b7fa9a81a88190ab4aa417e397ea21",
          "report_url": "https://platform.openai.com/evaluations/eval_67b7fa9a81a88190ab4aa417e397ea21?run_id=evalrun_67b7fbdad46c819092f6fe7a14189620",
          "status": "completed",
          "model": "o3-mini",
          "name": "Academic Assistant",
          "created_at": 1740110812,
          "result_counts": {
            "total": 171,
            "errored": 0,
            "failed": 80,
            "passed": 91
          },
          "per_model_usage": null,
          "per_testing_criteria_results": [
            {
              "testing_criteria": "String check grader",
              "passed": 91,
              "failed": 80
            }
          ],
          "run_data_source": {
            "type": "completions",
            "template_messages": [
              {
                "type": "message",
                "role": "system",
                "content": {
                  "type": "input_text",
                  "text": "You are a helpful assistant."
                }
              },
              {
                "type": "message",
                "role": "user",
                "content": {
                  "type": "input_text",
                  "text": "Hello, can you help me with my homework?"
                }
              }
            ],
            "datasource_reference": null,
            "model": "o3-mini",
            "max_completion_tokens": null,
            "seed": null,
            "temperature": null,
            "top_p": null
          },
          "error": null,
          "metadata": {"test": "synthetics"}
        }
      ],
      "first_id": "evalrun_67abd54d60ec8190832b46859da808f7",
      "last_id": "evalrun_67abd54d60ec8190832b46859da808f7",
      "has_more": false
    }

POST /evals/{eval_id}/runs

Create eval run

Kicks off a new run for a given evaluation, specifying the data source, and what model configuration to use to test. The datasource will be validated against the schema specified in the config of the evaluation.

Operation ID

createEvalRun

$client->createEvalRun( ... );

Parameters

  • eval_id (in path) (Required) - The ID of the evaluation to create a run for.

    Type: string

Request Body

Content Type: application/json

Responses

Status Code: 201

Successfully created a run for the evaluation

Content Types:

  • application/json

    Example (See the OpenAI spec for more detail):

    {
      "object": "eval.run",
      "id": "evalrun_67e57965b480819094274e3a32235e4c",
      "eval_id": "eval_67e579652b548190aaa83ada4b125f47",
      "report_url": "https://platform.openai.com/evaluations/eval_67e579652b548190aaa83ada4b125f47?run_id=evalrun_67e57965b480819094274e3a32235e4c",
      "status": "queued",
      "model": "gpt-4o-mini",
      "name": "gpt-4o-mini",
      "created_at": 1743092069,
      "result_counts": {
        "total": 0,
        "errored": 0,
        "failed": 0,
        "passed": 0
      },
      "per_model_usage": null,
      "per_testing_criteria_results": null,
      "data_source": {
        "type": "completions",
        "source": {
          "type": "file_content",
          "content": [
            {
              "item": {
                "input": "Tech Company Launches Advanced Artificial Intelligence Platform",
                "ground_truth": "Technology"
              }
            },
            {
              "item": {
                "input": "Central Bank Increases Interest Rates Amid Inflation Concerns",
                "ground_truth": "Markets"
              }
            },
            {
              "item": {
                "input": "International Summit Addresses Climate Change Strategies",
                "ground_truth": "World"
              }
            },
            {
              "item": {
                "input": "Major Retailer Reports Record-Breaking Holiday Sales",
                "ground_truth": "Business"
              }
            },
            {
              "item": {
                "input": "National Team Qualifies for World Championship Finals",
                "ground_truth": "Sports"
              }
            },
            {
              "item": {
                "input": "Stock Markets Rally After Positive Economic Data Released",
                "ground_truth": "Markets"
              }
            },
            {
              "item": {
                "input": "Global Manufacturer Announces Merger with Competitor",
                "ground_truth": "Business"
              }
            },
            {
              "item": {
                "input": "Breakthrough in Renewable Energy Technology Unveiled",
                "ground_truth": "Technology"
              }
            },
            {
              "item": {
                "input": "World Leaders Sign Historic Climate Agreement",
                "ground_truth": "World"
              }
            },
            {
              "item": {
                "input": "Professional Athlete Sets New Record in Championship Event",
                "ground_truth": "Sports"
              }
            },
            {
              "item": {
                "input": "Financial Institutions Adapt to New Regulatory Requirements",
                "ground_truth": "Business"
              }
            },
            {
              "item": {
                "input": "Tech Conference Showcases Advances in Artificial Intelligence",
                "ground_truth": "Technology"
              }
            },
            {
              "item": {
                "input": "Global Markets Respond to Oil Price Fluctuations",
                "ground_truth": "Markets"
              }
            },
            {
              "item": {
                "input": "International Cooperation Strengthened Through New Treaty",
                "ground_truth": "World"
              }
            },
            {
              "item": {
                "input": "Sports League Announces Revised Schedule for Upcoming Season",
                "ground_truth": "Sports"
              }
            }
          ]
        },
        "input_messages": {
          "type": "template",
          "template": [
            {
              "type": "message",
              "role": "developer",
              "content": {
                "type": "input_text",
                "text": "Categorize a given news headline into one of the following topics: Technology, Markets, World, Business, or Sports.\n\n# Steps\n\n1. Analyze the content of the news headline to understand its primary focus.\n2. Extract the subject matter, identifying any key indicators or keywords.\n3. Use the identified indicators to determine the most suitable category out of the five options: Technology, Markets, World, Business, or Sports.\n4. Ensure only one category is selected per headline.\n\n# Output Format\n\nRespond with the chosen category as a single word. For instance: \"Technology\", \"Markets\", \"World\", \"Business\", or \"Sports\".\n\n# Examples\n\n**Input**: \"Apple Unveils New iPhone Model, Featuring Advanced AI Features\"  \n**Output**: \"Technology\"\n\n**Input**: \"Global Stocks Mixed as Investors Await Central Bank Decisions\"  \n**Output**: \"Markets\"\n\n**Input**: \"War in Ukraine: Latest Updates on Negotiation Status\"  \n**Output**: \"World\"\n\n**Input**: \"Microsoft in Talks to Acquire Gaming Company for $2 Billion\"  \n**Output**: \"Business\"\n\n**Input**: \"Manchester United Secures Win in Premier League Football Match\"  \n**Output**: \"Sports\" \n\n# Notes\n\n- If the headline appears to fit into more than one category, choose the most dominant theme.\n- Keywords or phrases such as \"stocks\", \"company acquisition\", \"match\", or technological brands can be good indicators for classification.\n"
              }
            },
            {
              "type": "message",
              "role": "user",
              "content": {
                "type": "input_text",
                "text": "{{item.input}}"
              }
            }
          ]
        },
        "model": "gpt-4o-mini",
        "sampling_params": {
          "seed": 42,
          "temperature": 1.0,
          "top_p": 1.0,
          "max_completions_tokens": 2048
        }
      },
      "error": null,
      "metadata": {}
    }

Status Code: 400

Bad request (for example, missing eval object)

Content Types:

SEE ALSO

OpenAPI::Client::OpenAI::Path

COPYRIGHT AND LICENSE

Copyright (C) 2023-2025 by Nelson Ferraz

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.0 or, at your option, any later version of Perl 5 you may have available.