NAME

OpenAPI::Client::OpenAI::Path::evals - Documentation for the /evals path.

OPERATIONS

GET /evals

listEvals

$client->list_evals({
    body => { ... },
});

List evaluations for a project.

Path/query parameters

  • after (in query, optional, string) - Identifier for the last eval from the previous pagination request.

  • limit (in query, optional, integer) - Number of evals to retrieve.

    Default: 20

  • order (in query, optional, string) - Sort order for evals by timestamp. Use asc for ascending order or desc for descending order.

    Allowed values: asc, desc

    Default: asc

  • order_by (in query, optional, string) - Evals can be ordered by creation time or last updated time. Use created_at for creation time or updated_at for last updated time.

    Allowed values: created_at, updated_at

    Default: created_at

Responses

200 - A list of evals

Content-Type: application/json

Example:

"{\n  \"object\": \"list\",\n  \"data\": [\n    {\n      \"object\": \"eval\",\n      \"id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n      \"data_source_config\": {\n        \"type\": \"custom\",\n        \"schema\": {\n          \"type\": \"object\",\n          \"properties\": {\n            \"item\": {\n              \"type\": \"object\",\n              \"properties\": {\n                \"input\": {\n                  \"type\": \"string\"\n                },\n                \"ground_truth\": {\n                  \"type\": \"string\"\n                }\n              },\n              \"required\": [\n                \"input\",\n                \"ground_truth\"\n              ]\n            }\n          },\n          \"required\": [\n            \"item\"\n          ]\n        }\n      },\n      \"testing_criteria\": [\n        {\n          \"name\": \"String check\",\n          \"id\": \"String check-2eaf2d8d-d649-4335-8148-9535a7ca73c2\",\n          \"type\": \"string_check\",\n          \"input\": \"{{item.input}}\",\n          \"reference\": \"{{item.ground_truth}}\",\n          \"operation\": \"eq\"\n        }\n      ],\n      \"name\": \"External Data Eval\",\n      \"created_at\": 1739314509,\n      \"metadata\": {},\n    }\n  ],\n  \"first_id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n  \"last_id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n  \"has_more\": true\n}\n"

POST /evals

createEval

$client->create_eval({
    body => { ... },
});

Create the structure of an evaluation that can be used to test a model's performance. An evaluation is a set of testing criteria and the config for a data source, which dictates the schema of the data used in the evaluation. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide .

Responses

201 - OK

Content-Type: application/json

Example:

"{\n  \"object\": \"eval\",\n  \"id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n  \"data_source_config\": {\n    \"type\": \"custom\",\n    \"item_schema\": {\n      \"type\": \"object\",\n      \"properties\": {\n        \"label\": {\"type\": \"string\"},\n      },\n      \"required\": [\"label\"]\n    },\n    \"include_sample_schema\": true\n  },\n  \"testing_criteria\": [\n    {\n      \"name\": \"My string check grader\",\n      \"type\": \"string_check\",\n      \"input\": \"{{sample.output_text}}\",\n      \"reference\": \"{{item.label}}\",\n      \"operation\": \"eq\",\n    }\n  ],\n  \"name\": \"External Data Eval\",\n  \"created_at\": 1739314509,\n  \"metadata\": {\n    \"test\": \"synthetics\",\n  }\n}\n"

SCHEMAS

CreateEvalRequest

Properties:

  • data_source_config (object, required) - The configuration for the data source used for the evaluation runs. Dictates the schema of the data used in the evaluation.

  • metadata (Metadata)

    See "Metadata" below for shape.

  • name (string) - The name of the evaluation.

  • testing_criteria (array of object, required) - A list of graders for all eval runs in this group. Graders can reference variables in the data source using double curly braces notation, like {{item.variable_name}} . To reference the model's output, use the sample namespace (ie, {{sample.output_text}} ).

Eval

Properties:

  • created_at (integer, required) - The Unix timestamp (in seconds) for when the eval was created.

  • data_source_config (object, required) - Configuration of data sources used in runs of the evaluation.

  • id (string, required) - Unique identifier for the evaluation.

  • metadata (Metadata, required)

    See "Metadata" below for shape.

  • name (string, required) - The name of the evaluation.

  • object (string, required) - The object type.

    Allowed values: eval

    Default: eval

  • testing_criteria (array of object, required) - A list of testing criteria.

    Default: eval

EvalList

Properties:

  • data (array of Eval, required) - An array of eval objects.

  • first_id (string, required) - The identifier of the first eval in the data array.

  • has_more (boolean, required) - Indicates whether there are more evals available.

  • last_id (string, required) - The identifier of the last eval in the data array.

  • object (string, required) - The type of this object. It is always set to "list".

    Allowed values: list

    Default: list

Metadata

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

SEE ALSO

OpenAPI::Client::OpenAI::Path

COPYRIGHT AND LICENSE

Copyright (C) 2023-2026 by Nelson Ferraz

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.0 or, at your option, any later version of Perl 5 you may have available.