NAME
OpenAPI::Client::OpenAI::Path::evals - Documentation for the /evals path.
OPERATIONS
GET /evals
listEvals
$client->list_evals({
body => { ... },
});
List evaluations for a project.
Path/query parameters
after(in query, optional, string) - Identifier for the last eval from the previous pagination request.limit(in query, optional, integer) - Number of evals to retrieve.Default: 20
order(in query, optional, string) - Sort order for evals by timestamp. Useascfor ascending order ordescfor descending order.Allowed values: asc, desc
Default: asc
order_by(in query, optional, string) - Evals can be ordered by creation time or last updated time. Usecreated_atfor creation time orupdated_atfor last updated time.Allowed values: created_at, updated_at
Default: created_at
Responses
200 - A list of evals
Content-Type: application/json
Example:
"{\n \"object\": \"list\",\n \"data\": [\n {\n \"object\": \"eval\",\n \"id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n \"data_source_config\": {\n \"type\": \"custom\",\n \"schema\": {\n \"type\": \"object\",\n \"properties\": {\n \"item\": {\n \"type\": \"object\",\n \"properties\": {\n \"input\": {\n \"type\": \"string\"\n },\n \"ground_truth\": {\n \"type\": \"string\"\n }\n },\n \"required\": [\n \"input\",\n \"ground_truth\"\n ]\n }\n },\n \"required\": [\n \"item\"\n ]\n }\n },\n \"testing_criteria\": [\n {\n \"name\": \"String check\",\n \"id\": \"String check-2eaf2d8d-d649-4335-8148-9535a7ca73c2\",\n \"type\": \"string_check\",\n \"input\": \"{{item.input}}\",\n \"reference\": \"{{item.ground_truth}}\",\n \"operation\": \"eq\"\n }\n ],\n \"name\": \"External Data Eval\",\n \"created_at\": 1739314509,\n \"metadata\": {},\n }\n ],\n \"first_id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n \"last_id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n \"has_more\": true\n}\n"
POST /evals
createEval
$client->create_eval({
body => { ... },
});
Create the structure of an evaluation that can be used to test a model's performance. An evaluation is a set of testing criteria and the config for a data source, which dictates the schema of the data used in the evaluation. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide .
Responses
201 - OK
Content-Type: application/json
Example:
"{\n \"object\": \"eval\",\n \"id\": \"eval_67abd54d9b0081909a86353f6fb9317a\",\n \"data_source_config\": {\n \"type\": \"custom\",\n \"item_schema\": {\n \"type\": \"object\",\n \"properties\": {\n \"label\": {\"type\": \"string\"},\n },\n \"required\": [\"label\"]\n },\n \"include_sample_schema\": true\n },\n \"testing_criteria\": [\n {\n \"name\": \"My string check grader\",\n \"type\": \"string_check\",\n \"input\": \"{{sample.output_text}}\",\n \"reference\": \"{{item.label}}\",\n \"operation\": \"eq\",\n }\n ],\n \"name\": \"External Data Eval\",\n \"created_at\": 1739314509,\n \"metadata\": {\n \"test\": \"synthetics\",\n }\n}\n"
SCHEMAS
CreateEvalRequest
Properties:
data_source_config(object, required) - The configuration for the data source used for the evaluation runs. Dictates the schema of the data used in the evaluation.metadata(Metadata)See "Metadata" below for shape.
name(string) - The name of the evaluation.testing_criteria(array of object, required) - A list of graders for all eval runs in this group. Graders can reference variables in the data source using double curly braces notation, like{{item.variable_name}}. To reference the model's output, use thesamplenamespace (ie,{{sample.output_text}}).
Eval
Properties:
created_at(integer, required) - The Unix timestamp (in seconds) for when the eval was created.data_source_config(object, required) - Configuration of data sources used in runs of the evaluation.id(string, required) - Unique identifier for the evaluation.metadata(Metadata, required)See "Metadata" below for shape.
name(string, required) - The name of the evaluation.object(string, required) - The object type.Allowed values: eval
Default: eval
testing_criteria(array of object, required) - A list of testing criteria.Default: eval
EvalList
Properties:
data(array of Eval, required) - An array of eval objects.first_id(string, required) - The identifier of the first eval in the data array.has_more(boolean, required) - Indicates whether there are more evals available.last_id(string, required) - The identifier of the last eval in the data array.object(string, required) - The type of this object. It is always set to "list".Allowed values: list
Default: list
Metadata
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
SEE ALSO
COPYRIGHT AND LICENSE
Copyright (C) 2023-2026 by Nelson Ferraz
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.0 or, at your option, any later version of Perl 5 you may have available.