title	Code Evaluations

Code Evaluations in Freeplay

In addition to Human and Model Graded evaluations, Freeplay also offers the ability to run code-driven evaluations directly on the client-side, then log those results to Freeplay. These evals are generally functions written and run in the client's code path and then recorded back to Freeplay.

These evaluations are particularly useful for criteria requiring logical expressions, such as JSON schema checks or category assertions on single answers, or for pairwise comparisons to an expected output via methods like embedding or string distance. Code evals can be added both to:

Individual Sessions
Test Runs executed with our SDK or API, which can include comparisons to ground truth data

In either case, any results you log to Freeplay flow through to the UI just like human or model-graded evals. See our SDK documentation for more details.

What’s Next

Now review each evaluation type and then move onto test runs once all your evaluations are configured!

Ask AI

Table of Contents
- Code Evaluations in Freeplay

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Evaluations in Freeplay

FilesExpand file tree

code-evaluations.mdx

Latest commit

History

code-evaluations.mdx

File metadata and controls

Code Evaluations in Freeplay