Skip to content

Align adk web eval logic with adk eval CLI & pytest #4867

@daniel-odrinski

Description

@daniel-odrinski

Is your feature request related to a specific problem?

adk web eval discovery is inconsistent with adk eval and pytest workflows

There are three ways to run evals in ADK, from what I can tell, and only one of them constrains where eval files must live:

  • adk eval accepts eval set file paths as direct positional arguments — files can be anywhere on disk:
    adk eval my_agent path/to/evals/my_eval.evalset.json
  • pytest (via AgentEvaluator.evaluate()) accepts a file path as a parameter — files can be anywhere on disk:
    await AgentEvaluator.evaluate("my_agent", "path/to/evals/my_eval.evalset.json")
  • adk web uses LocalEvalSetsManager, which hardcodes discovery to a flat os.listdir on <agents_dir>/<app_name>/ with no way to override it. The only alternative is --eval_storage_uri=gs://... for GCS — there is no local path equivalent. Files must sit directly at <agents_dir>/<app_name>/*.evalset.json.

This means a project following the convention from Google's own adk-samples — where eval files live in an <app_name>/eval/ subdirectory and are run via pytest or adk eval — gets no visibility into those evals from the adk web UI. Eval sets created through the adk web UI are also written to the agent root via _get_eval_set_file_path, diverging from where the project's other eval files are stored.

Describe the Solution You'd Like

Allow for adk web to be configured (through an environment variable or otherwise) with a path where *.evalset.json files should be created and read from.

Impact on your work

Cannot provide a coherent developer experience where JSON files are created through the web UI and then run in CI/CD without polluting the agent app directory with evalset.json files.

Willingness to contribute

Are you interested in implementing this feature yourself or submitting a PR?
Maybe

Describe Alternatives You've Considered

Using evalset.json files directly from the agent/app/ directly, without putting them into a sub-directory:

Proposed API / Implementation

Add an ADK_EVALS_DIR env var that sets a subdirectory name to use for eval files within the agent directory. When unset, behaviour is identical to today. This gives users explicit control without changing any defaults.

_EVAL_SET_FILE_EXTENSION = ".evalset.json"
_EVALS_SUBDIR_ENV_VAR = "ADK_EVALS_DIR"  # e.g. "evals" or "eval"


class LocalEvalSetsManager(EvalSetsManager):

  def __init__(self, agents_dir: str):
    self._agents_dir = agents_dir
    self._evals_subdir = self._load_evals_subdir()

  def _load_evals_subdir(self) -> str:
    evals_subdir = os.environ.get(_EVALS_SUBDIR_ENV_VAR, "")
    if evals_subdir:
      if os.path.isabs(evals_subdir):
        raise ValueError(
            f"{_EVALS_SUBDIR_ENV_VAR} must be a relative path,"
            f" got: {evals_subdir!r}"
        )
      if ".." in evals_subdir.split(os.sep):
        raise ValueError(
            f"{_EVALS_SUBDIR_ENV_VAR} must not contain '..',"
            f" got: {evals_subdir!r}"
        )
    return evals_subdir

  def _get_eval_base_dir(self, app_name: str) -> str:
    base = os.path.join(self._agents_dir, app_name, self._evals_subdir)
    resolved = os.path.realpath(base)
    agents_dir = os.path.realpath(self._agents_dir)
    if not resolved.startswith(agents_dir + os.sep):
      raise ValueError(f"Invalid app_name: {app_name!r}")
    return resolved

  def _get_eval_set_file_path(self, app_name: str, eval_set_id: str) -> str:
    return os.path.join(
        self._get_eval_base_dir(app_name),
        eval_set_id + _EVAL_SET_FILE_EXTENSION,
    )

  def list_eval_sets(self, app_name: str) -> list[str]:
    eval_base_dir = self._get_eval_base_dir(app_name)
    try:
      return sorted(
          f.removesuffix(_EVAL_SET_FILE_EXTENSION)
          for f in os.listdir(eval_base_dir)
          if f.endswith(_EVAL_SET_FILE_EXTENSION)
      )
    except FileNotFoundError as e:
      raise NotFoundError(
          f"Eval directory for app `{app_name}` not found."
      ) from e

Usage:

ADK_EVALS_DIR=evals adk web path/to/agents_dir

Metadata

Metadata

Assignees

No one assigned

    Labels

    eval[Component] This issue is related to evaluation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions