Skip to content

PoC: Profile-based Intermediate validator loop for domain realism #64

@anand-testcompare

Description

@anand-testcompare

Why

Prompt instructions alone are not sufficient to enforce domain realism (e.g., Foundry/Palantir constraints like compute module boundaries, ontology requirements, pipeline semantics). We need deterministic checks that can reject invalid IntermediateFormat outputs and drive the ToolLoopAgent to iterate until the diagram is structurally valid for a chosen domain.

Goal

Add a small, end-to-end proof-of-concept that:

  1. Validates IntermediateFormat using curated, domain-specific rules.
  2. If invalid, returns structured errors and forces the ToolLoopAgent to revise the Intermediate (iteration loop).
  3. Produces a valid diagram with fewer "domain nonsense" outputs.

Non-goals

  • No Convex schema changes for techStacks/components/validationRules yet.
  • No template marketplace or icon substitution work.
  • No UI work beyond what’s required to pass a profileId (already supported).

Proposed approach

1) Add a validator loop to intermediate generation

  • Location: packages/backend/lib/agents/intermediate-generator.ts
  • Behavior:
    • Generate intermediate
    • Run schema validation + profile.validate()
    • If errors: feed back to the agent as explicit "fix these" instructions and re-run generation
    • Cap retries (e.g., 2-4) and fail with a useful error summary if still invalid

2) Create one concrete domain profile with curated checks

  • Add packages/backend/lib/agents/profiles/palantir-foundry.ts (or a more generic domain-realism-poc.ts)
  • Implement:
    • instructions: concise domain guidance
    • validate(intermediate): curated deterministic rules that can be expressed purely from the Intermediate graph

3) Curated rule types (examples)

Start with 5-10 rules max; focus on high-signal constraints:

  • Connection rules (type A cannot connect directly to type B)
  • Required intermediaries (e.g., Playwright/browser automation must be mediated by a compute module)
  • Required components if certain concepts appear (e.g., ontology presence)
  • Forbidden patterns (e.g., direct writeback without an action/edit node)

Rules should be expressed in terms of:

  • node.kind, node.metadata.*
  • edge.from, edge.to, edge.metadata.*
  • graphOptions.diagramType and selected profileId

4) Tests

  • Add a small test suite in packages/backend/convex/ demonstrating:
    • invalid intermediate triggers at least 1 retry
    • final intermediate passes validation
    • failure mode is readable when exceeding retry cap

Deliverables

  • Validator loop integrated into intermediate generation with retry cap
  • One PoC domain profile with curated rules
  • Tests proving the loop works
  • Short docs note describing how to add a new domain profile + rules

Acceptance criteria

  • Passing profileId activates domain validation rules
  • Invalid IntermediateFormat triggers iteration (at least 1 retry) with error-driven instructions
  • Retry cap prevents infinite loops; errors are surfaced clearly
  • Tests cover pass + fail paths
  • No as any / type suppression

Notes

This is a PoC intended to validate the architecture. If it works, we can later decide whether to store domain catalogs and rules in Convex (data-driven) vs TS (code-driven).

Metadata

Metadata

Assignees

No one assigned

    Labels

    spikePoC, product ideation, or multiple parallel approaches

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions