Why
Prompt instructions alone are not sufficient to enforce domain realism (e.g., Foundry/Palantir constraints like compute module boundaries, ontology requirements, pipeline semantics). We need deterministic checks that can reject invalid IntermediateFormat outputs and drive the ToolLoopAgent to iterate until the diagram is structurally valid for a chosen domain.
Goal
Add a small, end-to-end proof-of-concept that:
- Validates
IntermediateFormat using curated, domain-specific rules.
- If invalid, returns structured errors and forces the ToolLoopAgent to revise the Intermediate (iteration loop).
- Produces a valid diagram with fewer "domain nonsense" outputs.
Non-goals
- No Convex schema changes for techStacks/components/validationRules yet.
- No template marketplace or icon substitution work.
- No UI work beyond what’s required to pass a
profileId (already supported).
Proposed approach
1) Add a validator loop to intermediate generation
- Location:
packages/backend/lib/agents/intermediate-generator.ts
- Behavior:
- Generate intermediate
- Run schema validation +
profile.validate()
- If errors: feed back to the agent as explicit "fix these" instructions and re-run generation
- Cap retries (e.g., 2-4) and fail with a useful error summary if still invalid
2) Create one concrete domain profile with curated checks
- Add
packages/backend/lib/agents/profiles/palantir-foundry.ts (or a more generic domain-realism-poc.ts)
- Implement:
instructions: concise domain guidance
validate(intermediate): curated deterministic rules that can be expressed purely from the Intermediate graph
3) Curated rule types (examples)
Start with 5-10 rules max; focus on high-signal constraints:
- Connection rules (type A cannot connect directly to type B)
- Required intermediaries (e.g., Playwright/browser automation must be mediated by a compute module)
- Required components if certain concepts appear (e.g., ontology presence)
- Forbidden patterns (e.g., direct writeback without an action/edit node)
Rules should be expressed in terms of:
node.kind, node.metadata.*
edge.from, edge.to, edge.metadata.*
graphOptions.diagramType and selected profileId
4) Tests
- Add a small test suite in
packages/backend/convex/ demonstrating:
- invalid intermediate triggers at least 1 retry
- final intermediate passes validation
- failure mode is readable when exceeding retry cap
Deliverables
- Validator loop integrated into intermediate generation with retry cap
- One PoC domain profile with curated rules
- Tests proving the loop works
- Short docs note describing how to add a new domain profile + rules
Acceptance criteria
Notes
This is a PoC intended to validate the architecture. If it works, we can later decide whether to store domain catalogs and rules in Convex (data-driven) vs TS (code-driven).
Why
Prompt instructions alone are not sufficient to enforce domain realism (e.g., Foundry/Palantir constraints like compute module boundaries, ontology requirements, pipeline semantics). We need deterministic checks that can reject invalid IntermediateFormat outputs and drive the ToolLoopAgent to iterate until the diagram is structurally valid for a chosen domain.
Goal
Add a small, end-to-end proof-of-concept that:
IntermediateFormatusing curated, domain-specific rules.Non-goals
profileId(already supported).Proposed approach
1) Add a validator loop to intermediate generation
packages/backend/lib/agents/intermediate-generator.tsprofile.validate()2) Create one concrete domain profile with curated checks
packages/backend/lib/agents/profiles/palantir-foundry.ts(or a more genericdomain-realism-poc.ts)instructions: concise domain guidancevalidate(intermediate): curated deterministic rules that can be expressed purely from the Intermediate graph3) Curated rule types (examples)
Start with 5-10 rules max; focus on high-signal constraints:
Rules should be expressed in terms of:
node.kind,node.metadata.*edge.from,edge.to,edge.metadata.*graphOptions.diagramTypeand selectedprofileId4) Tests
packages/backend/convex/demonstrating:Deliverables
Acceptance criteria
profileIdactivates domain validation rulesas any/ type suppressionNotes
This is a PoC intended to validate the architecture. If it works, we can later decide whether to store domain catalogs and rules in Convex (data-driven) vs TS (code-driven).