fix: classify update-step tool misses as agent_skipped#628
Conversation
Homeboy Results —
|
|
Updated per review feedback:\n\n- Removed mapping for missing required update handler tool calls\n- UpdateStep now emits in packet metadata\n- ExecuteStepAbility now uses packet when failing job (fallback remains )\n\nThis keeps reserved for explicit skip tool usage () and classifies tool-miss protocol errors as failures with a precise reason string in status. |
|
Correction (previous comment had shell interpolation artifacts). Updated per review feedback:
This keeps agent_skipped reserved for explicit skip tool usage (skip_item) and classifies tool-miss protocol errors as failures with a precise reason string in status. |
|
Superseding PR body summary (can’t edit body via gh due projectCards API error in this repo context): Final semantics in this branch
Code changes for that
|
|
Added follow-up commit for multi-handler determinism in UpdateStep. What changed
WhyThis removes ambiguous “first handler by accident” behavior for multi-handler steps and makes required tool-call contract explicit. |
|
Added test coverage for the contract discussion:
This gives immediate CI guardrails for the semantics we agreed on, independent of Homeboy audit capabilities. Also opened Homeboy enhancement issue for architecture/layer ownership checks: |
a6350e3 to
b1e4df1
Compare
Summary
ToolResultFinderwhen callers explicitly expect optional tool missesUpdateStepto emit a structured packet when the required handler tool is not calledagent_skippedsignal inExecuteStepAbilityand complete the job withagent_skipped - handler_tool_not_calledToolResultFindermissing-handler logging behaviorWhy
At scale, update-step tool misses were being classified as
empty_data_packet_returnedfailures, creating noisy failure metrics. This change gives a semantically correct terminal state for model/tool-use misses without masking true step failures.Testing
homeboy test data-machine --skip-lint --path=\"/var/lib/datamachine/workspace/data-machine\" -- --filter \"PipelineBatchSchedulerTest|ToolResultFinderTest\"