Skip to content

Agentic bootstrap, MCP-only planning, model updates, UX improvements#27

Closed
josephgoksu wants to merge 5 commits intomainfrom
fix/security-quality-freshness-review
Closed

Agentic bootstrap, MCP-only planning, model updates, UX improvements#27
josephgoksu wants to merge 5 commits intomainfrom
fix/security-quality-freshness-review

Conversation

@josephgoksu
Copy link
Owner

@josephgoksu josephgoksu commented Mar 16, 2026

Summary

  • Agentic bootstrap agents: Convert doc/deps/git agents to ReAct mode with deterministic fallback. Two-wave execution (doc+deps first, then code+git with context). Adaptive agent selection based on project state.
  • MCP-only planning: Remove CLI plan/goal commands (-1,982 lines). MCP plan tool unchanged. Dynamic task count scaling based on goal complexity. Tasks enriched with baseline project context.
  • Stop hook fix: Sync TasksCompleted from DB (source of truth) on each invocation. Session save retry logic. Clear stale task references.
  • Model registry update: Add gpt-5.4/5.4-mini/5.4-nano, claude-opus-4-6/sonnet-4-6, gemini-3.1 models. Fix o3/gpt-5 pricing and context windows. Newest-first ordering in selection UI.
  • Knowledge output redesign: Grouped-by-type display with section headers. Smart workspace column hidden in single-workspace projects. Spelled-out stats instead of cryptic abbreviations.
  • MCP ask improvements: Knowledge nodes grouped by type in responses. Content preview increased to 300 chars. RAG prompt produces structured answers.

Greptile Summary

This is a large, multi-faceted PR that delivers five meaningful improvements: a two-wave agentic bootstrap with ReAct-first execution and deterministic fallback; removal of ~1,982 lines of CLI planning commands in favour of MCP-only planning; a stop-hook reliability fix syncing task completion from the DB; a model registry update adding the latest GPT-5.4/Claude-4-6/Gemini-3.1 model families; and a knowledge output redesign with grouped-by-type display.

The architectural direction is sound — the two-wave execution model is well-reasoned, the adaptive agent selection is pragmatic, and the safepath.SafeJoin consolidation in handlers.go was overdue. Most changes are clean and well-commented.

Issues found:

  • P1 — debug log prints empty string after clearing task ID (cmd/hook.go): session.CurrentTaskID is zeroed out before being referenced in the verbose debug log, so the log always prints "[DEBUG] Could not load current task : <err>" — the original ID is silently lost, making this log useless for diagnosing stale references.

  • P1 — fallback loop for unknown knowledge types is dead code (internal/mcp/presenter.go): The loop intended to render nodes whose types are absent from typeOrder checks grouped[node.Type] instead of the typeOrder slice. Because grouped is built from all result.Results, every node type is present in the map, so the condition !ok is always false. Any future knowledge type not in typeOrder (e.g., "risk") will be silently dropped from MCP ask responses.

  • P2 — duplicated dependency manifest list (internal/agents/impl/analysis_deps.go / internal/bootstrap/factory.go): commonDepFiles and dependencyManifests are identical slices defined in two separate packages. These will diverge as new ecosystems are added.

  • P2 — non-portable path join (internal/agents/impl/analysis_deps.go): hasAnyDependencyFile uses basePath + "/" + name instead of filepath.Join, inconsistent with all other file-path handling in the codebase.

  • P2 — wave 1 errors silently dropped (internal/bootstrap/runner.go): The _ discard of runParallel's error return for wave 1 provides zero diagnostic visibility when the doc or deps agents fail; at minimum a log line should surface this in verbose mode.

Confidence Score: 3/5

  • Safe to merge after fixing the two P1 logic bugs; the rest is solid engineering.
  • Two confirmed logic bugs lower the score: (1) the debug log in hook.go always prints an empty task ID, degrading debuggability of the stop-hook fix that is the stated purpose of this PR; (2) the unknown-type fallback in presenter.go is unreachable dead code, meaning any future knowledge type outside the hardcoded typeOrder list will be silently dropped from MCP ask responses. Both are mechanical fixes. The broader architectural changes (two-wave bootstrap, MCP-only planning, model registry) are well-implemented with no blocking correctness issues.
  • cmd/hook.go (debug log bug) and internal/mcp/presenter.go (dead-code fallback loop) need fixes before merge.

Important Files Changed

Filename Overview
cmd/hook.go Stop hook hardened with DB-sourced task completion sync and session save retry; has a debug log bug where session.CurrentTaskID is cleared before being printed (always logs empty string).
internal/mcp/presenter.go Knowledge output refactored to group nodes by type with section headers and extended content preview; fallback loop for types outside typeOrder is dead code due to incorrect map check.
internal/bootstrap/runner.go Two-wave execution introduced (doc+deps first, then code+git); wave 1 errors are silently discarded with no logging, reducing operator visibility on partial failures.
internal/bootstrap/factory.go Adaptive agent selection based on project state (git repo, file count, dependency manifests); dependency manifest list is duplicated in analysis_deps.go risking future divergence.
internal/agents/impl/react_bootstrap.go New shared ReAct helper for bootstrap agents; cleanly wraps Eino wiring with ErrNoToolCalling sentinel for graceful fallback to deterministic mode.
internal/llm/models.go Model registry updated with gpt-5.4/claude-4-6/gemini-3.1 entries, pricing corrections, and registry-order-based UI sorting replacing price-proxy sorting; changes look correct but rely entirely on manually-maintained data with no validation.
internal/agents/impl/analysis_deps.go ReAct-first mode added with deterministic fallback; hasAnyDependencyFile uses non-portable string concatenation and duplicates the manifest list from factory.go.
internal/bootstrap/initializer.go Claude Code slash commands now use embedded content via skills package instead of CLI shell-out; migration path correctly handles legacy .claude/skills/tw-* cleanup.
cmd/plan.go 693-line CLI plan command removed entirely; functionality moved to MCP-only path as intended by the PR.
cmd/goal.go 209-line CLI goal command removed; MCP plan tool replaces this functionality end-to-end.
internal/migration/upgrade.go Migration logic refactored to use positive-flag managed detection; adds legacy .claude/commands/ marker check for Claude users migrating from commands to embedded skills.
internal/ui/list_view.go Knowledge list view redesigned with grouped-by-type sections, readable stat labels, and smart workspace column hiding for single-workspace projects.

Sequence Diagram

sequenceDiagram
    participant CLI as tw bootstrap
    participant Factory as bootstrap.Factory
    participant Wave1 as Wave 1 (doc + deps)
    participant Wave2 as Wave 2 (code + git)
    participant ReAct as ReAct Agent (Eino)
    participant Det as Deterministic Chain

    CLI->>Factory: NewDefaultAgents(cfg, path, snap)
    Factory-->>CLI: adaptive agent list

    CLI->>Wave1: runParallel(ctx, wave1Agents, input)
    Wave1->>ReAct: runReactMode(systemPrompt, userMsg, maxSteps)
    alt ReAct succeeds (≥ threshold findings)
        ReAct-->>Wave1: findings
    else ReAct fails / below threshold
        Wave1->>Det: DeterministicChain.Invoke()
        Det-->>Wave1: findings
    end
    Wave1-->>CLI: wave1Results (errors silently discarded)

    CLI->>CLI: buildWaveContext(wave1Results)
    CLI->>Wave2: runParallel(ctx, wave2Agents, input + wave1Context)
    Wave2->>ReAct: runReactMode(systemPrompt, userMsg, maxSteps)
    alt ReAct succeeds
        ReAct-->>Wave2: findings
    else fallback
        Wave2->>Det: DeterministicChain.Invoke()
        Det-->>Wave2: findings
    end
    Wave2-->>CLI: wave2Results

    CLI-->>CLI: append(wave1Results, wave2Results)
    CLI->>CLI: printPostBootstrapSummary()
Loading

Fix All in Claude Code

Prompt To Fix All With AI
This is a comment left during a code review.
Path: cmd/hook.go
Line: 259-265

Comment:
**Debug log prints empty string after ID is cleared**

`session.CurrentTaskID` is set to `""` on line 261 *before* it is referenced in the debug `fmt.Fprintf` on line 263. The log will always print `"[DEBUG] Could not load current task : <err>"` — the original ID is lost, making the log useless for diagnosing which task caused the failure.

Capture the ID before clearing it:

```suggestion
	currentTask, err = repo.GetTask(session.CurrentTaskID)
	if err != nil {
		// Task not found or DB error -- clear stale reference
		staleID := session.CurrentTaskID
		session.CurrentTaskID = ""
		if viper.GetBool("verbose") {
			fmt.Fprintf(os.Stderr, "[DEBUG] Could not load current task %s: %v\n", staleID, err)
		}
	}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: internal/mcp/presenter.go
Line: 82-89

Comment:
**Fallback loop for unknown types is dead code — condition can never be true**

The `grouped` map is populated from `result.Results`:
```go
for _, node := range result.Results {
    grouped[node.Type] = append(grouped[node.Type], node)
}
```

Because every node's type is inserted into `grouped` during this loop, the subsequent check `if _, ok := grouped[node.Type]; !ok` will **always** be false — every type that appears in `result.Results` is guaranteed to be a key in `grouped`. Nodes whose types are absent from `typeOrder` (e.g., a future `"risk"` type) will be silently dropped from the output rather than rendered in the fallback section.

The check should compare against `typeOrder`, not `grouped`:

```go
// Include any types not in the standard order
inOrder := make(map[string]bool, len(typeOrder))
for _, t := range typeOrder {
    inOrder[t] = true
}
for _, node := range result.Results {
    if !inOrder[node.Type] {
        sb.WriteString(fmt.Sprintf("%d. **%s** (%s)\n", idx, node.Summary, node.Type))
        idx++
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: internal/agents/impl/analysis_deps.go
Line: 279-285

Comment:
**Duplicated dependency manifest list and non-portable path concatenation**

`commonDepFiles` here is identical to `dependencyManifests` in `internal/bootstrap/factory.go`. These two lists will inevitably diverge as new ecosystems are added (e.g., `mix.exs` for Elixir, `flake.nix` for Nix). Centralise the list in one place and reference it from both.

Additionally, `basePath + "/" + name` is non-portable. Prefer `filepath.Join(basePath, name)` which handles OS path separators correctly and is consistent with the rest of the codebase:

```suggestion
// hasAnyDependencyFile checks if at least one common dependency manifest exists.
func hasAnyDependencyFile(basePath string) bool {
	for _, name := range commonDepFiles {
		if _, err := os.Stat(filepath.Join(basePath, name)); err == nil {
			return true
		}
	}
	return false
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: internal/bootstrap/runner.go
Line: 79-82

Comment:
**Wave 1 errors are silently discarded**

`wave1Results, _ := runParallel(...)` discards any error from the first wave entirely. While the comment correctly notes wave 1 failures are non-fatal, completely swallowing the error means operators have no visibility when the `doc` or `deps` agents fail. Even a single `log.Printf` would surface this in verbose mode, which is valuable for debugging unexpected bootstrap behaviour:

```suggestion
	wave1Results, wave1Err := runParallel(ctx, wave1Agents, input)
	if wave1Err != nil {
		log.Printf("[bootstrap] wave 1 partial failure (non-fatal): %v", wave1Err)
	}
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "refactor(plan): remo..."

Security (3 medium-severity):
- Block find -exec/-execdir/-delete/-ok in ExecTool (injection risk)
- Add "--" before grep pattern to prevent flag injection
- Replace ad-hoc path validation in MCP handlers with safepath.SafeJoin
- Use safepath.SafeJoin in policy builtins resolvePath (traversal risk)
- Change crash log permissions from 0644 to 0600

Code quality:
- Add rows.Err() checks after all 8 rows.Next() loops in codeintel/repository.go
- Replace err == io.EOF with errors.Is(err, io.EOF) in 3 files
- Fix out.Close() error handling in memory_export.go (named return)

Freshness review fixes (Greptile PR #26 feedback):
- Fix decay discontinuity: unified formula (1.0 - ratio * 0.8) replaces
  hardcoded 0.2 all-missing + 0.4 formula partial-missing
- Add cache eviction (purge entries >2x TTL when cache exceeds 1000)
- Convert global cache to struct-based statCache (testable, bounded)
- Add TODO(freshness-level2) comments for dead code paths
- Mark freshness plan doc as "implemented with deviations"
- Add TestDecaySmoothCurve test verifying no discontinuities
Comment on lines 52 to 62
if filepath.IsAbs(path) {
return path
// Verify absolute path is within WorkDir
absWorkDir, err := filepath.Abs(bc.WorkDir)
if err != nil {
return "", fmt.Errorf("resolve work dir: %w", err)
}
if !strings.HasPrefix(filepath.Clean(path), absWorkDir) {
return "", fmt.Errorf("path outside work directory: %s", path)
}
return path, nil
}

This comment was marked as resolved.

Comment on lines +215 to +221
c.mu.Lock()
c.entries[path] = cacheEntry{info: info, err: err, checkedAt: time.Now()}
// Evict expired entries when cache grows too large
if len(c.entries) > cacheMaxSize {
c.evictExpired()
}
c.mu.Unlock()

This comment was marked as resolved.

- Updated migration tests to clarify directory structure and improve comments for better understanding.
- Enhanced error handling in the TUI by treating agent completion warnings as non-fatal, allowing for smoother user experience.
- Improved list view rendering by adding workspace checks and refactoring header rendering for better clarity.
- Introduced new skills for TaskWing, including detailed SKILL.md files for various commands to enhance project knowledge and workflow.
- Added functionality to simplify code while preserving behavior, ensuring that optimization does not bypass necessary gates.
- Implemented status command to provide current task progress and acceptance criteria, improving user awareness of task status.
}
}
// Also check for go.mod in subdirectories (monorepo)
if entries, err := os.ReadDir(basePath); err == nil {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 4 days ago

General fix: ensure that any path derived from user input is validated before being used in filesystem operations, not only when joining subpaths. In this case, hasDependencyFiles should not blindly trust its basePath argument; it should verify that basePath is a valid, existing directory (and, ideally, normalized) before calling os.ReadDir or os.Stat on paths derived from it.

Best concrete fix here without changing behavior: add a minimal, defensive validation step at the start of hasDependencyFiles that rejects non-existent or non-directory basePath values. This keeps its behavior the same for legitimate calls (where projectPath is an existing project directory) but ensures that even if ValidateAbsPath were misused or bypassed elsewhere in the future, hasDependencyFiles itself will not operate on arbitrary or unexpected paths. The new check should be placed inside hasDependencyFiles in internal/bootstrap/factory.go, before any filesystem operations: use os.Stat(basePath) to confirm that it exists and IsDir() to confirm it’s a directory; if not, return false immediately. This change is self-contained, requires no new imports (we already import os), and preserves all existing functionality for valid directories.

Concretely:

  • Edit internal/bootstrap/factory.go.

  • In hasDependencyFiles, immediately after the function signature, add:

    info, err := os.Stat(basePath)
    if err != nil || !info.IsDir() {
        return false
    }
  • Leave the rest of the logic as is, still using safepath.SafeJoin for combining basePath with file names and still iterating dependency manifests and subdirectories.

No other files (internal/server/handlers.go, internal/bootstrap/runner.go) need changes, because handleBootstrap already validates ProjectPath with safepath.ValidateAbsPath, and the new guard makes hasDependencyFiles robust even if it’s called from other contexts.


Suggested changeset 1
internal/bootstrap/factory.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/internal/bootstrap/factory.go b/internal/bootstrap/factory.go
--- a/internal/bootstrap/factory.go
+++ b/internal/bootstrap/factory.go
@@ -59,6 +59,12 @@
 // hasDependencyFiles checks if any common dependency manifest exists at the project root.
 // Uses safepath.SafeJoin to prevent path traversal in basePath.
 func hasDependencyFiles(basePath string) bool {
+	// Ensure basePath points to an existing directory before inspecting it.
+	info, err := os.Stat(basePath)
+	if err != nil || !info.IsDir() {
+		return false
+	}
+
 	for _, name := range dependencyManifests {
 		p, err := safepath.SafeJoin(basePath, name)
 		if err != nil {
EOF
@@ -59,6 +59,12 @@
// hasDependencyFiles checks if any common dependency manifest exists at the project root.
// Uses safepath.SafeJoin to prevent path traversal in basePath.
func hasDependencyFiles(basePath string) bool {
// Ensure basePath points to an existing directory before inspecting it.
info, err := os.Stat(basePath)
if err != nil || !info.IsDir() {
return false
}

for _, name := range dependencyManifests {
p, err := safepath.SafeJoin(basePath, name)
if err != nil {
Copilot is powered by AI and may make mistakes. Always verify output.
- Remove cmd/plan.go, cmd/goal.go, and plan TUI (1,982 lines deleted)
- MCP plan tool (clarify/generate/decompose/expand/finalize/audit) unchanged
- Planning prompt now scales task count dynamically to goal complexity
- Tasks include self-contained context (files, patterns, constraints, tech stack)
- Task enricher always provides baseline project context (constraints + decisions)
- Fix stop hook reliability: sync TasksCompleted from DB on each invocation
- Add session save retry logic to prevent stale state from failed writes
- Clear stale CurrentTaskID when DB lookup fails
@josephgoksu josephgoksu changed the title fix: security hardening, code quality, and freshness review fixes Agentic bootstrap, MCP-only planning, model updates, UX improvements Mar 18, 2026
@josephgoksu
Copy link
Owner Author

@greptileai

- policy/builtins.go: use safepath.ValidateAbsPath for absolute paths,
  fixes prefix collision bypass (e.g. /project-evil matching /project)
- freshness/freshness.go: add evictOldest fallback when all cache entries
  are fresh but cache exceeds max size during burst scenarios
- mcp/handlers.go: restore absolute path support in validateAndResolvePath
  via safepath.ValidateAbsPath, fixing regression for MCP clients
- agents/impl/analysis_deps.go: use filepath.Join instead of string concat
- bootstrap/factory.go: use safepath.SafeJoin for dependency file checks
// hasAnyDependencyFile checks if at least one common dependency manifest exists.
func hasAnyDependencyFile(basePath string) bool {
for _, name := range commonDepFiles {
if _, err := os.Stat(filepath.Join(basePath, name)); err == nil {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 4 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

if err != nil {
continue
}
if _, err := os.Stat(p); err == nil {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 4 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

…improvements

- Unified GetProjectContext replaces 3 separate retrieval paths (ContextRetriever, TaskEnricher, buildTaskContext)
- Bootstrap: fix duplicate "Bootstrap complete!" in multi-repo, batch link warnings into single summary, clean verification output
- Bootstrap: per-service project context in workspace mode, progress callbacks, repo count accuracy
- Models: add gpt-5.4/5.4-mini/5.4-nano, claude-opus-4-6/sonnet-4-6, gemini-3.1, remove deprecated models
- Models: newest-first ordering in selection UI
- UI: grouped knowledge output, freshness indicators, consistent header boxes across all commands
- UI: markdown rendering in ask --answer output, adaptive width
- Skills: add /taskwing:context, remove /taskwing:debug and /taskwing:simplify, fix stale file pruning
- Skills: clarify questions presented to user (not auto-answered), auto-start first task after plan creation
- Hooks: stop hook syncs TasksCompleted from DB, session save retry, verbose logs gated behind --debug
- MCP: behavioral CLAUDE.md instructions tell AI tools when to use TaskWing
- Prompts: generic and directive (no hardcoded tech stack examples), dynamic task count scaling
- Fix: read_file tool returns helpful message for directories instead of crashing ReAct agent
- Fix: auto-regenerate slash commands on MCP server start after brew upgrade
- Fix: truncateString consolidated to utils.Truncate, nil guards in context.go, brief.go pluralization
- Remove: CLI plan/goal/slash commands, docs/_partials, sync-docs scripts, stale doc references
if err != nil {
continue
}
if _, err := os.Stat(p); err == nil {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 4 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant