Skip to content

feat: local-first repositioning + query-time freshness validation#26

Merged
josephgoksu merged 11 commits intomainfrom
feat/query-time-freshness
Mar 15, 2026
Merged

feat: local-first repositioning + query-time freshness validation#26
josephgoksu merged 11 commits intomainfrom
feat/query-time-freshness

Conversation

@josephgoksu
Copy link
Owner

@josephgoksu josephgoksu commented Mar 15, 2026

Summary

Two major workstreams in one branch:

1. Local-first positioning overhaul

  • Reposition from "AI task manager" to "local-first AI knowledge layer"
  • Update tagline, hero copy, and privacy section across all docs (README, PRODUCT_VISION, TUTORIAL, CLAUDE.md, GEMINI.md, AGENTS.md)
  • Add honest "Private by Architecture" section with ASCII data flow diagram showing what TaskWing controls vs what the AI tool controls
  • Update CLI help text and LLM prompt to match new positioning
  • Add 5-phase positioning research (messaging framework, market validation, copy recommendations, credibility audit, SEO alignment)

2. Query-time freshness validation (Level 1)

  • New internal/freshness package: stat-based validation of evidence files at MCP query time
  • When ask returns knowledge findings, check if evidence files changed since the finding was created
  • Stale findings get confidence decayed (0.7x stale, 0.2x all-missing, floor 0.1)
  • MCP responses annotated: [verified 2h ago], [STALE: auth.go changed], [WARNING: evidence file deleted]
  • 60-second stat cache to avoid repeated syscalls across queries
  • Skip patterns for build artifacts (dist/, node_modules/, vendor/)
  • Schema migration: last_verified_at and original_confidence columns on nodes table
  • 13 tests covering fresh, stale, missing, no-evidence, cache, skip patterns

3. Planning docs

  • Multi-agent bootstrap pipeline plan (4 internal post-processing agents)
  • Tool template system plan (single-file self-registering tool pattern)
  • Freshness validation architecture plan

Test plan

  • go build ./... passes
  • go test ./... -count=1 -short -- 206 tests pass across 33 packages
  • go vet ./... clean
  • Doc marker consistency verified across README, CLAUDE, GEMINI, VISION, TUTORIAL
  • No old positioning language in production code or user-facing docs
  • QA review found and fixed 3 freshness defects (epoch reference time, dead variable, monorepo skip)

Greptile Summary

This PR delivers two parallel workstreams: a comprehensive local-first positioning overhaul across all docs/CLI, and a Level 1 query-time freshness validation system for the knowledge graph.

  • Positioning overhaul is thorough and consistent — tagline, hero copy, CLI help text, LLM prompts, and all 7 doc files updated with matching language. The new "Private by Architecture" section with ASCII data flow diagrams is honest about trust boundaries (what TaskWing controls vs what AI tools control).
  • Freshness validation (internal/freshness/) adds stat-based evidence file checking at MCP query time, with confidence decay for stale/missing files. The core logic is sound and well-tested with 13 tests covering fresh, stale, missing, no-evidence, cache, and skip patterns.
  • Key concern: The partial-missing decay formula has a discontinuity — jumping from 0.7 (1 of 2 files missing) to 0.2 (2 of 2 missing) is a 3.5x confidence cliff that seems unintentional.
  • Dead code: UpdateNodeFreshness(), GetNodeFreshness(), the last_verified_at/original_confidence schema columns, and the [verified Xh ago] format path are all unreachable — shipped for future Level 2 but adding maintenance burden now.
  • Global stat cache grows unboundedly. Works fine for current scale but will leak memory for large projects in long-running MCP server processes.
  • Planning docs have drifted from the implementation (missing LastVerifiedLevel, different function signatures, different decay factors).

Confidence Score: 3/5

  • PR is safe to merge with minor issues — the decay discontinuity should be reviewed but won't cause crashes.
  • Score of 3 reflects: solid test coverage and clean doc consistency, offset by a logic discontinuity in the decay formula, dead code shipped prematurely (schema columns + methods never called), and an unbounded global cache. None are blockers, but the decay behavior may produce surprising confidence scores for users.
  • Pay close attention to internal/freshness/freshness.go (decay discontinuity, unbounded cache) and internal/app/ask.go (dead persistence path, fallback reference time).

Important Files Changed

Filename Overview
internal/freshness/freshness.go New stat-based freshness checker. Global package-level mutable cache is a concurrency concern for long-running server processes and complicates testing. Partial-missing decay has a code path conflict (unreachable 0.2 vs formula 0.4).
internal/freshness/freshness_test.go 13 tests with good coverage across fresh, stale, missing, no-evidence, cache, and skip patterns. Tests properly call ResetCache() to isolate state.
internal/app/ask.go Adds freshness annotation to query results. Evidence JSON is re-serialized from parsed EvidenceRef (lossy round-trip). Freshness check results and last_verified_at are never persisted back to the database despite schema support.
internal/knowledge/response.go Adds freshness metadata fields (FreshnessStatus, FreshnessNote, StaleFiles) and CreatedAt to NodeResponse. Clean additions.
internal/mcp/presenter.go Cleanly adds freshness annotation to MCP response formatting. Minimal, correct change.
internal/memory/models.go Adds LastVerifiedAt and OriginalConfidence fields to Node struct. Plan mentions LastVerifiedLevel (int) but it was not implemented -- minor doc/code divergence.
internal/memory/sqlite.go Adds schema migration for freshness columns and CRUD methods. UpdateNodeFreshness and GetNodeFreshness are defined but never called by any code path in this PR.
cmd/root.go CLI help text updated to reflect local-first positioning. Clean change.
README.md Comprehensive positioning overhaul with new privacy architecture section and ASCII data flow diagram. Well-written and honest about trust boundaries.
docs/positioning/freshness-validation-plan.md Detailed plan for query-time freshness. Some drift from implementation (LastVerifiedLevel not implemented, function name differs, decay factors don't match code).

Sequence Diagram

sequenceDiagram
    participant AI as AI Tool (MCP Client)
    participant MCP as MCP Server (presenter.go)
    participant Ask as AskApp.Query (ask.go)
    participant KS as Knowledge Service
    participant SQLite as SQLite (memory.db)
    participant Fresh as freshness.Check
    participant FS as Filesystem (os.Stat)

    AI->>MCP: ask("how does auth work?")
    MCP->>Ask: Query(ctx, query, opts)
    Ask->>KS: Search(ctx, query, limit)
    KS->>SQLite: SearchFTS / VectorSearch
    SQLite-->>KS: []ScoredNode (with CreatedAt)
    KS-->>Ask: []ScoredNode
    Ask->>Ask: Convert to []NodeResponse

    rect rgb(255, 245, 230)
        Note over Ask,FS: Freshness Validation (new in this PR)
        loop For each NodeResponse
            Ask->>Fresh: Check(basePath, evidenceJSON, createdAt)
            Fresh->>FS: statCached(evidencePath)
            FS-->>Fresh: FileInfo.ModTime
            Fresh-->>Ask: Result{Status, DecayFactor}
            Ask->>Ask: Adjust ConfidenceScore × DecayFactor
            Ask->>Ask: Set FreshnessNote annotation
        end
    end

    Ask-->>MCP: AskResult (annotated)
    MCP->>MCP: FormatAsk (includes [STALE: ...] notes)
    MCP-->>AI: Markdown response with freshness metadata
Loading

Fix All in Claude Code

Prompt To Fix All With AI
This is a comment left during a code review.
Path: internal/freshness/freshness.go
Line: 193-197

Comment:
**Global mutable cache risks stale state across tests and long-running processes**

The stat cache is a package-level global with a `sync.RWMutex`. While `ResetCache()` exists for tests, this design has two concrete problems:

1. **Unbounded growth**: The cache `map[string]cacheEntry` grows forever — entries expire by TTL but are never evicted. For a project with thousands of evidence files, this leaks memory over the lifetime of an MCP server process.

2. **Testability**: Any test that forgets `ResetCache()` inherits stale state from prior tests. This is fragile — a test reorder or parallelism change can cause intermittent failures.

Consider either (a) adding a periodic eviction sweep (e.g., purge entries older than 2x TTL when cache exceeds N entries), or (b) making the cache an instance field on a `Checker` struct so each caller owns its own lifecycle. Option (b) also makes it trivially testable without global mutation.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: internal/freshness/freshness.go
Line: 102-122

Comment:
**Partial-missing decay formula contradicts all-missing constant**

When all evidence files are missing (`len(missing) == len(paths)`), the decay is hardcoded to `0.2` (line 107). But when "some files are missing", the formula on line 115 is:

```
decay = 1.0 - (missingRatio * 0.6)
```

At `missingRatio = 1.0` (which is checked and returned early above), this formula would yield `0.4` — not `0.2`. This means there's a discontinuity: if you have 2 evidence files and both are missing, you get `0.2`. But if you have 2 evidence files and 1 is missing, you get `1.0 - (0.5 * 0.6) = 0.7`. 

The jump from `0.7` (1 of 2 missing) to `0.2` (2 of 2 missing) is a 3.5x confidence cliff. This feels unintentional — if the formula is the intended behavior for partial missing, the all-missing case should probably use the same formula (`0.4`) or the formula constant should be `0.8` so partial approaches `0.2` smoothly.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: internal/app/ask.go
Line: 531-576

Comment:
**Freshness results and `last_verified_at` are never persisted to the database**

The schema migration adds `last_verified_at` and `original_confidence` columns, and `sqlite.go` defines `UpdateNodeFreshness()` / `GetNodeFreshness()` methods — but `annotateResultFreshness` never calls them. Every query re-stats every evidence file from scratch, and the `FormatStatus` call on line 566 always passes `nil` for `lastVerified`, meaning fresh nodes always display `[fresh]` instead of `[verified 2h ago]`.

This means:
1. The `[verified Xh ago]` annotation path in `FormatStatus` is dead code in practice — it can never be reached.
2. The `UpdateNodeFreshness` and `GetNodeFreshness` methods in `sqlite.go` are dead code — never called anywhere.
3. Every MCP query redundantly re-stats all evidence files even when nothing changed.

If this is intentional for Level 1 (persist later in Level 2+), that's fine — but the dead code should be flagged with a `// TODO(freshness-level2)` comment, and the schema columns should arguably be deferred until they're actually written to. Shipping dead schema + dead methods increases maintenance burden for zero current value.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: internal/app/ask.go
Line: 556-561

Comment:
**Fallback reference time biases all nodes without `CreatedAt` toward stale**

When `refTime.IsZero()`, the fallback is `time.Now().Add(-24 * time.Hour)`. This means any file modified in the last 24 hours will make the node appear stale — even though the node could have been created moments ago (if `CreatedAt` wasn't persisted for some reason). 

For the current codebase this may be acceptable since `CreateNode` in `sqlite.go:763` defaults `CreatedAt` to `time.Now().UTC()` when zero. But if nodes are ever imported or migrated without `created_at`, they'll all be marked stale by default. Consider logging a warning here so this edge case is visible during debugging.

```suggestion
		refTime := node.CreatedAt
		if refTime.IsZero() {
			// Fallback: if no creation time, treat as stale to be safe.
			// This can happen with imported or pre-v2.3 nodes missing created_at.
			refTime = time.Now().Add(-24 * time.Hour)
		}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/positioning/freshness-validation-plan.md
Line: 56-60

Comment:
**Plan doc diverged from implementation**

Three discrepancies between this plan and the shipped code:

1. **`LastVerifiedLevel int`** is in the plan but not implemented in `models.go` or the schema migration. The plan's `last_verified_level INTEGER DEFAULT 0` column was also not added.
2. **`FreshnessAge string`** is in the plan's `NodeResponse` spec but the implementation uses `FreshnessNote string` instead.
3. **Integration function**: Plan says `freshness.ValidateNodeResponses(ctx, a.ctx.Repo, a.ctx.BasePath, result.Results)`, but implementation is `annotateResultFreshness(a.ctx.BasePath, results)` — a local function that doesn't take `ctx` or `Repo`, meaning it can't persist results.
4. **Decay factors**: Plan shows single-file-deleted as `0.4` and all-deleted as `0.2`. Code uses `0.7` for stale (not in plan) and a formula `1.0 - (missingRatio * 0.6)` for partial missing (not in plan).

Consider updating this plan to reflect what was actually shipped, or marking it with a "Status: Implemented (with deviations)" header. Stale plan docs become misleading for future contributors.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 46a1be4

Greptile also left 5 inline comments on this PR.

Update hero copy, tagline, and capability table to lead with
local-first architecture and sovereignty angle. Add "Private by
Architecture" section with data flow diagram showing bootstrap ->
local SQLite -> local MCP queries. Update PRODUCT_VISION.md with
matching positioning and sovereignty property table.
Phase 1: messaging framework (dual-track, 5 sovereignty claims)
Phase 2: market validation (competitor audit, terminology testing)
Phase 3: copy recommendations (README hero, landing page, word swaps)
Phase 4: credibility audit (claim-by-claim truth check)
Phase 5: SEO alignment (keyword clusters, metadata, content strategy)
Design for 4 internal post-processing agents (Synthesizer,
CrossChecker, Refiner, CoverageAuditor) that improve bootstrap
quality. Covers AI architecture, software integration, and
shipping plan across v1.22.0-v1.24.0. No backward compat flag --
post-processing is always on.
Replace "Nothing leaves your machine" with honest two-part framing:
what TaskWing controls (local storage + local MCP queries) vs what
the AI tool controls (cloud tools send conversations to their servers).
Add clear guidance: use Ollama + local AI tool for full air-gap.
Show two paths clearly: standard (cloud LLM + cloud AI tool) and
full air-gap (Ollama + local AI tool). Separate what TaskWing
controls from what the AI tool controls. No misleading claims.
Replace "persistent context/memory", "permanent memory", "AI task
management" with "local-first AI knowledge layer" language across:
- cmd/root.go (CLI help text)
- initializer.go (generated doc section constant)
- knowledge/service.go (LLM prompt)
- CLAUDE.md, GEMINI.md, AGENTS.md (developer docs)
- TUTORIAL.md, README.md (user-facing docs)

Skip CHANGELOG.md (historical record).
Design for single-file self-registering tool pattern that reduces
adding a new MCP tool from 6 files to 2. Includes final roster of
9 MCP tools (6 migrated + health, onboard, review) and 18 slash
commands. 10-step migration plan from current system.
- README: "persistent architectural context" -> "local architectural context"
- PRODUCT_VISION: "codebase flowing through someone else's cloud" -> "knowledge
  base leaving your machine" (Phase 4 credibility audit flagged the original
  as overstating -- bootstrap DOES send code to LLMs)
Design for validating knowledge graph freshness at MCP query time.
stat() evidence files on every query, adjust confidence for stale
findings, show freshness metadata in responses. Three levels:
stat check (inline, <1ms), snippet re-validation (stale files only),
and async re-analysis queue. Covers ~95% of staleness scenarios
without hooks, daemons, or cron.
Add query-time freshness validation for MCP knowledge responses.
When any MCP tool returns knowledge findings, stat() evidence files
to detect whether they changed since last analysis.

- New internal/freshness package with Check() and stat cache (60s TTL)
- Schema migration: add last_verified_at, original_confidence to nodes
- NodeResponse gains FreshnessStatus, FreshnessNote, StaleFiles fields
- MCP presenter shows [verified Xh ago] or [STALE: file.go changed]
- Confidence adjusted by decay factor (0.7 stale, 0.2 all-missing)
- Skip patterns for build artifacts (dist/, node_modules/, vendor/)
- 12 tests covering fresh, stale, missing, no-evidence, cache, skip

Performance: <1ms per query (stat syscalls + cache). Zero LLM cost.
1. Remove dead variable `checked` (incremented but never read)
2. Fix reference time: was using epoch (year 0001) which made every
   finding appear stale on every query. Now uses node.CreatedAt --
   files modified after the node was created indicate staleness.
3. Fix shouldSkip for monorepo paths: api/node_modules/ was not
   matched because only prefix-check was done. Now also checks
   for /pattern as a path segment.

Also: add CreatedAt to NodeResponse for freshness reference,
add test for monorepo skip pattern.
Comment on lines +193 to +197
var (
cache = make(map[string]cacheEntry)
cacheMu sync.RWMutex
cacheTTL = 60 * time.Second
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global mutable cache risks stale state across tests and long-running processes

The stat cache is a package-level global with a sync.RWMutex. While ResetCache() exists for tests, this design has two concrete problems:

  1. Unbounded growth: The cache map[string]cacheEntry grows forever — entries expire by TTL but are never evicted. For a project with thousands of evidence files, this leaks memory over the lifetime of an MCP server process.

  2. Testability: Any test that forgets ResetCache() inherits stale state from prior tests. This is fragile — a test reorder or parallelism change can cause intermittent failures.

Consider either (a) adding a periodic eviction sweep (e.g., purge entries older than 2x TTL when cache exceeds N entries), or (b) making the cache an instance field on a Checker struct so each caller owns its own lifecycle. Option (b) also makes it trivially testable without global mutation.

Prompt To Fix With AI
This is a comment left during a code review.
Path: internal/freshness/freshness.go
Line: 193-197

Comment:
**Global mutable cache risks stale state across tests and long-running processes**

The stat cache is a package-level global with a `sync.RWMutex`. While `ResetCache()` exists for tests, this design has two concrete problems:

1. **Unbounded growth**: The cache `map[string]cacheEntry` grows forever — entries expire by TTL but are never evicted. For a project with thousands of evidence files, this leaks memory over the lifetime of an MCP server process.

2. **Testability**: Any test that forgets `ResetCache()` inherits stale state from prior tests. This is fragile — a test reorder or parallelism change can cause intermittent failures.

Consider either (a) adding a periodic eviction sweep (e.g., purge entries older than 2x TTL when cache exceeds N entries), or (b) making the cache an instance field on a `Checker` struct so each caller owns its own lifecycle. Option (b) also makes it trivially testable without global mutation.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +102 to +122
// All evidence files deleted
if len(missing) == len(paths) {
return Result{
Status: StatusMissing,
MissingFiles: missing,
DecayFactor: 0.2,
CheckedAt: now,
}
}

// Some files missing
if len(missing) > 0 {
missingRatio := float64(len(missing)) / float64(len(paths))
decay := 1.0 - (missingRatio * 0.6) // 0.4 at 100% missing
return Result{
Status: StatusMissing,
StaleFiles: stale,
MissingFiles: missing,
DecayFactor: decay,
CheckedAt: now,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial-missing decay formula contradicts all-missing constant

When all evidence files are missing (len(missing) == len(paths)), the decay is hardcoded to 0.2 (line 107). But when "some files are missing", the formula on line 115 is:

decay = 1.0 - (missingRatio * 0.6)

At missingRatio = 1.0 (which is checked and returned early above), this formula would yield 0.4 — not 0.2. This means there's a discontinuity: if you have 2 evidence files and both are missing, you get 0.2. But if you have 2 evidence files and 1 is missing, you get 1.0 - (0.5 * 0.6) = 0.7.

The jump from 0.7 (1 of 2 missing) to 0.2 (2 of 2 missing) is a 3.5x confidence cliff. This feels unintentional — if the formula is the intended behavior for partial missing, the all-missing case should probably use the same formula (0.4) or the formula constant should be 0.8 so partial approaches 0.2 smoothly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: internal/freshness/freshness.go
Line: 102-122

Comment:
**Partial-missing decay formula contradicts all-missing constant**

When all evidence files are missing (`len(missing) == len(paths)`), the decay is hardcoded to `0.2` (line 107). But when "some files are missing", the formula on line 115 is:

```
decay = 1.0 - (missingRatio * 0.6)
```

At `missingRatio = 1.0` (which is checked and returned early above), this formula would yield `0.4` — not `0.2`. This means there's a discontinuity: if you have 2 evidence files and both are missing, you get `0.2`. But if you have 2 evidence files and 1 is missing, you get `1.0 - (0.5 * 0.6) = 0.7`. 

The jump from `0.7` (1 of 2 missing) to `0.2` (2 of 2 missing) is a 3.5x confidence cliff. This feels unintentional — if the formula is the intended behavior for partial missing, the all-missing case should probably use the same formula (`0.4`) or the formula constant should be `0.8` so partial approaches `0.2` smoothly.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +531 to +576
func annotateResultFreshness(basePath string, results []knowledge.NodeResponse) {
for i := range results {
node := &results[i]

// Reconstruct evidence JSON from the parsed EvidenceRef slice
// for the freshness checker
if len(node.Evidence) == 0 {
node.FreshnessStatus = string(freshness.StatusNoEvidence)
continue
}

// Build evidence JSON from EvidenceRef
type ev struct {
FilePath string `json:"file_path"`
}
items := make([]ev, len(node.Evidence))
for j, e := range node.Evidence {
items[j] = ev{FilePath: e.File}
}
evJSON, err := json.Marshal(items)
if err != nil {
continue
}

// Use node creation time as the reference. Files modified after the
// node was created indicate the knowledge may be stale.
refTime := node.CreatedAt
if refTime.IsZero() {
// Fallback: if no creation time, treat as stale to be safe
refTime = time.Now().Add(-24 * time.Hour)
}

result := freshness.Check(basePath, string(evJSON), refTime)

node.FreshnessStatus = string(result.Status)
node.FreshnessNote = freshness.FormatStatus(result, nil)
node.StaleFiles = result.StaleFiles

// Adjust confidence if stale or missing
if result.DecayFactor < 1.0 && node.ConfidenceScore > 0 {
node.ConfidenceScore = node.ConfidenceScore * result.DecayFactor
if node.ConfidenceScore < 0.1 {
node.ConfidenceScore = 0.1
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Freshness results and last_verified_at are never persisted to the database

The schema migration adds last_verified_at and original_confidence columns, and sqlite.go defines UpdateNodeFreshness() / GetNodeFreshness() methods — but annotateResultFreshness never calls them. Every query re-stats every evidence file from scratch, and the FormatStatus call on line 566 always passes nil for lastVerified, meaning fresh nodes always display [fresh] instead of [verified 2h ago].

This means:

  1. The [verified Xh ago] annotation path in FormatStatus is dead code in practice — it can never be reached.
  2. The UpdateNodeFreshness and GetNodeFreshness methods in sqlite.go are dead code — never called anywhere.
  3. Every MCP query redundantly re-stats all evidence files even when nothing changed.

If this is intentional for Level 1 (persist later in Level 2+), that's fine — but the dead code should be flagged with a // TODO(freshness-level2) comment, and the schema columns should arguably be deferred until they're actually written to. Shipping dead schema + dead methods increases maintenance burden for zero current value.

Prompt To Fix With AI
This is a comment left during a code review.
Path: internal/app/ask.go
Line: 531-576

Comment:
**Freshness results and `last_verified_at` are never persisted to the database**

The schema migration adds `last_verified_at` and `original_confidence` columns, and `sqlite.go` defines `UpdateNodeFreshness()` / `GetNodeFreshness()` methods — but `annotateResultFreshness` never calls them. Every query re-stats every evidence file from scratch, and the `FormatStatus` call on line 566 always passes `nil` for `lastVerified`, meaning fresh nodes always display `[fresh]` instead of `[verified 2h ago]`.

This means:
1. The `[verified Xh ago]` annotation path in `FormatStatus` is dead code in practice — it can never be reached.
2. The `UpdateNodeFreshness` and `GetNodeFreshness` methods in `sqlite.go` are dead code — never called anywhere.
3. Every MCP query redundantly re-stats all evidence files even when nothing changed.

If this is intentional for Level 1 (persist later in Level 2+), that's fine — but the dead code should be flagged with a `// TODO(freshness-level2)` comment, and the schema columns should arguably be deferred until they're actually written to. Shipping dead schema + dead methods increases maintenance burden for zero current value.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +556 to +561
// node was created indicate the knowledge may be stale.
refTime := node.CreatedAt
if refTime.IsZero() {
// Fallback: if no creation time, treat as stale to be safe
refTime = time.Now().Add(-24 * time.Hour)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback reference time biases all nodes without CreatedAt toward stale

When refTime.IsZero(), the fallback is time.Now().Add(-24 * time.Hour). This means any file modified in the last 24 hours will make the node appear stale — even though the node could have been created moments ago (if CreatedAt wasn't persisted for some reason).

For the current codebase this may be acceptable since CreateNode in sqlite.go:763 defaults CreatedAt to time.Now().UTC() when zero. But if nodes are ever imported or migrated without created_at, they'll all be marked stale by default. Consider logging a warning here so this edge case is visible during debugging.

Suggested change
// node was created indicate the knowledge may be stale.
refTime := node.CreatedAt
if refTime.IsZero() {
// Fallback: if no creation time, treat as stale to be safe
refTime = time.Now().Add(-24 * time.Hour)
}
refTime := node.CreatedAt
if refTime.IsZero() {
// Fallback: if no creation time, treat as stale to be safe.
// This can happen with imported or pre-v2.3 nodes missing created_at.
refTime = time.Now().Add(-24 * time.Hour)
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: internal/app/ask.go
Line: 556-561

Comment:
**Fallback reference time biases all nodes without `CreatedAt` toward stale**

When `refTime.IsZero()`, the fallback is `time.Now().Add(-24 * time.Hour)`. This means any file modified in the last 24 hours will make the node appear stale — even though the node could have been created moments ago (if `CreatedAt` wasn't persisted for some reason). 

For the current codebase this may be acceptable since `CreateNode` in `sqlite.go:763` defaults `CreatedAt` to `time.Now().UTC()` when zero. But if nodes are ever imported or migrated without `created_at`, they'll all be marked stale by default. Consider logging a warning here so this edge case is visible during debugging.

```suggestion
		refTime := node.CreatedAt
		if refTime.IsZero() {
			// Fallback: if no creation time, treat as stale to be safe.
			// This can happen with imported or pre-v2.3 nodes missing created_at.
			refTime = time.Now().Add(-24 * time.Hour)
		}
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +56 to +60
```go
LastVerifiedAt *time.Time // When last freshness check ran
LastVerifiedLevel int // 0=never, 1=stat, 2=snippet, 3=full
OriginalConfidence *float64 // Confidence at ingest, before decay
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan doc diverged from implementation

Three discrepancies between this plan and the shipped code:

  1. LastVerifiedLevel int is in the plan but not implemented in models.go or the schema migration. The plan's last_verified_level INTEGER DEFAULT 0 column was also not added.
  2. FreshnessAge string is in the plan's NodeResponse spec but the implementation uses FreshnessNote string instead.
  3. Integration function: Plan says freshness.ValidateNodeResponses(ctx, a.ctx.Repo, a.ctx.BasePath, result.Results), but implementation is annotateResultFreshness(a.ctx.BasePath, results) — a local function that doesn't take ctx or Repo, meaning it can't persist results.
  4. Decay factors: Plan shows single-file-deleted as 0.4 and all-deleted as 0.2. Code uses 0.7 for stale (not in plan) and a formula 1.0 - (missingRatio * 0.6) for partial missing (not in plan).

Consider updating this plan to reflect what was actually shipped, or marking it with a "Status: Implemented (with deviations)" header. Stale plan docs become misleading for future contributors.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/positioning/freshness-validation-plan.md
Line: 56-60

Comment:
**Plan doc diverged from implementation**

Three discrepancies between this plan and the shipped code:

1. **`LastVerifiedLevel int`** is in the plan but not implemented in `models.go` or the schema migration. The plan's `last_verified_level INTEGER DEFAULT 0` column was also not added.
2. **`FreshnessAge string`** is in the plan's `NodeResponse` spec but the implementation uses `FreshnessNote string` instead.
3. **Integration function**: Plan says `freshness.ValidateNodeResponses(ctx, a.ctx.Repo, a.ctx.BasePath, result.Results)`, but implementation is `annotateResultFreshness(a.ctx.BasePath, results)` — a local function that doesn't take `ctx` or `Repo`, meaning it can't persist results.
4. **Decay factors**: Plan shows single-file-deleted as `0.4` and all-deleted as `0.2`. Code uses `0.7` for stale (not in plan) and a formula `1.0 - (missingRatio * 0.6)` for partial missing (not in plan).

Consider updating this plan to reflect what was actually shipped, or marking it with a "Status: Implemented (with deviations)" header. Stale plan docs become misleading for future contributors.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

@josephgoksu josephgoksu merged commit 661d172 into main Mar 15, 2026
10 checks passed
@josephgoksu josephgoksu deleted the feat/query-time-freshness branch March 15, 2026 23:54
josephgoksu added a commit that referenced this pull request Mar 16, 2026
Security (3 medium-severity):
- Block find -exec/-execdir/-delete/-ok in ExecTool (injection risk)
- Add "--" before grep pattern to prevent flag injection
- Replace ad-hoc path validation in MCP handlers with safepath.SafeJoin
- Use safepath.SafeJoin in policy builtins resolvePath (traversal risk)
- Change crash log permissions from 0644 to 0600

Code quality:
- Add rows.Err() checks after all 8 rows.Next() loops in codeintel/repository.go
- Replace err == io.EOF with errors.Is(err, io.EOF) in 3 files
- Fix out.Close() error handling in memory_export.go (named return)

Freshness review fixes (Greptile PR #26 feedback):
- Fix decay discontinuity: unified formula (1.0 - ratio * 0.8) replaces
  hardcoded 0.2 all-missing + 0.4 formula partial-missing
- Add cache eviction (purge entries >2x TTL when cache exceeds 1000)
- Convert global cache to struct-based statCache (testable, bounded)
- Add TODO(freshness-level2) comments for dead code paths
- Mark freshness plan doc as "implemented with deviations"
- Add TestDecaySmoothCurve test verifying no discontinuities
josephgoksu added a commit that referenced this pull request Mar 18, 2026
…ence removal (#28)

* fix: security hardening, code quality, and freshness review fixes

Security (3 medium-severity):
- Block find -exec/-execdir/-delete/-ok in ExecTool (injection risk)
- Add "--" before grep pattern to prevent flag injection
- Replace ad-hoc path validation in MCP handlers with safepath.SafeJoin
- Use safepath.SafeJoin in policy builtins resolvePath (traversal risk)
- Change crash log permissions from 0644 to 0600

Code quality:
- Add rows.Err() checks after all 8 rows.Next() loops in codeintel/repository.go
- Replace err == io.EOF with errors.Is(err, io.EOF) in 3 files
- Fix out.Close() error handling in memory_export.go (named return)

Freshness review fixes (Greptile PR #26 feedback):
- Fix decay discontinuity: unified formula (1.0 - ratio * 0.8) replaces
  hardcoded 0.2 all-missing + 0.4 formula partial-missing
- Add cache eviction (purge entries >2x TTL when cache exceeds 1000)
- Convert global cache to struct-based statCache (testable, bounded)
- Add TODO(freshness-level2) comments for dead code paths
- Mark freshness plan doc as "implemented with deviations"
- Add TestDecaySmoothCurve test verifying no discontinuities

* Refactor migration tests and enhance TUI error handling

- Updated migration tests to clarify directory structure and improve comments for better understanding.
- Enhanced error handling in the TUI by treating agent completion warnings as non-fatal, allowing for smoother user experience.
- Improved list view rendering by adding workspace checks and refactoring header rendering for better clarity.
- Introduced new skills for TaskWing, including detailed SKILL.md files for various commands to enhance project knowledge and workflow.
- Added functionality to simplify code while preserving behavior, ensuring that optimization does not bypass necessary gates.
- Implemented status command to provide current task progress and acceptance criteria, improving user awareness of task status.

* refactor(plan): remove CLI commands, keep MCP-only planning

- Remove cmd/plan.go, cmd/goal.go, and plan TUI (1,982 lines deleted)
- MCP plan tool (clarify/generate/decompose/expand/finalize/audit) unchanged
- Planning prompt now scales task count dynamically to goal complexity
- Tasks include self-contained context (files, patterns, constraints, tech stack)
- Task enricher always provides baseline project context (constraints + decisions)
- Fix stop hook reliability: sync TasksCompleted from DB on each invocation
- Add session save retry logic to prevent stale state from failed writes
- Clear stale CurrentTaskID when DB lookup fails

* fix: address PR review comments (security + cache + path handling)

- policy/builtins.go: use safepath.ValidateAbsPath for absolute paths,
  fixes prefix collision bypass (e.g. /project-evil matching /project)
- freshness/freshness.go: add evictOldest fallback when all cache entries
  are fresh but cache exceeds max size during burst scenarios
- mcp/handlers.go: restore absolute path support in validateAndResolvePath
  via safepath.ValidateAbsPath, fixing regression for MCP clients
- agents/impl/analysis_deps.go: use filepath.Join instead of string concat
- bootstrap/factory.go: use safepath.SafeJoin for dependency file checks

* feat: unified context API, bootstrap UX polish, model updates, skill improvements

- Unified GetProjectContext replaces 3 separate retrieval paths (ContextRetriever, TaskEnricher, buildTaskContext)
- Bootstrap: fix duplicate "Bootstrap complete!" in multi-repo, batch link warnings into single summary, clean verification output
- Bootstrap: per-service project context in workspace mode, progress callbacks, repo count accuracy
- Models: add gpt-5.4/5.4-mini/5.4-nano, claude-opus-4-6/sonnet-4-6, gemini-3.1, remove deprecated models
- Models: newest-first ordering in selection UI
- UI: grouped knowledge output, freshness indicators, consistent header boxes across all commands
- UI: markdown rendering in ask --answer output, adaptive width
- Skills: add /taskwing:context, remove /taskwing:debug and /taskwing:simplify, fix stale file pruning
- Skills: clarify questions presented to user (not auto-answered), auto-start first task after plan creation
- Hooks: stop hook syncs TasksCompleted from DB, session save retry, verbose logs gated behind --debug
- MCP: behavioral CLAUDE.md instructions tell AI tools when to use TaskWing
- Prompts: generic and directive (no hardcoded tech stack examples), dynamic task count scaling
- Fix: read_file tool returns helpful message for directories instead of crashing ReAct agent
- Fix: auto-regenerate slash commands on MCP server start after brew upgrade
- Fix: truncateString consolidated to utils.Truncate, nil guards in context.go, brief.go pluralization
- Remove: CLI plan/goal/slash commands, docs/_partials, sync-docs scripts, stale doc references

* fix: remove stale references to removed slash commands

Update CLI output, hints, comments, and docs to reflect the 8->4 skill
restructure. Replace references to /taskwing:ask, /taskwing:status,
/taskwing:explain, and /taskwing:remember with their current equivalents
(MCP tools or /taskwing:context).

* docs: update marketing materials for post-refactor state

Replace removed taskwing goal references, rewrite CHANGELOG unreleased
section to reflect consolidated slash commands and MCP-first planning.

* fix: address remaining PR review comments

- Save task ID before clearing so debug log prints actual value
- Use safepath.SafeJoin for tw-* directory cleanup to prevent traversal
- Log wave 1 agent errors instead of silently discarding them
- Remove dead legacy marker check (duplicate of primary path)
- Fix plan_id -> clarify_session_id reference in plan skill

* fix: address second round of PR review comments

- Propagate critical error from GetProjectContext when constraint DB fetch fails
- Resolve ARCHITECTURE.md from basePath directly instead of raw .. traversal
- Add TOCTOU re-check in stat cache to prevent stale overwrites under concurrency
- Use filepath.Join instead of hardcoded "/" for monorepo subdirectory paths
- Replace legacy log.Printf with slog.Debug in doc analysis agent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant