diff --git a/.gitignore b/.gitignore deleted file mode 100644 index 52f79ab..0000000 --- a/.gitignore +++ /dev/null @@ -1,8 +0,0 @@ -node_modules/ -browse/dist/ -.gstack/ -.claude/skills/ -/tmp/ -*.log -bun.lock -*.bun-build diff --git a/BROWSER.md b/BROWSER.md deleted file mode 100644 index 640bb65..0000000 --- a/BROWSER.md +++ /dev/null @@ -1,229 +0,0 @@ -# Browser — technical details - -This document covers the command reference and internals of gstack's headless browser. - -## Command reference - -| Category | Commands | What for | -|----------|----------|----------| -| Navigate | `goto`, `back`, `forward`, `reload`, `url` | Get to a page | -| Read | `text`, `html`, `links`, `forms`, `accessibility` | Extract content | -| Snapshot | `snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C]` | Get refs, diff, annotate | -| Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport`, `upload` | Use the page | -| Inspect | `js`, `eval`, `css`, `attrs`, `is`, `console`, `network`, `dialog`, `cookies`, `storage`, `perf` | Debug and verify | -| Visual | `screenshot`, `pdf`, `responsive` | See what Claude sees | -| Compare | `diff ` | Spot differences between environments | -| Dialogs | `dialog-accept [text]`, `dialog-dismiss` | Control alert/confirm/prompt handling | -| Tabs | `tabs`, `tab`, `newtab`, `closetab` | Multi-page workflows | -| Cookies | `cookie-import`, `cookie-import-browser` | Import cookies from file or real browser | -| Multi-step | `chain` (JSON from stdin) | Batch commands in one call | - -All selector arguments accept CSS selectors, `@e` refs after `snapshot`, or `@c` refs after `snapshot -C`. 50+ commands total plus cookie import. - -## How it works - -gstack's browser is a compiled CLI binary that talks to a persistent local Chromium daemon over HTTP. The CLI is a thin client — it reads a state file, sends a command, and prints the response to stdout. The server does the real work via [Playwright](https://playwright.dev/). - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ Claude Code │ -│ │ -│ "browse goto https://staging.myapp.com" │ -│ │ │ -│ ▼ │ -│ ┌──────────┐ HTTP POST ┌──────────────┐ │ -│ │ browse │ ──────────────── │ Bun HTTP │ │ -│ │ CLI │ localhost:rand │ server │ │ -│ │ │ Bearer token │ │ │ -│ │ compiled │ ◄────────────── │ Playwright │──── Chromium │ -│ │ binary │ plain text │ API calls │ (headless) │ -│ └──────────┘ └──────────────┘ │ -│ ~1ms startup persistent daemon │ -│ auto-starts on first call │ -│ auto-stops after 30 min idle │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Lifecycle - -1. **First call**: CLI checks `.gstack/browse.json` (in the project root) for a running server. None found — it spawns `bun run browse/src/server.ts` in the background. The server launches headless Chromium via Playwright, picks a random port (10000-60000), generates a bearer token, writes the state file, and starts accepting HTTP requests. This takes ~3 seconds. - -2. **Subsequent calls**: CLI reads the state file, sends an HTTP POST with the bearer token, prints the response. ~100-200ms round trip. - -3. **Idle shutdown**: After 30 minutes with no commands, the server shuts down and cleans up the state file. Next call restarts it automatically. - -4. **Crash recovery**: If Chromium crashes, the server exits immediately (no self-healing — don't hide failure). The CLI detects the dead server on the next call and starts a fresh one. - -### Key components - -``` -browse/ -├── src/ -│ ├── cli.ts # Thin client — reads state file, sends HTTP, prints response -│ ├── server.ts # Bun.serve HTTP server — routes commands to Playwright -│ ├── browser-manager.ts # Chromium lifecycle — launch, tabs, ref map, crash handling -│ ├── snapshot.ts # Accessibility tree → @ref assignment → Locator map + diff/annotate/-C -│ ├── read-commands.ts # Non-mutating commands (text, html, links, js, css, is, dialog, etc.) -│ ├── write-commands.ts # Mutating commands (click, fill, select, upload, dialog-accept, etc.) -│ ├── meta-commands.ts # Server management, chain, diff, snapshot routing -│ ├── cookie-import-browser.ts # Decrypt + import cookies from real Chromium browsers -│ ├── cookie-picker-routes.ts # HTTP routes for interactive cookie picker UI -│ ├── cookie-picker-ui.ts # Self-contained HTML/CSS/JS for cookie picker -│ └── buffers.ts # CircularBuffer + console/network/dialog capture -├── test/ # Integration tests + HTML fixtures -└── dist/ - └── browse # Compiled binary (~58MB, Bun --compile) -``` - -### The snapshot system - -The browser's key innovation is ref-based element selection, built on Playwright's accessibility tree API: - -1. `page.locator(scope).ariaSnapshot()` returns a YAML-like accessibility tree -2. The snapshot parser assigns refs (`@e1`, `@e2`, ...) to each element -3. For each ref, it builds a Playwright `Locator` (using `getByRole` + nth-child) -4. The ref-to-Locator map is stored on `BrowserManager` -5. Later commands like `click @e3` look up the Locator and call `locator.click()` - -No DOM mutation. No injected scripts. Just Playwright's native accessibility API. - -**Extended snapshot features:** -- `--diff` (`-D`): Stores each snapshot as a baseline. On the next `-D` call, returns a unified diff showing what changed. Use this to verify that an action (click, fill, etc.) actually worked. -- `--annotate` (`-a`): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use `-o ` to control the output path. -- `--cursor-interactive` (`-C`): Scans for non-ARIA interactive elements (divs with `cursor:pointer`, `onclick`, `tabindex>=0`) using `page.evaluate`. Assigns `@c1`, `@c2`... refs with deterministic `nth-child` CSS selectors. These are elements the ARIA tree misses but users can still click. - -### Authentication - -Each server session generates a random UUID as a bearer token. The token is written to the state file (`.gstack/browse.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer `. This prevents other processes on the machine from controlling the browser. - -### Console, network, and dialog capture - -The server hooks into Playwright's `page.on('console')`, `page.on('response')`, and `page.on('dialog')` events. All entries are kept in O(1) circular buffers (50,000 capacity each) and flushed to disk asynchronously via `Bun.write()`: - -- Console: `.gstack/browse-console.log` -- Network: `.gstack/browse-network.log` -- Dialog: `.gstack/browse-dialog.log` - -The `console`, `network`, and `dialog` commands read from the in-memory buffers, not disk. - -### Dialog handling - -Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept ` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken. - -### Multi-workspace support - -Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`). - -| Workspace | State file | Port | -|-----------|------------|------| -| `/code/project-a` | `/code/project-a/.gstack/browse.json` | random (10000-60000) | -| `/code/project-b` | `/code/project-b/.gstack/browse.json` | random (10000-60000) | - -No port collisions. No shared state. Each project is fully isolated. - -### Environment variables - -| Variable | Default | Description | -|----------|---------|-------------| -| `BROWSE_PORT` | 0 (random 10000-60000) | Fixed port for the HTTP server (debug override) | -| `BROWSE_IDLE_TIMEOUT` | 1800000 (30 min) | Idle shutdown timeout in ms | -| `BROWSE_STATE_FILE` | `.gstack/browse.json` | Path to state file (CLI passes to server) | -| `BROWSE_SERVER_SCRIPT` | auto-detected | Path to server.ts | - -### Performance - -| Tool | First call | Subsequent calls | Context overhead per call | -|------|-----------|-----------------|--------------------------| -| Chrome MCP | ~5s | ~2-5s | ~2000 tokens (schema + protocol) | -| Playwright MCP | ~3s | ~1-3s | ~1500 tokens (schema + protocol) | -| **gstack browse** | **~3s** | **~100-200ms** | **0 tokens** (plain text stdout) | - -The context overhead difference compounds fast. In a 20-command browser session, MCP tools burn 30,000-40,000 tokens on protocol framing alone. gstack burns zero. - -### Why CLI over MCP? - -MCP (Model Context Protocol) works well for remote services, but for local browser automation it adds pure overhead: - -- **Context bloat**: every MCP call includes full JSON schemas and protocol framing. A simple "get the page text" costs 10x more context tokens than it should. -- **Connection fragility**: persistent WebSocket/stdio connections drop and fail to reconnect. -- **Unnecessary abstraction**: Claude Code already has a Bash tool. A CLI that prints to stdout is the simplest possible interface. - -gstack skips all of this. Compiled binary. Plain text in, plain text out. No protocol. No schema. No connection management. - -## Acknowledgments - -The browser automation layer is built on [Playwright](https://playwright.dev/) by Microsoft. Playwright's accessibility tree API, locator system, and headless Chromium management are what make ref-based interaction possible. The snapshot system — assigning `@ref` labels to accessibility tree nodes and mapping them back to Playwright Locators — is built entirely on top of Playwright's primitives. Thank you to the Playwright team for building such a solid foundation. - -## Development - -### Prerequisites - -- [Bun](https://bun.sh/) v1.0+ -- Playwright's Chromium (installed automatically by `bun install`) - -### Quick start - -```bash -bun install # install dependencies + Playwright Chromium -bun test # run integration tests (~3s) -bun run dev # run CLI from source (no compile) -bun run build # compile to browse/dist/browse -``` - -### Dev mode vs compiled binary - -During development, use `bun run dev` instead of the compiled binary. It runs `browse/src/cli.ts` directly with Bun, so you get instant feedback without a compile step: - -```bash -bun run dev goto https://example.com -bun run dev text -bun run dev snapshot -i -bun run dev click @e3 -``` - -The compiled binary (`bun run build`) is only needed for distribution. It produces a single ~58MB executable at `browse/dist/browse` using Bun's `--compile` flag. - -### Running tests - -```bash -bun test # run all tests -bun test browse/test/commands # run command integration tests only -bun test browse/test/snapshot # run snapshot tests only -bun test browse/test/cookie-import-browser # run cookie import unit tests only -``` - -Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fixtures from `browse/test/fixtures/`, then exercise the CLI commands against those pages. 203 tests across 3 files, ~15 seconds total. - -### Source map - -| File | Role | -|------|------| -| `browse/src/cli.ts` | Entry point. Reads `.gstack/browse.json`, sends HTTP to the server, prints response. | -| `browse/src/server.ts` | Bun HTTP server. Routes commands to the right handler. Manages idle timeout. | -| `browse/src/browser-manager.ts` | Chromium lifecycle — launch, tab management, ref map, crash detection. | -| `browse/src/snapshot.ts` | Parses accessibility tree, assigns `@e`/`@c` refs, builds Locator map. Handles `--diff`, `--annotate`, `-C`. | -| `browse/src/read-commands.ts` | Non-mutating commands: `text`, `html`, `links`, `js`, `css`, `is`, `dialog`, `forms`, etc. Exports `getCleanText()`. | -| `browse/src/write-commands.ts` | Mutating commands: `goto`, `click`, `fill`, `upload`, `dialog-accept`, `useragent` (with context recreation), etc. | -| `browse/src/meta-commands.ts` | Server management, chain routing, diff (DRY via `getCleanText`), snapshot delegation. | -| `browse/src/cookie-import-browser.ts` | Decrypt Chromium cookies via macOS Keychain + PBKDF2/AES-128-CBC. Auto-detects installed browsers. | -| `browse/src/cookie-picker-routes.ts` | HTTP routes for `/cookie-picker/*` — browser list, domain search, import, remove. | -| `browse/src/cookie-picker-ui.ts` | Self-contained HTML generator for the interactive cookie picker (dark theme, no frameworks). | -| `browse/src/buffers.ts` | `CircularBuffer` (O(1) ring buffer) + console/network/dialog capture with async disk flush. | - -### Deploying to the active skill - -The active skill lives at `~/.claude/skills/gstack/`. After making changes: - -1. Push your branch -2. Pull in the skill directory: `cd ~/.claude/skills/gstack && git pull` -3. Rebuild: `cd ~/.claude/skills/gstack && bun run build` - -Or copy the binary directly: `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse` - -### Adding a new command - -1. Add the handler in `read-commands.ts` (non-mutating) or `write-commands.ts` (mutating) -2. Register the route in `server.ts` -3. Add a test case in `browse/test/commands.test.ts` with an HTML fixture if needed -4. Run `bun test` to verify -5. Run `bun run build` to compile diff --git a/CHANGELOG.md b/CHANGELOG.md index d1c5c97..91a2e43 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,110 +1 @@ -# Changelog - -## 0.3.2 — 2026-03-13 - -### Fixed -- Cookie import picker now returns JSON instead of HTML — `jsonResponse()` referenced `url` out of scope, crashing every API call -- `help` command routed correctly (was unreachable due to META_COMMANDS dispatch ordering) -- Stale servers from global install no longer shadow local changes — removed legacy `~/.claude/skills/gstack` fallback from `resolveServerScript()` -- Crash log path references updated from `/tmp/` to `.gstack/` - -### Added -- **Diff-aware QA mode** — `/qa` on a feature branch auto-analyzes `git diff`, identifies affected pages/routes, detects the running app on localhost, and tests only what changed. No URL needed. -- **Project-local browse state** — state file, logs, and all server state now live in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`). No more `/tmp` state files. -- **Shared config module** (`browse/src/config.ts`) — centralizes path resolution for CLI and server, eliminates duplicated port/state logic -- **Random port selection** — server picks a random port 10000-60000 instead of scanning 9400-9409. No more CONDUCTOR_PORT magic offset. No more port collisions across workspaces. -- **Binary version tracking** — state file includes `binaryVersion` SHA; CLI auto-restarts the server when the binary is rebuilt -- **Legacy /tmp cleanup** — CLI scans for and removes old `/tmp/browse-server*.json` files, verifying PID ownership before sending signals -- **Greptile integration** — `/review` and `/ship` fetch and triage Greptile bot comments; `/retro` tracks Greptile batting average across weeks -- **Local dev mode** — `bin/dev-setup` symlinks skills from the repo for in-place development; `bin/dev-teardown` restores global install -- `help` command — agents can self-discover all commands and snapshot flags -- Version-aware `find-browse` with META signal protocol — detects stale binaries and prompts agents to update -- `browse/dist/find-browse` compiled binary with git SHA comparison against origin/main (4hr cached) -- `.version` file written at build time for binary version tracking -- Route-level tests for cookie picker (13 tests) and find-browse version check (10 tests) -- Config resolution tests (14 tests) covering git root detection, BROWSE_STATE_FILE override, ensureStateDir, readVersionHash, resolveServerScript, and version mismatch detection -- Browser interaction guidance in CLAUDE.md — prevents Claude from using mcp\_\_claude-in-chrome\_\_\* tools -- CONTRIBUTING.md with quick start, dev mode explanation, and instructions for testing branches in other repos - -### Changed -- State file location: `.gstack/browse.json` (was `/tmp/browse-server.json`) -- Log files location: `.gstack/browse-{console,network,dialog}.log` (was `/tmp/browse-*.log`) -- Atomic state file writes: `.json.tmp` → rename (prevents partial reads) -- CLI passes `BROWSE_STATE_FILE` to spawned server (server derives all paths from it) -- SKILL.md setup checks parse META signals and handle `META:UPDATE_AVAILABLE` -- `/qa` SKILL.md now describes four modes (diff-aware, full, quick, regression) with diff-aware as the default on feature branches -- `jsonResponse`/`errorResponse` use options objects to prevent positional parameter confusion -- Build script compiles both `browse` and `find-browse` binaries, cleans up `.bun-build` temp files -- README updated with Greptile setup instructions, diff-aware QA examples, and revised demo transcript - -### Removed -- `CONDUCTOR_PORT` magic offset (`browse_port = CONDUCTOR_PORT - 45600`) -- Port scan range 9400-9409 -- Legacy fallback to `~/.claude/skills/gstack/browse/src/server.ts` -- `DEVELOPING_GSTACK.md` (renamed to CONTRIBUTING.md) - -## 0.3.1 — 2026-03-12 - -### Phase 3.5: Browser cookie import - -- `cookie-import-browser` command — decrypt and import cookies from real Chromium browsers (Comet, Chrome, Arc, Brave, Edge) -- Interactive cookie picker web UI served from the browse server (dark theme, two-panel layout, domain search, import/remove) -- Direct CLI import with `--domain` flag for non-interactive use -- `/setup-browser-cookies` skill for Claude Code integration -- macOS Keychain access with async 10s timeout (no event loop blocking) -- Per-browser AES key caching (one Keychain prompt per browser per session) -- DB lock fallback: copies locked cookie DB to /tmp for safe reads -- 18 unit tests with encrypted cookie fixtures - -## 0.3.0 — 2026-03-12 - -### Phase 3: /qa skill — systematic QA testing - -- New `/qa` skill with 6-phase workflow (Initialize, Authenticate, Orient, Explore, Document, Wrap up) -- Three modes: full (systematic, 5-10 issues), quick (30-second smoke test), regression (compare against baseline) -- Issue taxonomy: 7 categories, 4 severity levels, per-page exploration checklist -- Structured report template with health score (0-100, weighted across 7 categories) -- Framework detection guidance for Next.js, Rails, WordPress, and SPAs -- `browse/bin/find-browse` — DRY binary discovery using `git rev-parse --show-toplevel` - -### Phase 2: Enhanced browser - -- Dialog handling: auto-accept/dismiss, dialog buffer, prompt text support -- File upload: `upload [file2...]` -- Element state checks: `is visible|hidden|enabled|disabled|checked|editable|focused ` -- Annotated screenshots with ref labels overlaid (`snapshot -a`) -- Snapshot diffing against previous snapshot (`snapshot -D`) -- Cursor-interactive element scan for non-ARIA clickables (`snapshot -C`) -- `wait --networkidle` / `--load` / `--domcontentloaded` flags -- `console --errors` filter (error + warning only) -- `cookie-import ` with auto-fill domain from page URL -- CircularBuffer O(1) ring buffer for console/network/dialog buffers -- Async buffer flush with Bun.write() -- Health check with page.evaluate + 2s timeout -- Playwright error wrapping — actionable messages for AI agents -- Context recreation preserves cookies/storage/URLs (useragent fix) -- SKILL.md rewritten as QA-oriented playbook with 10 workflow patterns -- 166 integration tests (was ~63) - -## 0.0.2 — 2026-03-12 - -- Fix project-local `/browse` installs — compiled binary now resolves `server.ts` from its own directory instead of assuming a global install exists -- `setup` rebuilds stale binaries (not just missing ones) and exits non-zero if the build fails -- Fix `chain` command swallowing real errors from write commands (e.g. navigation timeout reported as "Unknown meta command") -- Fix unbounded restart loop in CLI when server crashes repeatedly on the same command -- Cap console/network buffers at 50k entries (ring buffer) instead of growing without bound -- Fix disk flush stopping silently after buffer hits the 50k cap -- Fix `ln -snf` in setup to avoid creating nested symlinks on upgrade -- Use `git fetch && git reset --hard` instead of `git pull` for upgrades (handles force-pushes) -- Simplify install: global-first with optional project copy (replaces submodule approach) -- Restructured README: hero, before/after, demo transcript, troubleshooting section -- Six skills (added `/retro`) - -## 0.0.1 — 2026-03-11 - -Initial release. - -- Five skills: `/plan-ceo-review`, `/plan-eng-review`, `/review`, `/ship`, `/browse` -- Headless browser CLI with 40+ commands, ref-based interaction, persistent Chromium daemon -- One-command install as Claude Code skills (submodule or global clone) -- `setup` script for binary compilation and skill symlinking +2026-03-13 fixed it! --dash diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index 917afed..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,45 +0,0 @@ -# gstack development - -## Commands - -```bash -bun install # install dependencies -bun test # run integration tests (browse + snapshot) -bun run dev # run CLI in dev mode, e.g. bun run dev goto https://example.com -bun run build # compile binary to browse/dist/browse -``` - -## Project structure - -``` -gstack/ -├── browse/ # Headless browser CLI (Playwright) -│ ├── src/ # CLI + server + commands -│ ├── test/ # Integration tests + fixtures -│ └── dist/ # Compiled binary -├── ship/ # Ship workflow skill -├── review/ # PR review skill -├── plan-ceo-review/ # /plan-ceo-review skill -├── plan-eng-review/ # /plan-eng-review skill -├── retro/ # Retrospective skill -├── setup # One-time setup: build binary + symlink skills -├── SKILL.md # Browse skill (Claude discovers this) -└── package.json # Build scripts for browse -``` - -## Browser interaction - -When you need to interact with a browser (QA, dogfooding, cookie setup), use the -`/browse` skill or run the browse binary directly via `$B `. NEVER use -`mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this -project uses. - -## Deploying to the active skill - -The active skill lives at `~/.claude/skills/gstack/`. After making changes: - -1. Push your branch -2. Fetch and reset in the skill directory: `cd ~/.claude/skills/gstack && git fetch origin && git reset --hard origin/main` -3. Rebuild: `cd ~/.claude/skills/gstack && bun run build` - -Or copy the binary directly: `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse` diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md deleted file mode 100644 index d696904..0000000 --- a/CONTRIBUTING.md +++ /dev/null @@ -1,153 +0,0 @@ -# Contributing to gstack - -Thanks for wanting to make gstack better. Whether you're fixing a typo in a skill prompt or building an entirely new workflow, this guide will get you up and running fast. - -## Quick start - -gstack skills are Markdown files that Claude Code discovers from a `skills/` directory. Normally they live at `~/.claude/skills/gstack/` (your global install). But when you're developing gstack itself, you want Claude Code to use the skills *in your working tree* — so edits take effect instantly without copying or deploying anything. - -That's what dev mode does. It symlinks your repo into the local `.claude/skills/` directory so Claude Code reads skills straight from your checkout. - -```bash -git clone && cd gstack -bun install # install dependencies -bin/dev-setup # activate dev mode -``` - -Now edit any `SKILL.md`, invoke it in Claude Code (e.g. `/review`), and see your changes live. When you're done developing: - -```bash -bin/dev-teardown # deactivate — back to your global install -``` - -## How dev mode works - -`bin/dev-setup` creates a `.claude/skills/` directory inside the repo (gitignored) and fills it with symlinks pointing back to your working tree. Claude Code sees the local `skills/` first, so your edits win over the global install. - -``` -gstack/ <- your working tree -├── .claude/skills/ <- created by dev-setup (gitignored) -│ ├── gstack -> ../../ <- symlink back to repo root -│ ├── review -> gstack/review -│ ├── ship -> gstack/ship -│ └── ... <- one symlink per skill -├── review/ -│ └── SKILL.md <- edit this, test with /review -├── ship/ -│ └── SKILL.md -├── browse/ -│ ├── src/ <- TypeScript source -│ └── dist/ <- compiled binary (gitignored) -└── ... -``` - -## Day-to-day workflow - -```bash -# 1. Enter dev mode -bin/dev-setup - -# 2. Edit a skill -vim review/SKILL.md - -# 3. Test it in Claude Code — changes are live -# > /review - -# 4. Editing browse source? Rebuild the binary -bun run build - -# 5. Done for the day? Tear down -bin/dev-teardown -``` - -## Running tests - -```bash -bun test # all tests (browse integration + snapshot) -bun run dev # run CLI in dev mode, e.g. bun run dev goto https://example.com -bun run build # compile binary to browse/dist/browse -``` - -Tests run against the browse binary directly — they don't require dev mode. - -## Things to know - -- **SKILL.md changes are instant.** They're just Markdown. Edit, save, invoke. -- **Browse source changes need a rebuild.** If you touch `browse/src/*.ts`, run `bun run build`. -- **Dev mode shadows your global install.** Project-local skills take priority over `~/.claude/skills/gstack`. `bin/dev-teardown` restores the global one. -- **Conductor workspaces are independent.** Each workspace is its own clone. Run `bin/dev-setup` in the one you're working in. -- **`.claude/skills/` is gitignored.** The symlinks never get committed. - -## Testing a branch in another repo - -When you're developing gstack in one workspace and want to test your branch in a -different project (e.g. testing browse changes against your real app), there are -two cases depending on how gstack is installed in that project. - -### Global install only (no `.claude/skills/gstack/` in the project) - -Point your global install at the branch: - -```bash -cd ~/.claude/skills/gstack -git fetch origin -git checkout origin/ # e.g. origin/v0.3.2 -bun install # in case deps changed -bun run build # rebuild the binary -``` - -Now open Claude Code in the other project — it picks up skills from -`~/.claude/skills/` automatically. To go back to main when you're done: - -```bash -cd ~/.claude/skills/gstack -git checkout main && git pull -bun run build -``` - -### Vendored project copy (`.claude/skills/gstack/` checked into the project) - -Some projects vendor gstack by copying it into the repo (no `.git` inside the -copy). Project-local skills take priority over global, so you need to update -the vendored copy too. This is a three-step process: - -1. **Update your global install to the branch** (so you have the source): - ```bash - cd ~/.claude/skills/gstack - git fetch origin - git checkout origin/ # e.g. origin/v0.3.2 - bun install && bun run build - ``` - -2. **Replace the vendored copy** in the other project: - ```bash - cd /path/to/other-project - - # Remove old skill symlinks and vendored copy - for s in browse plan-ceo-review plan-eng-review review ship retro qa setup-browser-cookies; do - rm -f .claude/skills/$s - done - rm -rf .claude/skills/gstack - - # Copy from global install (strips .git so it stays vendored) - cp -Rf ~/.claude/skills/gstack .claude/skills/gstack - rm -rf .claude/skills/gstack/.git - - # Rebuild binary and re-create skill symlinks - cd .claude/skills/gstack && ./setup - ``` - -3. **Test your changes** — open Claude Code in that project and use the skills. - -To revert to main later, repeat steps 1-2 with `git checkout main && git pull` -instead of `git checkout origin/`. - -## Shipping your changes - -When you're happy with your skill edits: - -```bash -/ship -``` - -This runs tests, reviews the diff, bumps the version, and opens a PR. See `ship/SKILL.md` for the full workflow. diff --git a/LICENSE b/LICENSE deleted file mode 100644 index 3502951..0000000 --- a/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -MIT License - -Copyright (c) 2026 Garry Tan - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. diff --git a/SKILL.md b/SKILL.md deleted file mode 100644 index e561e2c..0000000 --- a/SKILL.md +++ /dev/null @@ -1,350 +0,0 @@ ---- -name: gstack -version: 1.1.0 -description: | - Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with - elements, verify page state, diff before/after actions, take annotated screenshots, check - responsive layouts, test forms and uploads, handle dialogs, and assert element states. - ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a - user flow, or file a bug with evidence. -allowed-tools: - - Bash - - Read - ---- - -# gstack browse: QA Testing & Dogfooding - -Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command. -Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs, sessions). - -## SETUP (run this check BEFORE any browse command) - -```bash -BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null) -B=$(echo "$BROWSE_OUTPUT" | head -1) -META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true) -if [ -n "$B" ]; then - echo "READY: $B" - [ -n "$META" ] && echo "$META" -else - echo "NEEDS_SETUP" -fi -``` - -If `NEEDS_SETUP`: -1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait. -2. Run: `cd && ./setup` -3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash` - -If you see `META:UPDATE_AVAILABLE`: -1. Parse the JSON payload to get `current`, `latest`, and `command`. -2. Tell the user: "A gstack update is available (current: X, latest: Y). OK to update?" -3. **STOP and wait for approval.** -4. Run the command from the META payload. -5. Re-run the setup check above to get the updated binary path. - -## IMPORTANT - -- Use the compiled binary via Bash: `$B ` -- NEVER use `mcp__claude-in-chrome__*` tools. They are slow and unreliable. -- Browser persists between calls — cookies, login sessions, and tabs carry over. -- Dialogs (alert/confirm/prompt) are auto-accepted by default — no browser lockup. - -## QA Workflows - -### Test a user flow (login, signup, checkout, etc.) - -```bash -B=~/.claude/skills/gstack/browse/dist/browse - -# 1. Go to the page -$B goto https://app.example.com/login - -# 2. See what's interactive -$B snapshot -i - -# 3. Fill the form using refs -$B fill @e3 "test@example.com" -$B fill @e4 "password123" -$B click @e5 - -# 4. Verify it worked -$B snapshot -D # diff shows what changed after clicking -$B is visible ".dashboard" # assert the dashboard appeared -$B screenshot /tmp/after-login.png -``` - -### Verify a deployment / check prod - -```bash -$B goto https://yourapp.com -$B text # read the page — does it load? -$B console # any JS errors? -$B network # any failed requests? -$B js "document.title" # correct title? -$B is visible ".hero-section" # key elements present? -$B screenshot /tmp/prod-check.png -``` - -### Dogfood a feature end-to-end - -```bash -# Navigate to the feature -$B goto https://app.example.com/new-feature - -# Take annotated screenshot — shows every interactive element with labels -$B snapshot -i -a -o /tmp/feature-annotated.png - -# Find ALL clickable things (including divs with cursor:pointer) -$B snapshot -C - -# Walk through the flow -$B snapshot -i # baseline -$B click @e3 # interact -$B snapshot -D # what changed? (unified diff) - -# Check element states -$B is visible ".success-toast" -$B is enabled "#next-step-btn" -$B is checked "#agree-checkbox" - -# Check console for errors after interactions -$B console -``` - -### Test responsive layouts - -```bash -# Quick: 3 screenshots at mobile/tablet/desktop -$B goto https://yourapp.com -$B responsive /tmp/layout - -# Manual: specific viewport -$B viewport 375x812 # iPhone -$B screenshot /tmp/mobile.png -$B viewport 1440x900 # Desktop -$B screenshot /tmp/desktop.png -``` - -### Test file upload - -```bash -$B goto https://app.example.com/upload -$B snapshot -i -$B upload @e3 /path/to/test-file.pdf -$B is visible ".upload-success" -$B screenshot /tmp/upload-result.png -``` - -### Test forms with validation - -```bash -$B goto https://app.example.com/form -$B snapshot -i - -# Submit empty — check validation errors appear -$B click @e10 # submit button -$B snapshot -D # diff shows error messages appeared -$B is visible ".error-message" - -# Fill and resubmit -$B fill @e3 "valid input" -$B click @e10 -$B snapshot -D # diff shows errors gone, success state -``` - -### Test dialogs (delete confirmations, prompts) - -```bash -# Set up dialog handling BEFORE triggering -$B dialog-accept # will auto-accept next alert/confirm -$B click "#delete-button" # triggers confirmation dialog -$B dialog # see what dialog appeared -$B snapshot -D # verify the item was deleted - -# For prompts that need input -$B dialog-accept "my answer" # accept with text -$B click "#rename-button" # triggers prompt -``` - -### Test authenticated pages (import real browser cookies) - -```bash -# Import cookies from your real browser (opens interactive picker) -$B cookie-import-browser - -# Or import a specific domain directly -$B cookie-import-browser comet --domain .github.com - -# Now test authenticated pages -$B goto https://github.com/settings/profile -$B snapshot -i -$B screenshot /tmp/github-profile.png -``` - -### Compare two pages / environments - -```bash -$B diff https://staging.app.com https://prod.app.com -``` - -### Multi-step chain (efficient for long flows) - -```bash -echo '[ - ["goto","https://app.example.com"], - ["snapshot","-i"], - ["fill","@e3","test@test.com"], - ["fill","@e4","password"], - ["click","@e5"], - ["snapshot","-D"], - ["screenshot","/tmp/result.png"] -]' | $B chain -``` - -## Quick Assertion Patterns - -```bash -# Element exists and is visible -$B is visible ".modal" - -# Button is enabled/disabled -$B is enabled "#submit-btn" -$B is disabled "#submit-btn" - -# Checkbox state -$B is checked "#agree" - -# Input is editable -$B is editable "#name-field" - -# Element has focus -$B is focused "#search-input" - -# Page contains text -$B js "document.body.textContent.includes('Success')" - -# Element count -$B js "document.querySelectorAll('.list-item').length" - -# Specific attribute value -$B attrs "#logo" # returns all attributes as JSON - -# CSS property -$B css ".button" "background-color" -``` - -## Snapshot System - -The snapshot is your primary tool for understanding and interacting with pages. - -```bash -$B snapshot -i # Interactive elements only (buttons, links, inputs) with @e refs -$B snapshot -c # Compact (no empty structural elements) -$B snapshot -d 3 # Limit depth to 3 levels -$B snapshot -s "main" # Scope to CSS selector -$B snapshot -D # Diff against previous snapshot (what changed?) -$B snapshot -a # Annotated screenshot with ref labels -$B snapshot -o /tmp/x.png # Output path for annotated screenshot -$B snapshot -C # Cursor-interactive elements (@c refs — divs with pointer, onclick) -``` - -Combine flags: `$B snapshot -i -a -C -o /tmp/annotated.png` - -After snapshot, use @refs everywhere: -```bash -$B click @e3 $B fill @e4 "value" $B hover @e1 -$B html @e2 $B css @e5 "color" $B attrs @e6 -$B click @c1 # cursor-interactive ref (from -C) -``` - -Refs are invalidated on navigation — run `snapshot` again after `goto`. - -## Command Reference - -### Navigation -| Command | Description | -|---------|-------------| -| `goto ` | Navigate to URL | -| `back` / `forward` | History navigation | -| `reload` | Reload page | -| `url` | Print current URL | - -### Reading -| Command | Description | -|---------|-------------| -| `text` | Cleaned page text | -| `html [selector]` | innerHTML | -| `links` | All links as "text -> href" | -| `forms` | Forms + fields as JSON | -| `accessibility` | Full ARIA tree | - -### Interaction -| Command | Description | -|---------|-------------| -| `click ` | Click element | -| `fill ` | Fill input | -| `select ` | Select dropdown | -| `hover ` | Hover element | -| `type ` | Type into focused element | -| `press ` | Press key (Enter, Tab, Escape) | -| `scroll [sel]` | Scroll element into view | -| `wait ` | Wait for element (max 10s) | -| `wait --networkidle` | Wait for network to be idle | -| `wait --load` | Wait for page load event | -| `upload ` | Upload file(s) | -| `cookie-import ` | Import cookies from JSON file | -| `cookie-import-browser [browser] [--domain ]` | Import cookies from real browser (opens picker UI, or direct import with --domain) | -| `dialog-accept [text]` | Auto-accept dialogs | -| `dialog-dismiss` | Auto-dismiss dialogs | -| `viewport ` | Set viewport size | - -### Inspection -| Command | Description | -|---------|-------------| -| `js ` | Run JavaScript | -| `eval ` | Run JS file | -| `css ` | Computed CSS | -| `attrs ` | Element attributes | -| `is ` | State check (visible/hidden/enabled/disabled/checked/editable/focused) | -| `console [--clear\|--errors]` | Console messages (--errors filters to error/warning) | -| `network [--clear]` | Network requests | -| `dialog [--clear]` | Dialog messages | -| `cookies` | All cookies | -| `storage` | localStorage + sessionStorage | -| `perf` | Page load timings | - -### Visual -| Command | Description | -|---------|-------------| -| `screenshot [path]` | Screenshot | -| `pdf [path]` | Save as PDF | -| `responsive [prefix]` | Mobile/tablet/desktop screenshots | -| `diff ` | Text diff between pages | - -### Tabs -| Command | Description | -|---------|-------------| -| `tabs` | List tabs | -| `tab ` | Switch tab | -| `newtab [url]` | Open tab | -| `closetab [id]` | Close tab | - -### Server -| Command | Description | -|---------|-------------| -| `status` | Health check | -| `stop` | Shutdown | -| `restart` | Restart | - -## Tips - -1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `screenshot` all hit the loaded page instantly. -2. **Use `snapshot -i` first.** See all interactive elements, then click/fill by ref. No CSS selector guessing. -3. **Use `snapshot -D` to verify.** Baseline → action → diff. See exactly what changed. -4. **Use `is` for assertions.** `is visible .modal` is faster and more reliable than parsing page text. -5. **Use `snapshot -a` for evidence.** Annotated screenshots are great for bug reports. -6. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses. -7. **Check `console` after actions.** Catch JS errors that don't surface visually. -8. **Use `chain` for long flows.** Single command, no per-step CLI overhead. diff --git a/TODO.md b/TODO.md index 0148e70..375f4b9 100644 --- a/TODO.md +++ b/TODO.md @@ -1,117 +1 @@ -# TODO — gstack roadmap - -## Phase 1: Foundations (v0.2.0) - - [x] Rename to gstack - - [x] Restructure to monorepo layout - - [x] Setup script for skill symlinks - - [x] Snapshot command with ref-based element selection - - [x] Snapshot tests - -## Phase 2: Enhanced Browser (v0.2.0) ✅ - - [x] Annotated screenshots (--annotate flag, ref labels overlaid on screenshot) - - [x] Snapshot diffing (--diff flag, unified diff against previous snapshot) - - [x] Dialog handling (auto-accept/dismiss, dialog buffer, prevents browser lockup) - - [x] File upload (upload ) - - [x] Cursor-interactive elements (-C flag, cursor:pointer/onclick/tabindex scan) - - [x] Element state checks (is visible/hidden/enabled/disabled/checked/editable/focused) - - [x] CircularBuffer — O(1) ring buffer for console/network/dialog (was O(n) array+shift) - - [x] Async buffer flush with Bun.write() (was appendFileSync) - - [x] Health check with page.evaluate('1') + 2s timeout - - [x] Playwright error wrapping — actionable messages for AI agents - - [x] Fix useragent — context recreation preserves cookies/storage/URLs - - [x] DRY: getCleanText exported, command sets in chain updated - - [x] 148 integration tests (was ~63) - -## Phase 3: QA Testing Agent (v0.3.0) - - [x] `/qa` SKILL.md — 6-phase workflow: Initialize → Authenticate → Orient → Explore → Document → Wrap up - - [x] Issue taxonomy reference (7 categories: visual, functional, UX, content, performance, console, accessibility) - - [x] Severity classification (critical/high/medium/low) - - [x] Exploration checklist per page - - [x] Report template (structured markdown with per-issue evidence) - - [x] Repro-first philosophy: every issue gets evidence before moving on - - [x] Two evidence tiers: interactive bugs (multi-step screenshots), static bugs (single annotated screenshot) - - [x] Key guidance: 5-10 well-documented issues per session, depth over breadth, write incrementally - - [x] Three modes: full (systematic), quick (30-second smoke test), regression (compare against baseline) - - [x] Framework detection guidance (Next.js, Rails, WordPress, SPA) - - [x] Health score rubric (7 categories, weighted average) - - [x] `wait --networkidle` / `wait --load` / `wait --domcontentloaded` - - [x] `console --errors` (filter to error/warning only) - - [x] `cookie-import ` (bulk cookie import with auto-fill domain) - - [x] `browse/bin/find-browse` (DRY binary discovery across skills) - - [ ] Video recording (deferred to Phase 5 — recreateContext destroys page state) - -## Phase 3.5: Browser Cookie Import (v0.3.x) - - [x] `cookie-import-browser` command (Chromium cookie DB decryption) - - [x] Cookie picker web UI (served from browse server) - - [x] `/setup-browser-cookies` skill - - [x] Unit tests with encrypted cookie fixtures (18 tests) - - [x] Browser registry (Comet, Chrome, Arc, Brave, Edge) - -## Phase 3.6: Visual PR Annotations + S3 Upload - - [ ] `/setup-gstack-upload` skill (configure S3 bucket for image hosting) - - [ ] `browse/bin/gstack-upload` helper (upload file to S3, return public URL) - - [ ] `/ship` Step 7.5: visual verification with screenshots in PR body - - [ ] `/review` Step 4.5: visual review with annotated screenshots in PR - - [ ] WebM → GIF conversion (ffmpeg) for video evidence in PRs - - [ ] README documentation for visual PR annotations - -## Phase 4: Skill + Browser Integration - - [ ] ship + browse: post-deploy verification - - Browse staging/preview URL after push - - Screenshot key pages - - Check console for JS errors - - Compare staging vs prod via snapshot diff - - Include verification screenshots in PR body - - STOP if critical errors found - - [ ] review + browse: visual diff review - - Browse PR's preview deploy - - Annotated screenshots of changed pages - - Compare against production visually - - Check responsive layouts (mobile/tablet/desktop) - - Verify accessibility tree hasn't regressed - - [ ] deploy-verify skill: lightweight post-deploy smoke test - - Hit key URLs, verify 200s - - Screenshot critical pages - - Console error check - - Compare against baseline snapshots - - Pass/fail with evidence - -## Phase 5: State & Sessions - - [ ] Bundle server.ts into compiled binary (eliminate resolveServerScript() fallback chain entirely) (P2, M) - - [ ] v20 encryption format support (AES-256-GCM) — future Chromium versions may change from v10 - - [ ] Sessions (isolated browser instances with separate cookies/storage/history) - - [ ] State persistence (save/load cookies + localStorage to JSON files) - - [ ] Auth vault (encrypted credential storage, referenced by name, LLM never sees passwords) - - [ ] Video recording (record start/stop — needs sessions for clean context lifecycle) - - [ ] retro + browse: deployment health tracking - - Screenshot production state - - Check perf metrics (page load times) - - Count console errors across key pages - - Track trends over retro window - -## Phase 6: Advanced Browser - - [ ] Iframe support (frame , frame main) - - [ ] Semantic locators (find role/label/text/placeholder/testid with actions) - - [ ] Device emulation presets (set device "iPhone 16 Pro") - - [ ] Network mocking/routing (intercept, block, mock requests) - - [ ] Download handling (click-to-download with path control) - - [ ] Content safety (--max-output truncation, --allowed-domains) - - [ ] Streaming (WebSocket live preview for pair browsing) - - [ ] CDP mode (connect to already-running Chrome/Electron apps) - -## Future Ideas - - [ ] Linux/Windows cookie decryption (GNOME Keyring / kwallet / DPAPI) - - [ ] Trend tracking across QA runs — compare baseline.json over time, detect regressions (P2, S) - - [ ] CI/CD integration — `/qa` as GitHub Action step, fail PR if health score drops (P2, M) - - [ ] Accessibility audit mode — `--a11y` flag for focused accessibility testing (P3, S) - - [ ] Greptile training feedback loop — export suppression patterns to Greptile team for model improvement (P3, S) - -## Ideas & Notes - - Browser is the nervous system — every skill should be able to see, interact with, and verify the web - - Skills are the product; the browser enables them - - One repo, one install, entire AI engineering workflow - - Bun compiled binary matches Rust CLI performance for this use case (bottleneck is Chromium, not CLI parsing) - - Accessibility tree snapshots use ~200-400 tokens vs ~3000-5000 for full DOM — critical for AI context efficiency - - Locator map approach for refs: store Map on BrowserManager, no DOM mutation, no CSP issues - - Snapshot scoping (-i, -c, -d, -s flags) is critical for performance on large pages - - All new commands follow existing pattern: add to command set, add switch case, return string +stop vibecoding! full stop diff --git a/VERSION b/VERSION deleted file mode 100644 index d15723f..0000000 --- a/VERSION +++ /dev/null @@ -1 +0,0 @@ -0.3.2 diff --git a/bin/dev-setup b/bin/dev-setup deleted file mode 100755 index 709cca4..0000000 --- a/bin/dev-setup +++ /dev/null @@ -1,36 +0,0 @@ -#!/usr/bin/env bash -# Set up gstack for local development — test skills from within this repo. -# -# Creates .claude/skills/gstack → (symlink to repo root) so Claude Code -# discovers skills from your working tree. Changes take effect immediately. -# -# Usage: bin/dev-setup # set up -# bin/dev-teardown # clean up -set -e - -REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" - -# 1. Create .claude/skills/ inside the repo -mkdir -p "$REPO_ROOT/.claude/skills" - -# 2. Symlink .claude/skills/gstack → repo root -# This makes setup think it's inside a real .claude/skills/ directory -GSTACK_LINK="$REPO_ROOT/.claude/skills/gstack" -if [ -L "$GSTACK_LINK" ]; then - echo "Updating existing symlink..." - rm "$GSTACK_LINK" -elif [ -d "$GSTACK_LINK" ]; then - echo "Error: .claude/skills/gstack is a real directory, not a symlink." >&2 - echo "Remove it manually if you want to use dev mode." >&2 - exit 1 -fi -ln -s "$REPO_ROOT" "$GSTACK_LINK" - -# 3. Run setup via the symlink so it detects .claude/skills/ as its parent -"$GSTACK_LINK/setup" - -echo "" -echo "Dev mode active. Skills resolve from this working tree." -echo "Edit any SKILL.md and test immediately — no copy/deploy needed." -echo "" -echo "To tear down: bin/dev-teardown" diff --git a/bin/dev-teardown b/bin/dev-teardown deleted file mode 100755 index e333a75..0000000 --- a/bin/dev-teardown +++ /dev/null @@ -1,39 +0,0 @@ -#!/usr/bin/env bash -# Remove local dev skill symlinks. Restores global gstack as the active install. -set -e - -REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" -SKILLS_DIR="$REPO_ROOT/.claude/skills" - -if [ ! -d "$SKILLS_DIR" ]; then - echo "Nothing to tear down — .claude/skills/ doesn't exist." - exit 0 -fi - -# Remove individual skill symlinks -removed=() -for link in "$SKILLS_DIR"/*/; do - name="$(basename "$link")" - [ "$name" = "gstack" ] && continue - if [ -L "${link%/}" ]; then - rm "${link%/}" - removed+=("$name") - fi -done - -# Remove the gstack symlink -if [ -L "$SKILLS_DIR/gstack" ]; then - rm "$SKILLS_DIR/gstack" - removed+=("gstack") -fi - -# Clean up empty dirs -rmdir "$SKILLS_DIR" 2>/dev/null || true -rmdir "$REPO_ROOT/.claude" 2>/dev/null || true - -if [ ${#removed[@]} -gt 0 ]; then - echo "Removed: ${removed[*]}" -else - echo "No symlinks found." -fi -echo "Dev mode deactivated. Global gstack (~/.claude/skills/gstack) is now active." diff --git a/browse/SKILL.md b/browse/SKILL.md deleted file mode 100644 index 99c979c..0000000 --- a/browse/SKILL.md +++ /dev/null @@ -1,128 +0,0 @@ ---- -name: browse -version: 1.1.0 -description: | - Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with - elements, verify page state, diff before/after actions, take annotated screenshots, check - responsive layouts, test forms and uploads, handle dialogs, and assert element states. - ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a - user flow, or file a bug with evidence. -allowed-tools: - - Bash - - Read - ---- - -# browse: QA Testing & Dogfooding - -Persistent headless Chromium. First call auto-starts (~3s), then ~100ms per command. -State persists between calls (cookies, tabs, login sessions). - -## Core QA Patterns - -### 1. Verify a page loads correctly -```bash -$B goto https://yourapp.com -$B text # content loads? -$B console # JS errors? -$B network # failed requests? -$B is visible ".main-content" # key elements present? -``` - -### 2. Test a user flow -```bash -$B goto https://app.com/login -$B snapshot -i # see all interactive elements -$B fill @e3 "user@test.com" -$B fill @e4 "password" -$B click @e5 # submit -$B snapshot -D # diff: what changed after submit? -$B is visible ".dashboard" # success state present? -``` - -### 3. Verify an action worked -```bash -$B snapshot # baseline -$B click @e3 # do something -$B snapshot -D # unified diff shows exactly what changed -``` - -### 4. Visual evidence for bug reports -```bash -$B snapshot -i -a -o /tmp/annotated.png # labeled screenshot -$B screenshot /tmp/bug.png # plain screenshot -$B console # error log -``` - -### 5. Find all clickable elements (including non-ARIA) -```bash -$B snapshot -C # finds divs with cursor:pointer, onclick, tabindex -$B click @c1 # interact with them -``` - -### 6. Assert element states -```bash -$B is visible ".modal" -$B is enabled "#submit-btn" -$B is disabled "#submit-btn" -$B is checked "#agree-checkbox" -$B is editable "#name-field" -$B is focused "#search-input" -$B js "document.body.textContent.includes('Success')" -``` - -### 7. Test responsive layouts -```bash -$B responsive /tmp/layout # mobile + tablet + desktop screenshots -$B viewport 375x812 # or set specific viewport -$B screenshot /tmp/mobile.png -``` - -### 8. Test file uploads -```bash -$B upload "#file-input" /path/to/file.pdf -$B is visible ".upload-success" -``` - -### 9. Test dialogs -```bash -$B dialog-accept "yes" # set up handler -$B click "#delete-button" # trigger dialog -$B dialog # see what appeared -$B snapshot -D # verify deletion happened -``` - -### 10. Compare environments -```bash -$B diff https://staging.app.com https://prod.app.com -``` - -## Snapshot Flags - -``` --i Interactive elements only (buttons, links, inputs) --c Compact (no empty structural nodes) --d Limit depth --s Scope to CSS selector --D Diff against previous snapshot --a Annotated screenshot with ref labels --o Output path for screenshot --C Cursor-interactive elements (@c refs) -``` - -Combine: `$B snapshot -i -a -C -o /tmp/annotated.png` - -Use @refs after snapshot: `$B click @e3`, `$B fill @e4 "value"`, `$B click @c1` - -## Full Command List - -**Navigate:** goto, back, forward, reload, url -**Read:** text, html, links, forms, accessibility -**Snapshot:** snapshot (with flags above) -**Interact:** click, fill, select, hover, type, press, scroll, wait, wait --networkidle, wait --load, viewport, upload, cookie-import, dialog-accept, dialog-dismiss -**Inspect:** js, eval, css, attrs, is, console, console --errors, network, dialog, cookies, storage, perf -**Visual:** screenshot, pdf, responsive -**Compare:** diff -**Multi-step:** chain (pipe JSON array) -**Tabs:** tabs, tab, newtab, closetab -**Server:** status, stop, restart diff --git a/browse/bin/find-browse b/browse/bin/find-browse deleted file mode 100755 index db07f37..0000000 --- a/browse/bin/find-browse +++ /dev/null @@ -1,17 +0,0 @@ -#!/bin/bash -# Shim: delegates to compiled find-browse binary, falls back to basic discovery. -# The compiled binary adds version checking and META signal support. -DIR="$(cd "$(dirname "$0")/.." && pwd)/dist" -if test -x "$DIR/find-browse"; then - exec "$DIR/find-browse" "$@" -fi -# Fallback: basic discovery (no version check) -ROOT=$(git rev-parse --show-toplevel 2>/dev/null) -if [ -n "$ROOT" ] && test -x "$ROOT/.claude/skills/gstack/browse/dist/browse"; then - echo "$ROOT/.claude/skills/gstack/browse/dist/browse" -elif test -x "$HOME/.claude/skills/gstack/browse/dist/browse"; then - echo "$HOME/.claude/skills/gstack/browse/dist/browse" -else - echo "ERROR: browse binary not found. Run: cd && ./setup" >&2 - exit 1 -fi diff --git a/browse/src/browser-manager.ts b/browse/src/browser-manager.ts deleted file mode 100644 index f684ce1..0000000 --- a/browse/src/browser-manager.ts +++ /dev/null @@ -1,453 +0,0 @@ -/** - * Browser lifecycle manager - * - * Chromium crash handling: - * browser.on('disconnected') → log error → process.exit(1) - * CLI detects dead server → auto-restarts on next command - * We do NOT try to self-heal — don't hide failure. - * - * Dialog handling: - * page.on('dialog') → auto-accept by default → store in dialog buffer - * Prevents browser lockup from alert/confirm/prompt - * - * Context recreation (useragent): - * recreateContext() saves cookies/storage/URLs, creates new context, - * restores state. Falls back to clean slate on any failure. - */ - -import { chromium, type Browser, type BrowserContext, type Page, type Locator } from 'playwright'; -import { addConsoleEntry, addNetworkEntry, addDialogEntry, networkBuffer, type DialogEntry } from './buffers'; - -export class BrowserManager { - private browser: Browser | null = null; - private context: BrowserContext | null = null; - private pages: Map = new Map(); - private activeTabId: number = 0; - private nextTabId: number = 1; - private extraHeaders: Record = {}; - private customUserAgent: string | null = null; - - /** Server port — set after server starts, used by cookie-import-browser command */ - public serverPort: number = 0; - - // ─── Ref Map (snapshot → @e1, @e2, @c1, @c2, ...) ──────── - private refMap: Map = new Map(); - - // ─── Snapshot Diffing ───────────────────────────────────── - // NOT cleared on navigation — it's a text baseline for diffing - private lastSnapshot: string | null = null; - - // ─── Dialog Handling ────────────────────────────────────── - private dialogAutoAccept: boolean = true; - private dialogPromptText: string | null = null; - - async launch() { - this.browser = await chromium.launch({ headless: true }); - - // Chromium crash → exit with clear message - this.browser.on('disconnected', () => { - console.error('[browse] FATAL: Chromium process crashed or was killed. Server exiting.'); - console.error('[browse] Console/network logs flushed to .gstack/browse-*.log'); - process.exit(1); - }); - - const contextOptions: any = { - viewport: { width: 1280, height: 720 }, - }; - if (this.customUserAgent) { - contextOptions.userAgent = this.customUserAgent; - } - this.context = await this.browser.newContext(contextOptions); - - if (Object.keys(this.extraHeaders).length > 0) { - await this.context.setExtraHTTPHeaders(this.extraHeaders); - } - - // Create first tab - await this.newTab(); - } - - async close() { - if (this.browser) { - // Remove disconnect handler to avoid exit during intentional close - this.browser.removeAllListeners('disconnected'); - await this.browser.close(); - this.browser = null; - } - } - - /** Health check — verifies Chromium is connected AND responsive */ - async isHealthy(): Promise { - if (!this.browser || !this.browser.isConnected()) return false; - try { - const page = this.pages.get(this.activeTabId); - if (!page) return true; // connected but no pages — still healthy - await Promise.race([ - page.evaluate('1'), - new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), 2000)), - ]); - return true; - } catch { - return false; - } - } - - // ─── Tab Management ──────────────────────────────────────── - async newTab(url?: string): Promise { - if (!this.context) throw new Error('Browser not launched'); - - const page = await this.context.newPage(); - const id = this.nextTabId++; - this.pages.set(id, page); - this.activeTabId = id; - - // Wire up console/network/dialog capture - this.wirePageEvents(page); - - if (url) { - await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 }); - } - - return id; - } - - async closeTab(id?: number): Promise { - const tabId = id ?? this.activeTabId; - const page = this.pages.get(tabId); - if (!page) throw new Error(`Tab ${tabId} not found`); - - await page.close(); - this.pages.delete(tabId); - - // Switch to another tab if we closed the active one - if (tabId === this.activeTabId) { - const remaining = [...this.pages.keys()]; - if (remaining.length > 0) { - this.activeTabId = remaining[remaining.length - 1]; - } else { - // No tabs left — create a new blank one - await this.newTab(); - } - } - } - - switchTab(id: number): void { - if (!this.pages.has(id)) throw new Error(`Tab ${id} not found`); - this.activeTabId = id; - } - - getTabCount(): number { - return this.pages.size; - } - - async getTabListWithTitles(): Promise> { - const tabs: Array<{ id: number; url: string; title: string; active: boolean }> = []; - for (const [id, page] of this.pages) { - tabs.push({ - id, - url: page.url(), - title: await page.title().catch(() => ''), - active: id === this.activeTabId, - }); - } - return tabs; - } - - // ─── Page Access ─────────────────────────────────────────── - getPage(): Page { - const page = this.pages.get(this.activeTabId); - if (!page) throw new Error('No active page. Use "browse goto " first.'); - return page; - } - - getCurrentUrl(): string { - try { - return this.getPage().url(); - } catch { - return 'about:blank'; - } - } - - // ─── Ref Map ────────────────────────────────────────────── - setRefMap(refs: Map) { - this.refMap = refs; - } - - clearRefs() { - this.refMap.clear(); - } - - /** - * Resolve a selector that may be a @ref (e.g., "@e3", "@c1") or a CSS selector. - * Returns { locator } for refs or { selector } for CSS selectors. - */ - resolveRef(selector: string): { locator: Locator } | { selector: string } { - if (selector.startsWith('@e') || selector.startsWith('@c')) { - const ref = selector.slice(1); // "e3" or "c1" - const locator = this.refMap.get(ref); - if (!locator) { - throw new Error( - `Ref ${selector} not found. Page may have changed — run 'snapshot' to get fresh refs.` - ); - } - return { locator }; - } - return { selector }; - } - - getRefCount(): number { - return this.refMap.size; - } - - // ─── Snapshot Diffing ───────────────────────────────────── - setLastSnapshot(text: string | null) { - this.lastSnapshot = text; - } - - getLastSnapshot(): string | null { - return this.lastSnapshot; - } - - // ─── Dialog Control ─────────────────────────────────────── - setDialogAutoAccept(accept: boolean) { - this.dialogAutoAccept = accept; - } - - getDialogAutoAccept(): boolean { - return this.dialogAutoAccept; - } - - setDialogPromptText(text: string | null) { - this.dialogPromptText = text; - } - - getDialogPromptText(): string | null { - return this.dialogPromptText; - } - - // ─── Viewport ────────────────────────────────────────────── - async setViewport(width: number, height: number) { - await this.getPage().setViewportSize({ width, height }); - } - - // ─── Extra Headers ───────────────────────────────────────── - async setExtraHeader(name: string, value: string) { - this.extraHeaders[name] = value; - if (this.context) { - await this.context.setExtraHTTPHeaders(this.extraHeaders); - } - } - - // ─── User Agent ──────────────────────────────────────────── - setUserAgent(ua: string) { - this.customUserAgent = ua; - } - - getUserAgent(): string | null { - return this.customUserAgent; - } - - /** - * Recreate the browser context to apply user agent changes. - * Saves and restores cookies, localStorage, sessionStorage, and open pages. - * Falls back to a clean slate on any failure. - */ - async recreateContext(): Promise { - if (!this.browser || !this.context) { - throw new Error('Browser not launched'); - } - - try { - // 1. Save state from current context - const savedCookies = await this.context.cookies(); - const savedPages: Array<{ url: string; isActive: boolean; storage: any }> = []; - - for (const [id, page] of this.pages) { - const url = page.url(); - let storage = null; - try { - storage = await page.evaluate(() => ({ - localStorage: { ...localStorage }, - sessionStorage: { ...sessionStorage }, - })); - } catch {} - savedPages.push({ - url: url === 'about:blank' ? '' : url, - isActive: id === this.activeTabId, - storage, - }); - } - - // 2. Close old pages and context - for (const page of this.pages.values()) { - await page.close().catch(() => {}); - } - this.pages.clear(); - await this.context.close().catch(() => {}); - - // 3. Create new context with updated settings - const contextOptions: any = { - viewport: { width: 1280, height: 720 }, - }; - if (this.customUserAgent) { - contextOptions.userAgent = this.customUserAgent; - } - this.context = await this.browser.newContext(contextOptions); - - if (Object.keys(this.extraHeaders).length > 0) { - await this.context.setExtraHTTPHeaders(this.extraHeaders); - } - - // 4. Restore cookies - if (savedCookies.length > 0) { - await this.context.addCookies(savedCookies); - } - - // 5. Re-create pages - let activeId: number | null = null; - for (const saved of savedPages) { - const page = await this.context.newPage(); - const id = this.nextTabId++; - this.pages.set(id, page); - this.wirePageEvents(page); - - if (saved.url) { - await page.goto(saved.url, { waitUntil: 'domcontentloaded', timeout: 15000 }).catch(() => {}); - } - - // 6. Restore storage - if (saved.storage) { - try { - await page.evaluate((s: any) => { - if (s.localStorage) { - for (const [k, v] of Object.entries(s.localStorage)) { - localStorage.setItem(k, v as string); - } - } - if (s.sessionStorage) { - for (const [k, v] of Object.entries(s.sessionStorage)) { - sessionStorage.setItem(k, v as string); - } - } - }, saved.storage); - } catch {} - } - - if (saved.isActive) activeId = id; - } - - // If no pages were saved, create a blank one - if (this.pages.size === 0) { - await this.newTab(); - } else { - this.activeTabId = activeId ?? [...this.pages.keys()][0]; - } - - // Clear refs — pages are new, locators are stale - this.clearRefs(); - - return null; // success - } catch (err: any) { - // Fallback: create a clean context + blank tab - try { - this.pages.clear(); - if (this.context) await this.context.close().catch(() => {}); - - const contextOptions: any = { - viewport: { width: 1280, height: 720 }, - }; - if (this.customUserAgent) { - contextOptions.userAgent = this.customUserAgent; - } - this.context = await this.browser!.newContext(contextOptions); - await this.newTab(); - this.clearRefs(); - } catch { - // If even the fallback fails, we're in trouble — but browser is still alive - } - return `Context recreation failed: ${err.message}. Browser reset to blank tab.`; - } - } - - // ─── Console/Network/Dialog/Ref Wiring ──────────────────── - private wirePageEvents(page: Page) { - // Clear ref map on navigation — refs point to stale elements after page change - // (lastSnapshot is NOT cleared — it's a text baseline for diffing) - page.on('framenavigated', (frame) => { - if (frame === page.mainFrame()) { - this.clearRefs(); - } - }); - - // ─── Dialog auto-handling (prevents browser lockup) ───── - page.on('dialog', async (dialog) => { - const entry: DialogEntry = { - timestamp: Date.now(), - type: dialog.type(), - message: dialog.message(), - defaultValue: dialog.defaultValue() || undefined, - action: this.dialogAutoAccept ? 'accepted' : 'dismissed', - response: this.dialogAutoAccept ? (this.dialogPromptText ?? undefined) : undefined, - }; - addDialogEntry(entry); - - try { - if (this.dialogAutoAccept) { - await dialog.accept(this.dialogPromptText ?? undefined); - } else { - await dialog.dismiss(); - } - } catch { - // Dialog may have been dismissed by navigation — ignore - } - }); - - page.on('console', (msg) => { - addConsoleEntry({ - timestamp: Date.now(), - level: msg.type(), - text: msg.text(), - }); - }); - - page.on('request', (req) => { - addNetworkEntry({ - timestamp: Date.now(), - method: req.method(), - url: req.url(), - }); - }); - - page.on('response', (res) => { - // Find matching request entry and update it (backward scan) - const url = res.url(); - const status = res.status(); - for (let i = networkBuffer.length - 1; i >= 0; i--) { - const entry = networkBuffer.get(i); - if (entry && entry.url === url && !entry.status) { - networkBuffer.set(i, { ...entry, status, duration: Date.now() - entry.timestamp }); - break; - } - } - }); - - // Capture response sizes via response finished - page.on('requestfinished', async (req) => { - try { - const res = await req.response(); - if (res) { - const url = req.url(); - const body = await res.body().catch(() => null); - const size = body ? body.length : 0; - for (let i = networkBuffer.length - 1; i >= 0; i--) { - const entry = networkBuffer.get(i); - if (entry && entry.url === url && !entry.size) { - networkBuffer.set(i, { ...entry, size }); - break; - } - } - } - } catch {} - }); - } -} diff --git a/browse/src/buffers.ts b/browse/src/buffers.ts deleted file mode 100644 index 27d3796..0000000 --- a/browse/src/buffers.ts +++ /dev/null @@ -1,137 +0,0 @@ -/** - * Shared buffers and types — extracted to break circular dependency - * between server.ts and browser-manager.ts - * - * CircularBuffer: O(1) insert ring buffer with fixed capacity. - * - * ┌───┬───┬───┬───┬───┬───┐ - * │ 3 │ 4 │ 5 │ │ 1 │ 2 │ capacity=6, head=4, size=5 - * └───┴───┴───┴───┴─▲─┴───┘ - * │ - * head (oldest entry) - * - * push() writes at (head+size) % capacity, O(1) - * toArray() returns entries in insertion order, O(n) - * totalAdded keeps incrementing past capacity (flush cursor) - */ - -// ─── CircularBuffer ───────────────────────────────────────── - -export class CircularBuffer { - private buffer: (T | undefined)[]; - private head: number = 0; - private _size: number = 0; - private _totalAdded: number = 0; - readonly capacity: number; - - constructor(capacity: number) { - this.capacity = capacity; - this.buffer = new Array(capacity); - } - - push(entry: T): void { - const index = (this.head + this._size) % this.capacity; - this.buffer[index] = entry; - if (this._size < this.capacity) { - this._size++; - } else { - // Buffer full — advance head (overwrites oldest) - this.head = (this.head + 1) % this.capacity; - } - this._totalAdded++; - } - - /** Return entries in insertion order (oldest first) */ - toArray(): T[] { - const result: T[] = []; - for (let i = 0; i < this._size; i++) { - result.push(this.buffer[(this.head + i) % this.capacity] as T); - } - return result; - } - - /** Return the last N entries (most recent first → reversed to oldest first) */ - last(n: number): T[] { - const count = Math.min(n, this._size); - const result: T[] = []; - const start = (this.head + this._size - count) % this.capacity; - for (let i = 0; i < count; i++) { - result.push(this.buffer[(start + i) % this.capacity] as T); - } - return result; - } - - get length(): number { - return this._size; - } - - get totalAdded(): number { - return this._totalAdded; - } - - clear(): void { - this.head = 0; - this._size = 0; - // Don't reset totalAdded — flush cursor depends on it - } - - /** Get entry by index (0 = oldest) — used by network response matching */ - get(index: number): T | undefined { - if (index < 0 || index >= this._size) return undefined; - return this.buffer[(this.head + index) % this.capacity]; - } - - /** Set entry by index (0 = oldest) — used by network response matching */ - set(index: number, entry: T): void { - if (index < 0 || index >= this._size) return; - this.buffer[(this.head + index) % this.capacity] = entry; - } -} - -// ─── Entry Types ──────────────────────────────────────────── - -export interface LogEntry { - timestamp: number; - level: string; - text: string; -} - -export interface NetworkEntry { - timestamp: number; - method: string; - url: string; - status?: number; - duration?: number; - size?: number; -} - -export interface DialogEntry { - timestamp: number; - type: string; // 'alert' | 'confirm' | 'prompt' | 'beforeunload' - message: string; - defaultValue?: string; - action: string; // 'accepted' | 'dismissed' - response?: string; // text provided for prompt -} - -// ─── Buffer Instances ─────────────────────────────────────── - -const HIGH_WATER_MARK = 50_000; - -export const consoleBuffer = new CircularBuffer(HIGH_WATER_MARK); -export const networkBuffer = new CircularBuffer(HIGH_WATER_MARK); -export const dialogBuffer = new CircularBuffer(HIGH_WATER_MARK); - -// ─── Convenience add functions ────────────────────────────── - -export function addConsoleEntry(entry: LogEntry) { - consoleBuffer.push(entry); -} - -export function addNetworkEntry(entry: NetworkEntry) { - networkBuffer.push(entry); -} - -export function addDialogEntry(entry: DialogEntry) { - dialogBuffer.push(entry); -} diff --git a/browse/src/cli.ts b/browse/src/cli.ts deleted file mode 100644 index f8b7902..0000000 --- a/browse/src/cli.ts +++ /dev/null @@ -1,325 +0,0 @@ -/** - * gstack CLI — thin wrapper that talks to the persistent server - * - * Flow: - * 1. Read .gstack/browse.json for port + token - * 2. If missing or stale PID → start server in background - * 3. Health check + version mismatch detection - * 4. Send command via HTTP POST - * 5. Print response to stdout (or stderr for errors) - */ - -import * as fs from 'fs'; -import * as path from 'path'; -import { resolveConfig, ensureStateDir, readVersionHash } from './config'; - -const config = resolveConfig(); -const MAX_START_WAIT = 8000; // 8 seconds to start - -export function resolveServerScript( - env: Record = process.env, - metaDir: string = import.meta.dir, - execPath: string = process.execPath -): string { - if (env.BROWSE_SERVER_SCRIPT) { - return env.BROWSE_SERVER_SCRIPT; - } - - // Dev mode: cli.ts runs directly from browse/src - if (metaDir.startsWith('/') && !metaDir.includes('$bunfs')) { - const direct = path.resolve(metaDir, 'server.ts'); - if (fs.existsSync(direct)) { - return direct; - } - } - - // Compiled binary: derive the source tree from browse/dist/browse - if (execPath) { - const adjacent = path.resolve(path.dirname(execPath), '..', 'src', 'server.ts'); - if (fs.existsSync(adjacent)) { - return adjacent; - } - } - - throw new Error( - 'Cannot find server.ts. Set BROWSE_SERVER_SCRIPT env or run from the browse source tree.' - ); -} - -const SERVER_SCRIPT = resolveServerScript(); - -interface ServerState { - pid: number; - port: number; - token: string; - startedAt: string; - serverPath: string; - binaryVersion?: string; -} - -// ─── State File ──────────────────────────────────────────────── -function readState(): ServerState | null { - try { - const data = fs.readFileSync(config.stateFile, 'utf-8'); - return JSON.parse(data); - } catch { - return null; - } -} - -function isProcessAlive(pid: number): boolean { - try { - process.kill(pid, 0); - return true; - } catch { - return false; - } -} - -// ─── Process Management ───────────────────────────────────────── -async function killServer(pid: number): Promise { - if (!isProcessAlive(pid)) return; - - try { process.kill(pid, 'SIGTERM'); } catch { return; } - - // Wait up to 2s for graceful shutdown - const deadline = Date.now() + 2000; - while (Date.now() < deadline && isProcessAlive(pid)) { - await Bun.sleep(100); - } - - // Force kill if still alive - if (isProcessAlive(pid)) { - try { process.kill(pid, 'SIGKILL'); } catch {} - } -} - -/** - * Clean up legacy /tmp/browse-server*.json files from before project-local state. - * Verifies PID ownership before sending signals. - */ -function cleanupLegacyState(): void { - try { - const files = fs.readdirSync('/tmp').filter(f => f.startsWith('browse-server') && f.endsWith('.json')); - for (const file of files) { - const fullPath = `/tmp/${file}`; - try { - const data = JSON.parse(fs.readFileSync(fullPath, 'utf-8')); - if (data.pid && isProcessAlive(data.pid)) { - // Verify this is actually a browse server before killing - const check = Bun.spawnSync(['ps', '-p', String(data.pid), '-o', 'command='], { - stdout: 'pipe', stderr: 'pipe', timeout: 2000, - }); - const cmd = check.stdout.toString().trim(); - if (cmd.includes('bun') || cmd.includes('server.ts')) { - try { process.kill(data.pid, 'SIGTERM'); } catch {} - } - } - fs.unlinkSync(fullPath); - } catch { - // Best effort — skip files we can't parse or clean up - } - } - // Clean up legacy log files too - const logFiles = fs.readdirSync('/tmp').filter(f => - f.startsWith('browse-console') || f.startsWith('browse-network') || f.startsWith('browse-dialog') - ); - for (const file of logFiles) { - try { fs.unlinkSync(`/tmp/${file}`); } catch {} - } - } catch { - // /tmp read failed — skip legacy cleanup - } -} - -// ─── Server Lifecycle ────────────────────────────────────────── -async function startServer(): Promise { - ensureStateDir(config); - - // Clean up stale state file - try { fs.unlinkSync(config.stateFile); } catch {} - - // Start server as detached background process - const proc = Bun.spawn(['bun', 'run', SERVER_SCRIPT], { - stdio: ['ignore', 'pipe', 'pipe'], - env: { ...process.env, BROWSE_STATE_FILE: config.stateFile }, - }); - - // Don't hold the CLI open - proc.unref(); - - // Wait for state file to appear - const start = Date.now(); - while (Date.now() - start < MAX_START_WAIT) { - const state = readState(); - if (state && isProcessAlive(state.pid)) { - return state; - } - await Bun.sleep(100); - } - - // If we get here, server didn't start in time - // Try to read stderr for error message - const stderr = proc.stderr; - if (stderr) { - const reader = stderr.getReader(); - const { value } = await reader.read(); - if (value) { - const errText = new TextDecoder().decode(value); - throw new Error(`Server failed to start:\n${errText}`); - } - } - throw new Error(`Server failed to start within ${MAX_START_WAIT / 1000}s`); -} - -async function ensureServer(): Promise { - const state = readState(); - - if (state && isProcessAlive(state.pid)) { - // Check for binary version mismatch (auto-restart on update) - const currentVersion = readVersionHash(); - if (currentVersion && state.binaryVersion && currentVersion !== state.binaryVersion) { - console.error('[browse] Binary updated, restarting server...'); - await killServer(state.pid); - return startServer(); - } - - // Server appears alive — do a health check - try { - const resp = await fetch(`http://127.0.0.1:${state.port}/health`, { - signal: AbortSignal.timeout(2000), - }); - if (resp.ok) { - const health = await resp.json() as any; - if (health.status === 'healthy') { - return state; - } - } - } catch { - // Health check failed — server is dead or unhealthy - } - } - - // Need to (re)start - console.error('[browse] Starting server...'); - return startServer(); -} - -// ─── Command Dispatch ────────────────────────────────────────── -async function sendCommand(state: ServerState, command: string, args: string[], retries = 0): Promise { - const body = JSON.stringify({ command, args }); - - try { - const resp = await fetch(`http://127.0.0.1:${state.port}/command`, { - method: 'POST', - headers: { - 'Content-Type': 'application/json', - 'Authorization': `Bearer ${state.token}`, - }, - body, - signal: AbortSignal.timeout(30000), - }); - - if (resp.status === 401) { - // Token mismatch — server may have restarted - console.error('[browse] Auth failed — server may have restarted. Retrying...'); - const newState = readState(); - if (newState && newState.token !== state.token) { - return sendCommand(newState, command, args); - } - throw new Error('Authentication failed'); - } - - const text = await resp.text(); - - if (resp.ok) { - process.stdout.write(text); - if (!text.endsWith('\n')) process.stdout.write('\n'); - } else { - // Try to parse as JSON error - try { - const err = JSON.parse(text); - console.error(err.error || text); - if (err.hint) console.error(err.hint); - } catch { - console.error(text); - } - process.exit(1); - } - } catch (err: any) { - if (err.name === 'AbortError') { - console.error('[browse] Command timed out after 30s'); - process.exit(1); - } - // Connection error — server may have crashed - if (err.code === 'ECONNREFUSED' || err.code === 'ECONNRESET' || err.message?.includes('fetch failed')) { - if (retries >= 1) throw new Error('[browse] Server crashed twice in a row — aborting'); - console.error('[browse] Server connection lost. Restarting...'); - const newState = await startServer(); - return sendCommand(newState, command, args, retries + 1); - } - throw err; - } -} - -// ─── Main ────────────────────────────────────────────────────── -async function main() { - const args = process.argv.slice(2); - - if (args.length === 0 || args[0] === '--help' || args[0] === '-h') { - console.log(`gstack browse — Fast headless browser for AI coding agents - -Usage: browse [args...] - -Navigation: goto | back | forward | reload | url -Content: text | html [sel] | links | forms | accessibility -Interaction: click | fill | select - hover | type | press - scroll [sel] | wait | viewport - upload [file2...] - cookie-import - cookie-import-browser [browser] [--domain ] -Inspection: js | eval | css | attrs - console [--clear|--errors] | network [--clear] | dialog [--clear] - cookies | storage [set ] | perf - is (visible|hidden|enabled|disabled|checked|editable|focused) -Visual: screenshot [path] | pdf [path] | responsive [prefix] -Snapshot: snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o path] [-C] - -D/--diff: diff against previous snapshot - -a/--annotate: annotated screenshot with ref labels - -C/--cursor-interactive: find non-ARIA clickable elements -Compare: diff -Multi-step: chain (reads JSON from stdin) -Tabs: tabs | tab | newtab [url] | closetab [id] -Server: status | cookie = | header : - useragent | stop | restart -Dialogs: dialog-accept [text] | dialog-dismiss - -Refs: After 'snapshot', use @e1, @e2... as selectors: - click @e3 | fill @e4 "value" | hover @e1 - @c refs from -C: click @c1`); - process.exit(0); - } - - // One-time cleanup of legacy /tmp state files - cleanupLegacyState(); - - const command = args[0]; - const commandArgs = args.slice(1); - - // Special case: chain reads from stdin - if (command === 'chain' && commandArgs.length === 0) { - const stdin = await Bun.stdin.text(); - commandArgs.push(stdin.trim()); - } - - const state = await ensureServer(); - await sendCommand(state, command, commandArgs); -} - -if (import.meta.main) { - main().catch((err) => { - console.error(`[browse] ${err.message}`); - process.exit(1); - }); -} diff --git a/browse/src/config.ts b/browse/src/config.ts deleted file mode 100644 index 7689291..0000000 --- a/browse/src/config.ts +++ /dev/null @@ -1,105 +0,0 @@ -/** - * Shared config for browse CLI + server. - * - * Resolution: - * 1. BROWSE_STATE_FILE env → derive stateDir from parent - * 2. git rev-parse --show-toplevel → projectDir/.gstack/ - * 3. process.cwd() fallback (non-git environments) - * - * The CLI computes the config and passes BROWSE_STATE_FILE to the - * spawned server. The server derives all paths from that env var. - */ - -import * as fs from 'fs'; -import * as path from 'path'; - -export interface BrowseConfig { - projectDir: string; - stateDir: string; - stateFile: string; - consoleLog: string; - networkLog: string; - dialogLog: string; -} - -/** - * Detect the git repository root, or null if not in a repo / git unavailable. - */ -export function getGitRoot(): string | null { - try { - const proc = Bun.spawnSync(['git', 'rev-parse', '--show-toplevel'], { - stdout: 'pipe', - stderr: 'pipe', - timeout: 2_000, // Don't hang if .git is broken - }); - if (proc.exitCode !== 0) return null; - return proc.stdout.toString().trim() || null; - } catch { - return null; - } -} - -/** - * Resolve all browse config paths. - * - * If BROWSE_STATE_FILE is set (e.g. by CLI when spawning server, or by - * tests for isolation), all paths are derived from it. Otherwise, the - * project root is detected via git or cwd. - */ -export function resolveConfig( - env: Record = process.env, -): BrowseConfig { - let stateFile: string; - let stateDir: string; - let projectDir: string; - - if (env.BROWSE_STATE_FILE) { - stateFile = env.BROWSE_STATE_FILE; - stateDir = path.dirname(stateFile); - projectDir = path.dirname(stateDir); // parent of .gstack/ - } else { - projectDir = getGitRoot() || process.cwd(); - stateDir = path.join(projectDir, '.gstack'); - stateFile = path.join(stateDir, 'browse.json'); - } - - return { - projectDir, - stateDir, - stateFile, - consoleLog: path.join(stateDir, 'browse-console.log'), - networkLog: path.join(stateDir, 'browse-network.log'), - dialogLog: path.join(stateDir, 'browse-dialog.log'), - }; -} - -/** - * Create the .gstack/ state directory if it doesn't exist. - * Throws with a clear message on permission errors. - */ -export function ensureStateDir(config: BrowseConfig): void { - try { - fs.mkdirSync(config.stateDir, { recursive: true }); - } catch (err: any) { - if (err.code === 'EACCES') { - throw new Error(`Cannot create state directory ${config.stateDir}: permission denied`); - } - if (err.code === 'ENOTDIR') { - throw new Error(`Cannot create state directory ${config.stateDir}: a file exists at that path`); - } - throw err; - } -} - -/** - * Read the binary version (git SHA) from browse/dist/.version. - * Returns null if the file doesn't exist or can't be read. - */ -export function readVersionHash(execPath: string = process.execPath): string | null { - try { - const versionFile = path.resolve(path.dirname(execPath), '.version'); - return fs.readFileSync(versionFile, 'utf-8').trim() || null; - } catch { - return null; - } -} diff --git a/browse/src/cookie-import-browser.ts b/browse/src/cookie-import-browser.ts deleted file mode 100644 index 29d9db3..0000000 --- a/browse/src/cookie-import-browser.ts +++ /dev/null @@ -1,417 +0,0 @@ -/** - * Chromium browser cookie import — read and decrypt cookies from real browsers - * - * Supports macOS Chromium-based browsers: Comet, Chrome, Arc, Brave, Edge. - * Pure logic module — no Playwright dependency, no HTTP concerns. - * - * Decryption pipeline (Chromium macOS "v10" format): - * - * ┌──────────────────────────────────────────────────────────────────┐ - * │ 1. Keychain: `security find-generic-password -s "" -w` │ - * │ → base64 password string │ - * │ │ - * │ 2. Key derivation: │ - * │ PBKDF2(password, salt="saltysalt", iter=1003, len=16, sha1) │ - * │ → 16-byte AES key │ - * │ │ - * │ 3. For each cookie with encrypted_value starting with "v10": │ - * │ - Ciphertext = encrypted_value[3:] │ - * │ - IV = 16 bytes of 0x20 (space character) │ - * │ - Plaintext = AES-128-CBC-decrypt(key, iv, ciphertext) │ - * │ - Remove PKCS7 padding │ - * │ - Skip first 32 bytes (HMAC-SHA256 authentication tag) │ - * │ - Remaining bytes = cookie value (UTF-8) │ - * │ │ - * │ 4. If encrypted_value is empty but `value` field is set, │ - * │ use value directly (unencrypted cookie) │ - * │ │ - * │ 5. Chromium epoch: microseconds since 1601-01-01 │ - * │ Unix seconds = (epoch - 11644473600000000) / 1000000 │ - * │ │ - * │ 6. sameSite: 0→"None", 1→"Lax", 2→"Strict", else→"Lax" │ - * └──────────────────────────────────────────────────────────────────┘ - */ - -import { Database } from 'bun:sqlite'; -import * as crypto from 'crypto'; -import * as fs from 'fs'; -import * as path from 'path'; -import * as os from 'os'; - -// ─── Types ────────────────────────────────────────────────────── - -export interface BrowserInfo { - name: string; - dataDir: string; // relative to ~/Library/Application Support/ - keychainService: string; - aliases: string[]; -} - -export interface DomainEntry { - domain: string; - count: number; -} - -export interface ImportResult { - cookies: PlaywrightCookie[]; - count: number; - failed: number; - domainCounts: Record; -} - -export interface PlaywrightCookie { - name: string; - value: string; - domain: string; - path: string; - expires: number; - secure: boolean; - httpOnly: boolean; - sameSite: 'Strict' | 'Lax' | 'None'; -} - -export class CookieImportError extends Error { - constructor( - message: string, - public code: string, - public action?: 'retry', - ) { - super(message); - this.name = 'CookieImportError'; - } -} - -// ─── Browser Registry ─────────────────────────────────────────── -// Hardcoded — NEVER interpolate user input into shell commands. - -const BROWSER_REGISTRY: BrowserInfo[] = [ - { name: 'Comet', dataDir: 'Comet/', keychainService: 'Comet Safe Storage', aliases: ['comet', 'perplexity'] }, - { name: 'Chrome', dataDir: 'Google/Chrome/', keychainService: 'Chrome Safe Storage', aliases: ['chrome', 'google-chrome'] }, - { name: 'Arc', dataDir: 'Arc/User Data/', keychainService: 'Arc Safe Storage', aliases: ['arc'] }, - { name: 'Brave', dataDir: 'BraveSoftware/Brave-Browser/', keychainService: 'Brave Safe Storage', aliases: ['brave'] }, - { name: 'Edge', dataDir: 'Microsoft Edge/', keychainService: 'Microsoft Edge Safe Storage', aliases: ['edge'] }, -]; - -// ─── Key Cache ────────────────────────────────────────────────── -// Cache derived AES keys per browser. First import per browser does -// Keychain + PBKDF2. Subsequent imports reuse the cached key. - -const keyCache = new Map(); - -// ─── Public API ───────────────────────────────────────────────── - -/** - * Find which browsers are installed (have a cookie DB on disk). - */ -export function findInstalledBrowsers(): BrowserInfo[] { - const appSupport = path.join(os.homedir(), 'Library', 'Application Support'); - return BROWSER_REGISTRY.filter(b => { - const dbPath = path.join(appSupport, b.dataDir, 'Default', 'Cookies'); - try { return fs.existsSync(dbPath); } catch { return false; } - }); -} - -/** - * List unique cookie domains + counts from a browser's DB. No decryption. - */ -export function listDomains(browserName: string, profile = 'Default'): { domains: DomainEntry[]; browser: string } { - const browser = resolveBrowser(browserName); - const dbPath = getCookieDbPath(browser, profile); - const db = openDb(dbPath, browser.name); - try { - const now = chromiumNow(); - const rows = db.query( - `SELECT host_key AS domain, COUNT(*) AS count - FROM cookies - WHERE has_expires = 0 OR expires_utc > ? - GROUP BY host_key - ORDER BY count DESC` - ).all(now) as DomainEntry[]; - return { domains: rows, browser: browser.name }; - } finally { - db.close(); - } -} - -/** - * Decrypt and return Playwright-compatible cookies for specific domains. - */ -export async function importCookies( - browserName: string, - domains: string[], - profile = 'Default', -): Promise { - if (domains.length === 0) return { cookies: [], count: 0, failed: 0, domainCounts: {} }; - - const browser = resolveBrowser(browserName); - const derivedKey = await getDerivedKey(browser); - const dbPath = getCookieDbPath(browser, profile); - const db = openDb(dbPath, browser.name); - - try { - const now = chromiumNow(); - // Parameterized query — no SQL injection - const placeholders = domains.map(() => '?').join(','); - const rows = db.query( - `SELECT host_key, name, value, encrypted_value, path, expires_utc, - is_secure, is_httponly, has_expires, samesite - FROM cookies - WHERE host_key IN (${placeholders}) - AND (has_expires = 0 OR expires_utc > ?) - ORDER BY host_key, name` - ).all(...domains, now) as RawCookie[]; - - const cookies: PlaywrightCookie[] = []; - let failed = 0; - const domainCounts: Record = {}; - - for (const row of rows) { - try { - const value = decryptCookieValue(row, derivedKey); - const cookie = toPlaywrightCookie(row, value); - cookies.push(cookie); - domainCounts[row.host_key] = (domainCounts[row.host_key] || 0) + 1; - } catch { - failed++; - } - } - - return { cookies, count: cookies.length, failed, domainCounts }; - } finally { - db.close(); - } -} - -// ─── Internal: Browser Resolution ─────────────────────────────── - -function resolveBrowser(nameOrAlias: string): BrowserInfo { - const needle = nameOrAlias.toLowerCase().trim(); - const found = BROWSER_REGISTRY.find(b => - b.aliases.includes(needle) || b.name.toLowerCase() === needle - ); - if (!found) { - const supported = BROWSER_REGISTRY.flatMap(b => b.aliases).join(', '); - throw new CookieImportError( - `Unknown browser '${nameOrAlias}'. Supported: ${supported}`, - 'unknown_browser', - ); - } - return found; -} - -function validateProfile(profile: string): void { - if (/[/\\]|\.\./.test(profile) || /[\x00-\x1f]/.test(profile)) { - throw new CookieImportError( - `Invalid profile name: '${profile}'`, - 'bad_request', - ); - } -} - -function getCookieDbPath(browser: BrowserInfo, profile: string): string { - validateProfile(profile); - const appSupport = path.join(os.homedir(), 'Library', 'Application Support'); - const dbPath = path.join(appSupport, browser.dataDir, profile, 'Cookies'); - if (!fs.existsSync(dbPath)) { - throw new CookieImportError( - `${browser.name} is not installed (no cookie database at ${dbPath})`, - 'not_installed', - ); - } - return dbPath; -} - -// ─── Internal: SQLite Access ──────────────────────────────────── - -function openDb(dbPath: string, browserName: string): Database { - try { - return new Database(dbPath, { readonly: true }); - } catch (err: any) { - if (err.message?.includes('SQLITE_BUSY') || err.message?.includes('database is locked')) { - return openDbFromCopy(dbPath, browserName); - } - if (err.message?.includes('SQLITE_CORRUPT') || err.message?.includes('malformed')) { - throw new CookieImportError( - `Cookie database for ${browserName} is corrupt`, - 'db_corrupt', - ); - } - throw err; - } -} - -function openDbFromCopy(dbPath: string, browserName: string): Database { - const tmpPath = `/tmp/browse-cookies-${browserName.toLowerCase()}-${crypto.randomUUID()}.db`; - try { - fs.copyFileSync(dbPath, tmpPath); - // Also copy WAL and SHM if they exist (for consistent reads) - const walPath = dbPath + '-wal'; - const shmPath = dbPath + '-shm'; - if (fs.existsSync(walPath)) fs.copyFileSync(walPath, tmpPath + '-wal'); - if (fs.existsSync(shmPath)) fs.copyFileSync(shmPath, tmpPath + '-shm'); - - const db = new Database(tmpPath, { readonly: true }); - // Schedule cleanup after the DB is closed - const origClose = db.close.bind(db); - db.close = () => { - origClose(); - try { fs.unlinkSync(tmpPath); } catch {} - try { fs.unlinkSync(tmpPath + '-wal'); } catch {} - try { fs.unlinkSync(tmpPath + '-shm'); } catch {} - }; - return db; - } catch { - // Clean up on failure - try { fs.unlinkSync(tmpPath); } catch {} - throw new CookieImportError( - `Cookie database is locked (${browserName} may be running). Try closing ${browserName} first.`, - 'db_locked', - 'retry', - ); - } -} - -// ─── Internal: Keychain Access (async, 10s timeout) ───────────── - -async function getDerivedKey(browser: BrowserInfo): Promise { - const cached = keyCache.get(browser.keychainService); - if (cached) return cached; - - const password = await getKeychainPassword(browser.keychainService); - const derived = crypto.pbkdf2Sync(password, 'saltysalt', 1003, 16, 'sha1'); - keyCache.set(browser.keychainService, derived); - return derived; -} - -async function getKeychainPassword(service: string): Promise { - // Use async Bun.spawn with timeout to avoid blocking the event loop. - // macOS may show an Allow/Deny dialog that blocks until the user responds. - const proc = Bun.spawn( - ['security', 'find-generic-password', '-s', service, '-w'], - { stdout: 'pipe', stderr: 'pipe' }, - ); - - const timeout = new Promise((_, reject) => - setTimeout(() => { - proc.kill(); - reject(new CookieImportError( - `macOS is waiting for Keychain permission. Look for a dialog asking to allow access to "${service}".`, - 'keychain_timeout', - 'retry', - )); - }, 10_000), - ); - - try { - const exitCode = await Promise.race([proc.exited, timeout]); - const stdout = await new Response(proc.stdout).text(); - const stderr = await new Response(proc.stderr).text(); - - if (exitCode !== 0) { - // Distinguish denied vs not found vs other - const errText = stderr.trim().toLowerCase(); - if (errText.includes('user canceled') || errText.includes('denied') || errText.includes('interaction not allowed')) { - throw new CookieImportError( - `Keychain access denied. Click "Allow" in the macOS dialog for "${service}".`, - 'keychain_denied', - 'retry', - ); - } - if (errText.includes('could not be found') || errText.includes('not found')) { - throw new CookieImportError( - `No Keychain entry for "${service}". Is this a Chromium-based browser?`, - 'keychain_not_found', - ); - } - throw new CookieImportError( - `Could not read Keychain: ${stderr.trim()}`, - 'keychain_error', - 'retry', - ); - } - - return stdout.trim(); - } catch (err) { - if (err instanceof CookieImportError) throw err; - throw new CookieImportError( - `Could not read Keychain: ${(err as Error).message}`, - 'keychain_error', - 'retry', - ); - } -} - -// ─── Internal: Cookie Decryption ──────────────────────────────── - -interface RawCookie { - host_key: string; - name: string; - value: string; - encrypted_value: Buffer | Uint8Array; - path: string; - expires_utc: number | bigint; - is_secure: number; - is_httponly: number; - has_expires: number; - samesite: number; -} - -function decryptCookieValue(row: RawCookie, key: Buffer): string { - // Prefer unencrypted value if present - if (row.value && row.value.length > 0) return row.value; - - const ev = Buffer.from(row.encrypted_value); - if (ev.length === 0) return ''; - - const prefix = ev.slice(0, 3).toString('utf-8'); - if (prefix !== 'v10') { - throw new Error(`Unknown encryption prefix: ${prefix}`); - } - - const ciphertext = ev.slice(3); - const iv = Buffer.alloc(16, 0x20); // 16 space characters - const decipher = crypto.createDecipheriv('aes-128-cbc', key, iv); - const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]); - - // First 32 bytes are HMAC-SHA256 authentication tag; actual value follows - if (plaintext.length <= 32) return ''; - return plaintext.slice(32).toString('utf-8'); -} - -function toPlaywrightCookie(row: RawCookie, value: string): PlaywrightCookie { - return { - name: row.name, - value, - domain: row.host_key, - path: row.path || '/', - expires: chromiumEpochToUnix(row.expires_utc, row.has_expires), - secure: row.is_secure === 1, - httpOnly: row.is_httponly === 1, - sameSite: mapSameSite(row.samesite), - }; -} - -// ─── Internal: Chromium Epoch Conversion ──────────────────────── - -const CHROMIUM_EPOCH_OFFSET = 11644473600000000n; - -function chromiumNow(): bigint { - // Current time in Chromium epoch (microseconds since 1601-01-01) - return BigInt(Date.now()) * 1000n + CHROMIUM_EPOCH_OFFSET; -} - -function chromiumEpochToUnix(epoch: number | bigint, hasExpires: number): number { - if (hasExpires === 0 || epoch === 0 || epoch === 0n) return -1; // session cookie - const epochBig = BigInt(epoch); - const unixMicro = epochBig - CHROMIUM_EPOCH_OFFSET; - return Number(unixMicro / 1000000n); -} - -function mapSameSite(value: number): 'Strict' | 'Lax' | 'None' { - switch (value) { - case 0: return 'None'; - case 1: return 'Lax'; - case 2: return 'Strict'; - default: return 'Lax'; - } -} diff --git a/browse/src/cookie-picker-routes.ts b/browse/src/cookie-picker-routes.ts deleted file mode 100644 index 6a4a431..0000000 --- a/browse/src/cookie-picker-routes.ts +++ /dev/null @@ -1,207 +0,0 @@ -/** - * Cookie picker route handler — HTTP + Playwright glue - * - * Handles all /cookie-picker/* routes. Imports from cookie-import-browser.ts - * (decryption) and cookie-picker-ui.ts (HTML generation). - * - * Routes (no auth — localhost-only, accepted risk): - * GET /cookie-picker → serves the picker HTML page - * GET /cookie-picker/browsers → list installed browsers - * GET /cookie-picker/domains → list domains + counts for a browser - * POST /cookie-picker/import → decrypt + import cookies to Playwright - * POST /cookie-picker/remove → clear cookies for domains - * GET /cookie-picker/imported → currently imported domains + counts - */ - -import type { BrowserManager } from './browser-manager'; -import { findInstalledBrowsers, listDomains, importCookies, CookieImportError, type PlaywrightCookie } from './cookie-import-browser'; -import { getCookiePickerHTML } from './cookie-picker-ui'; - -// ─── State ────────────────────────────────────────────────────── -// Tracks which domains were imported via the picker. -// /imported only returns cookies for domains in this Set. -// /remove clears from this Set. -const importedDomains = new Set(); -const importedCounts = new Map(); - -// ─── JSON Helpers ─────────────────────────────────────────────── - -function corsOrigin(port: number): string { - return `http://127.0.0.1:${port}`; -} - -function jsonResponse(data: any, opts: { port: number; status?: number }): Response { - return new Response(JSON.stringify(data), { - status: opts.status ?? 200, - headers: { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': corsOrigin(opts.port), - }, - }); -} - -function errorResponse(message: string, code: string, opts: { port: number; status?: number; action?: string }): Response { - return jsonResponse( - { error: message, code, ...(opts.action ? { action: opts.action } : {}) }, - { port: opts.port, status: opts.status ?? 400 }, - ); -} - -// ─── Route Handler ────────────────────────────────────────────── - -export async function handleCookiePickerRoute( - url: URL, - req: Request, - bm: BrowserManager, -): Promise { - const pathname = url.pathname; - const port = parseInt(url.port, 10) || 9400; - - // CORS preflight - if (req.method === 'OPTIONS') { - return new Response(null, { - status: 204, - headers: { - 'Access-Control-Allow-Origin': corsOrigin(port), - 'Access-Control-Allow-Methods': 'GET, POST, OPTIONS', - 'Access-Control-Allow-Headers': 'Content-Type', - }, - }); - } - - try { - // GET /cookie-picker — serve the picker UI - if (pathname === '/cookie-picker' && req.method === 'GET') { - const html = getCookiePickerHTML(port); - return new Response(html, { - status: 200, - headers: { 'Content-Type': 'text/html; charset=utf-8' }, - }); - } - - // GET /cookie-picker/browsers — list installed browsers - if (pathname === '/cookie-picker/browsers' && req.method === 'GET') { - const browsers = findInstalledBrowsers(); - return jsonResponse({ - browsers: browsers.map(b => ({ - name: b.name, - aliases: b.aliases, - })), - }, { port }); - } - - // GET /cookie-picker/domains?browser= — list domains + counts - if (pathname === '/cookie-picker/domains' && req.method === 'GET') { - const browserName = url.searchParams.get('browser'); - if (!browserName) { - return errorResponse("Missing 'browser' parameter", 'missing_param', { port }); - } - const result = listDomains(browserName); - return jsonResponse({ - browser: result.browser, - domains: result.domains, - }, { port }); - } - - // POST /cookie-picker/import — decrypt + import to Playwright session - if (pathname === '/cookie-picker/import' && req.method === 'POST') { - let body: any; - try { - body = await req.json(); - } catch { - return errorResponse('Invalid JSON body', 'bad_request', { port }); - } - - const { browser, domains } = body; - if (!browser) return errorResponse("Missing 'browser' field", 'missing_param', { port }); - if (!domains || !Array.isArray(domains) || domains.length === 0) { - return errorResponse("Missing or empty 'domains' array", 'missing_param', { port }); - } - - // Decrypt cookies from the browser DB - const result = await importCookies(browser, domains); - - if (result.cookies.length === 0) { - return jsonResponse({ - imported: 0, - failed: result.failed, - domainCounts: {}, - message: result.failed > 0 - ? `All ${result.failed} cookies failed to decrypt` - : 'No cookies found for the specified domains', - }, { port }); - } - - // Add to Playwright context - const page = bm.getPage(); - await page.context().addCookies(result.cookies); - - // Track what was imported - for (const domain of Object.keys(result.domainCounts)) { - importedDomains.add(domain); - importedCounts.set(domain, (importedCounts.get(domain) || 0) + result.domainCounts[domain]); - } - - console.log(`[cookie-picker] Imported ${result.count} cookies for ${Object.keys(result.domainCounts).length} domains`); - - return jsonResponse({ - imported: result.count, - failed: result.failed, - domainCounts: result.domainCounts, - }, { port }); - } - - // POST /cookie-picker/remove — clear cookies for domains - if (pathname === '/cookie-picker/remove' && req.method === 'POST') { - let body: any; - try { - body = await req.json(); - } catch { - return errorResponse('Invalid JSON body', 'bad_request', { port }); - } - - const { domains } = body; - if (!domains || !Array.isArray(domains) || domains.length === 0) { - return errorResponse("Missing or empty 'domains' array", 'missing_param', { port }); - } - - const page = bm.getPage(); - const context = page.context(); - for (const domain of domains) { - await context.clearCookies({ domain }); - importedDomains.delete(domain); - importedCounts.delete(domain); - } - - console.log(`[cookie-picker] Removed cookies for ${domains.length} domains`); - - return jsonResponse({ - removed: domains.length, - domains, - }, { port }); - } - - // GET /cookie-picker/imported — currently imported domains + counts - if (pathname === '/cookie-picker/imported' && req.method === 'GET') { - const entries: Array<{ domain: string; count: number }> = []; - for (const domain of importedDomains) { - entries.push({ domain, count: importedCounts.get(domain) || 0 }); - } - entries.sort((a, b) => b.count - a.count); - - return jsonResponse({ - domains: entries, - totalDomains: entries.length, - totalCookies: entries.reduce((sum, e) => sum + e.count, 0), - }, { port }); - } - - return new Response('Not found', { status: 404 }); - } catch (err: any) { - if (err instanceof CookieImportError) { - return errorResponse(err.message, err.code, { port, status: 400, action: err.action }); - } - console.error(`[cookie-picker] Error: ${err.message}`); - return errorResponse(err.message || 'Internal error', 'internal_error', { port, status: 500 }); - } -} diff --git a/browse/src/cookie-picker-ui.ts b/browse/src/cookie-picker-ui.ts deleted file mode 100644 index 010c2dd..0000000 --- a/browse/src/cookie-picker-ui.ts +++ /dev/null @@ -1,541 +0,0 @@ -/** - * Cookie picker UI — self-contained HTML page - * - * Dark theme, two-panel layout, vanilla HTML/CSS/JS. - * Left: source browser domains with search + import buttons. - * Right: imported domains with trash buttons. - * No cookie values exposed anywhere. - */ - -export function getCookiePickerHTML(serverPort: number): string { - const baseUrl = `http://127.0.0.1:${serverPort}`; - - return ` - - - - -Cookie Import — gstack browse - - - - -
-

Cookie Import

- localhost:${serverPort} -
- - - -
- -
-
Source Browser
-
-
- -
-
-
Detecting browsers...
-
- -
- - -
-
Imported to Session
-
-
No cookies imported yet
-
- -
-
- - - -`; -} diff --git a/browse/src/find-browse.ts b/browse/src/find-browse.ts deleted file mode 100644 index 38b987a..0000000 --- a/browse/src/find-browse.ts +++ /dev/null @@ -1,181 +0,0 @@ -/** - * find-browse — locate the gstack browse binary + check for updates. - * - * Compiled to browse/dist/find-browse (standalone binary, no bun runtime needed). - * - * Output protocol: - * Line 1: /path/to/binary (always present) - * Line 2+: META: (optional, 0 or more) - * - * META types: - * META:UPDATE_AVAILABLE — local binary is behind origin/main - * - * All version checks are best-effort: network failures, missing files, and - * cache errors degrade gracefully to outputting only the binary path. - */ - -import { existsSync } from 'fs'; -import { readFileSync, writeFileSync } from 'fs'; -import { join, dirname } from 'path'; -import { homedir } from 'os'; - -const REPO_URL = 'https://github.com/garrytan/gstack.git'; -const CACHE_PATH = '/tmp/gstack-latest-version'; -const CACHE_TTL = 14400; // 4 hours in seconds - -// ─── Binary Discovery ─────────────────────────────────────────── - -function getGitRoot(): string | null { - try { - const proc = Bun.spawnSync(['git', 'rev-parse', '--show-toplevel'], { - stdout: 'pipe', - stderr: 'pipe', - }); - if (proc.exitCode !== 0) return null; - return proc.stdout.toString().trim(); - } catch { - return null; - } -} - -export function locateBinary(): string | null { - const root = getGitRoot(); - const home = homedir(); - - // Workspace-local takes priority (for development) - if (root) { - const local = join(root, '.claude', 'skills', 'gstack', 'browse', 'dist', 'browse'); - if (existsSync(local)) return local; - } - - // Global fallback - const global = join(home, '.claude', 'skills', 'gstack', 'browse', 'dist', 'browse'); - if (existsSync(global)) return global; - - return null; -} - -// ─── Version Check ────────────────────────────────────────────── - -interface CacheEntry { - sha: string; - timestamp: number; -} - -function readCache(): CacheEntry | null { - try { - const content = readFileSync(CACHE_PATH, 'utf-8').trim(); - const parts = content.split(/\s+/); - if (parts.length < 2) return null; - const sha = parts[0]; - const timestamp = parseInt(parts[1], 10); - if (!sha || isNaN(timestamp)) return null; - // Validate SHA is hex - if (!/^[0-9a-f]{40}$/i.test(sha)) return null; - return { sha, timestamp }; - } catch { - return null; - } -} - -function writeCache(sha: string, timestamp: number): void { - try { - writeFileSync(CACHE_PATH, `${sha} ${timestamp}\n`); - } catch { - // Cache write failure is non-fatal - } -} - -function fetchRemoteSHA(): string | null { - try { - const proc = Bun.spawnSync(['git', 'ls-remote', REPO_URL, 'refs/heads/main'], { - stdout: 'pipe', - stderr: 'pipe', - timeout: 10_000, // 10s timeout - }); - if (proc.exitCode !== 0) return null; - const output = proc.stdout.toString().trim(); - const sha = output.split(/\s+/)[0]; - if (!sha || !/^[0-9a-f]{40}$/i.test(sha)) return null; - return sha; - } catch { - return null; - } -} - -function resolveSkillDir(binaryPath: string): string | null { - const home = homedir(); - const globalPrefix = join(home, '.claude', 'skills', 'gstack'); - if (binaryPath.startsWith(globalPrefix)) return globalPrefix; - - // Workspace-local: binary is at $ROOT/.claude/skills/gstack/browse/dist/browse - // Skill dir is $ROOT/.claude/skills/gstack - const parts = binaryPath.split('/.claude/skills/gstack/'); - if (parts.length === 2) return parts[0] + '/.claude/skills/gstack'; - - return null; -} - -export function checkVersion(binaryDir: string): string | null { - // Read local version - const versionFile = join(binaryDir, '.version'); - let localSHA: string; - try { - localSHA = readFileSync(versionFile, 'utf-8').trim(); - } catch { - return null; // No .version file → skip check - } - if (!localSHA) return null; - - const now = Math.floor(Date.now() / 1000); - - // Check cache - let remoteSHA: string | null = null; - const cache = readCache(); - if (cache && (now - cache.timestamp) < CACHE_TTL) { - remoteSHA = cache.sha; - } - - // Fetch from remote if cache miss - if (!remoteSHA) { - remoteSHA = fetchRemoteSHA(); - if (remoteSHA) { - writeCache(remoteSHA, now); - } - } - - if (!remoteSHA) return null; // Offline or error → skip check - - // Compare - if (localSHA === remoteSHA) return null; // Up to date - - // Determine skill directory for update command - const binaryPath = join(binaryDir, 'browse'); - const skillDir = resolveSkillDir(binaryPath); - if (!skillDir) return null; - - const payload = JSON.stringify({ - current: localSHA.slice(0, 8), - latest: remoteSHA.slice(0, 8), - command: `cd ${skillDir} && git stash && git fetch origin && git reset --hard origin/main && ./setup`, - }); - - return `META:UPDATE_AVAILABLE ${payload}`; -} - -// ─── Main ─────────────────────────────────────────────────────── - -function main() { - const bin = locateBinary(); - if (!bin) { - process.stderr.write('ERROR: browse binary not found. Run: cd && ./setup\n'); - process.exit(1); - } - - console.log(bin); - - const meta = checkVersion(dirname(bin)); - if (meta) console.log(meta); -} - -main(); diff --git a/browse/src/meta-commands.ts b/browse/src/meta-commands.ts deleted file mode 100644 index 8d3f9eb..0000000 --- a/browse/src/meta-commands.ts +++ /dev/null @@ -1,221 +0,0 @@ -/** - * Meta commands — tabs, server control, screenshots, chain, diff, snapshot - */ - -import type { BrowserManager } from './browser-manager'; -import { handleSnapshot } from './snapshot'; -import { getCleanText } from './read-commands'; -import * as Diff from 'diff'; -import * as fs from 'fs'; -import * as path from 'path'; - -// Security: Path validation to prevent path traversal attacks -const SAFE_DIRECTORIES = ['/tmp', process.cwd()]; - -function validateOutputPath(filePath: string): void { - const resolved = path.resolve(filePath); - const isSafe = SAFE_DIRECTORIES.some(dir => resolved === dir || resolved.startsWith(dir + '/')); - if (!isSafe) { - throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`); - } -} - -// Command sets for chain routing (mirrors server.ts — kept local to avoid circular import) -const CHAIN_READ = new Set([ - 'text', 'html', 'links', 'forms', 'accessibility', - 'js', 'eval', 'css', 'attrs', - 'console', 'network', 'cookies', 'storage', 'perf', - 'dialog', 'is', -]); -const CHAIN_WRITE = new Set([ - 'goto', 'back', 'forward', 'reload', - 'click', 'fill', 'select', 'hover', 'type', 'press', 'scroll', 'wait', - 'viewport', 'cookie', 'cookie-import', 'header', 'useragent', - 'upload', 'dialog-accept', 'dialog-dismiss', - 'cookie-import-browser', -]); -const CHAIN_META = new Set([ - 'tabs', 'tab', 'newtab', 'closetab', - 'status', 'stop', 'restart', - 'screenshot', 'pdf', 'responsive', - 'chain', 'diff', - 'url', 'snapshot', -]); - -export async function handleMetaCommand( - command: string, - args: string[], - bm: BrowserManager, - shutdown: () => Promise | void -): Promise { - switch (command) { - // ─── Tabs ────────────────────────────────────────── - case 'tabs': { - const tabs = await bm.getTabListWithTitles(); - return tabs.map(t => - `${t.active ? '→ ' : ' '}[${t.id}] ${t.title || '(untitled)'} — ${t.url}` - ).join('\n'); - } - - case 'tab': { - const id = parseInt(args[0], 10); - if (isNaN(id)) throw new Error('Usage: browse tab '); - bm.switchTab(id); - return `Switched to tab ${id}`; - } - - case 'newtab': { - const url = args[0]; - const id = await bm.newTab(url); - return `Opened tab ${id}${url ? ` → ${url}` : ''}`; - } - - case 'closetab': { - const id = args[0] ? parseInt(args[0], 10) : undefined; - await bm.closeTab(id); - return `Closed tab${id ? ` ${id}` : ''}`; - } - - // ─── Server Control ──────────────────────────────── - case 'status': { - const page = bm.getPage(); - const tabs = bm.getTabCount(); - return [ - `Status: healthy`, - `URL: ${page.url()}`, - `Tabs: ${tabs}`, - `PID: ${process.pid}`, - ].join('\n'); - } - - case 'url': { - return bm.getCurrentUrl(); - } - - case 'stop': { - await shutdown(); - return 'Server stopped'; - } - - case 'restart': { - // Signal that we want a restart — the CLI will detect exit and restart - console.log('[browse] Restart requested. Exiting for CLI to restart.'); - await shutdown(); - return 'Restarting...'; - } - - // ─── Visual ──────────────────────────────────────── - case 'screenshot': { - const page = bm.getPage(); - const screenshotPath = args[0] || '/tmp/browse-screenshot.png'; - validateOutputPath(screenshotPath); - await page.screenshot({ path: screenshotPath, fullPage: true }); - return `Screenshot saved: ${screenshotPath}`; - } - - case 'pdf': { - const page = bm.getPage(); - const pdfPath = args[0] || '/tmp/browse-page.pdf'; - validateOutputPath(pdfPath); - await page.pdf({ path: pdfPath, format: 'A4' }); - return `PDF saved: ${pdfPath}`; - } - - case 'responsive': { - const page = bm.getPage(); - const prefix = args[0] || '/tmp/browse-responsive'; - validateOutputPath(prefix); - const viewports = [ - { name: 'mobile', width: 375, height: 812 }, - { name: 'tablet', width: 768, height: 1024 }, - { name: 'desktop', width: 1280, height: 720 }, - ]; - const originalViewport = page.viewportSize(); - const results: string[] = []; - - for (const vp of viewports) { - await page.setViewportSize({ width: vp.width, height: vp.height }); - const path = `${prefix}-${vp.name}.png`; - await page.screenshot({ path, fullPage: true }); - results.push(`${vp.name} (${vp.width}x${vp.height}): ${path}`); - } - - // Restore original viewport - if (originalViewport) { - await page.setViewportSize(originalViewport); - } - - return results.join('\n'); - } - - // ─── Chain ───────────────────────────────────────── - case 'chain': { - // Read JSON array from args[0] (if provided) or expect it was passed as body - const jsonStr = args[0]; - if (!jsonStr) throw new Error('Usage: echo \'[["goto","url"],["text"]]\' | browse chain'); - - let commands: string[][]; - try { - commands = JSON.parse(jsonStr); - } catch { - throw new Error('Invalid JSON. Expected: [["command", "arg1", "arg2"], ...]'); - } - - if (!Array.isArray(commands)) throw new Error('Expected JSON array of commands'); - - const results: string[] = []; - const { handleReadCommand } = await import('./read-commands'); - const { handleWriteCommand } = await import('./write-commands'); - - for (const cmd of commands) { - const [name, ...cmdArgs] = cmd; - try { - let result: string; - if (CHAIN_WRITE.has(name)) result = await handleWriteCommand(name, cmdArgs, bm); - else if (CHAIN_READ.has(name)) result = await handleReadCommand(name, cmdArgs, bm); - else if (CHAIN_META.has(name)) result = await handleMetaCommand(name, cmdArgs, bm, shutdown); - else throw new Error(`Unknown command: ${name}`); - results.push(`[${name}] ${result}`); - } catch (err: any) { - results.push(`[${name}] ERROR: ${err.message}`); - } - } - - return results.join('\n\n'); - } - - // ─── Diff ────────────────────────────────────────── - case 'diff': { - const [url1, url2] = args; - if (!url1 || !url2) throw new Error('Usage: browse diff '); - - const page = bm.getPage(); - await page.goto(url1, { waitUntil: 'domcontentloaded', timeout: 15000 }); - const text1 = await getCleanText(page); - - await page.goto(url2, { waitUntil: 'domcontentloaded', timeout: 15000 }); - const text2 = await getCleanText(page); - - const changes = Diff.diffLines(text1, text2); - const output: string[] = [`--- ${url1}`, `+++ ${url2}`, '']; - - for (const part of changes) { - const prefix = part.added ? '+' : part.removed ? '-' : ' '; - const lines = part.value.split('\n').filter(l => l.length > 0); - for (const line of lines) { - output.push(`${prefix} ${line}`); - } - } - - return output.join('\n'); - } - - // ─── Snapshot ───────────────────────────────────── - case 'snapshot': { - return await handleSnapshot(args, bm); - } - - default: - throw new Error(`Unknown meta command: ${command}`); - } -} diff --git a/browse/src/read-commands.ts b/browse/src/read-commands.ts deleted file mode 100644 index 31d1018..0000000 --- a/browse/src/read-commands.ts +++ /dev/null @@ -1,294 +0,0 @@ -/** - * Read commands — extract data from pages without side effects - * - * text, html, links, forms, accessibility, js, eval, css, attrs, - * console, network, cookies, storage, perf - */ - -import type { BrowserManager } from './browser-manager'; -import { consoleBuffer, networkBuffer, dialogBuffer } from './buffers'; -import type { Page } from 'playwright'; -import * as fs from 'fs'; -import * as path from 'path'; - -// Security: Path validation to prevent path traversal attacks -const SAFE_DIRECTORIES = ['/tmp', process.cwd()]; - -function validateReadPath(filePath: string): void { - if (path.isAbsolute(filePath)) { - const resolved = path.resolve(filePath); - const isSafe = SAFE_DIRECTORIES.some(dir => resolved === dir || resolved.startsWith(dir + '/')); - if (!isSafe) { - throw new Error(`Absolute path must be within: ${SAFE_DIRECTORIES.join(', ')}`); - } - } - const normalized = path.normalize(filePath); - if (normalized.includes('..')) { - throw new Error('Path traversal sequences (..) are not allowed'); - } -} - -/** - * Extract clean text from a page (strips script/style/noscript/svg). - * Exported for DRY reuse in meta-commands (diff). - */ -export async function getCleanText(page: Page): Promise { - return await page.evaluate(() => { - const body = document.body; - if (!body) return ''; - const clone = body.cloneNode(true) as HTMLElement; - clone.querySelectorAll('script, style, noscript, svg').forEach(el => el.remove()); - return clone.innerText - .split('\n') - .map(line => line.trim()) - .filter(line => line.length > 0) - .join('\n'); - }); -} - -export async function handleReadCommand( - command: string, - args: string[], - bm: BrowserManager -): Promise { - const page = bm.getPage(); - - switch (command) { - case 'text': { - return await getCleanText(page); - } - - case 'html': { - const selector = args[0]; - if (selector) { - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - return await resolved.locator.innerHTML({ timeout: 5000 }); - } - return await page.innerHTML(resolved.selector); - } - return await page.content(); - } - - case 'links': { - const links = await page.evaluate(() => - [...document.querySelectorAll('a[href]')].map(a => ({ - text: a.textContent?.trim().slice(0, 120) || '', - href: (a as HTMLAnchorElement).href, - })).filter(l => l.text && l.href) - ); - return links.map(l => `${l.text} → ${l.href}`).join('\n'); - } - - case 'forms': { - const forms = await page.evaluate(() => { - return [...document.querySelectorAll('form')].map((form, i) => { - const fields = [...form.querySelectorAll('input, select, textarea')].map(el => { - const input = el as HTMLInputElement; - return { - tag: el.tagName.toLowerCase(), - type: input.type || undefined, - name: input.name || undefined, - id: input.id || undefined, - placeholder: input.placeholder || undefined, - required: input.required || undefined, - value: input.type === 'password' ? '[redacted]' : (input.value || undefined), - options: el.tagName === 'SELECT' - ? [...(el as HTMLSelectElement).options].map(o => ({ value: o.value, text: o.text })) - : undefined, - }; - }); - return { - index: i, - action: form.action || undefined, - method: form.method || 'get', - id: form.id || undefined, - fields, - }; - }); - }); - return JSON.stringify(forms, null, 2); - } - - case 'accessibility': { - const snapshot = await page.locator("body").ariaSnapshot(); - return snapshot; - } - - case 'js': { - const expr = args[0]; - if (!expr) throw new Error('Usage: browse js '); - const result = await page.evaluate(expr); - return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); - } - - case 'eval': { - const filePath = args[0]; - if (!filePath) throw new Error('Usage: browse eval '); - validateReadPath(filePath); - if (!fs.existsSync(filePath)) throw new Error(`File not found: ${filePath}`); - const code = fs.readFileSync(filePath, 'utf-8'); - const result = await page.evaluate(code); - return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); - } - - case 'css': { - const [selector, property] = args; - if (!selector || !property) throw new Error('Usage: browse css '); - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - const value = await resolved.locator.evaluate( - (el, prop) => getComputedStyle(el).getPropertyValue(prop), - property - ); - return value; - } - const value = await page.evaluate( - ([sel, prop]) => { - const el = document.querySelector(sel); - if (!el) return `Element not found: ${sel}`; - return getComputedStyle(el).getPropertyValue(prop); - }, - [resolved.selector, property] - ); - return value; - } - - case 'attrs': { - const selector = args[0]; - if (!selector) throw new Error('Usage: browse attrs '); - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - const attrs = await resolved.locator.evaluate((el) => { - const result: Record = {}; - for (const attr of el.attributes) { - result[attr.name] = attr.value; - } - return result; - }); - return JSON.stringify(attrs, null, 2); - } - const attrs = await page.evaluate((sel) => { - const el = document.querySelector(sel); - if (!el) return `Element not found: ${sel}`; - const result: Record = {}; - for (const attr of el.attributes) { - result[attr.name] = attr.value; - } - return result; - }, resolved.selector); - return typeof attrs === 'string' ? attrs : JSON.stringify(attrs, null, 2); - } - - case 'console': { - if (args[0] === '--clear') { - consoleBuffer.clear(); - return 'Console buffer cleared.'; - } - const entries = args[0] === '--errors' - ? consoleBuffer.toArray().filter(e => e.level === 'error' || e.level === 'warning') - : consoleBuffer.toArray(); - if (entries.length === 0) return args[0] === '--errors' ? '(no console errors)' : '(no console messages)'; - return entries.map(e => - `[${new Date(e.timestamp).toISOString()}] [${e.level}] ${e.text}` - ).join('\n'); - } - - case 'network': { - if (args[0] === '--clear') { - networkBuffer.clear(); - return 'Network buffer cleared.'; - } - if (networkBuffer.length === 0) return '(no network requests)'; - return networkBuffer.toArray().map(e => - `${e.method} ${e.url} → ${e.status || 'pending'} (${e.duration || '?'}ms, ${e.size || '?'}B)` - ).join('\n'); - } - - case 'dialog': { - if (args[0] === '--clear') { - dialogBuffer.clear(); - return 'Dialog buffer cleared.'; - } - if (dialogBuffer.length === 0) return '(no dialogs captured)'; - return dialogBuffer.toArray().map(e => - `[${new Date(e.timestamp).toISOString()}] [${e.type}] "${e.message}" → ${e.action}${e.response ? ` "${e.response}"` : ''}` - ).join('\n'); - } - - case 'is': { - const property = args[0]; - const selector = args[1]; - if (!property || !selector) throw new Error('Usage: browse is \nProperties: visible, hidden, enabled, disabled, checked, editable, focused'); - - const resolved = bm.resolveRef(selector); - let locator; - if ('locator' in resolved) { - locator = resolved.locator; - } else { - locator = page.locator(resolved.selector); - } - - switch (property) { - case 'visible': return String(await locator.isVisible()); - case 'hidden': return String(await locator.isHidden()); - case 'enabled': return String(await locator.isEnabled()); - case 'disabled': return String(await locator.isDisabled()); - case 'checked': return String(await locator.isChecked()); - case 'editable': return String(await locator.isEditable()); - case 'focused': { - const isFocused = await locator.evaluate( - (el) => el === document.activeElement - ); - return String(isFocused); - } - default: - throw new Error(`Unknown property: ${property}. Use: visible, hidden, enabled, disabled, checked, editable, focused`); - } - } - - case 'cookies': { - const cookies = await page.context().cookies(); - return JSON.stringify(cookies, null, 2); - } - - case 'storage': { - if (args[0] === 'set' && args[1]) { - const key = args[1]; - const value = args[2] || ''; - await page.evaluate(([k, v]) => localStorage.setItem(k, v), [key, value]); - return `Set localStorage["${key}"]`; - } - const storage = await page.evaluate(() => ({ - localStorage: { ...localStorage }, - sessionStorage: { ...sessionStorage }, - })); - return JSON.stringify(storage, null, 2); - } - - case 'perf': { - const timings = await page.evaluate(() => { - const nav = performance.getEntriesByType('navigation')[0] as PerformanceNavigationTiming; - if (!nav) return 'No navigation timing data available.'; - return { - dns: Math.round(nav.domainLookupEnd - nav.domainLookupStart), - tcp: Math.round(nav.connectEnd - nav.connectStart), - ssl: Math.round(nav.secureConnectionStart > 0 ? nav.connectEnd - nav.secureConnectionStart : 0), - ttfb: Math.round(nav.responseStart - nav.requestStart), - download: Math.round(nav.responseEnd - nav.responseStart), - domParse: Math.round(nav.domInteractive - nav.responseEnd), - domReady: Math.round(nav.domContentLoadedEventEnd - nav.startTime), - load: Math.round(nav.loadEventEnd - nav.startTime), - total: Math.round(nav.loadEventEnd - nav.startTime), - }; - }); - if (typeof timings === 'string') return timings; - return Object.entries(timings) - .map(([k, v]) => `${k.padEnd(12)} ${v}ms`) - .join('\n'); - } - - default: - throw new Error(`Unknown read command: ${command}`); - } -} diff --git a/browse/src/server.ts b/browse/src/server.ts deleted file mode 100644 index 5886813..0000000 --- a/browse/src/server.ts +++ /dev/null @@ -1,362 +0,0 @@ -/** - * gstack browse server — persistent Chromium daemon - * - * Architecture: - * Bun.serve HTTP on localhost → routes commands to Playwright - * Console/network/dialog buffers: CircularBuffer in-memory + async disk flush - * Chromium crash → server EXITS with clear error (CLI auto-restarts) - * Auto-shutdown after BROWSE_IDLE_TIMEOUT (default 30 min) - * - * State: - * State file: /.gstack/browse.json (set via BROWSE_STATE_FILE env) - * Log files: /.gstack/browse-{console,network,dialog}.log - * Port: random 10000-60000 (or BROWSE_PORT env for debug override) - */ - -import { BrowserManager } from './browser-manager'; -import { handleReadCommand } from './read-commands'; -import { handleWriteCommand } from './write-commands'; -import { handleMetaCommand } from './meta-commands'; -import { handleCookiePickerRoute } from './cookie-picker-routes'; -import { resolveConfig, ensureStateDir, readVersionHash } from './config'; -import * as fs from 'fs'; -import * as path from 'path'; -import * as crypto from 'crypto'; - -// ─── Config ───────────────────────────────────────────────────── -const config = resolveConfig(); -ensureStateDir(config); - -// ─── Auth ─────────────────────────────────────────────────────── -const AUTH_TOKEN = crypto.randomUUID(); -const BROWSE_PORT = parseInt(process.env.BROWSE_PORT || '0', 10); -const IDLE_TIMEOUT_MS = parseInt(process.env.BROWSE_IDLE_TIMEOUT || '1800000', 10); // 30 min - -function validateAuth(req: Request): boolean { - const header = req.headers.get('authorization'); - return header === `Bearer ${AUTH_TOKEN}`; -} - -// ─── Buffer (from buffers.ts) ──────────────────────────────────── -import { consoleBuffer, networkBuffer, dialogBuffer, addConsoleEntry, addNetworkEntry, addDialogEntry, type LogEntry, type NetworkEntry, type DialogEntry } from './buffers'; -export { consoleBuffer, networkBuffer, dialogBuffer, addConsoleEntry, addNetworkEntry, addDialogEntry, type LogEntry, type NetworkEntry, type DialogEntry }; - -const CONSOLE_LOG_PATH = config.consoleLog; -const NETWORK_LOG_PATH = config.networkLog; -const DIALOG_LOG_PATH = config.dialogLog; -let lastConsoleFlushed = 0; -let lastNetworkFlushed = 0; -let lastDialogFlushed = 0; -let flushInProgress = false; - -async function flushBuffers() { - if (flushInProgress) return; // Guard against concurrent flush - flushInProgress = true; - - try { - // Console buffer - const newConsoleCount = consoleBuffer.totalAdded - lastConsoleFlushed; - if (newConsoleCount > 0) { - const entries = consoleBuffer.last(Math.min(newConsoleCount, consoleBuffer.length)); - const lines = entries.map(e => - `[${new Date(e.timestamp).toISOString()}] [${e.level}] ${e.text}` - ).join('\n') + '\n'; - await Bun.write(CONSOLE_LOG_PATH, (await Bun.file(CONSOLE_LOG_PATH).text().catch(() => '')) + lines); - lastConsoleFlushed = consoleBuffer.totalAdded; - } - - // Network buffer - const newNetworkCount = networkBuffer.totalAdded - lastNetworkFlushed; - if (newNetworkCount > 0) { - const entries = networkBuffer.last(Math.min(newNetworkCount, networkBuffer.length)); - const lines = entries.map(e => - `[${new Date(e.timestamp).toISOString()}] ${e.method} ${e.url} → ${e.status || 'pending'} (${e.duration || '?'}ms, ${e.size || '?'}B)` - ).join('\n') + '\n'; - await Bun.write(NETWORK_LOG_PATH, (await Bun.file(NETWORK_LOG_PATH).text().catch(() => '')) + lines); - lastNetworkFlushed = networkBuffer.totalAdded; - } - - // Dialog buffer - const newDialogCount = dialogBuffer.totalAdded - lastDialogFlushed; - if (newDialogCount > 0) { - const entries = dialogBuffer.last(Math.min(newDialogCount, dialogBuffer.length)); - const lines = entries.map(e => - `[${new Date(e.timestamp).toISOString()}] [${e.type}] "${e.message}" → ${e.action}${e.response ? ` "${e.response}"` : ''}` - ).join('\n') + '\n'; - await Bun.write(DIALOG_LOG_PATH, (await Bun.file(DIALOG_LOG_PATH).text().catch(() => '')) + lines); - lastDialogFlushed = dialogBuffer.totalAdded; - } - } catch { - // Flush failures are non-fatal — buffers are in memory - } finally { - flushInProgress = false; - } -} - -// Flush every 1 second -const flushInterval = setInterval(flushBuffers, 1000); - -// ─── Idle Timer ──────────────────────────────────────────────── -let lastActivity = Date.now(); - -function resetIdleTimer() { - lastActivity = Date.now(); -} - -const idleCheckInterval = setInterval(() => { - if (Date.now() - lastActivity > IDLE_TIMEOUT_MS) { - console.log(`[browse] Idle for ${IDLE_TIMEOUT_MS / 1000}s, shutting down`); - shutdown(); - } -}, 60_000); - -// ─── Command Sets (exported for chain command) ────────────────── -export const READ_COMMANDS = new Set([ - 'text', 'html', 'links', 'forms', 'accessibility', - 'js', 'eval', 'css', 'attrs', - 'console', 'network', 'cookies', 'storage', 'perf', - 'dialog', 'is', -]); - -export const WRITE_COMMANDS = new Set([ - 'goto', 'back', 'forward', 'reload', - 'click', 'fill', 'select', 'hover', 'type', 'press', 'scroll', 'wait', - 'viewport', 'cookie', 'cookie-import', 'cookie-import-browser', 'header', 'useragent', - 'upload', 'dialog-accept', 'dialog-dismiss', -]); - -export const META_COMMANDS = new Set([ - 'tabs', 'tab', 'newtab', 'closetab', - 'status', 'stop', 'restart', - 'screenshot', 'pdf', 'responsive', - 'chain', 'diff', - 'url', 'snapshot', -]); - -// ─── Server ──────────────────────────────────────────────────── -const browserManager = new BrowserManager(); -let isShuttingDown = false; - -// Find port: explicit BROWSE_PORT, or random in 10000-60000 -async function findPort(): Promise { - // Explicit port override (for debugging) - if (BROWSE_PORT) { - try { - const testServer = Bun.serve({ port: BROWSE_PORT, fetch: () => new Response('ok') }); - testServer.stop(); - return BROWSE_PORT; - } catch { - throw new Error(`[browse] Port ${BROWSE_PORT} (from BROWSE_PORT env) is in use`); - } - } - - // Random port with retry - const MIN_PORT = 10000; - const MAX_PORT = 60000; - const MAX_RETRIES = 5; - for (let attempt = 0; attempt < MAX_RETRIES; attempt++) { - const port = MIN_PORT + Math.floor(Math.random() * (MAX_PORT - MIN_PORT)); - try { - const testServer = Bun.serve({ port, fetch: () => new Response('ok') }); - testServer.stop(); - return port; - } catch { - continue; - } - } - throw new Error(`[browse] No available port after ${MAX_RETRIES} attempts in range ${MIN_PORT}-${MAX_PORT}`); -} - -/** - * Translate Playwright errors into actionable messages for AI agents. - */ -function wrapError(err: any): string { - const msg = err.message || String(err); - // Timeout errors - if (err.name === 'TimeoutError' || msg.includes('Timeout') || msg.includes('timeout')) { - if (msg.includes('locator.click') || msg.includes('locator.fill') || msg.includes('locator.hover')) { - return `Element not found or not interactable within timeout. Check your selector or run 'snapshot' for fresh refs.`; - } - if (msg.includes('page.goto') || msg.includes('Navigation')) { - return `Page navigation timed out. The URL may be unreachable or the page may be loading slowly.`; - } - return `Operation timed out: ${msg.split('\n')[0]}`; - } - // Multiple elements matched - if (msg.includes('resolved to') && msg.includes('elements')) { - return `Selector matched multiple elements. Be more specific or use @refs from 'snapshot'.`; - } - // Pass through other errors - return msg; -} - -async function handleCommand(body: any): Promise { - const { command, args = [] } = body; - - if (!command) { - return new Response(JSON.stringify({ error: 'Missing "command" field' }), { - status: 400, - headers: { 'Content-Type': 'application/json' }, - }); - } - - try { - let result: string; - - if (READ_COMMANDS.has(command)) { - result = await handleReadCommand(command, args, browserManager); - } else if (WRITE_COMMANDS.has(command)) { - result = await handleWriteCommand(command, args, browserManager); - } else if (META_COMMANDS.has(command)) { - result = await handleMetaCommand(command, args, browserManager, shutdown); - } else if (command === 'help') { - const helpText = [ - 'gstack browse — headless browser for AI agents', - '', - 'Commands:', - ' Navigation: goto , back, forward, reload', - ' Interaction: click , fill , select , hover, type, press, scroll, wait', - ' Read: text [sel], html [sel], links, forms, accessibility, cookies, storage, console, network, perf', - ' Evaluate: js , eval , css , attrs , is ', - ' Snapshot: snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o path] [-C]', - ' Screenshot: screenshot [path], pdf [path], responsive ', - ' Tabs: tabs, tab , newtab [url], closetab [id]', - ' State: cookie , cookie-import , cookie-import-browser [browser]', - ' Headers: header [name] [value], useragent [string]', - ' Upload: upload [file2...]', - ' Dialogs: dialog, dialog-accept [text], dialog-dismiss', - ' Meta: status, stop, restart, diff, chain, help', - '', - 'Snapshot flags:', - ' -i interactive only -c compact (remove empty nodes)', - ' -d N limit depth -s sel scope to CSS selector', - ' -D diff vs previous -a annotated screenshot with ref labels', - ' -o path output file -C cursor-interactive elements', - ].join('\n'); - return new Response(helpText, { - status: 200, - headers: { 'Content-Type': 'text/plain' }, - }); - } else { - return new Response(JSON.stringify({ - error: `Unknown command: ${command}`, - hint: `Available commands: ${[...READ_COMMANDS, ...WRITE_COMMANDS, ...META_COMMANDS].sort().join(', ')}`, - }), { - status: 400, - headers: { 'Content-Type': 'application/json' }, - }); - } - - return new Response(result, { - status: 200, - headers: { 'Content-Type': 'text/plain' }, - }); - } catch (err: any) { - return new Response(JSON.stringify({ error: wrapError(err) }), { - status: 500, - headers: { 'Content-Type': 'application/json' }, - }); - } -} - -async function shutdown() { - if (isShuttingDown) return; - isShuttingDown = true; - - console.log('[browse] Shutting down...'); - clearInterval(flushInterval); - clearInterval(idleCheckInterval); - await flushBuffers(); // Final flush (async now) - - await browserManager.close(); - - // Clean up state file - try { fs.unlinkSync(config.stateFile); } catch {} - - process.exit(0); -} - -// Handle signals -process.on('SIGTERM', shutdown); -process.on('SIGINT', shutdown); - -// ─── Start ───────────────────────────────────────────────────── -async function start() { - // Clear old log files - try { fs.unlinkSync(CONSOLE_LOG_PATH); } catch {} - try { fs.unlinkSync(NETWORK_LOG_PATH); } catch {} - try { fs.unlinkSync(DIALOG_LOG_PATH); } catch {} - - const port = await findPort(); - - // Launch browser - await browserManager.launch(); - - const startTime = Date.now(); - const server = Bun.serve({ - port, - hostname: '127.0.0.1', - fetch: async (req) => { - resetIdleTimer(); - - const url = new URL(req.url); - - // Cookie picker routes — no auth required (localhost-only) - if (url.pathname.startsWith('/cookie-picker')) { - return handleCookiePickerRoute(url, req, browserManager); - } - - // Health check — no auth required (now async) - if (url.pathname === '/health') { - const healthy = await browserManager.isHealthy(); - return new Response(JSON.stringify({ - status: healthy ? 'healthy' : 'unhealthy', - uptime: Math.floor((Date.now() - startTime) / 1000), - tabs: browserManager.getTabCount(), - currentUrl: browserManager.getCurrentUrl(), - }), { - status: 200, - headers: { 'Content-Type': 'application/json' }, - }); - } - - // All other endpoints require auth - if (!validateAuth(req)) { - return new Response(JSON.stringify({ error: 'Unauthorized' }), { - status: 401, - headers: { 'Content-Type': 'application/json' }, - }); - } - - if (url.pathname === '/command' && req.method === 'POST') { - const body = await req.json(); - return handleCommand(body); - } - - return new Response('Not found', { status: 404 }); - }, - }); - - // Write state file (atomic: write .tmp then rename) - const state = { - pid: process.pid, - port, - token: AUTH_TOKEN, - startedAt: new Date().toISOString(), - serverPath: path.resolve(import.meta.dir, 'server.ts'), - binaryVersion: readVersionHash() || undefined, - }; - const tmpFile = config.stateFile + '.tmp'; - fs.writeFileSync(tmpFile, JSON.stringify(state, null, 2), { mode: 0o600 }); - fs.renameSync(tmpFile, config.stateFile); - - browserManager.serverPort = port; - console.log(`[browse] Server running on http://127.0.0.1:${port} (PID: ${process.pid})`); - console.log(`[browse] State file: ${config.stateFile}`); - console.log(`[browse] Idle timeout: ${IDLE_TIMEOUT_MS / 1000}s`); -} - -start().catch((err) => { - console.error(`[browse] Failed to start: ${err.message}`); - process.exit(1); -}); diff --git a/browse/src/snapshot.ts b/browse/src/snapshot.ts deleted file mode 100644 index b0c7b80..0000000 --- a/browse/src/snapshot.ts +++ /dev/null @@ -1,397 +0,0 @@ -/** - * Snapshot command — accessibility tree with ref-based element selection - * - * Architecture (Locator map — no DOM mutation): - * 1. page.locator(scope).ariaSnapshot() → YAML-like accessibility tree - * 2. Parse tree, assign refs @e1, @e2, ... - * 3. Build Playwright Locator for each ref (getByRole + nth) - * 4. Store Map on BrowserManager - * 5. Return compact text output with refs prepended - * - * Extended features: - * --diff / -D: Compare against last snapshot, return unified diff - * --annotate / -a: Screenshot with overlay boxes at each @ref - * --output / -o: Output path for annotated screenshot - * -C / --cursor-interactive: Scan for cursor:pointer/onclick/tabindex elements - * - * Later: "click @e3" → look up Locator → locator.click() - */ - -import type { Page, Locator } from 'playwright'; -import type { BrowserManager } from './browser-manager'; -import * as Diff from 'diff'; - -// Roles considered "interactive" for the -i flag -const INTERACTIVE_ROLES = new Set([ - 'button', 'link', 'textbox', 'checkbox', 'radio', 'combobox', - 'listbox', 'menuitem', 'menuitemcheckbox', 'menuitemradio', - 'option', 'searchbox', 'slider', 'spinbutton', 'switch', 'tab', - 'treeitem', -]); - -interface SnapshotOptions { - interactive?: boolean; // -i: only interactive elements - compact?: boolean; // -c: remove empty structural elements - depth?: number; // -d N: limit tree depth - selector?: string; // -s SEL: scope to CSS selector - diff?: boolean; // -D / --diff: diff against last snapshot - annotate?: boolean; // -a / --annotate: annotated screenshot - outputPath?: string; // -o / --output: path for annotated screenshot - cursorInteractive?: boolean; // -C / --cursor-interactive: scan cursor:pointer etc. -} - -interface ParsedNode { - indent: number; - role: string; - name: string | null; - props: string; // e.g., "[level=1]" - children: string; // inline text content after ":" - rawLine: string; -} - -/** - * Parse CLI args into SnapshotOptions - */ -export function parseSnapshotArgs(args: string[]): SnapshotOptions { - const opts: SnapshotOptions = {}; - for (let i = 0; i < args.length; i++) { - switch (args[i]) { - case '-i': - case '--interactive': - opts.interactive = true; - break; - case '-c': - case '--compact': - opts.compact = true; - break; - case '-d': - case '--depth': - opts.depth = parseInt(args[++i], 10); - if (isNaN(opts.depth!)) throw new Error('Usage: snapshot -d '); - break; - case '-s': - case '--selector': - opts.selector = args[++i]; - if (!opts.selector) throw new Error('Usage: snapshot -s '); - break; - case '-D': - case '--diff': - opts.diff = true; - break; - case '-a': - case '--annotate': - opts.annotate = true; - break; - case '-o': - case '--output': - opts.outputPath = args[++i]; - if (!opts.outputPath) throw new Error('Usage: snapshot -o '); - break; - case '-C': - case '--cursor-interactive': - opts.cursorInteractive = true; - break; - default: - throw new Error(`Unknown snapshot flag: ${args[i]}`); - } - } - return opts; -} - -/** - * Parse one line of ariaSnapshot output. - * - * Format examples: - * - heading "Test" [level=1] - * - link "Link A": - * - /url: /a - * - textbox "Name" - * - paragraph: Some text - * - combobox "Role": - */ -function parseLine(line: string): ParsedNode | null { - // Match: (indent)(- )(role)( "name")?( [props])?(: inline)? - const match = line.match(/^(\s*)-\s+(\w+)(?:\s+"([^"]*)")?(?:\s+(\[.*?\]))?\s*(?::\s*(.*))?$/); - if (!match) { - // Skip metadata lines like "- /url: /a" - return null; - } - return { - indent: match[1].length, - role: match[2], - name: match[3] ?? null, - props: match[4] || '', - children: match[5]?.trim() || '', - rawLine: line, - }; -} - -/** - * Take an accessibility snapshot and build the ref map. - */ -export async function handleSnapshot( - args: string[], - bm: BrowserManager -): Promise { - const opts = parseSnapshotArgs(args); - const page = bm.getPage(); - - // Get accessibility tree via ariaSnapshot - let rootLocator: Locator; - if (opts.selector) { - rootLocator = page.locator(opts.selector); - const count = await rootLocator.count(); - if (count === 0) throw new Error(`Selector not found: ${opts.selector}`); - } else { - rootLocator = page.locator('body'); - } - - const ariaText = await rootLocator.ariaSnapshot(); - if (!ariaText || ariaText.trim().length === 0) { - bm.setRefMap(new Map()); - return '(no accessible elements found)'; - } - - // Parse the ariaSnapshot output - const lines = ariaText.split('\n'); - const refMap = new Map(); - const output: string[] = []; - let refCounter = 1; - - // Track role+name occurrences for nth() disambiguation - const roleNameCounts = new Map(); - const roleNameSeen = new Map(); - - // First pass: count role+name pairs for disambiguation - for (const line of lines) { - const node = parseLine(line); - if (!node) continue; - const key = `${node.role}:${node.name || ''}`; - roleNameCounts.set(key, (roleNameCounts.get(key) || 0) + 1); - } - - // Second pass: assign refs and build locators - for (const line of lines) { - const node = parseLine(line); - if (!node) continue; - - const depth = Math.floor(node.indent / 2); - const isInteractive = INTERACTIVE_ROLES.has(node.role); - - // Depth filter - if (opts.depth !== undefined && depth > opts.depth) continue; - - // Interactive filter: skip non-interactive but still count for locator indices - if (opts.interactive && !isInteractive) { - // Still track for nth() counts - const key = `${node.role}:${node.name || ''}`; - roleNameSeen.set(key, (roleNameSeen.get(key) || 0) + 1); - continue; - } - - // Compact filter: skip elements with no name and no inline content that aren't interactive - if (opts.compact && !isInteractive && !node.name && !node.children) continue; - - // Assign ref - const ref = `e${refCounter++}`; - const indent = ' '.repeat(depth); - - // Build Playwright locator - const key = `${node.role}:${node.name || ''}`; - const seenIndex = roleNameSeen.get(key) || 0; - roleNameSeen.set(key, seenIndex + 1); - const totalCount = roleNameCounts.get(key) || 1; - - let locator: Locator; - if (opts.selector) { - locator = page.locator(opts.selector).getByRole(node.role as any, { - name: node.name || undefined, - }); - } else { - locator = page.getByRole(node.role as any, { - name: node.name || undefined, - }); - } - - // Disambiguate with nth() if multiple elements share role+name - if (totalCount > 1) { - locator = locator.nth(seenIndex); - } - - refMap.set(ref, locator); - - // Format output line - let outputLine = `${indent}@${ref} [${node.role}]`; - if (node.name) outputLine += ` "${node.name}"`; - if (node.props) outputLine += ` ${node.props}`; - if (node.children) outputLine += `: ${node.children}`; - - output.push(outputLine); - } - - // ─── Cursor-interactive scan (-C) ───────────────────────── - if (opts.cursorInteractive) { - try { - const cursorElements = await page.evaluate(() => { - const STANDARD_INTERACTIVE = new Set([ - 'A', 'BUTTON', 'INPUT', 'SELECT', 'TEXTAREA', 'SUMMARY', 'DETAILS', - ]); - - const results: Array<{ selector: string; text: string; reason: string }> = []; - const allElements = document.querySelectorAll('*'); - - for (const el of allElements) { - // Skip standard interactive elements (already in ARIA tree) - if (STANDARD_INTERACTIVE.has(el.tagName)) continue; - // Skip hidden elements - if (!(el as HTMLElement).offsetParent && el.tagName !== 'BODY') continue; - - const style = getComputedStyle(el); - const hasCursorPointer = style.cursor === 'pointer'; - const hasOnclick = el.hasAttribute('onclick'); - const hasTabindex = el.hasAttribute('tabindex') && parseInt(el.getAttribute('tabindex')!, 10) >= 0; - const hasRole = el.hasAttribute('role'); - - if (!hasCursorPointer && !hasOnclick && !hasTabindex) continue; - // Skip if it has an ARIA role (likely already captured) - if (hasRole) continue; - - // Build deterministic nth-child CSS path - const parts: string[] = []; - let current: Element | null = el; - while (current && current !== document.documentElement) { - const parent = current.parentElement; - if (!parent) break; - const siblings = [...parent.children]; - const index = siblings.indexOf(current) + 1; - parts.unshift(`${current.tagName.toLowerCase()}:nth-child(${index})`); - current = parent; - } - const selector = parts.join(' > '); - - const text = (el as HTMLElement).innerText?.trim().slice(0, 80) || el.tagName.toLowerCase(); - const reasons: string[] = []; - if (hasCursorPointer) reasons.push('cursor:pointer'); - if (hasOnclick) reasons.push('onclick'); - if (hasTabindex) reasons.push(`tabindex=${el.getAttribute('tabindex')}`); - - results.push({ selector, text, reason: reasons.join(', ') }); - } - return results; - }); - - if (cursorElements.length > 0) { - output.push(''); - output.push('── cursor-interactive (not in ARIA tree) ──'); - let cRefCounter = 1; - for (const elem of cursorElements) { - const ref = `c${cRefCounter++}`; - const locator = page.locator(elem.selector); - refMap.set(ref, locator); - output.push(`@${ref} [${elem.reason}] "${elem.text}"`); - } - } - } catch { - output.push(''); - output.push('(cursor scan failed — CSP restriction)'); - } - } - - // Store ref map on BrowserManager - bm.setRefMap(refMap); - - if (output.length === 0) { - return '(no interactive elements found)'; - } - - const snapshotText = output.join('\n'); - - // ─── Annotated screenshot (-a) ──────────────────────────── - if (opts.annotate) { - const screenshotPath = opts.outputPath || '/tmp/browse-annotated.png'; - // Validate output path (consistent with screenshot/pdf/responsive) - const resolvedPath = require('path').resolve(screenshotPath); - const safeDirs = ['/tmp', process.cwd()]; - if (!safeDirs.some((dir: string) => resolvedPath === dir || resolvedPath.startsWith(dir + '/'))) { - throw new Error(`Path must be within: ${safeDirs.join(', ')}`); - } - try { - // Inject overlay divs at each ref's bounding box - const boxes: Array<{ ref: string; box: { x: number; y: number; width: number; height: number } }> = []; - for (const [ref, locator] of refMap) { - try { - const box = await locator.boundingBox({ timeout: 1000 }); - if (box) { - boxes.push({ ref: `@${ref}`, box }); - } - } catch { - // Element may be offscreen or hidden — skip - } - } - - await page.evaluate((boxes) => { - for (const { ref, box } of boxes) { - const overlay = document.createElement('div'); - overlay.className = '__browse_annotation__'; - overlay.style.cssText = ` - position: absolute; top: ${box.y}px; left: ${box.x}px; - width: ${box.width}px; height: ${box.height}px; - border: 2px solid red; background: rgba(255,0,0,0.1); - pointer-events: none; z-index: 99999; - font-size: 10px; color: red; font-weight: bold; - `; - const label = document.createElement('span'); - label.textContent = ref; - label.style.cssText = 'position: absolute; top: -14px; left: 0; background: red; color: white; padding: 0 3px; font-size: 10px;'; - overlay.appendChild(label); - document.body.appendChild(overlay); - } - }, boxes); - - await page.screenshot({ path: screenshotPath, fullPage: true }); - - // Always remove overlays - await page.evaluate(() => { - document.querySelectorAll('.__browse_annotation__').forEach(el => el.remove()); - }); - - output.push(''); - output.push(`[annotated screenshot: ${screenshotPath}]`); - } catch { - // Remove overlays even on screenshot failure - try { - await page.evaluate(() => { - document.querySelectorAll('.__browse_annotation__').forEach(el => el.remove()); - }); - } catch {} - } - } - - // ─── Diff mode (-D) ─────────────────────────────────────── - if (opts.diff) { - const lastSnapshot = bm.getLastSnapshot(); - if (!lastSnapshot) { - bm.setLastSnapshot(snapshotText); - return snapshotText + '\n\n(no previous snapshot to diff against — this snapshot stored as baseline)'; - } - - const changes = Diff.diffLines(lastSnapshot, snapshotText); - const diffOutput: string[] = ['--- previous snapshot', '+++ current snapshot', '']; - - for (const part of changes) { - const prefix = part.added ? '+' : part.removed ? '-' : ' '; - const diffLines = part.value.split('\n').filter(l => l.length > 0); - for (const line of diffLines) { - diffOutput.push(`${prefix} ${line}`); - } - } - - bm.setLastSnapshot(snapshotText); - return diffOutput.join('\n'); - } - - // Store for future diffs - bm.setLastSnapshot(snapshotText); - - return output.join('\n'); -} diff --git a/browse/src/write-commands.ts b/browse/src/write-commands.ts deleted file mode 100644 index 08c9425..0000000 --- a/browse/src/write-commands.ts +++ /dev/null @@ -1,312 +0,0 @@ -/** - * Write commands — navigate and interact with pages (side effects) - * - * goto, back, forward, reload, click, fill, select, hover, type, - * press, scroll, wait, viewport, cookie, header, useragent - */ - -import type { BrowserManager } from './browser-manager'; -import { findInstalledBrowsers, importCookies } from './cookie-import-browser'; -import * as fs from 'fs'; -import * as path from 'path'; - -export async function handleWriteCommand( - command: string, - args: string[], - bm: BrowserManager -): Promise { - const page = bm.getPage(); - - switch (command) { - case 'goto': { - const url = args[0]; - if (!url) throw new Error('Usage: browse goto '); - const response = await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 }); - const status = response?.status() || 'unknown'; - return `Navigated to ${url} (${status})`; - } - - case 'back': { - await page.goBack({ waitUntil: 'domcontentloaded', timeout: 15000 }); - return `Back → ${page.url()}`; - } - - case 'forward': { - await page.goForward({ waitUntil: 'domcontentloaded', timeout: 15000 }); - return `Forward → ${page.url()}`; - } - - case 'reload': { - await page.reload({ waitUntil: 'domcontentloaded', timeout: 15000 }); - return `Reloaded ${page.url()}`; - } - - case 'click': { - const selector = args[0]; - if (!selector) throw new Error('Usage: browse click '); - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.click({ timeout: 5000 }); - } else { - await page.click(resolved.selector, { timeout: 5000 }); - } - // Wait briefly for any navigation/DOM update - await page.waitForLoadState('domcontentloaded').catch(() => {}); - return `Clicked ${selector} → now at ${page.url()}`; - } - - case 'fill': { - const [selector, ...valueParts] = args; - const value = valueParts.join(' '); - if (!selector || !value) throw new Error('Usage: browse fill '); - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.fill(value, { timeout: 5000 }); - } else { - await page.fill(resolved.selector, value, { timeout: 5000 }); - } - return `Filled ${selector}`; - } - - case 'select': { - const [selector, ...valueParts] = args; - const value = valueParts.join(' '); - if (!selector || !value) throw new Error('Usage: browse select '); - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.selectOption(value, { timeout: 5000 }); - } else { - await page.selectOption(resolved.selector, value, { timeout: 5000 }); - } - return `Selected "${value}" in ${selector}`; - } - - case 'hover': { - const selector = args[0]; - if (!selector) throw new Error('Usage: browse hover '); - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.hover({ timeout: 5000 }); - } else { - await page.hover(resolved.selector, { timeout: 5000 }); - } - return `Hovered ${selector}`; - } - - case 'type': { - const text = args.join(' '); - if (!text) throw new Error('Usage: browse type '); - await page.keyboard.type(text); - return `Typed ${text.length} characters`; - } - - case 'press': { - const key = args[0]; - if (!key) throw new Error('Usage: browse press (e.g., Enter, Tab, Escape)'); - await page.keyboard.press(key); - return `Pressed ${key}`; - } - - case 'scroll': { - const selector = args[0]; - if (selector) { - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.scrollIntoViewIfNeeded({ timeout: 5000 }); - } else { - await page.locator(resolved.selector).scrollIntoViewIfNeeded({ timeout: 5000 }); - } - return `Scrolled ${selector} into view`; - } - await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight)); - return 'Scrolled to bottom'; - } - - case 'wait': { - const selector = args[0]; - if (!selector) throw new Error('Usage: browse wait '); - if (selector === '--networkidle') { - const timeout = args[1] ? parseInt(args[1], 10) : 15000; - await page.waitForLoadState('networkidle', { timeout }); - return 'Network idle'; - } - if (selector === '--load') { - await page.waitForLoadState('load'); - return 'Page loaded'; - } - if (selector === '--domcontentloaded') { - await page.waitForLoadState('domcontentloaded'); - return 'DOM content loaded'; - } - const timeout = args[1] ? parseInt(args[1], 10) : 15000; - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.waitFor({ state: 'visible', timeout }); - } else { - await page.waitForSelector(resolved.selector, { timeout }); - } - return `Element ${selector} appeared`; - } - - case 'viewport': { - const size = args[0]; - if (!size || !size.includes('x')) throw new Error('Usage: browse viewport (e.g., 375x812)'); - const [w, h] = size.split('x').map(Number); - await bm.setViewport(w, h); - return `Viewport set to ${w}x${h}`; - } - - case 'cookie': { - const cookieStr = args[0]; - if (!cookieStr || !cookieStr.includes('=')) throw new Error('Usage: browse cookie ='); - const eq = cookieStr.indexOf('='); - const name = cookieStr.slice(0, eq); - const value = cookieStr.slice(eq + 1); - const url = new URL(page.url()); - await page.context().addCookies([{ - name, - value, - domain: url.hostname, - path: '/', - }]); - return `Cookie set: ${name}=****`; - } - - case 'header': { - const headerStr = args[0]; - if (!headerStr || !headerStr.includes(':')) throw new Error('Usage: browse header :'); - const sep = headerStr.indexOf(':'); - const name = headerStr.slice(0, sep).trim(); - const value = headerStr.slice(sep + 1).trim(); - await bm.setExtraHeader(name, value); - const sensitiveHeaders = ['authorization', 'cookie', 'set-cookie', 'x-api-key', 'x-auth-token']; - const redactedValue = sensitiveHeaders.includes(name.toLowerCase()) ? '****' : value; - return `Header set: ${name}: ${redactedValue}`; - } - - case 'useragent': { - const ua = args.join(' '); - if (!ua) throw new Error('Usage: browse useragent '); - bm.setUserAgent(ua); - const error = await bm.recreateContext(); - if (error) { - return `User agent set to "${ua}" but: ${error}`; - } - return `User agent set: ${ua}`; - } - - case 'upload': { - const [selector, ...filePaths] = args; - if (!selector || filePaths.length === 0) throw new Error('Usage: browse upload [file2...]'); - - // Validate all files exist before upload - for (const fp of filePaths) { - if (!fs.existsSync(fp)) throw new Error(`File not found: ${fp}`); - } - - const resolved = bm.resolveRef(selector); - if ('locator' in resolved) { - await resolved.locator.setInputFiles(filePaths); - } else { - await page.locator(resolved.selector).setInputFiles(filePaths); - } - - const fileInfo = filePaths.map(fp => { - const stat = fs.statSync(fp); - return `${path.basename(fp)} (${stat.size}B)`; - }).join(', '); - return `Uploaded: ${fileInfo}`; - } - - case 'dialog-accept': { - const text = args.length > 0 ? args.join(' ') : null; - bm.setDialogAutoAccept(true); - bm.setDialogPromptText(text); - return text - ? `Dialogs will be accepted with text: "${text}"` - : 'Dialogs will be accepted'; - } - - case 'dialog-dismiss': { - bm.setDialogAutoAccept(false); - bm.setDialogPromptText(null); - return 'Dialogs will be dismissed'; - } - - case 'cookie-import': { - const filePath = args[0]; - if (!filePath) throw new Error('Usage: browse cookie-import '); - // Path validation — prevent reading arbitrary files - if (path.isAbsolute(filePath)) { - const safeDirs = ['/tmp', process.cwd()]; - const resolved = path.resolve(filePath); - if (!safeDirs.some(dir => resolved === dir || resolved.startsWith(dir + '/'))) { - throw new Error(`Path must be within: ${safeDirs.join(', ')}`); - } - } - if (path.normalize(filePath).includes('..')) { - throw new Error('Path traversal sequences (..) are not allowed'); - } - if (!fs.existsSync(filePath)) throw new Error(`File not found: ${filePath}`); - const raw = fs.readFileSync(filePath, 'utf-8'); - let cookies: any[]; - try { cookies = JSON.parse(raw); } catch { throw new Error(`Invalid JSON in ${filePath}`); } - if (!Array.isArray(cookies)) throw new Error('Cookie file must contain a JSON array'); - - // Auto-fill domain from current page URL when missing (consistent with cookie command) - const pageUrl = new URL(page.url()); - const defaultDomain = pageUrl.hostname; - - for (const c of cookies) { - if (!c.name || c.value === undefined) throw new Error('Each cookie must have "name" and "value" fields'); - if (!c.domain) c.domain = defaultDomain; - if (!c.path) c.path = '/'; - } - - await page.context().addCookies(cookies); - return `Loaded ${cookies.length} cookies from ${filePath}`; - } - - case 'cookie-import-browser': { - // Two modes: - // 1. Direct CLI import: cookie-import-browser --domain - // 2. Open picker UI: cookie-import-browser [browser] - const browserArg = args[0]; - const domainIdx = args.indexOf('--domain'); - - if (domainIdx !== -1 && domainIdx + 1 < args.length) { - // Direct import mode — no UI - const domain = args[domainIdx + 1]; - const browser = browserArg || 'comet'; - const result = await importCookies(browser, [domain]); - if (result.cookies.length > 0) { - await page.context().addCookies(result.cookies); - } - const msg = [`Imported ${result.count} cookies for ${domain} from ${browser}`]; - if (result.failed > 0) msg.push(`(${result.failed} failed to decrypt)`); - return msg.join(' '); - } - - // Picker UI mode — open in user's browser - const port = bm.serverPort; - if (!port) throw new Error('Server port not available'); - - const browsers = findInstalledBrowsers(); - if (browsers.length === 0) { - throw new Error('No Chromium browsers found. Supported: Comet, Chrome, Arc, Brave, Edge'); - } - - const pickerUrl = `http://127.0.0.1:${port}/cookie-picker`; - try { - Bun.spawn(['open', pickerUrl], { stdout: 'ignore', stderr: 'ignore' }); - } catch { - // open may fail silently — URL is in the message below - } - - return `Cookie picker opened at ${pickerUrl}\nDetected browsers: ${browsers.map(b => b.name).join(', ')}\nSelect domains to import, then close the picker when done.`; - } - - default: - throw new Error(`Unknown write command: ${command}`); - } -} diff --git a/browse/test/commands.test.ts b/browse/test/commands.test.ts deleted file mode 100644 index 1f6ad2f..0000000 --- a/browse/test/commands.test.ts +++ /dev/null @@ -1,1598 +0,0 @@ -/** - * Integration tests for all browse commands - * - * Tests run against a local test server serving fixture HTML files. - * A real browse server is started and commands are sent via the CLI HTTP interface. - */ - -import { describe, test, expect, beforeAll, afterAll } from 'bun:test'; -import { startTestServer } from './test-server'; -import { BrowserManager } from '../src/browser-manager'; -import { resolveServerScript } from '../src/cli'; -import { handleReadCommand } from '../src/read-commands'; -import { handleWriteCommand } from '../src/write-commands'; -import { handleMetaCommand } from '../src/meta-commands'; -import { consoleBuffer, networkBuffer, dialogBuffer, addConsoleEntry, addNetworkEntry, addDialogEntry, CircularBuffer } from '../src/buffers'; -import * as fs from 'fs'; -import { spawn } from 'child_process'; -import * as path from 'path'; - -let testServer: ReturnType; -let bm: BrowserManager; -let baseUrl: string; - -beforeAll(async () => { - testServer = startTestServer(0); - baseUrl = testServer.url; - - bm = new BrowserManager(); - await bm.launch(); -}); - -afterAll(() => { - // Force kill browser instead of graceful close (avoids hang) - try { testServer.server.stop(); } catch {} - // bm.close() can hang — just let process exit handle it - setTimeout(() => process.exit(0), 500); -}); - -// ─── Navigation ───────────────────────────────────────────────── - -describe('Navigation', () => { - test('goto navigates to URL', async () => { - const result = await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - expect(result).toContain('Navigated to'); - expect(result).toContain('200'); - }); - - test('url returns current URL', async () => { - const result = await handleMetaCommand('url', [], bm, async () => {}); - expect(result).toContain('/basic.html'); - }); - - test('back goes back', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - const result = await handleWriteCommand('back', [], bm); - expect(result).toContain('Back'); - }); - - test('forward goes forward', async () => { - const result = await handleWriteCommand('forward', [], bm); - expect(result).toContain('Forward'); - }); - - test('reload reloads page', async () => { - const result = await handleWriteCommand('reload', [], bm); - expect(result).toContain('Reloaded'); - }); -}); - -// ─── Content Extraction ───────────────────────────────────────── - -describe('Content extraction', () => { - beforeAll(async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - }); - - test('text returns cleaned page text', async () => { - const result = await handleReadCommand('text', [], bm); - expect(result).toContain('Hello World'); - expect(result).toContain('Item one'); - expect(result).not.toContain('

'); - }); - - test('html returns full page HTML', async () => { - const result = await handleReadCommand('html', [], bm); - expect(result).toContain(''); - expect(result).toContain('

Hello World

'); - }); - - test('html with selector returns element innerHTML', async () => { - const result = await handleReadCommand('html', ['#content'], bm); - expect(result).toContain('Some body text here.'); - expect(result).toContain('
  • Item one
  • '); - }); - - test('links returns all links', async () => { - const result = await handleReadCommand('links', [], bm); - expect(result).toContain('Page 1'); - expect(result).toContain('Page 2'); - expect(result).toContain('External'); - expect(result).toContain('→'); - }); - - test('forms discovers form fields', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - const result = await handleReadCommand('forms', [], bm); - const forms = JSON.parse(result); - expect(forms.length).toBe(2); - expect(forms[0].id).toBe('login-form'); - expect(forms[0].method).toBe('post'); - expect(forms[0].fields.length).toBeGreaterThanOrEqual(2); - expect(forms[1].id).toBe('profile-form'); - - // Check field discovery - const emailField = forms[0].fields.find((f: any) => f.name === 'email'); - expect(emailField).toBeDefined(); - expect(emailField.type).toBe('email'); - expect(emailField.required).toBe(true); - }); - - test('accessibility returns ARIA tree', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleReadCommand('accessibility', [], bm); - expect(result).toContain('Hello World'); - }); -}); - -// ─── JavaScript / CSS / Attrs ─────────────────────────────────── - -describe('Inspection', () => { - beforeAll(async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - }); - - test('js evaluates expression', async () => { - const result = await handleReadCommand('js', ['document.title'], bm); - expect(result).toBe('Test Page - Basic'); - }); - - test('js returns objects as JSON', async () => { - const result = await handleReadCommand('js', ['({a: 1, b: 2})'], bm); - const obj = JSON.parse(result); - expect(obj.a).toBe(1); - expect(obj.b).toBe(2); - }); - - test('css returns computed property', async () => { - const result = await handleReadCommand('css', ['h1', 'color'], bm); - // Navy color - expect(result).toContain('0, 0, 128'); - }); - - test('css returns font-family', async () => { - const result = await handleReadCommand('css', ['body', 'font-family'], bm); - expect(result).toContain('Helvetica'); - }); - - test('attrs returns element attributes', async () => { - const result = await handleReadCommand('attrs', ['#content'], bm); - const attrs = JSON.parse(result); - expect(attrs.id).toBe('content'); - expect(attrs['data-testid']).toBe('main-content'); - expect(attrs['data-version']).toBe('1.0'); - }); -}); - -// ─── Interaction ──────────────────────────────────────────────── - -describe('Interaction', () => { - test('fill + click works on form', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - - let result = await handleWriteCommand('fill', ['#email', 'test@example.com'], bm); - expect(result).toContain('Filled'); - - result = await handleWriteCommand('fill', ['#password', 'secret123'], bm); - expect(result).toContain('Filled'); - - // Verify values were set - const emailVal = await handleReadCommand('js', ['document.querySelector("#email").value'], bm); - expect(emailVal).toBe('test@example.com'); - - result = await handleWriteCommand('click', ['#login-btn'], bm); - expect(result).toContain('Clicked'); - }); - - test('select works on dropdown', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - const result = await handleWriteCommand('select', ['#role', 'admin'], bm); - expect(result).toContain('Selected'); - - const val = await handleReadCommand('js', ['document.querySelector("#role").value'], bm); - expect(val).toBe('admin'); - }); - - test('hover works', async () => { - const result = await handleWriteCommand('hover', ['h1'], bm); - expect(result).toContain('Hovered'); - }); - - test('wait finds existing element', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('wait', ['#title'], bm); - expect(result).toContain('appeared'); - }); - - test('scroll works', async () => { - const result = await handleWriteCommand('scroll', ['footer'], bm); - expect(result).toContain('Scrolled'); - }); - - test('viewport changes size', async () => { - const result = await handleWriteCommand('viewport', ['375x812'], bm); - expect(result).toContain('Viewport set'); - - const size = await handleReadCommand('js', ['`${window.innerWidth}x${window.innerHeight}`'], bm); - expect(size).toBe('375x812'); - - // Reset - await handleWriteCommand('viewport', ['1280x720'], bm); - }); - - test('type and press work', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - await handleWriteCommand('click', ['#name'], bm); - - const result = await handleWriteCommand('type', ['John Doe'], bm); - expect(result).toContain('Typed'); - - const val = await handleReadCommand('js', ['document.querySelector("#name").value'], bm); - expect(val).toBe('John Doe'); - }); -}); - -// ─── SPA / Console / Network ─────────────────────────────────── - -describe('SPA and buffers', () => { - test('wait handles delayed rendering', async () => { - await handleWriteCommand('goto', [baseUrl + '/spa.html'], bm); - const result = await handleWriteCommand('wait', ['.loaded'], bm); - expect(result).toContain('appeared'); - - const text = await handleReadCommand('text', [], bm); - expect(text).toContain('SPA Content Loaded'); - }); - - test('console captures messages', async () => { - const result = await handleReadCommand('console', [], bm); - expect(result).toContain('[SPA] Starting render'); - expect(result).toContain('[SPA] Render complete'); - }); - - test('console --clear clears buffer', async () => { - const result = await handleReadCommand('console', ['--clear'], bm); - expect(result).toContain('cleared'); - - const after = await handleReadCommand('console', [], bm); - expect(after).toContain('no console messages'); - }); - - test('network captures requests', async () => { - const result = await handleReadCommand('network', [], bm); - expect(result).toContain('GET'); - expect(result).toContain('/spa.html'); - }); - - test('network --clear clears buffer', async () => { - const result = await handleReadCommand('network', ['--clear'], bm); - expect(result).toContain('cleared'); - }); -}); - -// ─── Cookies / Storage ────────────────────────────────────────── - -describe('Cookies and storage', () => { - test('cookies returns array', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleReadCommand('cookies', [], bm); - // Test server doesn't set cookies, so empty array - expect(result).toBe('[]'); - }); - - test('storage set and get works', async () => { - await handleReadCommand('storage', ['set', 'testKey', 'testValue'], bm); - const result = await handleReadCommand('storage', [], bm); - const storage = JSON.parse(result); - expect(storage.localStorage.testKey).toBe('testValue'); - }); -}); - -// ─── Performance ──────────────────────────────────────────────── - -describe('Performance', () => { - test('perf returns timing data', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleReadCommand('perf', [], bm); - expect(result).toContain('dns'); - expect(result).toContain('ttfb'); - expect(result).toContain('load'); - expect(result).toContain('ms'); - }); -}); - -// ─── Visual ───────────────────────────────────────────────────── - -describe('Visual', () => { - test('screenshot saves file', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const screenshotPath = '/tmp/browse-test-screenshot.png'; - const result = await handleMetaCommand('screenshot', [screenshotPath], bm, async () => {}); - expect(result).toContain('Screenshot saved'); - expect(fs.existsSync(screenshotPath)).toBe(true); - const stat = fs.statSync(screenshotPath); - expect(stat.size).toBeGreaterThan(1000); - fs.unlinkSync(screenshotPath); - }); - - test('responsive saves 3 screenshots', async () => { - await handleWriteCommand('goto', [baseUrl + '/responsive.html'], bm); - const prefix = '/tmp/browse-test-resp'; - const result = await handleMetaCommand('responsive', [prefix], bm, async () => {}); - expect(result).toContain('mobile'); - expect(result).toContain('tablet'); - expect(result).toContain('desktop'); - - expect(fs.existsSync(`${prefix}-mobile.png`)).toBe(true); - expect(fs.existsSync(`${prefix}-tablet.png`)).toBe(true); - expect(fs.existsSync(`${prefix}-desktop.png`)).toBe(true); - - // Cleanup - fs.unlinkSync(`${prefix}-mobile.png`); - fs.unlinkSync(`${prefix}-tablet.png`); - fs.unlinkSync(`${prefix}-desktop.png`); - }); -}); - -// ─── Tabs ─────────────────────────────────────────────────────── - -describe('Tabs', () => { - test('tabs lists all tabs', async () => { - const result = await handleMetaCommand('tabs', [], bm, async () => {}); - expect(result).toContain('['); - expect(result).toContain(']'); - }); - - test('newtab opens new tab', async () => { - const result = await handleMetaCommand('newtab', [baseUrl + '/forms.html'], bm, async () => {}); - expect(result).toContain('Opened tab'); - - const tabCount = bm.getTabCount(); - expect(tabCount).toBeGreaterThanOrEqual(2); - }); - - test('tab switches to specific tab', async () => { - const result = await handleMetaCommand('tab', ['1'], bm, async () => {}); - expect(result).toContain('Switched to tab 1'); - }); - - test('closetab closes a tab', async () => { - const before = bm.getTabCount(); - // Close the last opened tab - const tabs = await bm.getTabListWithTitles(); - const lastTab = tabs[tabs.length - 1]; - const result = await handleMetaCommand('closetab', [String(lastTab.id)], bm, async () => {}); - expect(result).toContain('Closed tab'); - expect(bm.getTabCount()).toBe(before - 1); - }); -}); - -// ─── Diff ─────────────────────────────────────────────────────── - -describe('Diff', () => { - test('diff shows differences between pages', async () => { - const result = await handleMetaCommand( - 'diff', - [baseUrl + '/basic.html', baseUrl + '/forms.html'], - bm, - async () => {} - ); - expect(result).toContain('---'); - expect(result).toContain('+++'); - // basic.html has "Hello World", forms.html has "Form Test Page" - expect(result).toContain('Hello World'); - expect(result).toContain('Form Test Page'); - }); -}); - -// ─── Chain ────────────────────────────────────────────────────── - -describe('Chain', () => { - test('chain executes sequence of commands', async () => { - const commands = JSON.stringify([ - ['goto', baseUrl + '/basic.html'], - ['js', 'document.title'], - ['css', 'h1', 'color'], - ]); - const result = await handleMetaCommand('chain', [commands], bm, async () => {}); - expect(result).toContain('[goto]'); - expect(result).toContain('Test Page - Basic'); - expect(result).toContain('[css]'); - }); - - test('chain reports real error when write command fails', async () => { - const commands = JSON.stringify([ - ['goto', 'http://localhost:1/unreachable'], - ]); - const result = await handleMetaCommand('chain', [commands], bm, async () => {}); - expect(result).toContain('[goto] ERROR:'); - expect(result).not.toContain('Unknown meta command'); - expect(result).not.toContain('Unknown read command'); - }); -}); - -// ─── Status ───────────────────────────────────────────────────── - -describe('Status', () => { - test('status reports health', async () => { - const result = await handleMetaCommand('status', [], bm, async () => {}); - expect(result).toContain('Status: healthy'); - expect(result).toContain('Tabs:'); - }); -}); - -// ─── CLI server script resolution ─────────────────────────────── - -describe('CLI server script resolution', () => { - test('prefers adjacent browse/src/server.ts for compiled project installs', () => { - const root = fs.mkdtempSync('/tmp/gstack-cli-'); - const execPath = path.join(root, '.claude/skills/gstack/browse/dist/browse'); - const serverPath = path.join(root, '.claude/skills/gstack/browse/src/server.ts'); - - fs.mkdirSync(path.dirname(execPath), { recursive: true }); - fs.mkdirSync(path.dirname(serverPath), { recursive: true }); - fs.writeFileSync(serverPath, '// test server\n'); - - const resolved = resolveServerScript( - { HOME: path.join(root, 'empty-home') }, - '$bunfs/root', - execPath - ); - - expect(resolved).toBe(serverPath); - - fs.rmSync(root, { recursive: true, force: true }); - }); -}); - -// ─── CLI lifecycle ────────────────────────────────────────────── - -describe('CLI lifecycle', () => { - test('dead state file triggers a clean restart', async () => { - const stateFile = `/tmp/browse-test-state-${Date.now()}.json`; - fs.writeFileSync(stateFile, JSON.stringify({ - port: 1, - token: 'fake', - pid: 999999, - })); - - const cliPath = path.resolve(__dirname, '../src/cli.ts'); - const cliEnv: Record = {}; - for (const [k, v] of Object.entries(process.env)) { - if (v !== undefined) cliEnv[k] = v; - } - cliEnv.BROWSE_STATE_FILE = stateFile; - const result = await new Promise<{ code: number; stdout: string; stderr: string }>((resolve) => { - const proc = spawn('bun', ['run', cliPath, 'status'], { - timeout: 15000, - env: cliEnv, - }); - let stdout = ''; - let stderr = ''; - proc.stdout.on('data', (d) => stdout += d.toString()); - proc.stderr.on('data', (d) => stderr += d.toString()); - proc.on('close', (code) => resolve({ code: code ?? 1, stdout, stderr })); - }); - - let restartedPid: number | null = null; - if (fs.existsSync(stateFile)) { - restartedPid = JSON.parse(fs.readFileSync(stateFile, 'utf-8')).pid; - fs.unlinkSync(stateFile); - } - if (restartedPid) { - try { process.kill(restartedPid, 'SIGTERM'); } catch {} - } - - expect(result.code).toBe(0); - expect(result.stdout).toContain('Status: healthy'); - expect(result.stderr).toContain('Starting server'); - }, 20000); -}); - -// ─── Buffer bounds ────────────────────────────────────────────── - -describe('Buffer bounds', () => { - test('console buffer caps at 50000 entries', () => { - consoleBuffer.clear(); - for (let i = 0; i < 50_010; i++) { - addConsoleEntry({ timestamp: i, level: 'log', text: `msg-${i}` }); - } - expect(consoleBuffer.length).toBe(50_000); - const entries = consoleBuffer.toArray(); - expect(entries[0].text).toBe('msg-10'); - expect(entries[entries.length - 1].text).toBe('msg-50009'); - consoleBuffer.clear(); - }); - - test('network buffer caps at 50000 entries', () => { - networkBuffer.clear(); - for (let i = 0; i < 50_010; i++) { - addNetworkEntry({ timestamp: i, method: 'GET', url: `http://x/${i}` }); - } - expect(networkBuffer.length).toBe(50_000); - const entries = networkBuffer.toArray(); - expect(entries[0].url).toBe('http://x/10'); - expect(entries[entries.length - 1].url).toBe('http://x/50009'); - networkBuffer.clear(); - }); - - test('totalAdded counters keep incrementing past buffer cap', () => { - const startConsole = consoleBuffer.totalAdded; - const startNetwork = networkBuffer.totalAdded; - for (let i = 0; i < 100; i++) { - addConsoleEntry({ timestamp: i, level: 'log', text: `t-${i}` }); - addNetworkEntry({ timestamp: i, method: 'GET', url: `http://t/${i}` }); - } - expect(consoleBuffer.totalAdded).toBe(startConsole + 100); - expect(networkBuffer.totalAdded).toBe(startNetwork + 100); - consoleBuffer.clear(); - networkBuffer.clear(); - }); -}); - -// ─── CircularBuffer Unit Tests ───────────────────────────────── - -describe('CircularBuffer', () => { - test('push and toArray return items in insertion order', () => { - const buf = new CircularBuffer(5); - buf.push(1); buf.push(2); buf.push(3); - expect(buf.toArray()).toEqual([1, 2, 3]); - expect(buf.length).toBe(3); - }); - - test('overwrites oldest when full', () => { - const buf = new CircularBuffer(3); - buf.push(1); buf.push(2); buf.push(3); buf.push(4); - expect(buf.toArray()).toEqual([2, 3, 4]); - expect(buf.length).toBe(3); - }); - - test('totalAdded increments past capacity', () => { - const buf = new CircularBuffer(2); - buf.push(1); buf.push(2); buf.push(3); buf.push(4); buf.push(5); - expect(buf.totalAdded).toBe(5); - expect(buf.length).toBe(2); - expect(buf.toArray()).toEqual([4, 5]); - }); - - test('last(n) returns most recent entries', () => { - const buf = new CircularBuffer(5); - for (let i = 1; i <= 5; i++) buf.push(i); - expect(buf.last(3)).toEqual([3, 4, 5]); - expect(buf.last(10)).toEqual([1, 2, 3, 4, 5]); // clamped - expect(buf.last(1)).toEqual([5]); - }); - - test('get and set work by index', () => { - const buf = new CircularBuffer(3); - buf.push('a'); buf.push('b'); buf.push('c'); - expect(buf.get(0)).toBe('a'); - expect(buf.get(2)).toBe('c'); - buf.set(1, 'B'); - expect(buf.get(1)).toBe('B'); - expect(buf.get(-1)).toBeUndefined(); - expect(buf.get(5)).toBeUndefined(); - }); - - test('clear resets size but not totalAdded', () => { - const buf = new CircularBuffer(5); - buf.push(1); buf.push(2); buf.push(3); - buf.clear(); - expect(buf.length).toBe(0); - expect(buf.totalAdded).toBe(3); - expect(buf.toArray()).toEqual([]); - }); - - test('works with capacity=1', () => { - const buf = new CircularBuffer(1); - buf.push(10); - expect(buf.toArray()).toEqual([10]); - buf.push(20); - expect(buf.toArray()).toEqual([20]); - expect(buf.totalAdded).toBe(2); - }); -}); - -// ─── Dialog Handling ───────────────────────────────────────── - -describe('Dialog handling', () => { - test('alert does not hang — auto-accepted', async () => { - await handleWriteCommand('goto', [baseUrl + '/dialog.html'], bm); - await handleWriteCommand('click', ['#alert-btn'], bm); - // If we get here, dialog was handled (no hang) - const result = await handleReadCommand('dialog', [], bm); - expect(result).toContain('alert'); - expect(result).toContain('Hello from alert'); - expect(result).toContain('accepted'); - }); - - test('confirm is auto-accepted by default', async () => { - await handleWriteCommand('goto', [baseUrl + '/dialog.html'], bm); - await handleWriteCommand('click', ['#confirm-btn'], bm); - // Wait for DOM update - await new Promise(r => setTimeout(r, 100)); - const result = await handleReadCommand('js', ['document.querySelector("#confirm-result").textContent'], bm); - expect(result).toBe('confirmed'); - }); - - test('dialog-dismiss changes behavior', async () => { - const setResult = await handleWriteCommand('dialog-dismiss', [], bm); - expect(setResult).toContain('dismissed'); - - await handleWriteCommand('goto', [baseUrl + '/dialog.html'], bm); - await handleWriteCommand('click', ['#confirm-btn'], bm); - await new Promise(r => setTimeout(r, 100)); - const result = await handleReadCommand('js', ['document.querySelector("#confirm-result").textContent'], bm); - expect(result).toBe('cancelled'); - - // Reset to accept - await handleWriteCommand('dialog-accept', [], bm); - }); - - test('dialog-accept with text provides prompt response', async () => { - const setResult = await handleWriteCommand('dialog-accept', ['TestUser'], bm); - expect(setResult).toContain('TestUser'); - - await handleWriteCommand('goto', [baseUrl + '/dialog.html'], bm); - await handleWriteCommand('click', ['#prompt-btn'], bm); - await new Promise(r => setTimeout(r, 100)); - const result = await handleReadCommand('js', ['document.querySelector("#prompt-result").textContent'], bm); - expect(result).toBe('TestUser'); - - // Reset - await handleWriteCommand('dialog-accept', [], bm); - }); - - test('dialog --clear clears buffer', async () => { - const cleared = await handleReadCommand('dialog', ['--clear'], bm); - expect(cleared).toContain('cleared'); - const after = await handleReadCommand('dialog', [], bm); - expect(after).toContain('no dialogs'); - }); -}); - -// ─── Element State Checks (is) ───────────────────────────────── - -describe('Element state checks', () => { - beforeAll(async () => { - await handleWriteCommand('goto', [baseUrl + '/states.html'], bm); - }); - - test('is visible returns true for visible element', async () => { - const result = await handleReadCommand('is', ['visible', '#visible-div'], bm); - expect(result).toBe('true'); - }); - - test('is hidden returns true for hidden element', async () => { - const result = await handleReadCommand('is', ['hidden', '#hidden-div'], bm); - expect(result).toBe('true'); - }); - - test('is visible returns false for hidden element', async () => { - const result = await handleReadCommand('is', ['visible', '#hidden-div'], bm); - expect(result).toBe('false'); - }); - - test('is enabled returns true for enabled input', async () => { - const result = await handleReadCommand('is', ['enabled', '#enabled-input'], bm); - expect(result).toBe('true'); - }); - - test('is disabled returns true for disabled input', async () => { - const result = await handleReadCommand('is', ['disabled', '#disabled-input'], bm); - expect(result).toBe('true'); - }); - - test('is checked returns true for checked checkbox', async () => { - const result = await handleReadCommand('is', ['checked', '#checked-box'], bm); - expect(result).toBe('true'); - }); - - test('is checked returns false for unchecked checkbox', async () => { - const result = await handleReadCommand('is', ['checked', '#unchecked-box'], bm); - expect(result).toBe('false'); - }); - - test('is editable returns true for normal input', async () => { - const result = await handleReadCommand('is', ['editable', '#enabled-input'], bm); - expect(result).toBe('true'); - }); - - test('is editable returns false for readonly input', async () => { - const result = await handleReadCommand('is', ['editable', '#readonly-input'], bm); - expect(result).toBe('false'); - }); - - test('is focused after click', async () => { - await handleWriteCommand('click', ['#enabled-input'], bm); - const result = await handleReadCommand('is', ['focused', '#enabled-input'], bm); - expect(result).toBe('true'); - }); - - test('is with @ref works', async () => { - await handleMetaCommand('snapshot', ['-i'], bm, async () => {}); - // Find a ref for the enabled input - const snap = await handleMetaCommand('snapshot', ['-i'], bm, async () => {}); - const textboxLine = snap.split('\n').find(l => l.includes('[textbox]')); - if (textboxLine) { - const refMatch = textboxLine.match(/@(e\d+)/); - if (refMatch) { - const ref = `@${refMatch[1]}`; - const result = await handleReadCommand('is', ['visible', ref], bm); - expect(result).toBe('true'); - } - } - }); - - test('is with unknown property throws', async () => { - try { - await handleReadCommand('is', ['bogus', '#enabled-input'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Unknown property'); - } - }); - - test('is with missing args throws', async () => { - try { - await handleReadCommand('is', ['visible'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── File Upload ───────────────────────────────────────────────── - -describe('File upload', () => { - test('upload single file', async () => { - await handleWriteCommand('goto', [baseUrl + '/upload.html'], bm); - // Create a temp file to upload - const tempFile = '/tmp/browse-test-upload.txt'; - fs.writeFileSync(tempFile, 'test content'); - const result = await handleWriteCommand('upload', ['#file-input', tempFile], bm); - expect(result).toContain('Uploaded'); - expect(result).toContain('browse-test-upload.txt'); - - // Verify upload handler fired - await new Promise(r => setTimeout(r, 100)); - const text = await handleReadCommand('js', ['document.querySelector("#upload-result").textContent'], bm); - expect(text).toContain('browse-test-upload.txt'); - fs.unlinkSync(tempFile); - }); - - test('upload with @ref works', async () => { - await handleWriteCommand('goto', [baseUrl + '/upload.html'], bm); - const tempFile = '/tmp/browse-test-upload2.txt'; - fs.writeFileSync(tempFile, 'ref upload test'); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, async () => {}); - // Find the file input ref (it won't appear as "file input" in aria — use CSS selector instead) - const result = await handleWriteCommand('upload', ['#file-input', tempFile], bm); - expect(result).toContain('Uploaded'); - fs.unlinkSync(tempFile); - }); - - test('upload nonexistent file throws', async () => { - await handleWriteCommand('goto', [baseUrl + '/upload.html'], bm); - try { - await handleWriteCommand('upload', ['#file-input', '/tmp/nonexistent-file-12345.txt'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('File not found'); - } - }); - - test('upload missing args throws', async () => { - try { - await handleWriteCommand('upload', ['#file-input'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── Eval command ─────────────────────────────────────────────── - -describe('Eval', () => { - test('eval runs JS file', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tempFile = '/tmp/browse-test-eval.js'; - fs.writeFileSync(tempFile, 'document.title + " — evaluated"'); - const result = await handleReadCommand('eval', [tempFile], bm); - expect(result).toBe('Test Page - Basic — evaluated'); - fs.unlinkSync(tempFile); - }); - - test('eval returns object as JSON', async () => { - const tempFile = '/tmp/browse-test-eval-obj.js'; - fs.writeFileSync(tempFile, '({title: document.title, keys: Object.keys(document.body.dataset)})'); - const result = await handleReadCommand('eval', [tempFile], bm); - const obj = JSON.parse(result); - expect(obj.title).toBe('Test Page - Basic'); - expect(Array.isArray(obj.keys)).toBe(true); - fs.unlinkSync(tempFile); - }); - - test('eval file not found throws', async () => { - try { - await handleReadCommand('eval', ['/tmp/nonexistent-eval.js'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('File not found'); - } - }); - - test('eval no arg throws', async () => { - try { - await handleReadCommand('eval', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── Press command ────────────────────────────────────────────── - -describe('Press', () => { - test('press Tab moves focus', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - await handleWriteCommand('click', ['#email'], bm); - const result = await handleWriteCommand('press', ['Tab'], bm); - expect(result).toContain('Pressed Tab'); - }); - - test('press no arg throws', async () => { - try { - await handleWriteCommand('press', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── Cookie command ───────────────────────────────────────────── - -describe('Cookie command', () => { - test('cookie sets value', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('cookie', ['testcookie=testvalue'], bm); - expect(result).toContain('Cookie set'); - - const cookies = await handleReadCommand('cookies', [], bm); - expect(cookies).toContain('testcookie'); - expect(cookies).toContain('testvalue'); - }); - - test('cookie no arg throws', async () => { - try { - await handleWriteCommand('cookie', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('cookie no = throws', async () => { - try { - await handleWriteCommand('cookie', ['invalid'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── Header command ───────────────────────────────────────────── - -describe('Header command', () => { - test('header sets value and is sent', async () => { - const result = await handleWriteCommand('header', ['X-Test:test-value'], bm); - expect(result).toContain('Header set'); - - await handleWriteCommand('goto', [baseUrl + '/echo'], bm); - const echoText = await handleReadCommand('text', [], bm); - expect(echoText).toContain('x-test'); - expect(echoText).toContain('test-value'); - }); - - test('header no arg throws', async () => { - try { - await handleWriteCommand('header', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('header no colon throws', async () => { - try { - await handleWriteCommand('header', ['invalid'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── PDF command ──────────────────────────────────────────────── - -describe('PDF', () => { - test('pdf saves file with size', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const pdfPath = '/tmp/browse-test.pdf'; - const result = await handleMetaCommand('pdf', [pdfPath], bm, async () => {}); - expect(result).toContain('PDF saved'); - expect(fs.existsSync(pdfPath)).toBe(true); - const stat = fs.statSync(pdfPath); - expect(stat.size).toBeGreaterThan(100); - fs.unlinkSync(pdfPath); - }); -}); - -// ─── Empty page edge cases ────────────────────────────────────── - -describe('Empty page', () => { - test('text returns empty on empty page', async () => { - await handleWriteCommand('goto', [baseUrl + '/empty.html'], bm); - const result = await handleReadCommand('text', [], bm); - expect(result).toBe(''); - }); - - test('links returns empty on empty page', async () => { - const result = await handleReadCommand('links', [], bm); - expect(result).toBe(''); - }); - - test('forms returns empty array on empty page', async () => { - const result = await handleReadCommand('forms', [], bm); - expect(JSON.parse(result)).toEqual([]); - }); -}); - -// ─── Error paths ──────────────────────────────────────────────── - -describe('Errors', () => { - // Write command errors - test('goto with no arg throws', async () => { - try { - await handleWriteCommand('goto', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('click with no arg throws', async () => { - try { - await handleWriteCommand('click', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('fill with no value throws', async () => { - try { - await handleWriteCommand('fill', ['#input'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('select with no value throws', async () => { - try { - await handleWriteCommand('select', ['#sel'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('hover with no arg throws', async () => { - try { - await handleWriteCommand('hover', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('type with no arg throws', async () => { - try { - await handleWriteCommand('type', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('wait with no arg throws', async () => { - try { - await handleWriteCommand('wait', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('viewport with bad format throws', async () => { - try { - await handleWriteCommand('viewport', ['badformat'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('useragent with no arg throws', async () => { - try { - await handleWriteCommand('useragent', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - // Read command errors - test('js with no expression throws', async () => { - try { - await handleReadCommand('js', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('css with missing property throws', async () => { - try { - await handleReadCommand('css', ['h1'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('attrs with no selector throws', async () => { - try { - await handleReadCommand('attrs', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - // Meta command errors - test('tab with non-numeric id throws', async () => { - try { - await handleMetaCommand('tab', ['abc'], bm, async () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('diff with missing urls throws', async () => { - try { - await handleMetaCommand('diff', [baseUrl + '/basic.html'], bm, async () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('chain with invalid JSON throws', async () => { - try { - await handleMetaCommand('chain', ['not json'], bm, async () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Invalid JSON'); - } - }); - - test('chain with no arg throws', async () => { - try { - await handleMetaCommand('chain', [], bm, async () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('unknown read command throws', async () => { - try { - await handleReadCommand('bogus' as any, [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Unknown'); - } - }); - - test('unknown write command throws', async () => { - try { - await handleWriteCommand('bogus' as any, [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Unknown'); - } - }); - - test('unknown meta command throws', async () => { - try { - await handleMetaCommand('bogus' as any, [], bm, async () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Unknown'); - } - }); -}); - -// ─── Workflow: Navigation + Snapshot + Interaction ─────────────── - -describe('Workflows', () => { - test('navigation → snapshot → click @ref → verify URL', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, async () => {}); - // Find a link ref - const linkLine = snap.split('\n').find(l => l.includes('[link]')); - expect(linkLine).toBeDefined(); - const refMatch = linkLine!.match(/@(e\d+)/); - expect(refMatch).toBeDefined(); - // Click the link - await handleWriteCommand('click', [`@${refMatch![1]}`], bm); - // URL should have changed - const url = await handleMetaCommand('url', [], bm, async () => {}); - expect(url).toBeTruthy(); - }); - - test('form: goto → snapshot → fill @ref → click @ref', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, async () => {}); - // Find textbox and button - const textboxLine = snap.split('\n').find(l => l.includes('[textbox]')); - const buttonLine = snap.split('\n').find(l => l.includes('[button]') && l.includes('"Submit"')); - if (textboxLine && buttonLine) { - const textRef = textboxLine.match(/@(e\d+)/)![1]; - const btnRef = buttonLine.match(/@(e\d+)/)![1]; - await handleWriteCommand('fill', [`@${textRef}`, 'testuser'], bm); - await handleWriteCommand('click', [`@${btnRef}`], bm); - } - }); - - test('tabs: newtab → goto → switch → verify isolation', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tabsBefore = bm.getTabCount(); - await handleMetaCommand('newtab', [baseUrl + '/forms.html'], bm, async () => {}); - expect(bm.getTabCount()).toBe(tabsBefore + 1); - - const url = await handleMetaCommand('url', [], bm, async () => {}); - expect(url).toContain('/forms.html'); - - // Switch back to previous tab - const tabs = await bm.getTabListWithTitles(); - const prevTab = tabs.find(t => t.url.includes('/basic.html')); - if (prevTab) { - bm.switchTab(prevTab.id); - const url2 = await handleMetaCommand('url', [], bm, async () => {}); - expect(url2).toContain('/basic.html'); - } - - // Clean up extra tab - const allTabs = await bm.getTabListWithTitles(); - const formTab = allTabs.find(t => t.url.includes('/forms.html')); - if (formTab) await bm.closeTab(formTab.id); - }); - - test('cookies: set → read → reload → verify persistence', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - await handleWriteCommand('cookie', ['workflow-test=persisted'], bm); - await handleWriteCommand('reload', [], bm); - const cookies = await handleReadCommand('cookies', [], bm); - expect(cookies).toContain('workflow-test'); - expect(cookies).toContain('persisted'); - }); -}); - -// ─── Wait load states ────────────────────────────────────────── - -describe('Wait load states', () => { - test('wait --networkidle succeeds after page load', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('wait', ['--networkidle'], bm); - expect(result).toBe('Network idle'); - }); - - test('wait --load succeeds', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('wait', ['--load'], bm); - expect(result).toBe('Page loaded'); - }); - - test('wait --domcontentloaded succeeds', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('wait', ['--domcontentloaded'], bm); - expect(result).toBe('DOM content loaded'); - }); - - test('wait --networkidle with custom timeout', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('wait', ['--networkidle', '5000'], bm); - expect(result).toBe('Network idle'); - }); - - test('wait with selector still works', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('wait', ['#title'], bm); - expect(result).toContain('appeared'); - }); -}); - -// ─── Console --errors ────────────────────────────────────────── - -describe('Console --errors', () => { - test('console --errors filters to error and warning only', async () => { - // Clear existing entries - await handleReadCommand('console', ['--clear'], bm); - - // Add mixed entries - addConsoleEntry({ timestamp: Date.now(), level: 'log', text: 'info message' }); - addConsoleEntry({ timestamp: Date.now(), level: 'warning', text: 'warn message' }); - addConsoleEntry({ timestamp: Date.now(), level: 'error', text: 'error message' }); - - const result = await handleReadCommand('console', ['--errors'], bm); - expect(result).toContain('warn message'); - expect(result).toContain('error message'); - expect(result).not.toContain('info message'); - - // Cleanup - consoleBuffer.clear(); - }); - - test('console --errors returns empty message when no errors', async () => { - consoleBuffer.clear(); - addConsoleEntry({ timestamp: Date.now(), level: 'log', text: 'just a log' }); - - const result = await handleReadCommand('console', ['--errors'], bm); - expect(result).toBe('(no console errors)'); - - consoleBuffer.clear(); - }); - - test('console --errors on empty buffer', async () => { - consoleBuffer.clear(); - const result = await handleReadCommand('console', ['--errors'], bm); - expect(result).toBe('(no console errors)'); - }); - - test('console without flag still returns all messages', async () => { - consoleBuffer.clear(); - addConsoleEntry({ timestamp: Date.now(), level: 'log', text: 'all messages test' }); - - const result = await handleReadCommand('console', [], bm); - expect(result).toContain('all messages test'); - - consoleBuffer.clear(); - }); -}); - -// ─── Cookie Import ───────────────────────────────────────────── - -describe('Cookie import', () => { - test('cookie-import loads valid JSON cookies', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tempFile = '/tmp/browse-test-cookies.json'; - const cookies = [ - { name: 'test-cookie', value: 'test-value' }, - { name: 'another', value: '123' }, - ]; - fs.writeFileSync(tempFile, JSON.stringify(cookies)); - - const result = await handleWriteCommand('cookie-import', [tempFile], bm); - expect(result).toBe('Loaded 2 cookies from /tmp/browse-test-cookies.json'); - - // Verify cookies were set - const cookieList = await handleReadCommand('cookies', [], bm); - expect(cookieList).toContain('test-cookie'); - expect(cookieList).toContain('test-value'); - expect(cookieList).toContain('another'); - - fs.unlinkSync(tempFile); - }); - - test('cookie-import auto-fills domain from page URL', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tempFile = '/tmp/browse-test-cookies-nodomain.json'; - // Cookies without domain — should auto-fill from page URL - const cookies = [{ name: 'autofill-test', value: 'works' }]; - fs.writeFileSync(tempFile, JSON.stringify(cookies)); - - const result = await handleWriteCommand('cookie-import', [tempFile], bm); - expect(result).toContain('Loaded 1'); - - const cookieList = await handleReadCommand('cookies', [], bm); - expect(cookieList).toContain('autofill-test'); - - fs.unlinkSync(tempFile); - }); - - test('cookie-import preserves explicit domain', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tempFile = '/tmp/browse-test-cookies-domain.json'; - const cookies = [{ name: 'explicit', value: 'domain', domain: 'example.com', path: '/foo' }]; - fs.writeFileSync(tempFile, JSON.stringify(cookies)); - - const result = await handleWriteCommand('cookie-import', [tempFile], bm); - expect(result).toContain('Loaded 1'); - - fs.unlinkSync(tempFile); - }); - - test('cookie-import with empty array succeeds', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tempFile = '/tmp/browse-test-cookies-empty.json'; - fs.writeFileSync(tempFile, '[]'); - - const result = await handleWriteCommand('cookie-import', [tempFile], bm); - expect(result).toBe('Loaded 0 cookies from /tmp/browse-test-cookies-empty.json'); - - fs.unlinkSync(tempFile); - }); - - test('cookie-import throws on file not found', async () => { - try { - await handleWriteCommand('cookie-import', ['/tmp/nonexistent-cookies.json'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('File not found'); - } - }); - - test('cookie-import throws on invalid JSON', async () => { - const tempFile = '/tmp/browse-test-cookies-bad.json'; - fs.writeFileSync(tempFile, 'not json {{{'); - - try { - await handleWriteCommand('cookie-import', [tempFile], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Invalid JSON'); - } - - fs.unlinkSync(tempFile); - }); - - test('cookie-import throws on non-array JSON', async () => { - const tempFile = '/tmp/browse-test-cookies-obj.json'; - fs.writeFileSync(tempFile, '{"name": "not-an-array"}'); - - try { - await handleWriteCommand('cookie-import', [tempFile], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('JSON array'); - } - - fs.unlinkSync(tempFile); - }); - - test('cookie-import throws on cookie missing name', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tempFile = '/tmp/browse-test-cookies-noname.json'; - fs.writeFileSync(tempFile, JSON.stringify([{ value: 'no-name' }])); - - try { - await handleWriteCommand('cookie-import', [tempFile], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('name'); - } - - fs.unlinkSync(tempFile); - }); - - test('cookie-import no arg throws', async () => { - try { - await handleWriteCommand('cookie-import', [], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── Security: Redact sensitive values (PR #21) ───────────────── - -describe('Sensitive value redaction', () => { - test('type command does not echo typed text', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('type', ['my-secret-password'], bm); - expect(result).not.toContain('my-secret-password'); - expect(result).toContain('18 characters'); - }); - - test('cookie command redacts value', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleWriteCommand('cookie', ['session=secret123'], bm); - expect(result).toContain('session'); - expect(result).toContain('****'); - expect(result).not.toContain('secret123'); - }); - - test('header command redacts Authorization value', async () => { - const result = await handleWriteCommand('header', ['Authorization:Bearer token-xyz'], bm); - expect(result).toContain('Authorization'); - expect(result).toContain('****'); - expect(result).not.toContain('token-xyz'); - }); - - test('header command shows non-sensitive values', async () => { - const result = await handleWriteCommand('header', ['Content-Type:application/json'], bm); - expect(result).toContain('Content-Type'); - expect(result).toContain('application/json'); - expect(result).not.toContain('****'); - }); - - test('header command redacts X-API-Key', async () => { - const result = await handleWriteCommand('header', ['X-API-Key:sk-12345'], bm); - expect(result).toContain('X-API-Key'); - expect(result).toContain('****'); - expect(result).not.toContain('sk-12345'); - }); - - test('storage set does not echo value', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleReadCommand('storage', ['set', 'apiKey', 'secret-api-key-value'], bm); - expect(result).toContain('apiKey'); - expect(result).not.toContain('secret-api-key-value'); - }); - - test('forms redacts password field values', async () => { - await handleWriteCommand('goto', [baseUrl + '/forms.html'], bm); - const formsResult = await handleReadCommand('forms', [], bm); - const forms = JSON.parse(formsResult); - // Find password fields and verify they're redacted - for (const form of forms) { - for (const field of form.fields) { - if (field.type === 'password') { - expect(field.value === undefined || field.value === '[redacted]').toBe(true); - } - } - } - }); -}); - -// ─── Security: Path traversal prevention (PR #26) ─────────────── - -describe('Path traversal prevention', () => { - test('screenshot rejects path outside safe dirs', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - try { - await handleMetaCommand('screenshot', ['/etc/evil.png'], bm, () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path must be within'); - } - }); - - test('screenshot allows /tmp path', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleMetaCommand('screenshot', ['/tmp/test-safe.png'], bm, () => {}); - expect(result).toContain('Screenshot saved'); - try { fs.unlinkSync('/tmp/test-safe.png'); } catch {} - }); - - test('pdf rejects path outside safe dirs', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - try { - await handleMetaCommand('pdf', ['/home/evil.pdf'], bm, () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path must be within'); - } - }); - - test('responsive rejects path outside safe dirs', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - try { - await handleMetaCommand('responsive', ['/var/evil'], bm, () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path must be within'); - } - }); - - test('eval rejects path traversal with ..', async () => { - try { - await handleReadCommand('eval', ['../../etc/passwd'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path traversal'); - } - }); - - test('eval rejects absolute path outside safe dirs', async () => { - try { - await handleReadCommand('eval', ['/etc/passwd'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Absolute path must be within'); - } - }); - - test('eval allows /tmp path', async () => { - const tmpFile = '/tmp/test-eval-safe.js'; - fs.writeFileSync(tmpFile, 'document.title'); - try { - const result = await handleReadCommand('eval', [tmpFile], bm); - expect(typeof result).toBe('string'); - } finally { - try { fs.unlinkSync(tmpFile); } catch {} - } - }); - - test('screenshot rejects /tmpevil prefix collision', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - try { - await handleMetaCommand('screenshot', ['/tmpevil/steal.png'], bm, () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path must be within'); - } - }); - - test('cookie-import rejects path traversal', async () => { - try { - await handleWriteCommand('cookie-import', ['../../etc/shadow'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path traversal'); - } - }); - - test('cookie-import rejects absolute path outside safe dirs', async () => { - try { - await handleWriteCommand('cookie-import', ['/etc/passwd'], bm); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path must be within'); - } - }); - - test('snapshot -a -o rejects path outside safe dirs', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - // First get a snapshot so refs exist - await handleMetaCommand('snapshot', ['-i'], bm, () => {}); - try { - await handleMetaCommand('snapshot', ['-a', '-o', '/etc/evil.png'], bm, () => {}); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Path must be within'); - } - }); -}); - -// ─── Chain command: cookie-import in chain ────────────────────── - -describe('Chain with cookie-import', () => { - test('cookie-import works inside chain', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const tmpCookies = '/tmp/test-chain-cookies.json'; - fs.writeFileSync(tmpCookies, JSON.stringify([ - { name: 'chain_test', value: 'chain_value', domain: 'localhost', path: '/' } - ])); - try { - const commands = JSON.stringify([ - ['cookie-import', tmpCookies], - ]); - const result = await handleMetaCommand('chain', [commands], bm, async () => {}); - expect(result).toContain('[cookie-import]'); - expect(result).toContain('Loaded 1 cookie'); - } finally { - try { fs.unlinkSync(tmpCookies); } catch {} - } - }); -}); diff --git a/browse/test/config.test.ts b/browse/test/config.test.ts deleted file mode 100644 index 780385f..0000000 --- a/browse/test/config.test.ts +++ /dev/null @@ -1,125 +0,0 @@ -import { describe, test, expect } from 'bun:test'; -import { resolveConfig, ensureStateDir, readVersionHash, getGitRoot } from '../src/config'; -import * as fs from 'fs'; -import * as path from 'path'; -import * as os from 'os'; - -describe('config', () => { - describe('getGitRoot', () => { - test('returns a path when in a git repo', () => { - const root = getGitRoot(); - expect(root).not.toBeNull(); - expect(fs.existsSync(path.join(root!, '.git'))).toBe(true); - }); - }); - - describe('resolveConfig', () => { - test('uses git root by default', () => { - const config = resolveConfig({}); - const gitRoot = getGitRoot(); - expect(gitRoot).not.toBeNull(); - expect(config.projectDir).toBe(gitRoot); - expect(config.stateDir).toBe(path.join(gitRoot!, '.gstack')); - expect(config.stateFile).toBe(path.join(gitRoot!, '.gstack', 'browse.json')); - }); - - test('derives paths from BROWSE_STATE_FILE when set', () => { - const stateFile = '/tmp/test-config/.gstack/browse.json'; - const config = resolveConfig({ BROWSE_STATE_FILE: stateFile }); - expect(config.stateFile).toBe(stateFile); - expect(config.stateDir).toBe('/tmp/test-config/.gstack'); - expect(config.projectDir).toBe('/tmp/test-config'); - }); - - test('log paths are in stateDir', () => { - const config = resolveConfig({}); - expect(config.consoleLog).toBe(path.join(config.stateDir, 'browse-console.log')); - expect(config.networkLog).toBe(path.join(config.stateDir, 'browse-network.log')); - expect(config.dialogLog).toBe(path.join(config.stateDir, 'browse-dialog.log')); - }); - }); - - describe('ensureStateDir', () => { - test('creates directory if it does not exist', () => { - const tmpDir = path.join(os.tmpdir(), `browse-config-test-${Date.now()}`); - const config = resolveConfig({ BROWSE_STATE_FILE: path.join(tmpDir, '.gstack', 'browse.json') }); - expect(fs.existsSync(config.stateDir)).toBe(false); - ensureStateDir(config); - expect(fs.existsSync(config.stateDir)).toBe(true); - // Cleanup - fs.rmSync(tmpDir, { recursive: true, force: true }); - }); - - test('is a no-op if directory already exists', () => { - const tmpDir = path.join(os.tmpdir(), `browse-config-test-${Date.now()}`); - const stateDir = path.join(tmpDir, '.gstack'); - fs.mkdirSync(stateDir, { recursive: true }); - const config = resolveConfig({ BROWSE_STATE_FILE: path.join(stateDir, 'browse.json') }); - ensureStateDir(config); // should not throw - expect(fs.existsSync(config.stateDir)).toBe(true); - // Cleanup - fs.rmSync(tmpDir, { recursive: true, force: true }); - }); - }); - - describe('readVersionHash', () => { - test('returns null when .version file does not exist', () => { - const result = readVersionHash('/nonexistent/path/browse'); - expect(result).toBeNull(); - }); - - test('reads version from .version file adjacent to execPath', () => { - const tmpDir = path.join(os.tmpdir(), `browse-version-test-${Date.now()}`); - fs.mkdirSync(tmpDir, { recursive: true }); - const versionFile = path.join(tmpDir, '.version'); - fs.writeFileSync(versionFile, 'abc123def\n'); - const result = readVersionHash(path.join(tmpDir, 'browse')); - expect(result).toBe('abc123def'); - // Cleanup - fs.rmSync(tmpDir, { recursive: true, force: true }); - }); - }); -}); - -describe('resolveServerScript', () => { - // Import the function from cli.ts - const { resolveServerScript } = require('../src/cli'); - - test('uses BROWSE_SERVER_SCRIPT env when set', () => { - const result = resolveServerScript({ BROWSE_SERVER_SCRIPT: '/custom/server.ts' }, '', ''); - expect(result).toBe('/custom/server.ts'); - }); - - test('finds server.ts adjacent to cli.ts in dev mode', () => { - const srcDir = path.resolve(__dirname, '../src'); - const result = resolveServerScript({}, srcDir, ''); - expect(result).toBe(path.join(srcDir, 'server.ts')); - }); - - test('throws when server.ts cannot be found', () => { - expect(() => resolveServerScript({}, '/nonexistent/$bunfs', '/nonexistent/browse')) - .toThrow('Cannot find server.ts'); - }); -}); - -describe('version mismatch detection', () => { - test('detects when versions differ', () => { - const stateVersion = 'abc123'; - const currentVersion = 'def456'; - expect(stateVersion !== currentVersion).toBe(true); - }); - - test('no mismatch when versions match', () => { - const stateVersion = 'abc123'; - const currentVersion = 'abc123'; - expect(stateVersion !== currentVersion).toBe(false); - }); - - test('no mismatch when either version is null', () => { - const currentVersion: string | null = null; - const stateVersion: string | undefined = 'abc123'; - // Version mismatch only triggers when both are present - const shouldRestart = currentVersion !== null && stateVersion !== undefined && currentVersion !== stateVersion; - expect(shouldRestart).toBe(false); - }); -}); diff --git a/browse/test/cookie-import-browser.test.ts b/browse/test/cookie-import-browser.test.ts deleted file mode 100644 index 1e91cf1..0000000 --- a/browse/test/cookie-import-browser.test.ts +++ /dev/null @@ -1,397 +0,0 @@ -/** - * Unit tests for cookie-import-browser.ts - * - * Uses a fixture SQLite database with cookies encrypted using a known test key. - * Mocks Keychain access to return the test password. - * - * Test key derivation (matches real Chromium pipeline): - * password = "test-keychain-password" - * key = PBKDF2(password, "saltysalt", 1003, 16, sha1) - * - * Encryption: AES-128-CBC with IV = 16 × 0x20, prefix "v10" - * First 32 bytes of plaintext = HMAC-SHA256 tag (random for tests) - * Remaining bytes = actual cookie value - */ - -import { describe, test, expect, beforeAll, afterAll, mock } from 'bun:test'; -import { Database } from 'bun:sqlite'; -import * as crypto from 'crypto'; -import * as fs from 'fs'; -import * as path from 'path'; -import * as os from 'os'; - -// ─── Test Constants ───────────────────────────────────────────── - -const TEST_PASSWORD = 'test-keychain-password'; -const TEST_KEY = crypto.pbkdf2Sync(TEST_PASSWORD, 'saltysalt', 1003, 16, 'sha1'); -const IV = Buffer.alloc(16, 0x20); -const CHROMIUM_EPOCH_OFFSET = 11644473600000000n; - -// Fixture DB path -const FIXTURE_DIR = path.join(import.meta.dir, 'fixtures'); -const FIXTURE_DB = path.join(FIXTURE_DIR, 'test-cookies.db'); - -// ─── Encryption Helper ────────────────────────────────────────── - -function encryptCookieValue(value: string): Buffer { - // 32-byte HMAC tag (random for test) + actual value - const hmacTag = crypto.randomBytes(32); - const plaintext = Buffer.concat([hmacTag, Buffer.from(value, 'utf-8')]); - - // PKCS7 pad to AES block size (16 bytes) - const blockSize = 16; - const padLen = blockSize - (plaintext.length % blockSize); - const padded = Buffer.concat([plaintext, Buffer.alloc(padLen, padLen)]); - - const cipher = crypto.createCipheriv('aes-128-cbc', TEST_KEY, IV); - cipher.setAutoPadding(false); // We padded manually - const encrypted = Buffer.concat([cipher.update(padded), cipher.final()]); - - // Prefix with "v10" - return Buffer.concat([Buffer.from('v10'), encrypted]); -} - -function chromiumEpoch(unixSeconds: number): bigint { - return BigInt(unixSeconds) * 1000000n + CHROMIUM_EPOCH_OFFSET; -} - -// ─── Create Fixture Database ──────────────────────────────────── - -function createFixtureDb() { - fs.mkdirSync(FIXTURE_DIR, { recursive: true }); - if (fs.existsSync(FIXTURE_DB)) fs.unlinkSync(FIXTURE_DB); - - const db = new Database(FIXTURE_DB); - db.run(`CREATE TABLE cookies ( - host_key TEXT NOT NULL, - name TEXT NOT NULL, - value TEXT NOT NULL DEFAULT '', - encrypted_value BLOB NOT NULL DEFAULT x'', - path TEXT NOT NULL DEFAULT '/', - expires_utc INTEGER NOT NULL DEFAULT 0, - is_secure INTEGER NOT NULL DEFAULT 0, - is_httponly INTEGER NOT NULL DEFAULT 0, - has_expires INTEGER NOT NULL DEFAULT 0, - samesite INTEGER NOT NULL DEFAULT 1 - )`); - - const insert = db.prepare(`INSERT INTO cookies - (host_key, name, value, encrypted_value, path, expires_utc, is_secure, is_httponly, has_expires, samesite) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`); - - const futureExpiry = Number(chromiumEpoch(Math.floor(Date.now() / 1000) + 86400 * 365)); - const pastExpiry = Number(chromiumEpoch(Math.floor(Date.now() / 1000) - 86400)); - - // Domain 1: .github.com — 3 encrypted cookies - insert.run('.github.com', 'session_id', '', encryptCookieValue('abc123'), '/', futureExpiry, 1, 1, 1, 1); - insert.run('.github.com', 'user_token', '', encryptCookieValue('token-xyz'), '/', futureExpiry, 1, 0, 1, 0); - insert.run('.github.com', 'theme', '', encryptCookieValue('dark'), '/', futureExpiry, 0, 0, 1, 2); - - // Domain 2: .google.com — 2 cookies - insert.run('.google.com', 'NID', '', encryptCookieValue('google-nid-value'), '/', futureExpiry, 1, 1, 1, 0); - insert.run('.google.com', 'SID', '', encryptCookieValue('google-sid-value'), '/', futureExpiry, 1, 1, 1, 1); - - // Domain 3: .example.com — 1 unencrypted cookie (value field set, no encrypted_value) - insert.run('.example.com', 'plain_cookie', 'hello-world', Buffer.alloc(0), '/', futureExpiry, 0, 0, 1, 1); - - // Domain 4: .expired.com — 1 expired cookie (should be filtered out) - insert.run('.expired.com', 'old', '', encryptCookieValue('expired-value'), '/', pastExpiry, 0, 0, 1, 1); - - // Domain 5: .session.com — session cookie (has_expires=0) - insert.run('.session.com', 'sess', '', encryptCookieValue('session-value'), '/', 0, 1, 1, 0, 1); - - // Domain 6: .corrupt.com — cookie with garbage encrypted_value - insert.run('.corrupt.com', 'bad', '', Buffer.from('v10' + 'not-valid-ciphertext-at-all'), '/', futureExpiry, 0, 0, 1, 1); - - // Domain 7: .mixed.com — one good, one corrupt - insert.run('.mixed.com', 'good', '', encryptCookieValue('mixed-good'), '/', futureExpiry, 0, 0, 1, 1); - insert.run('.mixed.com', 'bad', '', Buffer.from('v10' + 'garbage-data-here!!!'), '/', futureExpiry, 0, 0, 1, 1); - - db.close(); -} - -// ─── Mock Setup ───────────────────────────────────────────────── -// We need to mock: -// 1. The Keychain access (getKeychainPassword) to return TEST_PASSWORD -// 2. The cookie DB path resolution to use our fixture DB - -// We'll import the module after setting up the mocks -let findInstalledBrowsers: any; -let listDomains: any; -let importCookies: any; -let CookieImportError: any; - -beforeAll(async () => { - createFixtureDb(); - - // Mock Bun.spawn to return test password for keychain access - const origSpawn = Bun.spawn; - // @ts-ignore - monkey-patching for test - Bun.spawn = function(cmd: any, opts: any) { - // Intercept security find-generic-password calls - if (Array.isArray(cmd) && cmd[0] === 'security' && cmd[1] === 'find-generic-password') { - const service = cmd[3]; // -s - // Return test password for any known test service - return { - stdout: new ReadableStream({ - start(controller) { - controller.enqueue(new TextEncoder().encode(TEST_PASSWORD + '\n')); - controller.close(); - } - }), - stderr: new ReadableStream({ - start(controller) { controller.close(); } - }), - exited: Promise.resolve(0), - kill: () => {}, - }; - } - // Pass through other spawn calls - return origSpawn(cmd, opts); - }; - - // Import the module (uses our mocked Bun.spawn) - const mod = await import('../src/cookie-import-browser'); - findInstalledBrowsers = mod.findInstalledBrowsers; - listDomains = mod.listDomains; - importCookies = mod.importCookies; - CookieImportError = mod.CookieImportError; -}); - -afterAll(() => { - // Clean up fixture DB - try { fs.unlinkSync(FIXTURE_DB); } catch {} - try { fs.rmdirSync(FIXTURE_DIR); } catch {} -}); - -// ─── Helper: Override DB path for tests ───────────────────────── -// The real code resolves paths via ~/Library/Application Support//Default/Cookies -// We need to test against our fixture DB directly. We'll test the pure decryption functions -// by calling importCookies with a browser that points to our fixture. -// Since the module uses a hardcoded registry, we test the decryption logic via a different approach: -// We'll directly call the internal decryption by setting up the DB in the expected location. - -// For the unit tests below, we test the decryption pipeline by: -// 1. Creating encrypted cookies with known values -// 2. Decrypting them with the module's decryption logic -// The actual DB path resolution is tested separately. - -// ─── Tests ────────────────────────────────────────────────────── - -describe('Cookie Import Browser', () => { - - describe('Decryption Pipeline', () => { - test('encrypts and decrypts round-trip correctly', () => { - // Verify our test helper produces valid ciphertext - const encrypted = encryptCookieValue('hello-world'); - expect(encrypted.slice(0, 3).toString()).toBe('v10'); - - // Decrypt manually to verify - const ciphertext = encrypted.slice(3); - const decipher = crypto.createDecipheriv('aes-128-cbc', TEST_KEY, IV); - const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]); - // Skip 32-byte HMAC tag - const value = plaintext.slice(32).toString('utf-8'); - expect(value).toBe('hello-world'); - }); - - test('handles empty encrypted_value', () => { - const encrypted = encryptCookieValue(''); - const ciphertext = encrypted.slice(3); - const decipher = crypto.createDecipheriv('aes-128-cbc', TEST_KEY, IV); - const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]); - // 32-byte tag + empty value → slice(32) = empty - expect(plaintext.length).toBe(32); // just the HMAC tag, padded to block boundary? Actually 32 + 0 padded = 48 - // With PKCS7 padding: 32 bytes + 16 bytes of padding = 48 bytes padded → decrypts to 32 bytes + padding removed = 32 bytes - }); - - test('handles special characters in cookie values', () => { - const specialValue = 'a=b&c=d; path=/; expires=Thu, 01 Jan 2099'; - const encrypted = encryptCookieValue(specialValue); - const ciphertext = encrypted.slice(3); - const decipher = crypto.createDecipheriv('aes-128-cbc', TEST_KEY, IV); - const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]); - expect(plaintext.slice(32).toString('utf-8')).toBe(specialValue); - }); - }); - - describe('Fixture DB Structure', () => { - test('fixture DB has correct domain counts', () => { - const db = new Database(FIXTURE_DB, { readonly: true }); - const rows = db.query( - `SELECT host_key, COUNT(*) as count FROM cookies GROUP BY host_key ORDER BY count DESC` - ).all() as any[]; - db.close(); - - const counts = Object.fromEntries(rows.map((r: any) => [r.host_key, r.count])); - expect(counts['.github.com']).toBe(3); - expect(counts['.google.com']).toBe(2); - expect(counts['.example.com']).toBe(1); - expect(counts['.expired.com']).toBe(1); - expect(counts['.session.com']).toBe(1); - expect(counts['.corrupt.com']).toBe(1); - expect(counts['.mixed.com']).toBe(2); - }); - - test('encrypted cookies in fixture have v10 prefix', () => { - const db = new Database(FIXTURE_DB, { readonly: true }); - const rows = db.query( - `SELECT name, encrypted_value FROM cookies WHERE host_key = '.github.com'` - ).all() as any[]; - db.close(); - - for (const row of rows) { - const ev = Buffer.from(row.encrypted_value); - expect(ev.slice(0, 3).toString()).toBe('v10'); - } - }); - - test('decrypts all github.com cookies from fixture DB', () => { - const db = new Database(FIXTURE_DB, { readonly: true }); - const rows = db.query( - `SELECT name, value, encrypted_value FROM cookies WHERE host_key = '.github.com'` - ).all() as any[]; - db.close(); - - const expected: Record = { - 'session_id': 'abc123', - 'user_token': 'token-xyz', - 'theme': 'dark', - }; - - for (const row of rows) { - const ev = Buffer.from(row.encrypted_value); - if (ev.length === 0) continue; - const ciphertext = ev.slice(3); - const decipher = crypto.createDecipheriv('aes-128-cbc', TEST_KEY, IV); - const plaintext = Buffer.concat([decipher.update(ciphertext), decipher.final()]); - const value = plaintext.slice(32).toString('utf-8'); - expect(value).toBe(expected[row.name]); - } - }); - - test('unencrypted cookie uses value field directly', () => { - const db = new Database(FIXTURE_DB, { readonly: true }); - const row = db.query( - `SELECT value, encrypted_value FROM cookies WHERE host_key = '.example.com'` - ).get() as any; - db.close(); - - expect(row.value).toBe('hello-world'); - expect(Buffer.from(row.encrypted_value).length).toBe(0); - }); - }); - - describe('sameSite Mapping', () => { - test('maps sameSite values correctly', () => { - // Read from fixture DB and verify mapping - const db = new Database(FIXTURE_DB, { readonly: true }); - - // samesite=0 → None - const none = db.query(`SELECT samesite FROM cookies WHERE name = 'user_token'`).get() as any; - expect(none.samesite).toBe(0); - - // samesite=1 → Lax - const lax = db.query(`SELECT samesite FROM cookies WHERE name = 'session_id'`).get() as any; - expect(lax.samesite).toBe(1); - - // samesite=2 → Strict - const strict = db.query(`SELECT samesite FROM cookies WHERE name = 'theme'`).get() as any; - expect(strict.samesite).toBe(2); - - db.close(); - }); - }); - - describe('Chromium Epoch Conversion', () => { - test('converts Chromium epoch to Unix timestamp correctly', () => { - // Round-trip: pick a known Unix timestamp, convert to Chromium, convert back - const knownUnix = 1704067200; // 2024-01-01T00:00:00Z - const chromiumTs = BigInt(knownUnix) * 1000000n + CHROMIUM_EPOCH_OFFSET; - const unixTs = Number((chromiumTs - CHROMIUM_EPOCH_OFFSET) / 1000000n); - expect(unixTs).toBe(knownUnix); - }); - - test('session cookies (has_expires=0) get expires=-1', () => { - const db = new Database(FIXTURE_DB, { readonly: true }); - const row = db.query( - `SELECT has_expires, expires_utc FROM cookies WHERE host_key = '.session.com'` - ).get() as any; - db.close(); - expect(row.has_expires).toBe(0); - // When has_expires=0, the module should return expires=-1 - }); - }); - - describe('Error Handling', () => { - test('CookieImportError has correct properties', () => { - const err = new CookieImportError('test message', 'test_code', 'retry'); - expect(err.message).toBe('test message'); - expect(err.code).toBe('test_code'); - expect(err.action).toBe('retry'); - expect(err.name).toBe('CookieImportError'); - expect(err instanceof Error).toBe(true); - }); - - test('CookieImportError without action', () => { - const err = new CookieImportError('no action', 'some_code'); - expect(err.action).toBeUndefined(); - }); - }); - - describe('Browser Registry', () => { - test('findInstalledBrowsers returns array', () => { - const browsers = findInstalledBrowsers(); - expect(Array.isArray(browsers)).toBe(true); - // Each entry should have the right shape - for (const b of browsers) { - expect(b).toHaveProperty('name'); - expect(b).toHaveProperty('dataDir'); - expect(b).toHaveProperty('keychainService'); - expect(b).toHaveProperty('aliases'); - } - }); - }); - - describe('Corrupt Data Handling', () => { - test('garbage ciphertext produces decryption error', () => { - const garbage = Buffer.from('v10' + 'this-is-not-valid-ciphertext!!'); - const ciphertext = garbage.slice(3); - expect(() => { - const decipher = crypto.createDecipheriv('aes-128-cbc', TEST_KEY, IV); - Buffer.concat([decipher.update(ciphertext), decipher.final()]); - }).toThrow(); - }); - }); - - describe('Profile Validation', () => { - test('rejects path traversal in profile names', () => { - // The validateProfile function should reject profiles with / or .. - // We can't call it directly (internal), but we can test via listDomains - // which calls validateProfile - expect(() => listDomains('chrome', '../etc')).toThrow(/Invalid profile/); - expect(() => listDomains('chrome', 'Default/../../etc')).toThrow(/Invalid profile/); - }); - - test('rejects control characters in profile names', () => { - expect(() => listDomains('chrome', 'Default\x00evil')).toThrow(/Invalid profile/); - }); - }); - - describe('Unknown Browser', () => { - test('throws for unknown browser name', () => { - expect(() => listDomains('firefox')).toThrow(/Unknown browser.*firefox/i); - }); - - test('error includes list of supported browsers', () => { - try { - listDomains('firefox'); - throw new Error('Should have thrown'); - } catch (err: any) { - expect(err.code).toBe('unknown_browser'); - expect(err.message).toContain('comet'); - expect(err.message).toContain('chrome'); - } - }); - }); -}); diff --git a/browse/test/cookie-picker-routes.test.ts b/browse/test/cookie-picker-routes.test.ts deleted file mode 100644 index ca55c47..0000000 --- a/browse/test/cookie-picker-routes.test.ts +++ /dev/null @@ -1,205 +0,0 @@ -/** - * Tests for cookie-picker route handler - * - * Tests the HTTP glue layer directly with mock BrowserManager objects. - * Verifies that all routes return valid JSON (not HTML) with correct CORS headers. - */ - -import { describe, test, expect } from 'bun:test'; -import { handleCookiePickerRoute } from '../src/cookie-picker-routes'; - -// ─── Mock BrowserManager ────────────────────────────────────── - -function mockBrowserManager() { - const addedCookies: any[] = []; - const clearedDomains: string[] = []; - return { - bm: { - getPage: () => ({ - context: () => ({ - addCookies: (cookies: any[]) => { addedCookies.push(...cookies); }, - clearCookies: (opts: { domain: string }) => { clearedDomains.push(opts.domain); }, - }), - }), - } as any, - addedCookies, - clearedDomains, - }; -} - -function makeUrl(path: string, port = 9470): URL { - return new URL(`http://127.0.0.1:${port}${path}`); -} - -function makeReq(method: string, body?: any): Request { - const opts: RequestInit = { method }; - if (body) { - opts.body = JSON.stringify(body); - opts.headers = { 'Content-Type': 'application/json' }; - } - return new Request('http://127.0.0.1:9470', opts); -} - -// ─── Tests ────────────────────────────────────────────────────── - -describe('cookie-picker-routes', () => { - describe('CORS', () => { - test('OPTIONS returns 204 with correct CORS headers', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/browsers'); - const req = new Request('http://127.0.0.1:9470', { method: 'OPTIONS' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(204); - expect(res.headers.get('Access-Control-Allow-Origin')).toBe('http://127.0.0.1:9470'); - expect(res.headers.get('Access-Control-Allow-Methods')).toContain('POST'); - }); - - test('JSON responses include correct CORS origin with port', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/browsers', 9450); - const req = new Request('http://127.0.0.1:9450', { method: 'GET' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.headers.get('Access-Control-Allow-Origin')).toBe('http://127.0.0.1:9450'); - }); - }); - - describe('JSON responses (not HTML)', () => { - test('GET /cookie-picker/browsers returns JSON', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/browsers'); - const req = new Request('http://127.0.0.1:9470', { method: 'GET' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(200); - expect(res.headers.get('Content-Type')).toBe('application/json'); - const body = await res.json(); - expect(body).toHaveProperty('browsers'); - expect(Array.isArray(body.browsers)).toBe(true); - }); - - test('GET /cookie-picker/domains without browser param returns JSON error', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/domains'); - const req = new Request('http://127.0.0.1:9470', { method: 'GET' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(400); - expect(res.headers.get('Content-Type')).toBe('application/json'); - const body = await res.json(); - expect(body).toHaveProperty('error'); - expect(body).toHaveProperty('code', 'missing_param'); - }); - - test('POST /cookie-picker/import with invalid JSON returns JSON error', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/import'); - const req = new Request('http://127.0.0.1:9470', { - method: 'POST', - body: 'not json', - headers: { 'Content-Type': 'application/json' }, - }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(400); - expect(res.headers.get('Content-Type')).toBe('application/json'); - const body = await res.json(); - expect(body.code).toBe('bad_request'); - }); - - test('POST /cookie-picker/import missing browser field returns JSON error', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/import'); - const req = makeReq('POST', { domains: ['.example.com'] }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(400); - const body = await res.json(); - expect(body.code).toBe('missing_param'); - }); - - test('POST /cookie-picker/import missing domains returns JSON error', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/import'); - const req = makeReq('POST', { browser: 'Chrome' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(400); - const body = await res.json(); - expect(body.code).toBe('missing_param'); - }); - - test('POST /cookie-picker/remove with invalid JSON returns JSON error', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/remove'); - const req = new Request('http://127.0.0.1:9470', { - method: 'POST', - body: '{bad', - headers: { 'Content-Type': 'application/json' }, - }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(400); - expect(res.headers.get('Content-Type')).toBe('application/json'); - }); - - test('POST /cookie-picker/remove missing domains returns JSON error', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/remove'); - const req = makeReq('POST', {}); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(400); - const body = await res.json(); - expect(body.code).toBe('missing_param'); - }); - - test('GET /cookie-picker/imported returns JSON with domain list', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/imported'); - const req = new Request('http://127.0.0.1:9470', { method: 'GET' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(200); - expect(res.headers.get('Content-Type')).toBe('application/json'); - const body = await res.json(); - expect(body).toHaveProperty('domains'); - expect(body).toHaveProperty('totalDomains'); - expect(body).toHaveProperty('totalCookies'); - }); - }); - - describe('routing', () => { - test('GET /cookie-picker returns HTML', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker'); - const req = new Request('http://127.0.0.1:9470', { method: 'GET' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(200); - expect(res.headers.get('Content-Type')).toContain('text/html'); - }); - - test('unknown path returns 404', async () => { - const { bm } = mockBrowserManager(); - const url = makeUrl('/cookie-picker/nonexistent'); - const req = new Request('http://127.0.0.1:9470', { method: 'GET' }); - - const res = await handleCookiePickerRoute(url, req, bm); - - expect(res.status).toBe(404); - }); - }); -}); diff --git a/browse/test/find-browse.test.ts b/browse/test/find-browse.test.ts deleted file mode 100644 index 43e1300..0000000 --- a/browse/test/find-browse.test.ts +++ /dev/null @@ -1,144 +0,0 @@ -/** - * Tests for find-browse version check logic - * - * Tests the checkVersion() and locateBinary() functions directly. - * Uses temp directories with mock .version files and cache files. - */ - -import { describe, test, expect, beforeEach, afterEach } from 'bun:test'; -import { checkVersion, locateBinary } from '../src/find-browse'; -import { mkdtempSync, writeFileSync, rmSync, existsSync, mkdirSync } from 'fs'; -import { join } from 'path'; -import { tmpdir } from 'os'; - -let tempDir: string; - -beforeEach(() => { - tempDir = mkdtempSync(join(tmpdir(), 'find-browse-test-')); -}); - -afterEach(() => { - rmSync(tempDir, { recursive: true, force: true }); - // Clean up test cache - try { rmSync('/tmp/gstack-latest-version'); } catch {} -}); - -describe('checkVersion', () => { - test('returns null when .version file is missing', () => { - const result = checkVersion(tempDir); - expect(result).toBeNull(); - }); - - test('returns null when .version file is empty', () => { - writeFileSync(join(tempDir, '.version'), ''); - const result = checkVersion(tempDir); - expect(result).toBeNull(); - }); - - test('returns null when .version has only whitespace', () => { - writeFileSync(join(tempDir, '.version'), ' \n'); - const result = checkVersion(tempDir); - expect(result).toBeNull(); - }); - - test('returns null when local SHA matches remote (cache hit)', () => { - const sha = 'a'.repeat(40); - writeFileSync(join(tempDir, '.version'), sha); - // Write cache with same SHA, recent timestamp - const now = Math.floor(Date.now() / 1000); - writeFileSync('/tmp/gstack-latest-version', `${sha} ${now}\n`); - - const result = checkVersion(tempDir); - expect(result).toBeNull(); - }); - - test('returns META:UPDATE_AVAILABLE when SHAs differ (cache hit)', () => { - const localSha = 'a'.repeat(40); - const remoteSha = 'b'.repeat(40); - writeFileSync(join(tempDir, '.version'), localSha); - // Create a fake browse binary path so resolveSkillDir works - const browsePath = join(tempDir, 'browse'); - writeFileSync(browsePath, ''); - // Write cache with different SHA, recent timestamp - const now = Math.floor(Date.now() / 1000); - writeFileSync('/tmp/gstack-latest-version', `${remoteSha} ${now}\n`); - - const result = checkVersion(tempDir); - // Result may be null if resolveSkillDir can't determine skill dir from temp path - // That's expected — the META signal requires a known skill dir path - if (result !== null) { - expect(result).toStartWith('META:UPDATE_AVAILABLE'); - const jsonStr = result.replace('META:UPDATE_AVAILABLE ', ''); - const payload = JSON.parse(jsonStr); - expect(payload.current).toBe('a'.repeat(8)); - expect(payload.latest).toBe('b'.repeat(8)); - expect(payload.command).toContain('git stash'); - expect(payload.command).toContain('git reset --hard origin/main'); - expect(payload.command).toContain('./setup'); - } - }); - - test('uses cached SHA when cache is fresh (< 4hr)', () => { - const localSha = 'a'.repeat(40); - const remoteSha = 'a'.repeat(40); - writeFileSync(join(tempDir, '.version'), localSha); - // Cache is 1 hour old — should still be valid - const oneHourAgo = Math.floor(Date.now() / 1000) - 3600; - writeFileSync('/tmp/gstack-latest-version', `${remoteSha} ${oneHourAgo}\n`); - - const result = checkVersion(tempDir); - expect(result).toBeNull(); // SHAs match - }); - - test('treats expired cache as stale', () => { - const localSha = 'a'.repeat(40); - writeFileSync(join(tempDir, '.version'), localSha); - // Cache is 5 hours old — should be stale - const fiveHoursAgo = Math.floor(Date.now() / 1000) - 18000; - writeFileSync('/tmp/gstack-latest-version', `${'b'.repeat(40)} ${fiveHoursAgo}\n`); - - // This will try git ls-remote which may fail in test env — that's OK - // The important thing is it doesn't use the stale cache value - const result = checkVersion(tempDir); - // Result depends on whether git ls-remote succeeds in test environment - // If offline, returns null (graceful degradation) - expect(result === null || typeof result === 'string').toBe(true); - }); - - test('handles corrupt cache file gracefully', () => { - const localSha = 'a'.repeat(40); - writeFileSync(join(tempDir, '.version'), localSha); - writeFileSync('/tmp/gstack-latest-version', 'garbage data here'); - - // Should not throw, should treat as stale - const result = checkVersion(tempDir); - expect(result === null || typeof result === 'string').toBe(true); - }); - - test('handles cache with invalid SHA gracefully', () => { - const localSha = 'a'.repeat(40); - writeFileSync(join(tempDir, '.version'), localSha); - writeFileSync('/tmp/gstack-latest-version', `not-a-sha ${Math.floor(Date.now() / 1000)}\n`); - - // Invalid SHA should be treated as no cache - const result = checkVersion(tempDir); - expect(result === null || typeof result === 'string').toBe(true); - }); -}); - -describe('locateBinary', () => { - test('returns null when no binary exists at known paths', () => { - // This test depends on the test environment — if a real binary exists at - // ~/.claude/skills/gstack/browse/dist/browse, it will find it. - // We mainly test that the function doesn't throw. - const result = locateBinary(); - expect(result === null || typeof result === 'string').toBe(true); - }); - - test('returns string path when binary exists', () => { - const result = locateBinary(); - if (result !== null) { - expect(existsSync(result)).toBe(true); - } - }); -}); diff --git a/browse/test/fixtures/basic.html b/browse/test/fixtures/basic.html deleted file mode 100644 index 21904c8..0000000 --- a/browse/test/fixtures/basic.html +++ /dev/null @@ -1,33 +0,0 @@ - - - - - Test Page - Basic - - - - -

    Hello World

    -

    This is a highlighted paragraph.

    - -
    -

    Some body text here.

    -
      -
    • Item one
    • -
    • Item two
    • -
    • Item three
    • -
    -
    -
    Footer text
    - - diff --git a/browse/test/fixtures/cursor-interactive.html b/browse/test/fixtures/cursor-interactive.html deleted file mode 100644 index 0259081..0000000 --- a/browse/test/fixtures/cursor-interactive.html +++ /dev/null @@ -1,22 +0,0 @@ - - - - - Test Page - Cursor Interactive - - - -

    Cursor Interactive Test

    - -
    Click me (div)
    - Hover card (span) -
    Focusable div
    -
    Onclick div
    - - - Normal Link - - diff --git a/browse/test/fixtures/dialog.html b/browse/test/fixtures/dialog.html deleted file mode 100644 index bfc588a..0000000 --- a/browse/test/fixtures/dialog.html +++ /dev/null @@ -1,15 +0,0 @@ - - - - - Test Page - Dialog - - -

    Dialog Test

    - - - -

    -

    - - diff --git a/browse/test/fixtures/empty.html b/browse/test/fixtures/empty.html deleted file mode 100644 index 8ba582f..0000000 --- a/browse/test/fixtures/empty.html +++ /dev/null @@ -1,2 +0,0 @@ - - diff --git a/browse/test/fixtures/forms.html b/browse/test/fixtures/forms.html deleted file mode 100644 index 8a6b730..0000000 --- a/browse/test/fixtures/forms.html +++ /dev/null @@ -1,55 +0,0 @@ - - - - - Test Page - Forms - - - -

    Form Test Page

    - -
    - - - - - -
    - -
    - - - - - - - - -
    - -
    Form submitted!
    - - - - diff --git a/browse/test/fixtures/responsive.html b/browse/test/fixtures/responsive.html deleted file mode 100644 index 3c7c89d..0000000 --- a/browse/test/fixtures/responsive.html +++ /dev/null @@ -1,49 +0,0 @@ - - - - - - Test Page - Responsive - - - -
    -

    Responsive Layout Test

    -

    You are on mobile

    -

    You are on desktop

    -
    -
    Card 1
    -
    Card 2
    -
    Card 3
    -
    Card 4
    -
    Card 5
    -
    Card 6
    -
    -
    - - diff --git a/browse/test/fixtures/snapshot.html b/browse/test/fixtures/snapshot.html deleted file mode 100644 index 3753202..0000000 --- a/browse/test/fixtures/snapshot.html +++ /dev/null @@ -1,55 +0,0 @@ - - - - - Snapshot Test Page - - - -

    Snapshot Test

    -

    Subheading

    - - - -
    -

    Form Section

    -
    - - - - - - - -
    -
    - -
    -
    - -
    -
    - -

    Some paragraph text that is not interactive.

    - - - - diff --git a/browse/test/fixtures/spa.html b/browse/test/fixtures/spa.html deleted file mode 100644 index 2ea176d..0000000 --- a/browse/test/fixtures/spa.html +++ /dev/null @@ -1,24 +0,0 @@ - - - - - Test Page - SPA - - - -
    Loading...
    - - - diff --git a/browse/test/fixtures/states.html b/browse/test/fixtures/states.html deleted file mode 100644 index 67debbf..0000000 --- a/browse/test/fixtures/states.html +++ /dev/null @@ -1,17 +0,0 @@ - - - - - Test Page - Element States - - -

    Element States Test

    - - - - -
    Visible
    - - - - diff --git a/browse/test/fixtures/upload.html b/browse/test/fixtures/upload.html deleted file mode 100644 index bb8aca6..0000000 --- a/browse/test/fixtures/upload.html +++ /dev/null @@ -1,25 +0,0 @@ - - - - - Test Page - Upload - - -

    Upload Test

    - - -

    - - - diff --git a/browse/test/snapshot.test.ts b/browse/test/snapshot.test.ts deleted file mode 100644 index bc45f6a..0000000 --- a/browse/test/snapshot.test.ts +++ /dev/null @@ -1,418 +0,0 @@ -/** - * Snapshot command tests - * - * Tests: accessibility tree snapshots, ref-based element selection, - * ref invalidation on navigation, and ref resolution in commands. - */ - -import { describe, test, expect, beforeAll, afterAll } from 'bun:test'; -import { startTestServer } from './test-server'; -import { BrowserManager } from '../src/browser-manager'; -import { handleReadCommand } from '../src/read-commands'; -import { handleWriteCommand } from '../src/write-commands'; -import { handleMetaCommand } from '../src/meta-commands'; -import * as fs from 'fs'; - -let testServer: ReturnType; -let bm: BrowserManager; -let baseUrl: string; -const shutdown = async () => {}; - -beforeAll(async () => { - testServer = startTestServer(0); - baseUrl = testServer.url; - - bm = new BrowserManager(); - await bm.launch(); -}); - -afterAll(() => { - try { testServer.server.stop(); } catch {} - setTimeout(() => process.exit(0), 500); -}); - -// ─── Snapshot Output ──────────────────────────────────────────── - -describe('Snapshot', () => { - test('snapshot returns accessibility tree with refs', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', [], bm, shutdown); - expect(result).toContain('@e'); - expect(result).toContain('[heading]'); - expect(result).toContain('"Snapshot Test"'); - expect(result).toContain('[button]'); - expect(result).toContain('[link]'); - }); - - test('snapshot -i returns only interactive elements', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - expect(result).toContain('[button]'); - expect(result).toContain('[link]'); - expect(result).toContain('[textbox]'); - // Should NOT contain non-interactive roles like heading or paragraph - expect(result).not.toContain('[heading]'); - }); - - test('snapshot -c returns compact output', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const full = await handleMetaCommand('snapshot', [], bm, shutdown); - const compact = await handleMetaCommand('snapshot', ['-c'], bm, shutdown); - // Compact should have fewer lines (empty structural elements removed) - const fullLines = full.split('\n').length; - const compactLines = compact.split('\n').length; - expect(compactLines).toBeLessThanOrEqual(fullLines); - }); - - test('snapshot -d 2 limits depth', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const shallow = await handleMetaCommand('snapshot', ['-d', '2'], bm, shutdown); - const deep = await handleMetaCommand('snapshot', [], bm, shutdown); - // Shallow should have fewer or equal lines - expect(shallow.split('\n').length).toBeLessThanOrEqual(deep.split('\n').length); - }); - - test('snapshot -s "#main" scopes to selector', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const scoped = await handleMetaCommand('snapshot', ['-s', '#main'], bm, shutdown); - // Should contain elements inside #main - expect(scoped).toContain('[button]'); - expect(scoped).toContain('"Submit"'); - // Should NOT contain elements outside #main (like nav links) - expect(scoped).not.toContain('"Internal Link"'); - }); - - test('snapshot on page with no interactive elements', async () => { - // Navigate to about:blank which has minimal content - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - const result = await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - // basic.html has links, so this should find those - expect(result).toContain('[link]'); - }); - - test('second snapshot generates fresh refs', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap1 = await handleMetaCommand('snapshot', [], bm, shutdown); - const snap2 = await handleMetaCommand('snapshot', [], bm, shutdown); - // Both should have @e1 (refs restart from 1) - expect(snap1).toContain('@e1'); - expect(snap2).toContain('@e1'); - }); -}); - -// ─── Ref-Based Interaction ────────────────────────────────────── - -describe('Ref resolution', () => { - test('click @ref works after snapshot', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - // Find a button ref - const buttonLine = snap.split('\n').find(l => l.includes('[button]') && l.includes('"Submit"')); - expect(buttonLine).toBeDefined(); - const refMatch = buttonLine!.match(/@(e\d+)/); - expect(refMatch).toBeDefined(); - const ref = `@${refMatch![1]}`; - const result = await handleWriteCommand('click', [ref], bm); - expect(result).toContain('Clicked'); - }); - - test('fill @ref works after snapshot', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - // Find a textbox ref (Username) - const textboxLine = snap.split('\n').find(l => l.includes('[textbox]') && l.includes('"Username"')); - expect(textboxLine).toBeDefined(); - const refMatch = textboxLine!.match(/@(e\d+)/); - expect(refMatch).toBeDefined(); - const ref = `@${refMatch![1]}`; - const result = await handleWriteCommand('fill', [ref, 'testuser'], bm); - expect(result).toContain('Filled'); - }); - - test('hover @ref works after snapshot', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - const linkLine = snap.split('\n').find(l => l.includes('[link]')); - expect(linkLine).toBeDefined(); - const refMatch = linkLine!.match(/@(e\d+)/); - const ref = `@${refMatch![1]}`; - const result = await handleWriteCommand('hover', [ref], bm); - expect(result).toContain('Hovered'); - }); - - test('html @ref returns innerHTML', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', [], bm, shutdown); - // Find a heading ref - const headingLine = snap.split('\n').find(l => l.includes('[heading]') && l.includes('"Snapshot Test"')); - expect(headingLine).toBeDefined(); - const refMatch = headingLine!.match(/@(e\d+)/); - const ref = `@${refMatch![1]}`; - const result = await handleReadCommand('html', [ref], bm); - expect(result).toContain('Snapshot Test'); - }); - - test('css @ref returns computed CSS', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', [], bm, shutdown); - const headingLine = snap.split('\n').find(l => l.includes('[heading]') && l.includes('"Snapshot Test"')); - const refMatch = headingLine!.match(/@(e\d+)/); - const ref = `@${refMatch![1]}`; - const result = await handleReadCommand('css', [ref, 'font-family'], bm); - expect(result).toBeTruthy(); - }); - - test('attrs @ref returns element attributes', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - const textboxLine = snap.split('\n').find(l => l.includes('[textbox]') && l.includes('"Username"')); - const refMatch = textboxLine!.match(/@(e\d+)/); - const ref = `@${refMatch![1]}`; - const result = await handleReadCommand('attrs', [ref], bm); - expect(result).toContain('id'); - }); -}); - -// ─── Ref Invalidation ─────────────────────────────────────────── - -describe('Ref invalidation', () => { - test('stale ref after goto returns clear error', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - // Navigate away — should invalidate refs - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - // Try to use old ref - try { - await handleWriteCommand('click', ['@e1'], bm); - expect(true).toBe(false); // Should not reach here - } catch (err: any) { - expect(err.message).toContain('not found'); - expect(err.message).toContain('snapshot'); - } - }); - - test('refs cleared on page navigation', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - await handleMetaCommand('snapshot', ['-i'], bm, shutdown); - expect(bm.getRefCount()).toBeGreaterThan(0); - // Navigate - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - expect(bm.getRefCount()).toBe(0); - }); -}); - -// ─── Snapshot Diffing ────────────────────────────────────────── - -describe('Snapshot diff', () => { - test('first snapshot -D stores baseline', async () => { - // Clear any previous snapshot - bm.setLastSnapshot(null); - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', ['-D'], bm, shutdown); - expect(result).toContain('no previous snapshot'); - expect(result).toContain('baseline'); - }); - - test('snapshot -D shows diff after change', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - // Take first snapshot - await handleMetaCommand('snapshot', [], bm, shutdown); - // Modify DOM - await handleReadCommand('js', ['document.querySelector("h1").textContent = "Changed Title"'], bm); - // Take diff - const diff = await handleMetaCommand('snapshot', ['-D'], bm, shutdown); - expect(diff).toContain('---'); - expect(diff).toContain('+++'); - expect(diff).toContain('previous snapshot'); - expect(diff).toContain('current snapshot'); - }); - - test('snapshot -D with identical page shows no changes', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - await handleMetaCommand('snapshot', [], bm, shutdown); - const diff = await handleMetaCommand('snapshot', ['-D'], bm, shutdown); - // All lines should be unchanged (prefixed with space) - const lines = diff.split('\n').filter(l => l.startsWith('+') || l.startsWith('-')); - // Header lines start with --- and +++ so filter those - const contentChanges = lines.filter(l => !l.startsWith('---') && !l.startsWith('+++')); - expect(contentChanges.length).toBe(0); - }); -}); - -// ─── Annotated Screenshots ───────────────────────────────────── - -describe('Annotated screenshots', () => { - test('snapshot -a creates annotated screenshot', async () => { - const screenshotPath = '/tmp/browse-test-annotated.png'; - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', ['-a', '-o', screenshotPath], bm, shutdown); - expect(result).toContain('annotated screenshot'); - expect(result).toContain(screenshotPath); - expect(fs.existsSync(screenshotPath)).toBe(true); - const stat = fs.statSync(screenshotPath); - expect(stat.size).toBeGreaterThan(1000); - fs.unlinkSync(screenshotPath); - }); - - test('snapshot -a uses default path', async () => { - const defaultPath = '/tmp/browse-annotated.png'; - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', ['-a'], bm, shutdown); - expect(result).toContain('annotated screenshot'); - expect(fs.existsSync(defaultPath)).toBe(true); - fs.unlinkSync(defaultPath); - }); - - test('snapshot -a -i only annotates interactive', async () => { - const screenshotPath = '/tmp/browse-test-annotated-i.png'; - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', ['-i', '-a', '-o', screenshotPath], bm, shutdown); - expect(result).toContain('[button]'); - expect(result).toContain('[link]'); - expect(result).toContain('annotated screenshot'); - if (fs.existsSync(screenshotPath)) fs.unlinkSync(screenshotPath); - }); - - test('annotation overlays are cleaned up', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - await handleMetaCommand('snapshot', ['-a'], bm, shutdown); - // Check that overlays are removed - const overlays = await handleReadCommand('js', ['document.querySelectorAll(".__browse_annotation__").length'], bm); - expect(overlays).toBe('0'); - // Clean up default file - try { fs.unlinkSync('/tmp/browse-annotated.png'); } catch {} - }); -}); - -// ─── Cursor-Interactive ──────────────────────────────────────── - -describe('Cursor-interactive', () => { - test('snapshot -C finds cursor:pointer elements', async () => { - await handleWriteCommand('goto', [baseUrl + '/cursor-interactive.html'], bm); - const result = await handleMetaCommand('snapshot', ['-C'], bm, shutdown); - expect(result).toContain('cursor-interactive'); - expect(result).toContain('@c'); - expect(result).toContain('cursor:pointer'); - }); - - test('snapshot -C includes onclick elements', async () => { - await handleWriteCommand('goto', [baseUrl + '/cursor-interactive.html'], bm); - const result = await handleMetaCommand('snapshot', ['-C'], bm, shutdown); - expect(result).toContain('onclick'); - }); - - test('snapshot -C includes tabindex elements', async () => { - await handleWriteCommand('goto', [baseUrl + '/cursor-interactive.html'], bm); - const result = await handleMetaCommand('snapshot', ['-C'], bm, shutdown); - expect(result).toContain('tabindex'); - }); - - test('@c ref is clickable', async () => { - await handleWriteCommand('goto', [baseUrl + '/cursor-interactive.html'], bm); - const snap = await handleMetaCommand('snapshot', ['-C'], bm, shutdown); - // Find a @c ref - const cLine = snap.split('\n').find(l => l.includes('@c')); - if (cLine) { - const refMatch = cLine.match(/@(c\d+)/); - if (refMatch) { - const result = await handleWriteCommand('click', [`@${refMatch[1]}`], bm); - expect(result).toContain('Clicked'); - } - } - }); - - test('snapshot -C on page with no cursor elements', async () => { - await handleWriteCommand('goto', [baseUrl + '/empty.html'], bm); - const result = await handleMetaCommand('snapshot', ['-C'], bm, shutdown); - // Should not contain cursor-interactive section - expect(result).not.toContain('cursor-interactive'); - }); - - test('snapshot -i -C combines both modes', async () => { - await handleWriteCommand('goto', [baseUrl + '/cursor-interactive.html'], bm); - const result = await handleMetaCommand('snapshot', ['-i', '-C'], bm, shutdown); - // Should have interactive elements (button, link) - expect(result).toContain('[button]'); - expect(result).toContain('[link]'); - // And cursor-interactive section - expect(result).toContain('cursor-interactive'); - }); -}); - -// ─── Snapshot Error Paths ─────────────────────────────────────── - -describe('Snapshot errors', () => { - test('unknown flag throws', async () => { - try { - await handleMetaCommand('snapshot', ['--bogus'], bm, shutdown); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Unknown snapshot flag'); - } - }); - - test('-d without number throws', async () => { - try { - await handleMetaCommand('snapshot', ['-d'], bm, shutdown); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('-s without selector throws', async () => { - try { - await handleMetaCommand('snapshot', ['-s'], bm, shutdown); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); - - test('-s with nonexistent selector throws', async () => { - await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm); - try { - await handleMetaCommand('snapshot', ['-s', '#nonexistent-element-12345'], bm, shutdown); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Selector not found'); - } - }); - - test('-o without path throws', async () => { - try { - await handleMetaCommand('snapshot', ['-o'], bm, shutdown); - expect(true).toBe(false); - } catch (err: any) { - expect(err.message).toContain('Usage'); - } - }); -}); - -// ─── Combined Flags ───────────────────────────────────────────── - -describe('Snapshot combined flags', () => { - test('-i -c -d 2 combines all filters', async () => { - await handleWriteCommand('goto', [baseUrl + '/snapshot.html'], bm); - const result = await handleMetaCommand('snapshot', ['-i', '-c', '-d', '2'], bm, shutdown); - // Should be filtered to interactive, compact, shallow - expect(result).toContain('[button]'); - expect(result).toContain('[link]'); - // Should NOT contain deep nested non-interactive elements - expect(result).not.toContain('[heading]'); - }); - - test('closetab last tab auto-creates new', async () => { - // Get down to 1 tab - const tabs = await bm.getTabListWithTitles(); - for (let i = 1; i < tabs.length; i++) { - await bm.closeTab(tabs[i].id); - } - expect(bm.getTabCount()).toBe(1); - // Close the last tab - const lastTab = (await bm.getTabListWithTitles())[0]; - await bm.closeTab(lastTab.id); - // Should have auto-created a new tab - expect(bm.getTabCount()).toBe(1); - }); -}); diff --git a/browse/test/test-server.ts b/browse/test/test-server.ts deleted file mode 100644 index 3775882..0000000 --- a/browse/test/test-server.ts +++ /dev/null @@ -1,57 +0,0 @@ -/** - * Tiny Bun.serve for test fixtures - * Serves HTML files from test/fixtures/ on a random available port - */ - -import * as path from 'path'; -import * as fs from 'fs'; - -const FIXTURES_DIR = path.resolve(import.meta.dir, 'fixtures'); - -export function startTestServer(port: number = 0): { server: ReturnType; url: string } { - const server = Bun.serve({ - port, - hostname: '127.0.0.1', - fetch(req) { - const url = new URL(req.url); - - // Echo endpoint — returns request headers as JSON - if (url.pathname === '/echo') { - const headers: Record = {}; - req.headers.forEach((value, key) => { headers[key] = value; }); - return new Response(JSON.stringify(headers, null, 2), { - headers: { 'Content-Type': 'application/json' }, - }); - } - - let filePath = url.pathname === '/' ? '/basic.html' : url.pathname; - - // Remove leading slash - filePath = filePath.replace(/^\//, ''); - const fullPath = path.join(FIXTURES_DIR, filePath); - - if (!fs.existsSync(fullPath)) { - return new Response('Not Found', { status: 404 }); - } - - const content = fs.readFileSync(fullPath, 'utf-8'); - const ext = path.extname(fullPath); - const contentType = ext === '.html' ? 'text/html' : 'text/plain'; - - return new Response(content, { - headers: { 'Content-Type': contentType }, - }); - }, - }); - - const url = `http://127.0.0.1:${server.port}`; - return { server, url }; -} - -// If run directly, start and print URL -if (import.meta.main) { - const { server, url } = startTestServer(9450); - console.log(`Test server running at ${url}`); - console.log(`Fixtures: ${FIXTURES_DIR}`); - console.log('Press Ctrl+C to stop'); -} diff --git a/package.json b/package.json deleted file mode 100644 index bece501..0000000 --- a/package.json +++ /dev/null @@ -1,34 +0,0 @@ -{ - "name": "gstack", - "version": "0.3.2", - "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", - "license": "MIT", - "type": "module", - "bin": { - "browse": "./browse/dist/browse" - }, - "scripts": { - "build": "bun build --compile browse/src/cli.ts --outfile browse/dist/browse && bun build --compile browse/src/find-browse.ts --outfile browse/dist/find-browse && git rev-parse HEAD > browse/dist/.version && rm -f .*.bun-build", - "dev": "bun run browse/src/cli.ts", - "server": "bun run browse/src/server.ts", - "test": "bun test", - "start": "bun run browse/src/server.ts" - }, - "dependencies": { - "playwright": "^1.58.2", - "diff": "^7.0.0" - }, - "engines": { - "bun": ">=1.0.0" - }, - "keywords": [ - "browser", - "automation", - "playwright", - "headless", - "cli", - "claude", - "ai-agent", - "devtools" - ] -} diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md deleted file mode 100644 index 8ac026c..0000000 --- a/plan-ceo-review/SKILL.md +++ /dev/null @@ -1,484 +0,0 @@ ---- -name: plan-ceo-review -version: 1.0.0 -description: | - CEO/founder-mode plan review. Rethink the problem, find the 10-star product, - challenge premises, expand scope when it creates a better product. Three modes: - SCOPE EXPANSION (dream big), HOLD SCOPE (maximum rigor), SCOPE REDUCTION - (strip to essentials). -allowed-tools: - - Read - - Grep - - Glob - - Bash - - AskUserQuestion ---- - -# Mega Plan Review Mode - -## Philosophy -You are not here to rubber-stamp this plan. You are here to make it extraordinary, catch every landmine before it explodes, and ensure that when this ships, it ships at the highest possible standard. -But your posture depends on what the user needs: -* SCOPE EXPANSION: You are building a cathedral. Envision the platonic ideal. Push scope UP. Ask "what would make this 10x better for 2x the effort?" The answer to "should we also build X?" is "yes, if it serves the vision." You have permission to dream. -* HOLD SCOPE: You are a rigorous reviewer. The plan's scope is accepted. Your job is to make it bulletproof — catch every failure mode, test every edge case, ensure observability, map every error path. Do not silently reduce OR expand. -* SCOPE REDUCTION: You are a surgeon. Find the minimum viable version that achieves the core outcome. Cut everything else. Be ruthless. -Critical rule: Once the user selects a mode, COMMIT to it. Do not silently drift toward a different mode. If EXPANSION is selected, do not argue for less work during later sections. If REDUCTION is selected, do not sneak scope back in. Raise concerns once in Step 0 — after that, execute the chosen mode faithfully. -Do NOT make any code changes. Do NOT start implementation. Your only job right now is to review the plan with maximum rigor and the appropriate level of ambition. - -## Prime Directives -1. Zero silent failures. Every failure mode must be visible — to the system, to the team, to the user. If a failure can happen silently, that is a critical defect in the plan. -2. Every error has a name. Don't say "handle errors." Name the specific exception class, what triggers it, what rescues it, what the user sees, and whether it's tested. rescue StandardError is a code smell — call it out. -3. Data flows have shadow paths. Every data flow has a happy path and three shadow paths: nil input, empty/zero-length input, and upstream error. Trace all four for every new flow. -4. Interactions have edge cases. Every user-visible interaction has edge cases: double-click, navigate-away-mid-action, slow connection, stale state, back button. Map them. -5. Observability is scope, not afterthought. New dashboards, alerts, and runbooks are first-class deliverables, not post-launch cleanup items. -6. Diagrams are mandatory. No non-trivial flow goes undiagrammed. ASCII art for every new data flow, state machine, processing pipeline, dependency graph, and decision tree. -7. Everything deferred must be written down. Vague intentions are lies. TODOS.md or it doesn't exist. -8. Optimize for the 6-month future, not just today. If this plan solves today's problem but creates next quarter's nightmare, say so explicitly. -9. You have permission to say "scrap it and do this instead." If there's a fundamentally better approach, table it. I'd rather hear it now. - -## Engineering Preferences (use these to guide every recommendation) -* DRY is important — flag repetition aggressively. -* Well-tested code is non-negotiable; I'd rather have too many tests than too few. -* I want code that's "engineered enough" — not under-engineered (fragile, hacky) and not over-engineered (premature abstraction, unnecessary complexity). -* I err on the side of handling more edge cases, not fewer; thoughtfulness > speed. -* Bias toward explicit over clever. -* Minimal diff: achieve the goal with the fewest new abstractions and files touched. -* Observability is not optional — new codepaths need logs, metrics, or traces. -* Security is not optional — new codepaths need threat modeling. -* Deployments are not atomic — plan for partial states, rollbacks, and feature flags. -* ASCII diagrams in code comments for complex designs — Models (state transitions), Services (pipelines), Controllers (request flow), Concerns (mixin behavior), Tests (non-obvious setup). -* Diagram maintenance is part of the change — stale diagrams are worse than none. - -## Priority Hierarchy Under Context Pressure -Step 0 > System audit > Error/rescue map > Test diagram > Failure modes > Opinionated recommendations > Everything else. -Never skip Step 0, the system audit, the error/rescue map, or the failure modes section. These are the highest-leverage outputs. - -## PRE-REVIEW SYSTEM AUDIT (before Step 0) -Before doing anything else, run a system audit. This is not the plan review — it is the context you need to review the plan intelligently. -Run the following commands: -``` -git log --oneline -30 # Recent history -git diff main --stat # What's already changed -git stash list # Any stashed work -grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l -find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files -``` -Then read CLAUDE.md, TODOS.md, and any existing architecture docs. Map: -* What is the current system state? -* What is already in flight (other open PRs, branches, stashed changes)? -* What are the existing known pain points most relevant to this plan? -* Are there any FIXME/TODO comments in files this plan touches? - -### Retrospective Check -Check the git log for this branch. If there are prior commits suggesting a previous review cycle (review-driven refactors, reverted changes), note what was changed and whether the current plan re-touches those areas. Be MORE aggressive reviewing areas that were previously problematic. Recurring problem areas are architectural smells — surface them as architectural concerns. - -### Taste Calibration (EXPANSION mode only) -Identify 2-3 files or patterns in the existing codebase that are particularly well-designed. Note them as style references for the review. Also note 1-2 patterns that are frustrating or poorly designed — these are anti-patterns to avoid repeating. -Report findings before proceeding to Step 0. - -## Step 0: Nuclear Scope Challenge + Mode Selection - -### 0A. Premise Challenge -1. Is this the right problem to solve? Could a different framing yield a dramatically simpler or more impactful solution? -2. What is the actual user/business outcome? Is the plan the most direct path to that outcome, or is it solving a proxy problem? -3. What would happen if we did nothing? Real pain point or hypothetical one? - -### 0B. Existing Code Leverage -1. What existing code already partially or fully solves each sub-problem? Map every sub-problem to existing code. Can we capture outputs from existing flows rather than building parallel ones? -2. Is this plan rebuilding anything that already exists? If yes, explain why rebuilding is better than refactoring. - -### 0C. Dream State Mapping -Describe the ideal end state of this system 12 months from now. Does this plan move toward that state or away from it? -``` - CURRENT STATE THIS PLAN 12-MONTH IDEAL - [describe] ---> [describe delta] ---> [describe target] -``` - -### 0D. Mode-Specific Analysis -**For SCOPE EXPANSION** — run all three: -1. 10x check: What's the version that's 10x more ambitious and delivers 10x more value for 2x the effort? Describe it concretely. -2. Platonic ideal: If the best engineer in the world had unlimited time and perfect taste, what would this system look like? What would the user feel when using it? Start from experience, not architecture. -3. Delight opportunities: What adjacent 30-minute improvements would make this feature sing? Things where a user would think "oh nice, they thought of that." List at least 3. - -**For HOLD SCOPE** — run this: -1. Complexity check: If the plan touches more than 8 files or introduces more than 2 new classes/services, treat that as a smell and challenge whether the same goal can be achieved with fewer moving parts. -2. What is the minimum set of changes that achieves the stated goal? Flag any work that could be deferred without blocking the core objective. - -**For SCOPE REDUCTION** — run this: -1. Ruthless cut: What is the absolute minimum that ships value to a user? Everything else is deferred. No exceptions. -2. What can be a follow-up PR? Separate "must ship together" from "nice to ship together." - -### 0E. Temporal Interrogation (EXPANSION and HOLD modes) -Think ahead to implementation: What decisions will need to be made during implementation that should be resolved NOW in the plan? -``` - HOUR 1 (foundations): What does the implementer need to know? - HOUR 2-3 (core logic): What ambiguities will they hit? - HOUR 4-5 (integration): What will surprise them? - HOUR 6+ (polish/tests): What will they wish they'd planned for? -``` -Surface these as questions for the user NOW, not as "figure it out later." - -### 0F. Mode Selection -Present three options: -1. **SCOPE EXPANSION:** The plan is good but could be great. Propose the ambitious version, then review that. Push scope up. Build the cathedral. -2. **HOLD SCOPE:** The plan's scope is right. Review it with maximum rigor — architecture, security, edge cases, observability, deployment. Make it bulletproof. -3. **SCOPE REDUCTION:** The plan is overbuilt or wrong-headed. Propose a minimal version that achieves the core goal, then review that. - -Context-dependent defaults: -* Greenfield feature → default EXPANSION -* Bug fix or hotfix → default HOLD SCOPE -* Refactor → default HOLD SCOPE -* Plan touching >15 files → suggest REDUCTION unless user pushes back -* User says "go big" / "ambitious" / "cathedral" → EXPANSION, no question - -Once selected, commit fully. Do not silently drift. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -## Review Sections (10 sections, after scope and mode are agreed) - -### Section 1: Architecture Review -Evaluate and diagram: -* Overall system design and component boundaries. Draw the dependency graph. -* Data flow — all four paths. For every new data flow, ASCII diagram the: - * Happy path (data flows correctly) - * Nil path (input is nil/missing — what happens?) - * Empty path (input is present but empty/zero-length — what happens?) - * Error path (upstream call fails — what happens?) -* State machines. ASCII diagram for every new stateful object. Include impossible/invalid transitions and what prevents them. -* Coupling concerns. Which components are now coupled that weren't before? Is that coupling justified? Draw the before/after dependency graph. -* Scaling characteristics. What breaks first under 10x load? Under 100x? -* Single points of failure. Map them. -* Security architecture. Auth boundaries, data access patterns, API surfaces. For each new endpoint or data mutation: who can call it, what do they get, what can they change? -* Production failure scenarios. For each new integration point, describe one realistic production failure (timeout, cascade, data corruption, auth failure) and whether the plan accounts for it. -* Rollback posture. If this ships and immediately breaks, what's the rollback procedure? Git revert? Feature flag? DB migration rollback? How long? - -**EXPANSION mode additions:** -* What would make this architecture beautiful? Not just correct — elegant. Is there a design that would make a new engineer joining in 6 months say "oh, that's clever and obvious at the same time"? -* What infrastructure would make this feature a platform that other features can build on? - -Required ASCII diagram: full system architecture showing new components and their relationships to existing ones. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 2: Error & Rescue Map -This is the section that catches silent failures. It is not optional. -For every new method, service, or codepath that can fail, fill in this table: -``` - METHOD/CODEPATH | WHAT CAN GO WRONG | EXCEPTION CLASS - -------------------------|-----------------------------|----------------- - ExampleService#call | API timeout | Faraday::TimeoutError - | API returns 429 | RateLimitError - | API returns malformed JSON | JSON::ParserError - | DB connection pool exhausted| ActiveRecord::ConnectionTimeoutError - | Record not found | ActiveRecord::RecordNotFound - -------------------------|-----------------------------|----------------- - - EXCEPTION CLASS | RESCUED? | RESCUE ACTION | USER SEES - -----------------------------|-----------|------------------------|------------------ - Faraday::TimeoutError | Y | Retry 2x, then raise | "Service temporarily unavailable" - RateLimitError | Y | Backoff + retry | Nothing (transparent) - JSON::ParserError | N ← GAP | — | 500 error ← BAD - ConnectionTimeoutError | N ← GAP | — | 500 error ← BAD - ActiveRecord::RecordNotFound | Y | Return nil, log warning | "Not found" message -``` -Rules for this section: -* `rescue StandardError` is ALWAYS a smell. Name the specific exceptions. -* `rescue => e` with only `Rails.logger.error(e.message)` is insufficient. Log the full context: what was being attempted, with what arguments, for what user/request. -* Every rescued error must either: retry with backoff, degrade gracefully with a user-visible message, or re-raise with added context. "Swallow and continue" is almost never acceptable. -* For each GAP (unrescued error that should be rescued): specify the rescue action and what the user should see. -* For LLM/AI service calls specifically: what happens when the response is malformed? When it's empty? When it hallucinates invalid JSON? When the model returns a refusal? Each of these is a distinct failure mode. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 3: Security & Threat Model -Security is not a sub-bullet of architecture. It gets its own section. -Evaluate: -* Attack surface expansion. What new attack vectors does this plan introduce? New endpoints, new params, new file paths, new background jobs? -* Input validation. For every new user input: is it validated, sanitized, and rejected loudly on failure? What happens with: nil, empty string, string when integer expected, string exceeding max length, unicode edge cases, HTML/script injection attempts? -* Authorization. For every new data access: is it scoped to the right user/role? Is there a direct object reference vulnerability? Can user A access user B's data by manipulating IDs? -* Secrets and credentials. New secrets? In env vars, not hardcoded? Rotatable? -* Dependency risk. New gems/npm packages? Security track record? -* Data classification. PII, payment data, credentials? Handling consistent with existing patterns? -* Injection vectors. SQL, command, template, LLM prompt injection — check all. -* Audit logging. For sensitive operations: is there an audit trail? - -For each finding: threat, likelihood (High/Med/Low), impact (High/Med/Low), and whether the plan mitigates it. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 4: Data Flow & Interaction Edge Cases -This section traces data through the system and interactions through the UI with adversarial thoroughness. - -**Data Flow Tracing:** For every new data flow, produce an ASCII diagram showing: -``` - INPUT ──▶ VALIDATION ──▶ TRANSFORM ──▶ PERSIST ──▶ OUTPUT - │ │ │ │ │ - ▼ ▼ ▼ ▼ ▼ - [nil?] [invalid?] [exception?] [conflict?] [stale?] - [empty?] [too long?] [timeout?] [dup key?] [partial?] - [wrong [wrong type?] [OOM?] [locked?] [encoding?] - type?] -``` -For each node: what happens on each shadow path? Is it tested? - -**Interaction Edge Cases:** For every new user-visible interaction, evaluate: -``` - INTERACTION | EDGE CASE | HANDLED? | HOW? - ---------------------|------------------------|----------|-------- - Form submission | Double-click submit | ? | - | Submit with stale CSRF | ? | - | Submit during deploy | ? | - Async operation | User navigates away | ? | - | Operation times out | ? | - | Retry while in-flight | ? | - List/table view | Zero results | ? | - | 10,000 results | ? | - | Results change mid-page| ? | - Background job | Job fails after 3 of | ? | - | 10 items processed | | - | Job runs twice (dup) | ? | - | Queue backs up 2 hours | ? | -``` -Flag any unhandled edge case as a gap. For each gap, specify the fix. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 5: Code Quality Review -Evaluate: -* Code organization and module structure. Does new code fit existing patterns? If it deviates, is there a reason? -* DRY violations. Be aggressive. If the same logic exists elsewhere, flag it and reference the file and line. -* Naming quality. Are new classes, methods, and variables named for what they do, not how they do it? -* Error handling patterns. (Cross-reference with Section 2 — this section reviews the patterns; Section 2 maps the specifics.) -* Missing edge cases. List explicitly: "What happens when X is nil?" "When the API returns 429?" etc. -* Over-engineering check. Any new abstraction solving a problem that doesn't exist yet? -* Under-engineering check. Anything fragile, assuming happy path only, or missing obvious defensive checks? -* Cyclomatic complexity. Flag any new method that branches more than 5 times. Propose a refactor. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 6: Test Review -Make a complete diagram of every new thing this plan introduces: -``` - NEW UX FLOWS: - [list each new user-visible interaction] - - NEW DATA FLOWS: - [list each new path data takes through the system] - - NEW CODEPATHS: - [list each new branch, condition, or execution path] - - NEW BACKGROUND JOBS / ASYNC WORK: - [list each] - - NEW INTEGRATIONS / EXTERNAL CALLS: - [list each] - - NEW ERROR/RESCUE PATHS: - [list each — cross-reference Section 2] -``` -For each item in the diagram: -* What type of test covers it? (Unit / Integration / System / E2E) -* Does a test for it exist in the plan? If not, write the test spec header. -* What is the happy path test? -* What is the failure path test? (Be specific — which failure?) -* What is the edge case test? (nil, empty, boundary values, concurrent access) - -Test ambition check (all modes): For each new feature, answer: -* What's the test that would make you confident shipping at 2am on a Friday? -* What's the test a hostile QA engineer would write to break this? -* What's the chaos test? - -Test pyramid check: Many unit, fewer integration, few E2E? Or inverted? -Flakiness risk: Flag any test depending on time, randomness, external services, or ordering. -Load/stress test requirements: For any new codepath called frequently or processing significant data. - -For LLM/prompt changes: Check CLAUDE.md for the "Prompt/LLM changes" file patterns. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 7: Performance Review -Evaluate: -* N+1 queries. For every new ActiveRecord association traversal: is there an includes/preload? -* Memory usage. For every new data structure: what's the maximum size in production? -* Database indexes. For every new query: is there an index? -* Caching opportunities. For every expensive computation or external call: should it be cached? -* Background job sizing. For every new job: worst-case payload, runtime, retry behavior? -* Slow paths. Top 3 slowest new codepaths and estimated p99 latency. -* Connection pool pressure. New DB connections, Redis connections, HTTP connections? -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 8: Observability & Debuggability Review -New systems break. This section ensures you can see why. -Evaluate: -* Logging. For every new codepath: structured log lines at entry, exit, and each significant branch? -* Metrics. For every new feature: what metric tells you it's working? What tells you it's broken? -* Tracing. For new cross-service or cross-job flows: trace IDs propagated? -* Alerting. What new alerts should exist? -* Dashboards. What new dashboard panels do you want on day 1? -* Debuggability. If a bug is reported 3 weeks post-ship, can you reconstruct what happened from logs alone? -* Admin tooling. New operational tasks that need admin UI or rake tasks? -* Runbooks. For each new failure mode: what's the operational response? - -**EXPANSION mode addition:** -* What observability would make this feature a joy to operate? -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 9: Deployment & Rollout Review -Evaluate: -* Migration safety. For every new DB migration: backward-compatible? Zero-downtime? Table locks? -* Feature flags. Should any part be behind a feature flag? -* Rollout order. Correct sequence: migrate first, deploy second? -* Rollback plan. Explicit step-by-step. -* Deploy-time risk window. Old code and new code running simultaneously — what breaks? -* Environment parity. Tested in staging? -* Post-deploy verification checklist. First 5 minutes? First hour? -* Smoke tests. What automated checks should run immediately post-deploy? - -**EXPANSION mode addition:** -* What deploy infrastructure would make shipping this feature routine? -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -### Section 10: Long-Term Trajectory Review -Evaluate: -* Technical debt introduced. Code debt, operational debt, testing debt, documentation debt. -* Path dependency. Does this make future changes harder? -* Knowledge concentration. Documentation sufficient for a new engineer? -* Reversibility. Rate 1-5: 1 = one-way door, 5 = easily reversible. -* Ecosystem fit. Aligns with Rails/JS ecosystem direction? -* The 1-year question. Read this plan as a new engineer in 12 months — obvious? - -**EXPANSION mode additions:** -* What comes after this ships? Phase 2? Phase 3? Does the architecture support that trajectory? -* Platform potential. Does this create capabilities other features can leverage? -**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds. - -## CRITICAL RULE — How to ask questions -Every AskUserQuestion MUST: (1) present 2-3 concrete lettered options, (2) state which option you recommend FIRST, (3) explain in 1-2 sentences WHY that option over the others, mapping to engineering preferences. No batching multiple issues into one question. No yes/no questions. Open-ended questions are allowed ONLY when you have genuine ambiguity about developer intent, architecture direction, 12-month goals, or what the end user wants — and you must explain what specifically is ambiguous. - -## For Each Issue You Find -* **One issue = one AskUserQuestion call.** Never combine multiple issues into one question. -* Describe the problem concretely, with file and line references. -* Present 2-3 options, including "do nothing" where reasonable. -* For each option: effort, risk, and maintenance burden in one line. -* **Lead with your recommendation.** State it as a directive: "Do B. Here's why:" — not "Option B might be worth considering." Be opinionated. I'm paying for your judgment, not a menu. -* **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference. -* **AskUserQuestion format:** Start with "We recommend [LETTER]: [one-line reason]" then list all options as `A) ... B) ... C) ...`. Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). -* **Escape hatch:** If a section has no issues, say so and move on. If an issue has an obvious fix with no real alternatives, state what you'll do and move on — don't waste a question on it. Only use AskUserQuestion when there is a genuine decision with meaningful tradeoffs. - -## Required Outputs - -### "NOT in scope" section -List work considered and explicitly deferred, with one-line rationale each. - -### "What already exists" section -List existing code/flows that partially solve sub-problems and whether the plan reuses them. - -### "Dream state delta" section -Where this plan leaves us relative to the 12-month ideal. - -### Error & Rescue Registry (from Section 2) -Complete table of every method that can fail, every exception class, rescued status, rescue action, user impact. - -### Failure Modes Registry -``` - CODEPATH | FAILURE MODE | RESCUED? | TEST? | USER SEES? | LOGGED? - ---------|----------------|----------|-------|----------------|-------- -``` -Any row with RESCUED=N, TEST=N, USER SEES=Silent → **CRITICAL GAP**. - -### TODOS.md updates -Present each potential TODO as its own individual AskUserQuestion. Never batch TODOs — one per question. Never silently skip this step. - -For each TODO, describe: -* **What:** One-line description of the work. -* **Why:** The concrete problem it solves or value it unlocks. -* **Pros:** What you gain by doing this work. -* **Cons:** Cost, complexity, or risks of doing it. -* **Context:** Enough detail that someone picking this up in 3 months understands the motivation, the current state, and where to start. -* **Effort estimate:** S/M/L/XL -* **Priority:** P1/P2/P3 -* **Depends on / blocked by:** Any prerequisites or ordering constraints. - -Then present options: **A)** Add to TODOS.md **B)** Skip — not valuable enough **C)** Build it now in this PR instead of deferring. - -### Delight Opportunities (EXPANSION mode only) -Identify at least 5 "bonus chunk" opportunities (<30 min each) that would make users think "oh nice, they thought of that." Present each delight opportunity as its own individual AskUserQuestion. Never batch them. For each one, describe what it is, why it would delight users, and effort estimate. Then present options: **A)** Add to TODOS.md as a vision item **B)** Skip **C)** Build it now in this PR. - -### Diagrams (mandatory, produce all that apply) -1. System architecture -2. Data flow (including shadow paths) -3. State machine -4. Error flow -5. Deployment sequence -6. Rollback flowchart - -### Stale Diagram Audit -List every ASCII diagram in files this plan touches. Still accurate? - -### Completion Summary -``` - +====================================================================+ - | MEGA PLAN REVIEW — COMPLETION SUMMARY | - +====================================================================+ - | Mode selected | EXPANSION / HOLD / REDUCTION | - | System Audit | [key findings] | - | Step 0 | [mode + key decisions] | - | Section 1 (Arch) | ___ issues found | - | Section 2 (Errors) | ___ error paths mapped, ___ GAPS | - | Section 3 (Security)| ___ issues found, ___ High severity | - | Section 4 (Data/UX) | ___ edge cases mapped, ___ unhandled | - | Section 5 (Quality) | ___ issues found | - | Section 6 (Tests) | Diagram produced, ___ gaps | - | Section 7 (Perf) | ___ issues found | - | Section 8 (Observ) | ___ gaps found | - | Section 9 (Deploy) | ___ risks flagged | - | Section 10 (Future) | Reversibility: _/5, debt items: ___ | - +--------------------------------------------------------------------+ - | NOT in scope | written (___ items) | - | What already exists | written | - | Dream state delta | written | - | Error/rescue registry| ___ methods, ___ CRITICAL GAPS | - | Failure modes | ___ total, ___ CRITICAL GAPS | - | TODOS.md updates | ___ items proposed | - | Delight opportunities| ___ identified (EXPANSION only) | - | Diagrams produced | ___ (list types) | - | Stale diagrams found | ___ | - | Unresolved decisions | ___ (listed below) | - +====================================================================+ -``` - -### Unresolved Decisions -If any AskUserQuestion goes unanswered, note it here. Never silently default. - -## Formatting Rules -* NUMBER issues (1, 2, 3...) and LETTERS for options (A, B, C...). -* Label with NUMBER + LETTER (e.g., "3A", "3B"). -* Recommended option always listed first. -* One sentence max per option. -* After each section, pause and wait for feedback. -* Use **CRITICAL GAP** / **WARNING** / **OK** for scannability. - -## Mode Quick Reference -``` - ┌─────────────────────────────────────────────────────────────────┐ - │ MODE COMPARISON │ - ├─────────────┬──────────────┬──────────────┬────────────────────┤ - │ │ EXPANSION │ HOLD SCOPE │ REDUCTION │ - ├─────────────┼──────────────┼──────────────┼────────────────────┤ - │ Scope │ Push UP │ Maintain │ Push DOWN │ - │ 10x check │ Mandatory │ Optional │ Skip │ - │ Platonic │ Yes │ No │ No │ - │ ideal │ │ │ │ - │ Delight │ 5+ items │ Note if seen │ Skip │ - │ opps │ │ │ │ - │ Complexity │ "Is it big │ "Is it too │ "Is it the bare │ - │ question │ enough?" │ complex?" │ minimum?" │ - │ Taste │ Yes │ No │ No │ - │ calibration │ │ │ │ - │ Temporal │ Full (hr 1-6)│ Key decisions│ Skip │ - │ interrogate │ │ only │ │ - │ Observ. │ "Joy to │ "Can we │ "Can we see if │ - │ standard │ operate" │ debug it?" │ it's broken?" │ - │ Deploy │ Infra as │ Safe deploy │ Simplest possible │ - │ standard │ feature scope│ + rollback │ deploy │ - │ Error map │ Full + chaos │ Full │ Critical paths │ - │ │ scenarios │ │ only │ - │ Phase 2/3 │ Map it │ Note it │ Skip │ - │ planning │ │ │ │ - └─────────────┴──────────────┴──────────────┴────────────────────┘ -``` diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md deleted file mode 100644 index 3074aee..0000000 --- a/plan-eng-review/SKILL.md +++ /dev/null @@ -1,162 +0,0 @@ ---- -name: plan-eng-review -version: 1.0.0 -description: | - Eng manager-mode plan review. Lock in the execution plan — architecture, - data flow, diagrams, edge cases, test coverage, performance. Walks through - issues interactively with opinionated recommendations. -allowed-tools: - - Read - - Grep - - Glob - - AskUserQuestion ---- - -# Plan Review Mode - -Review this plan thoroughly before making any code changes. For every issue or recommendation, explain the concrete tradeoffs, give me an opinionated recommendation, and ask for my input before assuming a direction. - -## Priority hierarchy -If you are running low on context or the user asks you to compress: Step 0 > Test diagram > Opinionated recommendations > Everything else. Never skip Step 0 or the test diagram. - -## My engineering preferences (use these to guide your recommendations): -* DRY is important—flag repetition aggressively. -* Well-tested code is non-negotiable; I'd rather have too many tests than too few. -* I want code that's "engineered enough" — not under-engineered (fragile, hacky) and not over-engineered (premature abstraction, unnecessary complexity). -* I err on the side of handling more edge cases, not fewer; thoughtfulness > speed. -* Bias toward explicit over clever. -* Minimal diff: achieve the goal with the fewest new abstractions and files touched. - -## Documentation and diagrams: -* I value ASCII art diagrams highly — for data flow, state machines, dependency graphs, processing pipelines, and decision trees. Use them liberally in plans and design docs. -* For particularly complex designs or behaviors, embed ASCII diagrams directly in code comments in the appropriate places: Models (data relationships, state transitions), Controllers (request flow), Concerns (mixin behavior), Services (processing pipelines), and Tests (what's being set up and why) when the test structure is non-obvious. -* **Diagram maintenance is part of the change.** When modifying code that has ASCII diagrams in comments nearby, review whether those diagrams are still accurate. Update them as part of the same commit. Stale diagrams are worse than no diagrams — they actively mislead. Flag any stale diagrams you encounter during review even if they're outside the immediate scope of the change. - -## BEFORE YOU START: - -### Step 0: Scope Challenge -Before reviewing anything, answer these questions: -1. **What existing code already partially or fully solves each sub-problem?** Can we capture outputs from existing flows rather than building parallel ones? -2. **What is the minimum set of changes that achieves the stated goal?** Flag any work that could be deferred without blocking the core objective. Be ruthless about scope creep. -3. **Complexity check:** If the plan touches more than 8 files or introduces more than 2 new classes/services, treat that as a smell and challenge whether the same goal can be achieved with fewer moving parts. - -Then ask if I want one of three options: -1. **SCOPE REDUCTION:** The plan is overbuilt. Propose a minimal version that achieves the core goal, then review that. -2. **BIG CHANGE:** Work through interactively, one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section. -3. **SMALL CHANGE:** Compressed review — Step 0 + one combined pass covering all 4 sections. For each section, pick the single most important issue (think hard — this forces you to prioritize). Present as a single numbered list with lettered options + mandatory test diagram + completion summary. One AskUserQuestion round at the end. For each issue in the batch, state your recommendation and explain WHY, with lettered options. - -**Critical: If I do not select SCOPE REDUCTION, respect that decision fully.** Your job becomes making the plan I chose succeed, not continuing to lobby for a smaller plan. Raise scope concerns once in Step 0 — after that, commit to my chosen scope and optimize within it. Do not silently reduce scope, skip planned components, or re-argue for less work during later review sections. - -## Review Sections (after scope is agreed) - -### 1. Architecture review -Evaluate: -* Overall system design and component boundaries. -* Dependency graph and coupling concerns. -* Data flow patterns and potential bottlenecks. -* Scaling characteristics and single points of failure. -* Security architecture (auth, data access, API boundaries). -* Whether key flows deserve ASCII diagrams in the plan or in code comments. -* For each new codepath or integration point, describe one realistic production failure scenario and whether the plan accounts for it. - -**STOP.** For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved. - -### 2. Code quality review -Evaluate: -* Code organization and module structure. -* DRY violations—be aggressive here. -* Error handling patterns and missing edge cases (call these out explicitly). -* Technical debt hotspots. -* Areas that are over-engineered or under-engineered relative to my preferences. -* Existing ASCII diagrams in touched files — are they still accurate after this change? - -**STOP.** For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved. - -### 3. Test review -Make a diagram of all new UX, new data flow, new codepaths, and new branching if statements or outcomes. For each, note what is new about the features discussed in this branch and plan. Then, for each new item in the diagram, make sure there is a JS or Rails test. - -For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in CLAUDE.md. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. Then use AskUserQuestion to confirm the eval scope with the user. - -**STOP.** For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved. - -### 4. Performance review -Evaluate: -* N+1 queries and database access patterns. -* Memory-usage concerns. -* Caching opportunities. -* Slow or high-complexity code paths. - -**STOP.** For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Only proceed to the next section after ALL issues in this section are resolved. - -## CRITICAL RULE — How to ask questions -Every AskUserQuestion MUST: (1) present 2-3 concrete lettered options, (2) state which option you recommend FIRST, (3) explain in 1-2 sentences WHY that option over the others, mapping to engineering preferences. No batching multiple issues into one question. No yes/no questions. Open-ended questions are allowed ONLY when you have genuine ambiguity about developer intent, architecture direction, 12-month goals, or what the end user wants — and you must explain what specifically is ambiguous. **Exception:** SMALL CHANGE mode intentionally batches one issue per section into a single AskUserQuestion at the end — but each issue in that batch still requires its own recommendation + WHY + lettered options. - -## For each issue you find -For every specific issue (bug, smell, design concern, or risk): -* **One issue = one AskUserQuestion call.** Never combine multiple issues into one question. -* Describe the problem concretely, with file and line references. -* Present 2–3 options, including "do nothing" where that's reasonable. -* For each option, specify in one line: effort, risk, and maintenance burden. -* **Lead with your recommendation.** State it as a directive: "Do B. Here's why:" — not "Option B might be worth considering." Be opinionated. I'm paying for your judgment, not a menu. -* **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference (DRY, explicit > clever, minimal diff, etc.). -* **AskUserQuestion format:** Start with "We recommend [LETTER]: [one-line reason]" then list all options as `A) ... B) ... C) ...`. Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). -* **Escape hatch:** If a section has no issues, say so and move on. If an issue has an obvious fix with no real alternatives, state what you'll do and move on — don't waste a question on it. Only use AskUserQuestion when there is a genuine decision with meaningful tradeoffs. - -## Required outputs - -### "NOT in scope" section -Every plan review MUST produce a "NOT in scope" section listing work that was considered and explicitly deferred, with a one-line rationale for each item. - -### "What already exists" section -List existing code/flows that already partially solve sub-problems in this plan, and whether the plan reuses them or unnecessarily rebuilds them. - -### TODOS.md updates -After all review sections are complete, present each potential TODO as its own individual AskUserQuestion. Never batch TODOs — one per question. Never silently skip this step. - -For each TODO, describe: -* **What:** One-line description of the work. -* **Why:** The concrete problem it solves or value it unlocks. -* **Pros:** What you gain by doing this work. -* **Cons:** Cost, complexity, or risks of doing it. -* **Context:** Enough detail that someone picking this up in 3 months understands the motivation, the current state, and where to start. -* **Depends on / blocked by:** Any prerequisites or ordering constraints. - -Then present options: **A)** Add to TODOS.md **B)** Skip — not valuable enough **C)** Build it now in this PR instead of deferring. - -Do NOT just append vague bullet points. A TODO without context is worse than no TODO — it creates false confidence that the idea was captured while actually losing the reasoning. - -### Diagrams -The plan itself should use ASCII diagrams for any non-trivial data flow, state machine, or processing pipeline. Additionally, identify which files in the implementation should get inline ASCII diagram comments — particularly Models with complex state transitions, Services with multi-step pipelines, and Concerns with non-obvious mixin behavior. - -### Failure modes -For each new codepath identified in the test review diagram, list one realistic way it could fail in production (timeout, nil reference, race condition, stale data, etc.) and whether: -1. A test covers that failure -2. Error handling exists for it -3. The user would see a clear error or a silent failure - -If any failure mode has no test AND no error handling AND would be silent, flag it as a **critical gap**. - -### Completion summary -At the end of the review, fill in and display this summary so the user can see all findings at a glance: -- Step 0: Scope Challenge (user chose: ___) -- Architecture Review: ___ issues found -- Code Quality Review: ___ issues found -- Test Review: diagram produced, ___ gaps identified -- Performance Review: ___ issues found -- NOT in scope: written -- What already exists: written -- TODOS.md updates: ___ items proposed to user -- Failure modes: ___ critical gaps flagged - -## Retrospective learning -Check the git log for this branch. If there are prior commits suggesting a previous review cycle (e.g., review-driven refactors, reverted changes), note what was changed and whether the current plan touches the same areas. Be more aggressive reviewing areas that were previously problematic. - -## Formatting rules -* NUMBER issues (1, 2, 3...) and give LETTERS for options (A, B, C...). -* When using AskUserQuestion, label each option with issue NUMBER and option LETTER so I don't get confused. -* Recommended option is always listed first. -* Keep each option to one sentence max. I should be able to pick in under 5 seconds. -* After each review section, pause and ask for feedback before moving on. - -## Unresolved decisions -If the user does not respond to an AskUserQuestion or interrupts to move on, note which decisions were left unresolved. At the end of the review, list these as "Unresolved decisions that may bite you later" — never silently default to an option. diff --git a/qa/SKILL.md b/qa/SKILL.md deleted file mode 100644 index 7e834d4..0000000 --- a/qa/SKILL.md +++ /dev/null @@ -1,346 +0,0 @@ ---- -name: qa -version: 1.0.0 -description: | - Systematically QA test a web application. Use when asked to "qa", "QA", "test this site", - "find bugs", "dogfood", or review quality. Four modes: diff-aware (automatic on feature - branches — analyzes git diff, identifies affected pages, tests them), full (systematic - exploration), quick (30-second smoke test), regression (compare against baseline). Produces - structured report with health score, screenshots, and repro steps. -allowed-tools: - - Bash - - Read - - Write ---- - -# /qa: Systematic QA Testing - -You are a QA engineer. Test web applications like a real user — click everything, fill every form, check every state. Produce a structured report with evidence. - -## Setup - -**Parse the user's request for these parameters:** - -| Parameter | Default | Override example | -|-----------|---------|-----------------| -| Target URL | (auto-detect or required) | `https://myapp.com`, `http://localhost:3000` | -| Mode | full | `--quick`, `--regression .gstack/qa-reports/baseline.json` | -| Output dir | `.gstack/qa-reports/` | `Output to /tmp/qa` | -| Scope | Full app (or diff-scoped) | `Focus on the billing page` | -| Auth | None | `Sign in to user@example.com`, `Import cookies from cookies.json` | - -**If no URL is given and you're on a feature branch:** Automatically enter **diff-aware mode** (see Modes below). This is the most common case — the user just shipped code on a branch and wants to verify it works. - -**Find the browse binary:** - -```bash -BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null) -B=$(echo "$BROWSE_OUTPUT" | head -1) -META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true) -if [ -z "$B" ]; then - echo "ERROR: browse binary not found" - exit 1 -fi -echo "READY: $B" -[ -n "$META" ] && echo "$META" -``` - -If you see `META:UPDATE_AVAILABLE`: tell the user an update is available, STOP and wait for approval, then run the command from the META payload and re-run the setup check. - -**Create output directories:** - -```bash -REPORT_DIR=".gstack/qa-reports" -mkdir -p "$REPORT_DIR/screenshots" -``` - ---- - -## Modes - -### Diff-aware (automatic when on a feature branch with no URL) - -This is the **primary mode** for developers verifying their work. When the user says `/qa` without a URL and the repo is on a feature branch, automatically: - -1. **Analyze the branch diff** to understand what changed: - ```bash - git diff main...HEAD --name-only - git log main..HEAD --oneline - ``` - -2. **Identify affected pages/routes** from the changed files: - - Controller/route files → which URL paths they serve - - View/template/component files → which pages render them - - Model/service files → which pages use those models (check controllers that reference them) - - CSS/style files → which pages include those stylesheets - - API endpoints → test them directly with `$B js "await fetch('/api/...')"` - - Static pages (markdown, HTML) → navigate to them directly - -3. **Detect the running app** — check common local dev ports: - ```bash - $B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \ - $B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \ - $B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080" - ``` - If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL. - -4. **Test each affected page/route:** - - Navigate to the page - - Take a screenshot - - Check console for errors - - If the change was interactive (forms, buttons, flows), test the interaction end-to-end - - Use `snapshot -D` before and after actions to verify the change had the expected effect - -5. **Cross-reference with commit messages and PR description** to understand *intent* — what should the change do? Verify it actually does that. - -6. **Report findings** scoped to the branch changes: - - "Changes tested: N pages/routes affected by this branch" - - For each: does it work? Screenshot evidence. - - Any regressions on adjacent pages? - -**If the user provides a URL with diff-aware mode:** Use that URL as the base but still scope testing to the changed files. - -### Full (default when URL is provided) -Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size. - -### Quick (`--quick`) -30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation. - -### Regression (`--regression `) -Run full mode, then load `baseline.json` from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report. - ---- - -## Workflow - -### Phase 1: Initialize - -1. Find browse binary (see Setup above) -2. Create output directories -3. Copy report template from `qa/templates/qa-report-template.md` to output dir -4. Start timer for duration tracking - -### Phase 2: Authenticate (if needed) - -**If the user specified auth credentials:** - -```bash -$B goto -$B snapshot -i # find the login form -$B fill @e3 "user@example.com" -$B fill @e4 "[REDACTED]" # NEVER include real passwords in report -$B click @e5 # submit -$B snapshot -D # verify login succeeded -``` - -**If the user provided a cookie file:** - -```bash -$B cookie-import cookies.json -$B goto -``` - -**If 2FA/OTP is required:** Ask the user for the code and wait. - -**If CAPTCHA blocks you:** Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue." - -### Phase 3: Orient - -Get a map of the application: - -```bash -$B goto -$B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png" -$B links # map navigation structure -$B console --errors # any errors on landing? -``` - -**Detect framework** (note in report metadata): -- `__next` in HTML or `_next/data` requests → Next.js -- `csrf-token` meta tag → Rails -- `wp-content` in URLs → WordPress -- Client-side routing with no page reloads → SPA - -**For SPAs:** The `links` command may return few results because navigation is client-side. Use `snapshot -i` to find nav elements (buttons, menu items) instead. - -### Phase 4: Explore - -Visit pages systematically. At each page: - -```bash -$B goto -$B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png" -$B console --errors -``` - -Then follow the **per-page exploration checklist** (see `qa/references/issue-taxonomy.md`): - -1. **Visual scan** — Look at the annotated screenshot for layout issues -2. **Interactive elements** — Click buttons, links, controls. Do they work? -3. **Forms** — Fill and submit. Test empty, invalid, edge cases -4. **Navigation** — Check all paths in and out -5. **States** — Empty state, loading, error, overflow -6. **Console** — Any new JS errors after interactions? -7. **Responsiveness** — Check mobile viewport if relevant: - ```bash - $B viewport 375x812 - $B screenshot "$REPORT_DIR/screenshots/page-mobile.png" - $B viewport 1280x720 - ``` - -**Depth judgment:** Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy). - -**Quick mode:** Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist — just check: loads? Console errors? Broken links visible? - -### Phase 5: Document - -Document each issue **immediately when found** — don't batch them. - -**Two evidence tiers:** - -**Interactive bugs** (broken flows, dead buttons, form failures): -1. Take a screenshot before the action -2. Perform the action -3. Take a screenshot showing the result -4. Use `snapshot -D` to show what changed -5. Write repro steps referencing screenshots - -```bash -$B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png" -$B click @e5 -$B screenshot "$REPORT_DIR/screenshots/issue-001-result.png" -$B snapshot -D -``` - -**Static bugs** (typos, layout issues, missing images): -1. Take a single annotated screenshot showing the problem -2. Describe what's wrong - -```bash -$B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png" -``` - -**Write each issue to the report immediately** using the template format from `qa/templates/qa-report-template.md`. - -### Phase 6: Wrap Up - -1. **Compute health score** using the rubric below -2. **Write "Top 3 Things to Fix"** — the 3 highest-severity issues -3. **Write console health summary** — aggregate all console errors seen across pages -4. **Update severity counts** in the summary table -5. **Fill in report metadata** — date, duration, pages visited, screenshot count, framework -6. **Save baseline** — write `baseline.json` with: - ```json - { - "date": "YYYY-MM-DD", - "url": "", - "healthScore": N, - "issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }], - "categoryScores": { "console": N, "links": N, ... } - } - ``` - -**Regression mode:** After writing the report, load the baseline file. Compare: -- Health score delta -- Issues fixed (in baseline but not current) -- New issues (in current but not baseline) -- Append the regression section to the report - ---- - -## Health Score Rubric - -Compute each category score (0-100), then take the weighted average. - -### Console (weight: 15%) -- 0 errors → 100 -- 1-3 errors → 70 -- 4-10 errors → 40 -- 10+ errors → 10 - -### Links (weight: 10%) -- 0 broken → 100 -- Each broken link → -15 (minimum 0) - -### Per-Category Scoring (Visual, Functional, UX, Content, Performance, Accessibility) -Each category starts at 100. Deduct per finding: -- Critical issue → -25 -- High issue → -15 -- Medium issue → -8 -- Low issue → -3 -Minimum 0 per category. - -### Weights -| Category | Weight | -|----------|--------| -| Console | 15% | -| Links | 10% | -| Visual | 10% | -| Functional | 20% | -| UX | 15% | -| Performance | 10% | -| Content | 5% | -| Accessibility | 15% | - -### Final Score -`score = Σ (category_score × weight)` - ---- - -## Framework-Specific Guidance - -### Next.js -- Check console for hydration errors (`Hydration failed`, `Text content did not match`) -- Monitor `_next/data` requests in network — 404s indicate broken data fetching -- Test client-side navigation (click links, don't just `goto`) — catches routing issues -- Check for CLS (Cumulative Layout Shift) on pages with dynamic content - -### Rails -- Check for N+1 query warnings in console (if development mode) -- Verify CSRF token presence in forms -- Test Turbo/Stimulus integration — do page transitions work smoothly? -- Check for flash messages appearing and dismissing correctly - -### WordPress -- Check for plugin conflicts (JS errors from different plugins) -- Verify admin bar visibility for logged-in users -- Test REST API endpoints (`/wp-json/`) -- Check for mixed content warnings (common with WP) - -### General SPA (React, Vue, Angular) -- Use `snapshot -i` for navigation — `links` command misses client-side routes -- Check for stale state (navigate away and back — does data refresh?) -- Test browser back/forward — does the app handle history correctly? -- Check for memory leaks (monitor console after extended use) - ---- - -## Important Rules - -1. **Repro is everything.** Every issue needs at least one screenshot. No exceptions. -2. **Verify before documenting.** Retry the issue once to confirm it's reproducible, not a fluke. -3. **Never include credentials.** Write `[REDACTED]` for passwords in repro steps. -4. **Write incrementally.** Append each issue to the report as you find it. Don't batch. -5. **Never read source code.** Test as a user, not a developer. -6. **Check console after every interaction.** JS errors that don't surface visually are still bugs. -7. **Test like a user.** Use realistic data. Walk through complete workflows end-to-end. -8. **Depth over breadth.** 5-10 well-documented issues with evidence > 20 vague descriptions. -9. **Never delete output files.** Screenshots and reports accumulate — that's intentional. -10. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses. - ---- - -## Output Structure - -``` -.gstack/qa-reports/ -├── qa-report-{domain}-{YYYY-MM-DD}.md # Structured report -├── screenshots/ -│ ├── initial.png # Landing page annotated screenshot -│ ├── issue-001-step-1.png # Per-issue evidence -│ ├── issue-001-result.png -│ └── ... -└── baseline.json # For regression mode -``` - -Report filenames use the domain and date: `qa-report-myapp-com-2026-03-12.md` diff --git a/qa/references/issue-taxonomy.md b/qa/references/issue-taxonomy.md deleted file mode 100644 index 05c5741..0000000 --- a/qa/references/issue-taxonomy.md +++ /dev/null @@ -1,85 +0,0 @@ -# QA Issue Taxonomy - -## Severity Levels - -| Severity | Definition | Examples | -|----------|------------|----------| -| **critical** | Blocks a core workflow, causes data loss, or crashes the app | Form submit causes error page, checkout flow broken, data deleted without confirmation | -| **high** | Major feature broken or unusable, no workaround | Search returns wrong results, file upload silently fails, auth redirect loop | -| **medium** | Feature works but with noticeable problems, workaround exists | Slow page load (>5s), form validation missing but submit still works, layout broken on mobile only | -| **low** | Minor cosmetic or polish issue | Typo in footer, 1px alignment issue, hover state inconsistent | - -## Categories - -### 1. Visual/UI -- Layout breaks (overlapping elements, clipped text, horizontal scrollbar) -- Broken or missing images -- Incorrect z-index (elements appearing behind others) -- Font/color inconsistencies -- Animation glitches (jank, incomplete transitions) -- Alignment issues (off-grid, uneven spacing) -- Dark mode / theme issues - -### 2. Functional -- Broken links (404, wrong destination) -- Dead buttons (click does nothing) -- Form validation (missing, wrong, bypassed) -- Incorrect redirects -- State not persisting (data lost on refresh, back button) -- Race conditions (double-submit, stale data) -- Search returning wrong or no results - -### 3. UX -- Confusing navigation (no breadcrumbs, dead ends) -- Missing loading indicators (user doesn't know something is happening) -- Slow interactions (>500ms with no feedback) -- Unclear error messages ("Something went wrong" with no detail) -- No confirmation before destructive actions -- Inconsistent interaction patterns across pages -- Dead ends (no way back, no next action) - -### 4. Content -- Typos and grammar errors -- Outdated or incorrect text -- Placeholder / lorem ipsum text left in -- Truncated text (cut off without ellipsis or "more") -- Wrong labels on buttons or form fields -- Missing or unhelpful empty states - -### 5. Performance -- Slow page loads (>3 seconds) -- Janky scrolling (dropped frames) -- Layout shifts (content jumping after load) -- Excessive network requests (>50 on a single page) -- Large unoptimized images -- Blocking JavaScript (page unresponsive during load) - -### 6. Console/Errors -- JavaScript exceptions (uncaught errors) -- Failed network requests (4xx, 5xx) -- Deprecation warnings (upcoming breakage) -- CORS errors -- Mixed content warnings (HTTP resources on HTTPS) -- CSP violations - -### 7. Accessibility -- Missing alt text on images -- Unlabeled form inputs -- Keyboard navigation broken (can't tab to elements) -- Focus traps (can't escape a modal or dropdown) -- Missing or incorrect ARIA attributes -- Insufficient color contrast -- Content not reachable by screen reader - -## Per-Page Exploration Checklist - -For each page visited during a QA session: - -1. **Visual scan** — Take annotated screenshot (`snapshot -i -a -o`). Look for layout issues, broken images, alignment. -2. **Interactive elements** — Click every button, link, and control. Does each do what it says? -3. **Forms** — Fill and submit. Test empty submission, invalid data, edge cases (long text, special characters). -4. **Navigation** — Check all paths in/out. Breadcrumbs, back button, deep links, mobile menu. -5. **States** — Check empty state, loading state, error state, full/overflow state. -6. **Console** — Run `console --errors` after interactions. Any new JS errors or failed requests? -7. **Responsiveness** — If relevant, check mobile and tablet viewports. -8. **Auth boundaries** — What happens when logged out? Different user roles? diff --git a/qa/templates/qa-report-template.md b/qa/templates/qa-report-template.md deleted file mode 100644 index d118ab8..0000000 --- a/qa/templates/qa-report-template.md +++ /dev/null @@ -1,79 +0,0 @@ -# QA Report: {APP_NAME} - -| Field | Value | -|-------|-------| -| **Date** | {DATE} | -| **URL** | {URL} | -| **Scope** | {SCOPE or "Full app"} | -| **Mode** | {full / quick / regression} | -| **Duration** | {DURATION} | -| **Pages visited** | {COUNT} | -| **Screenshots** | {COUNT} | -| **Framework** | {DETECTED or "Unknown"} | - -## Health Score: {SCORE}/100 - -| Category | Score | -|----------|-------| -| Console | {0-100} | -| Links | {0-100} | -| Visual | {0-100} | -| Functional | {0-100} | -| UX | {0-100} | -| Performance | {0-100} | -| Accessibility | {0-100} | - -## Top 3 Things to Fix - -1. **{ISSUE-NNN}: {title}** — {one-line description} -2. **{ISSUE-NNN}: {title}** — {one-line description} -3. **{ISSUE-NNN}: {title}** — {one-line description} - -## Console Health - -| Error | Count | First seen | -|-------|-------|------------| -| {error message} | {N} | {URL} | - -## Summary - -| Severity | Count | -|----------|-------| -| Critical | 0 | -| High | 0 | -| Medium | 0 | -| Low | 0 | -| **Total** | **0** | - -## Issues - -### ISSUE-001: {Short title} - -| Field | Value | -|-------|-------| -| **Severity** | critical / high / medium / low | -| **Category** | visual / functional / ux / content / performance / console / accessibility | -| **URL** | {page URL} | - -**Description:** {What is wrong, expected vs actual.} - -**Repro Steps:** - -1. Navigate to {URL} - ![Step 1](screenshots/issue-001-step-1.png) -2. {Action} - ![Step 2](screenshots/issue-001-step-2.png) -3. **Observe:** {what goes wrong} - ![Result](screenshots/issue-001-result.png) - ---- - -## Regression (if applicable) - -| Metric | Baseline | Current | Delta | -|--------|----------|---------|-------| -| Health score | {N} | {N} | {+/-N} | -| Issues | {N} | {N} | {+/-N} | - -**Fixed since baseline:** {list} -**New since baseline:** {list} diff --git a/retro/SKILL.md b/retro/SKILL.md deleted file mode 100644 index 3469c92..0000000 --- a/retro/SKILL.md +++ /dev/null @@ -1,444 +0,0 @@ ---- -name: retro -version: 2.0.0 -description: | - Weekly engineering retrospective. Analyzes commit history, work patterns, - and code quality metrics with persistent history and trend tracking. - Team-aware: breaks down per-person contributions with praise and growth areas. -allowed-tools: - - Bash - - Read - - Write - - Glob ---- - -# /retro — Weekly Engineering Retrospective - -Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. - -## User-invocable -When the user types `/retro`, run this skill. - -## Arguments -- `/retro` — default: last 7 days -- `/retro 24h` — last 24 hours -- `/retro 14d` — last 14 days -- `/retro 30d` — last 30 days -- `/retro compare` — compare current window vs prior same-length window -- `/retro compare 14d` — compare with explicit window - -## Instructions - -Parse the argument to determine the time window. Default to 7 days if no argument given. Use `--since="N days ago"`, `--since="N hours ago"`, or `--since="N weeks ago"` (for `w` units) for git log queries. All times should be reported in **Pacific time** (use `TZ=America/Los_Angeles` when converting timestamps). - -**Argument validation:** If the argument doesn't match a number followed by `d`, `h`, or `w`, the word `compare`, or `compare` followed by a number and `d`/`h`/`w`, show this usage and stop: -``` -Usage: /retro [window] - /retro — last 7 days (default) - /retro 24h — last 24 hours - /retro 14d — last 14 days - /retro 30d — last 30 days - /retro compare — compare this period vs prior period - /retro compare 14d — compare with explicit window -``` - -### Step 1: Gather Raw Data - -First, fetch origin and identify the current user: -```bash -git fetch origin main --quiet -# Identify who is running the retro -git config user.name -git config user.email -``` - -The name returned by `git config user.name` is **"you"** — the person reading this retro. All other authors are teammates. Use this to orient the narrative: "your" commits vs teammate contributions. - -Run ALL of these git commands in parallel (they are independent): - -```bash -# 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions -git log origin/main --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat - -# 2. Per-commit test vs total LOC breakdown with author -# Each commit block starts with COMMIT:|, followed by numstat lines. -# Separate test files (matching test/|spec/|__tests__/) from production files. -git log origin/main --since="" --format="COMMIT:%H|%aN" --numstat - -# 3. Commit timestamps for session detection and hourly distribution (with author) -# Use TZ=America/Los_Angeles for Pacific time conversion -TZ=America/Los_Angeles git log origin/main --since="" --format="%at|%aN|%ai|%s" | sort -n - -# 4. Files most frequently changed (hotspot analysis) -git log origin/main --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn - -# 5. PR numbers from commit messages (extract #NNN patterns) -git log origin/main --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' - -# 6. Per-author file hotspots (who touches what) -git log origin/main --since="" --format="AUTHOR:%aN" --name-only - -# 7. Per-author commit counts (quick summary) -git shortlog origin/main --since="" -sn --no-merges - -# 8. Greptile triage history (if available) -cat ~/.gstack/greptile-history.md 2>/dev/null || true -``` - -### Step 2: Compute Metrics - -Calculate and present these metrics in a summary table: - -| Metric | Value | -|--------|-------| -| Commits to main | N | -| Contributors | N | -| PRs merged | N | -| Total insertions | N | -| Total deletions | N | -| Net LOC added | N | -| Test LOC (insertions) | N | -| Test LOC ratio | N% | -| Version range | vX.Y.Z.W → vX.Y.Z.W | -| Active days | N | -| Detected sessions | N | -| Avg LOC/session-hour | N | -| Greptile signal | N% (Y catches, Z FPs) | - -Then show a **per-author leaderboard** immediately below: - -``` -Contributor Commits +/- Top area -You (garry) 32 +2400/-300 browse/ -alice 12 +800/-150 app/services/ -bob 3 +120/-40 tests/ -``` - -Sort by commits descending. The current user (from `git config user.name`) always appears first, labeled "You (name)". - -**Greptile signal (if history exists):** Read `~/.gstack/greptile-history.md` (fetched in Step 1, command 8). Filter entries within the retro time window by date. Count entries by type: `fix`, `fp`, `already-fixed`. Compute signal ratio: `(fix + already-fixed) / (fix + already-fixed + fp)`. If no entries exist in the window or the file doesn't exist, skip the Greptile metric row. Skip unparseable lines silently. - -### Step 3: Commit Time Distribution - -Show hourly histogram in Pacific time using bar chart: - -``` -Hour Commits ████████████████ - 00: 4 ████ - 07: 5 █████ - ... -``` - -Identify and call out: -- Peak hours -- Dead zones -- Whether pattern is bimodal (morning/evening) or continuous -- Late-night coding clusters (after 10pm) - -### Step 4: Work Session Detection - -Detect sessions using **45-minute gap** threshold between consecutive commits. For each session report: -- Start/end time (Pacific) -- Number of commits -- Duration in minutes - -Classify sessions: -- **Deep sessions** (50+ min) -- **Medium sessions** (20-50 min) -- **Micro sessions** (<20 min, typically single-commit fire-and-forget) - -Calculate: -- Total active coding time (sum of session durations) -- Average session length -- LOC per hour of active time - -### Step 5: Commit Type Breakdown - -Categorize by conventional commit prefix (feat/fix/refactor/test/chore/docs). Show as percentage bar: - -``` -feat: 20 (40%) ████████████████████ -fix: 27 (54%) ███████████████████████████ -refactor: 2 ( 4%) ██ -``` - -Flag if fix ratio exceeds 50% — this signals a "ship fast, fix fast" pattern that may indicate review gaps. - -### Step 6: Hotspot Analysis - -Show top 10 most-changed files. Flag: -- Files changed 5+ times (churn hotspots) -- Test files vs production files in the hotspot list -- VERSION/CHANGELOG frequency (version discipline indicator) - -### Step 7: PR Size Distribution - -From commit diffs, estimate PR sizes and bucket them: -- **Small** (<100 LOC) -- **Medium** (100-500 LOC) -- **Large** (500-1500 LOC) -- **XL** (1500+ LOC) — flag these with file counts - -### Step 8: Focus Score + Ship of the Week - -**Focus score:** Calculate the percentage of commits touching the single most-changed top-level directory (e.g., `app/services/`, `app/views/`). Higher score = deeper focused work. Lower score = scattered context-switching. Report as: "Focus score: 62% (app/services/)" - -**Ship of the week:** Auto-identify the single highest-LOC PR in the window. Highlight it: -- PR number and title -- LOC changed -- Why it matters (infer from commit messages and files touched) - -### Step 9: Team Member Analysis - -For each contributor (including the current user), compute: - -1. **Commits and LOC** — total commits, insertions, deletions, net LOC -2. **Areas of focus** — which directories/files they touched most (top 3) -3. **Commit type mix** — their personal feat/fix/refactor/test breakdown -4. **Session patterns** — when they code (their peak hours), session count -5. **Test discipline** — their personal test LOC ratio -6. **Biggest ship** — their single highest-impact commit or PR in the window - -**For the current user ("You"):** This section gets the deepest treatment. Include all the detail from the solo retro — session analysis, time patterns, focus score. Frame it in first person: "Your peak hours...", "Your biggest ship..." - -**For each teammate:** Write 2-3 sentences covering what they worked on and their pattern. Then: - -- **Praise** (1-2 specific things): Anchor in actual commits. Not "great work" — say exactly what was good. Examples: "Shipped the entire auth middleware rewrite in 3 focused sessions with 45% test coverage", "Every PR under 200 LOC — disciplined decomposition." -- **Opportunity for growth** (1 specific thing): Frame as a leveling-up suggestion, not criticism. Anchor in actual data. Examples: "Test ratio was 12% this week — adding test coverage to the payment module before it gets more complex would pay off", "5 fix commits on the same file suggest the original PR could have used a review pass." - -**If only one contributor (solo repo):** Skip the team breakdown and proceed as before — the retro is personal. - -**If there are Co-Authored-By trailers:** Parse `Co-Authored-By:` lines in commit messages. Credit those authors for the commit alongside the primary author. Note AI co-authors (e.g., `noreply@anthropic.com`) but do not include them as team members — instead, track "AI-assisted commits" as a separate metric. - -### Step 10: Week-over-Week Trends (if window >= 14d) - -If the time window is 14 days or more, split into weekly buckets and show trends: -- Commits per week (total and per-author) -- LOC per week -- Test ratio per week -- Fix ratio per week -- Session count per week - -### Step 11: Streak Tracking - -Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: - -```bash -# Team streak: all unique commit dates (Pacific time) — no hard cutoff -TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u - -# Personal streak: only the current user's commits -TZ=America/Los_Angeles git log origin/main --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u -``` - -Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: -- "Team shipping streak: 47 consecutive days" -- "Your shipping streak: 32 consecutive days" - -### Step 12: Load History & Compare - -Before saving the new snapshot, check for prior retro history: - -```bash -ls -t .context/retros/*.json 2>/dev/null -``` - -**If prior retros exist:** Load the most recent one using the Read tool. Calculate deltas for key metrics and include a **Trends vs Last Retro** section: -``` - Last Now Delta -Test ratio: 22% → 41% ↑19pp -Sessions: 10 → 14 ↑4 -LOC/hour: 200 → 350 ↑75% -Fix ratio: 54% → 30% ↓24pp (improving) -Commits: 32 → 47 ↑47% -Deep sessions: 3 → 5 ↑2 -``` - -**If no prior retros exist:** Skip the comparison section and append: "First retro recorded — run again next week to see trends." - -### Step 13: Save Retro History - -After computing all metrics (including streak) and loading any prior history for comparison, save a JSON snapshot: - -```bash -mkdir -p .context/retros -``` - -Determine the next sequence number for today (substitute the actual date for `$(date +%Y-%m-%d)`): -```bash -# Count existing retros for today to get next sequence number -today=$(TZ=America/Los_Angeles date +%Y-%m-%d) -existing=$(ls .context/retros/${today}-*.json 2>/dev/null | wc -l | tr -d ' ') -next=$((existing + 1)) -# Save as .context/retros/${today}-${next}.json -``` - -Use the Write tool to save the JSON file with this schema: -```json -{ - "date": "2026-03-08", - "window": "7d", - "metrics": { - "commits": 47, - "contributors": 3, - "prs_merged": 12, - "insertions": 3200, - "deletions": 800, - "net_loc": 2400, - "test_loc": 1300, - "test_ratio": 0.41, - "active_days": 6, - "sessions": 14, - "deep_sessions": 5, - "avg_session_minutes": 42, - "loc_per_session_hour": 350, - "feat_pct": 0.40, - "fix_pct": 0.30, - "peak_hour": 22, - "ai_assisted_commits": 32 - }, - "authors": { - "Garry Tan": { "commits": 32, "insertions": 2400, "deletions": 300, "test_ratio": 0.41, "top_area": "browse/" }, - "Alice": { "commits": 12, "insertions": 800, "deletions": 150, "test_ratio": 0.35, "top_area": "app/services/" } - }, - "version_range": ["1.16.0.0", "1.16.1.0"], - "streak_days": 47, - "tweetable": "Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm", - "greptile": { - "fixes": 3, - "fps": 1, - "already_fixed": 2, - "signal_pct": 83 - } -} -``` - -**Note:** Only include the `greptile` field if `~/.gstack/greptile-history.md` exists and has entries within the time window. If no history data is available, omit the field entirely. - -### Step 14: Write the Narrative - -Structure the output as: - ---- - -**Tweetable summary** (first line, before everything else): -``` -Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm | Streak: 47d -``` - -## Engineering Retro: [date range] - -### Summary Table -(from Step 2) - -### Trends vs Last Retro -(from Step 11, loaded before save — skip if first retro) - -### Time & Session Patterns -(from Steps 3-4) - -Narrative interpreting what the team-wide patterns mean: -- When the most productive hours are and what drives them -- Whether sessions are getting longer or shorter over time -- Estimated hours per day of active coding (team aggregate) -- Notable patterns: do team members code at the same time or in shifts? - -### Shipping Velocity -(from Steps 5-7) - -Narrative covering: -- Commit type mix and what it reveals -- PR size discipline (are PRs staying small?) -- Fix-chain detection (sequences of fix commits on the same subsystem) -- Version bump discipline - -### Code Quality Signals -- Test LOC ratio trend -- Hotspot analysis (are the same files churning?) -- Any XL PRs that should have been split -- Greptile signal ratio and trend (if history exists): "Greptile: X% signal (Y valid catches, Z false positives)" - -### Focus & Highlights -(from Step 8) -- Focus score with interpretation -- Ship of the week callout - -### Your Week (personal deep-dive) -(from Step 9, for the current user only) - -This is the section the user cares most about. Include: -- Their personal commit count, LOC, test ratio -- Their session patterns and peak hours -- Their focus areas -- Their biggest ship -- **What you did well** (2-3 specific things anchored in commits) -- **Where to level up** (1-2 specific, actionable suggestions) - -### Team Breakdown -(from Step 9, for each teammate — skip if solo repo) - -For each teammate (sorted by commits descending), write a section: - -#### [Name] -- **What they shipped**: 2-3 sentences on their contributions, areas of focus, and commit patterns -- **Praise**: 1-2 specific things they did well, anchored in actual commits. Be genuine — what would you actually say in a 1:1? Examples: - - "Cleaned up the entire auth module in 3 small, reviewable PRs — textbook decomposition" - - "Added integration tests for every new endpoint, not just happy paths" - - "Fixed the N+1 query that was causing 2s load times on the dashboard" -- **Opportunity for growth**: 1 specific, constructive suggestion. Frame as investment, not criticism. Examples: - - "Test coverage on the payment module is at 8% — worth investing in before the next feature lands on top of it" - - "3 of the 5 PRs were 800+ LOC — breaking these up would catch issues earlier and make review easier" - - "All commits land between 1-4am — sustainable pace matters for code quality long-term" - -**AI collaboration note:** If many commits have `Co-Authored-By` AI trailers (e.g., Claude, Copilot), note the AI-assisted commit percentage as a team metric. Frame it neutrally — "N% of commits were AI-assisted" — without judgment. - -### Top 3 Team Wins -Identify the 3 highest-impact things shipped in the window across the whole team. For each: -- What it was -- Who shipped it -- Why it matters (product/architecture impact) - -### 3 Things to Improve -Specific, actionable, anchored in actual commits. Mix personal and team-level suggestions. Phrase as "to get even better, the team could..." - -### 3 Habits for Next Week -Small, practical, realistic. Each must be something that takes <5 minutes to adopt. At least one should be team-oriented (e.g., "review each other's PRs same-day"). - -### Week-over-Week Trends -(if applicable, from Step 10) - ---- - -## Compare Mode - -When the user runs `/retro compare` (or `/retro compare 14d`): - -1. Compute metrics for the current window (default 7d) using `--since="7 days ago"` -2. Compute metrics for the immediately prior same-length window using both `--since` and `--until` to avoid overlap (e.g., `--since="14 days ago" --until="7 days ago"` for a 7d window) -3. Show a side-by-side comparison table with deltas and arrows -4. Write a brief narrative highlighting the biggest improvements and regressions -5. Save only the current-window snapshot to `.context/retros/` (same as a normal retro run); do **not** persist the prior-window metrics. - -## Tone - -- Encouraging but candid, no coddling -- Specific and concrete — always anchor in actual commits/code -- Skip generic praise ("great job!") — say exactly what was good and why -- Frame improvements as leveling up, not criticism -- **Praise should feel like something you'd actually say in a 1:1** — specific, earned, genuine -- **Growth suggestions should feel like investment advice** — "this is worth your time because..." not "you failed at..." -- Never compare teammates against each other negatively. Each person's section stands on its own. -- Keep total output around 3000-4500 words (slightly longer to accommodate team sections) -- Use markdown tables and code blocks for data, prose for narrative -- Output directly to the conversation — do NOT write to filesystem (except the `.context/retros/` JSON snapshot) - -## Important Rules - -- ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. -- Use `origin/main` for all git queries (not local main which may be stale) -- Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) -- If the window has zero commits, say so and suggest a different window -- Round LOC/hour to nearest 50 -- Treat merge commits as PR boundaries -- Do not read CLAUDE.md or other docs — this skill is self-contained -- On first run (no prior retros), skip comparison sections gracefully diff --git a/review/SKILL.md b/review/SKILL.md deleted file mode 100644 index 35075d1..0000000 --- a/review/SKILL.md +++ /dev/null @@ -1,112 +0,0 @@ ---- -name: review -version: 1.0.0 -description: | - Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust - boundary violations, conditional side effects, and other structural issues. -allowed-tools: - - Bash - - Read - - Edit - - Write - - Grep - - Glob - - AskUserQuestion ---- - -# Pre-Landing PR Review - -You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. - ---- - -## Step 1: Check branch - -1. Run `git branch --show-current` to get the current branch. -2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. -3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. - ---- - -## Step 2: Read the checklist - -Read `.claude/skills/review/checklist.md`. - -**If the file cannot be read, STOP and report the error.** Do not proceed without the checklist. - ---- - -## Step 2.5: Check for Greptile review comments - -Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, and classify steps. - -**If no PR exists, `gh` fails, API returns an error, or there are zero Greptile comments:** Skip this step silently. Greptile integration is additive — the review works without it. - -**If Greptile comments are found:** Store the classifications (VALID & ACTIONABLE, VALID BUT ALREADY FIXED, FALSE POSITIVE, SUPPRESSED) — you will need them in Step 5. - ---- - -## Step 3: Get the diff - -Fetch the latest main to avoid false positives from a stale local main: - -```bash -git fetch origin main --quiet -``` - -Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. - ---- - -## Step 4: Two-pass review - -Apply the checklist against the diff in two passes: - -1. **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary -2. **Pass 2 (INFORMATIONAL):** Conditional Side Effects, Magic Numbers & String Coupling, Dead Code & Consistency, LLM Prompt Issues, Test Gaps, View/Frontend - -Follow the output format specified in the checklist. Respect the suppressions — do NOT flag items listed in the "DO NOT flag" section. - ---- - -## Step 5: Output findings - -**Always output ALL findings** — both critical and informational. The user must see every issue. - -- If CRITICAL issues found: output all findings, then for EACH critical issue use a separate AskUserQuestion with the problem, your recommended fix, and options (A: Fix it now, B: Acknowledge, C: False positive — skip). - After all critical questions are answered, output a summary of what the user chose for each issue. If the user chose A (fix) on any issue, apply the recommended fixes. If only B/C were chosen, no action needed. -- If only non-critical issues found: output findings. No further action needed. -- If no issues found: output `Pre-Landing Review: No issues found.` - -### Greptile comment resolution - -After outputting your own findings, if Greptile comments were classified in Step 2.5: - -**Include a Greptile summary in your output header:** `+ N Greptile comments (X valid, Y fixed, Z FP)` - -1. **VALID & ACTIONABLE comments:** These are already included in your CRITICAL findings — they follow the same AskUserQuestion flow (A: Fix it now, B: Acknowledge, C: False positive). If the user chooses C (false positive), post a reply using the appropriate API from the triage doc and save the pattern to `~/.gstack/greptile-history.md` (type: fp). - -2. **FALSE POSITIVE comments:** Present each one via AskUserQuestion: - - Show the Greptile comment: file:line (or [top-level]) + body summary + permalink URL - - Explain concisely why it's a false positive - - Options: - - A) Reply to Greptile explaining why this is incorrect (recommended if clearly wrong) - - B) Fix it anyway (if low-effort and harmless) - - C) Ignore — don't reply, don't fix - - If the user chooses A, post a reply using the appropriate API from the triage doc and save the pattern to `~/.gstack/greptile-history.md` (type: fp). - -3. **VALID BUT ALREADY FIXED comments:** Reply acknowledging the catch — no AskUserQuestion needed: - - Post reply: `"Good catch — already fixed in ."` - - Save to `~/.gstack/greptile-history.md` (type: already-fixed) - -4. **SUPPRESSED comments:** Skip silently — these are known false positives from previous triage. - ---- - -## Important Rules - -- **Read the FULL diff before commenting.** Do not flag issues already addressed in the diff. -- **Read-only by default.** Only modify files if the user explicitly chooses "Fix it now" on a critical issue. Never commit, push, or create PRs. -- **Be terse.** One line problem, one line fix. No preamble. -- **Only flag real problems.** Skip anything that's fine. diff --git a/review/checklist.md b/review/checklist.md deleted file mode 100644 index e321890..0000000 --- a/review/checklist.md +++ /dev/null @@ -1,125 +0,0 @@ -# Pre-Landing Review Checklist - -## Instructions - -Review the `git diff origin/main` output for the issues listed below. Be specific — cite `file:line` and suggest fixes. Skip anything that's fine. Only flag real problems. - -**Two-pass review:** -- **Pass 1 (CRITICAL):** Run SQL & Data Safety and LLM Output Trust Boundary first. These can block `/ship`. -- **Pass 2 (INFORMATIONAL):** Run all remaining categories. These are included in the PR body but do not block. - -**Output format:** - -``` -Pre-Landing Review: N issues (X critical, Y informational) - -**CRITICAL** (blocking /ship): -- [file:line] Problem description - Fix: suggested fix - -**Issues** (non-blocking): -- [file:line] Problem description - Fix: suggested fix -``` - -If no issues found: `Pre-Landing Review: No issues found.` - -Be terse. For each issue: one line describing the problem, one line with the fix. No preamble, no summaries, no "looks good overall." - ---- - -## Review Categories - -### Pass 1 — CRITICAL - -#### SQL & Data Safety -- String interpolation in SQL (even if values are `.to_i`/`.to_f` — use `sanitize_sql_array` or Arel) -- TOCTOU races: check-then-set patterns that should be atomic `WHERE` + `update_all` -- `update_column`/`update_columns` bypassing validations on fields that have or should have constraints -- N+1 queries: `.includes()` missing for associations used in loops/views (especially avatar, attachments) - -#### Race Conditions & Concurrency -- Read-check-write without uniqueness constraint or `rescue RecordNotUnique; retry` (e.g., `where(hash:).first` then `save!` without handling concurrent insert) -- `find_or_create_by` on columns without unique DB index — concurrent calls can create duplicates -- Status transitions that don't use atomic `WHERE old_status = ? UPDATE SET new_status` — concurrent updates can skip or double-apply transitions -- `html_safe` on user-controlled data (XSS) — check any `.html_safe`, `raw()`, or string interpolation into `html_safe` output - -#### LLM Output Trust Boundary -- LLM-generated values (emails, URLs, names) written to DB or passed to mailers without format validation. Add lightweight guards (`EMAIL_REGEXP`, `URI.parse`, `.strip`) before persisting. -- Structured tool output (arrays, hashes) accepted without type/shape checks before database writes. - -### Pass 2 — INFORMATIONAL - -#### Conditional Side Effects -- Code paths that branch on a condition but forget to apply a side effect on one branch. Example: item promoted to verified but URL only attached when a secondary condition is true — the other branch promotes without the URL, creating an inconsistent record. -- Log messages that claim an action happened but the action was conditionally skipped. The log should reflect what actually occurred. - -#### Magic Numbers & String Coupling -- Bare numeric literals used in multiple files — should be named constants documented together -- Error message strings used as query filters elsewhere (grep for the string — is anything matching on it?) - -#### Dead Code & Consistency -- Variables assigned but never read -- Version mismatch between PR title and VERSION/CHANGELOG files -- CHANGELOG entries that describe changes inaccurately (e.g., "changed from X to Y" when X never existed) -- Comments/docstrings that describe old behavior after the code changed - -#### LLM Prompt Issues -- 0-indexed lists in prompts (LLMs reliably return 1-indexed) -- Prompt text listing available tools/capabilities that don't match what's actually wired up in the `tool_classes`/`tools` array -- Word/token limits stated in multiple places that could drift - -#### Test Gaps -- Negative-path tests that assert type/status but not the side effects (URL attached? field populated? callback fired?) -- Assertions on string content without checking format (e.g., asserting title present but not URL format) -- `.expects(:something).never` missing when a code path should explicitly NOT call an external service -- Security enforcement features (blocking, rate limiting, auth) without integration tests verifying the enforcement path works end-to-end - -#### Crypto & Entropy -- Truncation of data instead of hashing (last N chars instead of SHA-256) — less entropy, easier collisions -- `rand()` / `Random.rand` for security-sensitive values — use `SecureRandom` instead -- Non-constant-time comparisons (`==`) on secrets or tokens — vulnerable to timing attacks - -#### Time Window Safety -- Date-key lookups that assume "today" covers 24h — report at 8am PT only sees midnight→8am under today's key -- Mismatched time windows between related features — one uses hourly buckets, another uses daily keys for the same data - -#### Type Coercion at Boundaries -- Values crossing Ruby→JSON→JS boundaries where type could change (numeric vs string) — hash/digest inputs must normalize types -- Hash/digest inputs that don't call `.to_s` or equivalent before serialization — `{ cores: 8 }` vs `{ cores: "8" }` produce different hashes - -#### View/Frontend -- Inline `