Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@ bun.lock
.env.local
.env.*
!.env.example
.DS_Store
gemini-port/generated/
20 changes: 20 additions & 0 deletions TODOS.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,3 +388,23 @@
### Auto-upgrade mode + smart update check
- Config CLI (`bin/gstack-config`), auto-upgrade via `~/.gstack/config.yaml`, 12h cache TTL, exponential snooze backoff (24h→48h→1wk), "never ask again" option, vendored copy sync on upgrade
**Completed:** v0.3.8

### Gemini CLI port
- `gemini-port/` with native Gemini skill system integration
- `gstack-browse`: stateful Playwright engine with 14 commands, `@eN` ref selector, annotated screenshots, diff mode, console/network capture
- `gstack-setup-browser-cookies`: macOS Keychain cookie extraction with full attribute normalization
- `install.sh`: generates Gemini-adapted versions of 6 main skills at install time (single source of truth), links all 8 skills via `gemini skills link`
- Output truncation for `text`/`html` commands (50k char limit)
**Completed:** feat/gemini-port

## Gemini CLI Port

### Linux/Windows support for gstack-setup-browser-cookies

**What:** The `setup-cookies.js` script relies on `chrome-cookies-secure` which uses the macOS Keychain. Add support for GNOME Keyring (Linux) and DPAPI (Windows).

**Why:** Cross-platform cookie import. Currently macOS-only.

**Effort:** M
**Priority:** P3
**Depends on:** Linux/Windows cookie decryption (Browse section)
94 changes: 94 additions & 0 deletions gemini-port/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Gemini gstack: Team of Specialists for Gemini CLI

This directory ports the [gstack](https://github.com/garrytan/gstack) toolkit to Gemini CLI.

gstack transforms Gemini CLI from a general-purpose assistant into a team of opinionated specialists.

## Architecture

Only two components live permanently in this directory — the ones that are genuinely Gemini-specific:

| Directory | What it is |
| :--- | :--- |
| `gstack-browse/` | Stateful Playwright browser engine for Gemini CLI |
| `gstack-setup-browser-cookies/` | macOS Keychain cookie extractor for authenticated testing |

The 6 "content" skills (ship, reviewer, qa, retro, ceo, eng-lead) are **generated at install time** from the main gstack repo. `install.sh` strips Claude-specific frontmatter and tooling from each SKILL.md and links the result into Gemini CLI. There is no duplication — the main repo is the single source of truth.

## Installation

```bash
# From the gstack repo root:
bash gemini-port/install.sh

# Or with workspace scope (installs for a single Gemini workspace):
bash gemini-port/install.sh --scope workspace
```

Then reload in Gemini:
```
/skills reload
```

That's it. The script handles npm deps, skill generation, and `gemini skills link` for all 8 skills.

## Updating

When the main gstack skills evolve, regenerate the Gemini versions by re-running:

```bash
bash gemini-port/install.sh
```

## Skills

| Command | Role | Focus |
| :--- | :--- | :--- |
| `gstack-ceo` | **CEO / Founder** | Product vision, 10-star experience, problem scoping. |
| `gstack-eng-lead` | **Tech Lead** | Architecture, state machines, test matrices, Mermaid diagrams. |
| `gstack-reviewer` | **Paranoid Reviewer** | Structural audit, race conditions, N+1 queries, security. |
| `gstack-ship` | **Release Engineer** | Git sync, test execution, PR creation. |
| `gstack-browse` | **QA Engineer** | Stateful browser automation via Playwright. |
| `gstack-qa` | **QA Lead** | Automated regression testing based on git diffs. |
| `gstack-setup-browser-cookies` | **Session Manager** | macOS Keychain cookie extraction for authenticated testing. |
| `gstack-retro` | **EM Retro** | Data-driven weekly engineering retrospectives. |

## Technical Notes

### Stateful Browsing

Unlike the original Claude gstack (which requires a persistent Bun daemon), this port uses a **chained command architecture** in Playwright. A sequence of actions (`goto -> click -> fill -> screenshot`) runs in a single Node.js execution context, preserving DOM state without a background process.

### Cookie Extraction

`gstack-setup-browser-cookies` uses the macOS Keychain to decrypt local Chrome cookies, normalizing `SameSite` and `Expires` attributes for Playwright ingestion.

### Skill Generation

`install.sh` applies the following transforms to each main SKILL.md:

1. Strip Claude frontmatter (`allowed-tools`, `version`)
2. Add minimal Gemini frontmatter (`name`, `description`)
3. Remove the update-check block (Claude-specific self-upgrade logic)
4. Replace the `$B` binary discovery block with `B="node $HOME/.gemini/skills/gstack-browse/scripts/browse.js"`
5. Remove HTML generator comments

Generated files land in `generated/` (gitignored — they are build artifacts).

## Rebuilding Skill Bundles

The `.skill` files are pre-packaged archives for `gemini skills install` (offline use). After modifying source files, rebuild with:

```bash
# Rebuild gstack-browse
node /path/to/skill-creator/scripts/package_skill.cjs gstack-browse

# Rebuild gstack-setup-browser-cookies
node /path/to/skill-creator/scripts/package_skill.cjs gstack-setup-browser-cookies
```

The `SKILL.md` files in each skill subdirectory are the authoritative source. The `.skill` bundles are distribution artifacts.

---
*Ported by Rémi Al Ajroudi from [gstack](https://github.com/garrytan/gstack).*

Binary file added gemini-port/gstack-browse.skill
Binary file not shown.
190 changes: 190 additions & 0 deletions gemini-port/gstack-browse/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
---
name: gstack-browse
description: QA Engineer Mode (Browsing). Use when asked to test a UI, navigate a site, verify rendering, fill forms, or perform automated browser tasks. Gives the agent "eyes" and hands to interact with live applications.
---

# QA Engineer Mode (Browsing)

You are acting as a QA Engineer with a live Playwright browser. Your goal is to interact with web applications, verify UI state, document bugs, and catch regressions.

The browser uses a **persistent Chromium profile** at `~/.gstack/gemini-browser-data` — cookies, login sessions, and storage persist across runs. SSL certificate errors are ignored by default (supports local/staging servers).

**All commands run in the same browser context** — chain them in a single `run_shell_command` call to preserve state (DOM, cookies, navigation history).

---

## Setup

Before using any browse command, ensure Node.js and the Playwright dependency are installed:

```bash
cd <skill_dir>/gstack-browse && node -e "require('playwright')" 2>/dev/null || npm install
```

Then run commands via:
```bash
node <skill_dir>/gstack-browse/scripts/browse.js <command1> [args...] <command2> [args...]
```

For brevity in this document, `$B` means `node <skill_dir>/gstack-browse/scripts/browse.js`.

---

## Core QA Patterns

### 1. Verify a page loads correctly
```bash
$B goto https://yourapp.com
$B text # content loads?
$B console --errors # JS errors?
$B network --errors # failed requests?
$B is visible ".main-content" # key elements present?
```

### 2. Test a login flow
```bash
$B goto https://app.com/login
$B snapshot -i # see all interactive elements → get @eN refs
$B fill @e2 "user@test.com"
$B fill @e3 "password"
$B click @e4 # submit
$B snapshot -D # diff: what changed after submit?
$B is visible ".dashboard" # success state?
```

### 3. Verify an action worked
```bash
$B snapshot # baseline DOM snapshot
$B click @e3 # do something
$B snapshot -D # unified diff — shows exactly what changed
```

### 4. Visual evidence for bug reports
```bash
$B snapshot -i -a -o /tmp/annotated.png # annotated screenshot with @eN labels
$B screenshot /tmp/bug.png # plain full-page screenshot
$B console --errors # error log
```

### 5. Find all clickable elements (including non-ARIA divs)
```bash
$B snapshot -C # includes cursor:pointer, onclick elements
$B click @c1
```

### 6. Assert element states
```bash
$B is visible ".modal"
$B is enabled "#submit-btn"
$B is disabled "#submit-btn"
$B is checked "#agree-checkbox"
$B is editable "#name-field"
```

### 7. Test responsive layouts
```bash
$B viewport 375x812
$B screenshot /tmp/mobile.png
$B viewport 1280x720
```

### 8. Fill forms and select dropdowns
```bash
$B fill "#username" "alice"
$B select "#country" "United States"
$B press Tab
$B press Enter
```

### 9. Test page with authentication (cookies already imported)
```bash
$B goto https://app.com/dashboard # session cookies loaded from persistent profile
$B snapshot -i -a -o /tmp/dash.png
```

---

## Available Commands

### Navigation
| Command | Args | Description |
|---------|------|-------------|
| `goto` | `<url>` | Navigate to a URL. Waits for network idle. |
| `wait` | `<selector\|Nms>` | Wait for CSS selector to appear, or pause N milliseconds. |
| `scroll` | `<top\|bottom\|up\|down\|selector>` | Scroll page or bring element into view. |
| `viewport` | `<WxH>` | Set browser viewport. Example: `375x812`, `1280x720`. |

### Interaction
| Command | Args | Description |
|---------|------|-------------|
| `click` | `<selector>` | Click an element. Accepts CSS selectors or `@eN` refs. |
| `fill` | `<selector> <text>` | Fill an input field. Wrap multi-word text in quotes. |
| `hover` | `<selector>` | Mouse over an element (reveals tooltips, dropdowns). |
| `press` | `<key>` | Keyboard press. Examples: `Enter`, `Tab`, `Escape`, `ArrowDown`. |
| `select` | `<selector> <value>` | Choose an option in a `<select>` dropdown by label or value. |

### Observation
| Command | Args | Description |
|---------|------|-------------|
| `text` | — | Output page visible text. |
| `html` | — | Output full page HTML. |
| `links` | — | List all links (`href` + text). |
| `console` | `[--errors]` | Show browser console messages. `--errors` filters to errors/warnings only. |
| `network` | `[--errors]` | Show network requests. `--errors` shows only 4xx/5xx/failed. |
| `count` | `<selector>` | Count elements matching a CSS selector. |
| `is` | `<state> <selector>` | Assert element state. Exits non-zero on failure. |
| `js` | `<code>` | Execute JavaScript; print result as JSON. |

### Screenshots & Snapshots
| Command | Flags | Description |
|---------|-------|-------------|
| `screenshot` | `<path>` | Full-page screenshot saved to path. |
| `snapshot` | — | Full accessibility tree as text. |
| `snapshot` | `-i` | Interactive elements only (inputs, buttons, links, selects). |
| `snapshot` | `-C` | Like `-i` plus non-ARIA clickables (cursor:pointer, onclick). |
| `snapshot` | `-D` | Diff mode: show changes since last snapshot call. |
| `snapshot` | `-a -o <path>` | Annotated screenshot with `@eN` labels saved to path. |
| `snapshot` | `-i -a -o <path>` | Annotated screenshot of interactive elements only. |

### Session & Cookies
| Command | Args | Description |
|---------|------|-------------|
| `cookie-import` | `<path>` | Import cookies from a JSON file (Playwright cookie format). |

---

## @eN Selectors

`snapshot -i` outputs an element table like:

```
3 interactive elements:
@e1 input text "Email" #email
@e2 input password "Password" #password
@e3 button "Sign In" .btn-primary
```

These `@eN` refs can be used directly as selectors in `click`, `fill`, `hover`, `is`, and `scroll`. They resolve to the underlying CSS selector. The refs are valid for the current page — they reset when you navigate.

---

## `is` States

| State | Assertion |
|-------|-----------|
| `visible` | Element exists and is visible |
| `hidden` | Element is absent or not visible |
| `enabled` | Element is not disabled |
| `disabled` | Element has disabled attribute |
| `checked` | Checkbox/radio is checked |
| `editable` | Input/textarea is editable (not readonly) |

---

## Important Notes

- **Chaining is how state is preserved.** All commands in a single `node browse.js ...` call share the same page and session.
- **`@eN` refs are per-snapshot.** Re-run `snapshot -i` after navigation to get fresh refs.
- **Session data persists at `~/.gstack/gemini-browser-data`** across separate `node browse.js` invocations. Cookies from `gstack-setup-browser-cookies` are stored there.
- **`snapshot -D`** compares to the last snapshot call in any previous run (stored in `/tmp/gstack-snapshot-state.json`).
- **On SSL errors:** Ignored by default — fine for localhost and staging. Do not test production credential flows on untrusted cert sites.
60 changes: 60 additions & 0 deletions gemini-port/gstack-browse/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 16 additions & 0 deletions gemini-port/gstack-browse/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"name": "gstack-browse",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"type": "commonjs",
"dependencies": {
"playwright": "^1.58.2"
}
}
Loading