Skip to content

feat: add doctor diagnostics and layered GraphQL error hints#31

Open
ShigureNyako wants to merge 5 commits intoShigureLab:mainfrom
ShigureNyako:fix/doctor-graphql-diagnostics
Open

feat: add doctor diagnostics and layered GraphQL error hints#31
ShigureNyako wants to merge 5 commits intoShigureLab:mainfrom
ShigureNyako:fix/doctor-graphql-diagnostics

Conversation

@ShigureNyako
Copy link
Member

@ShigureNyako ShigureNyako commented Mar 21, 2026

Summary

Fixes #23.

Motivation

  • gh-llm currently surfaces transport failures such as Post "https://api.github.com/graphql": EOF as a raw gh error.
  • Users have to manually inspect entrypoint differences, auth state, network variables, REST reachability, and GraphQL reachability to understand whether the failure is local, auth-related, or transport-related.

Solution

  • add top-level gh-llm doctor diagnostics for entrypoint, executable paths, versions, active-host auth status, REST/GraphQL probes, and proxy-related environment variables
  • report both common lowercase and uppercase proxy variables so doctor does not hide active all_proxy / no_proxy setups
  • introduce structured GhCommandError handling so retried gh failures can surface retry counts and layered follow-up hints instead of only raw stderr
  • classify GraphQL transport failures consistently across gh api graphql, gh pr view, and gh issue view
  • share target-host resolution so both doctor and stderr follow-up hints use gh auth status --active --hostname <target_host>
  • document doctor troubleshooting guidance in the GitHub conversation skill

Validation

  • uv run ruff check .
  • uv run ruff format --check .
  • uv run pyright
  • uv run pytest -q
  • all_proxy='socks5://proxy.example.test:1080' no_proxy='localhost,127.0.0.1' uv run gh-llm doctor

@SigureMo Please review this PR when you have time. Thanks!

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7139807c46

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds built-in diagnostics and improved error reporting so users can quickly troubleshoot gh-llm failures (especially GraphQL transport issues like EOF) without manually running many separate gh commands. It implements a new doctor/env workflow and introduces structured gh command failures to surface retry counts plus layered “try next” hints.

Changes:

  • Add gh-llm doctor (and gh-llm env alias) to report entrypoint info, versions, auth status, REST/GraphQL probes, and relevant environment variables.
  • Introduce GhCommandError + formatting utilities to classify common failure categories and print actionable follow-up commands, including retry attempt counts.
  • Add/extend CLI tests to validate diagnostics output and layered GraphQL EOF failure messaging.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_cli.py Adds coverage for the new doctor/env commands and layered GraphQL transport error output.
src/gh_llm/github_api.py Switches retryable gh command failures to raise GhCommandError with attempt metadata.
src/gh_llm/diagnostics.py New module to classify gh failures and render layered diagnostics + suggested next commands.
src/gh_llm/commands/doctor.py New diagnostics command implementation (entrypoint/version/auth/probes/env reporting).
src/gh_llm/cli.py Registers the new parser and adds a top-level GhCommandError handler to print formatted diagnostics.
README.md Documents the new diagnostics workflow and improved transport error guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member Author

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

结论:REQUEST_CHANGES

说明:受当前 GitHub 账号限制影响,我无法对自己的 PR 提交 REQUEST_CHANGES 状态,所以这里用 COMMENT 形式提交正式 review;但从评审结论上看,这个 PR 还需要先改再合。

我本地复跑了 uv run ruff check src/gh_llm tests/test_cli.pyuv run pyrightuv run pytest -quv run gh-llm doctor,命令本身都能通过;不过仍有两个阻塞问题:

  1. src/gh_llm/diagnostics.py:76-114 只把 gh api graphql 的 transport error 归类为 GraphQL/network 问题,但 src/gh_llm/github_api.py:688-814 里的 resolve_pull_request() / resolve_issue() 入口首先走的是 gh pr view --json ... / gh issue view --json ...。我本地把 gh pr view mock 成返回 Post "https://api.github.com/graphql": EOF 后,CLI 现在会落到 generic gh failure,而不是这个 PR 期望提供的 GraphQL 分层诊断。这意味着最常见的 PR/Issue 读取入口仍然拿不到新的排障提示。
  2. src/gh_llm/commands/doctor.py:170-179 直接调用裸的 gh auth statusgh auth status --help 明确写了它会检查每个已知 host,并且只要任一 host/account 有问题就返回 1。对 doctor 这种“诊断当前可用性”的命令来说,这会把与当前 host 无关的过期登录也算成 unhealthy;即使当前活跃 host 的 REST / GraphQL probe 都正常,结果也可能被误报。这里至少应当按 active account / target host 收敛检查范围。

另外,新增测试目前只覆盖了 gh api graphql 直接失败的路径,没有覆盖 gh pr view / gh issue view 这两个高频入口的同类 transport error,也没有覆盖 doctor 在多 host / 非 active account 场景下的行为。

@ShigureNyako
Copy link
Member Author

已按 review 跟进并推送 ff14ee7

  • gh pr view / gh issue view 统一纳入 GraphQL transport error 分类,并补上重试/backoff
  • doctor 改为按目标 host 执行 gh auth status --active --hostname <target_host>,避免无关 host 的过期登录误报
  • 新增对应回归测试,覆盖 PR / Issue 入口 EOF 路径与多 host auth 诊断路径

已逐条回复并 resolve 两个 review thread。

@SigureMo
Copy link
Contributor

lint and fmt 都没修……

@ShigureNyako
Copy link
Member Author

lint and fmt 都没修……

已补上。上一版漏掉了 tests/test_cli.py 的格式化空行,我已经在 359775f 里按 ruff format 修正并推送。

本地已重新验证:

  • uv run ruff check
  • uv run pyright
  • uv run pytest -q

现在等这次 push 触发的 CI 结果。

Co-authored-by: Nyakku Shigure <38436475+SigureMo@users.noreply.github.com>
@ShigureNyako
Copy link
Member Author

已按最新 review 再跟进并推送 504ecc7

  • 移除 env 别名,只保留 doctor 命令入口
  • README 删除 gh-llm env 示例
  • skills/github-conversation/SKILL.md 补充 doctor 的 preflight / troubleshooting 指引

已逐条回复并 resolve 本轮 2 个 review thread。

Copy link
Member Author

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

结论:REQUEST_CHANGES

说明:受当前 GitHub 账号限制影响,我无法对自己的 PR 提交 REQUEST_CHANGES 状态,所以这里用 COMMENT 形式提交正式 review;但从评审结论上看,这个 PR 还需要先改再合。

我本地在 504ecc7 上复跑了:

  • uv run ruff check
  • uv run pyright
  • uv run pytest -q
  • uv run gh-llm doctor

这些命令目前都能通过;不过围绕这个 PR 的核心目标(环境 / 网络排障),我这里仍看到两个需要先修的点:

  1. src/gh_llm/commands/doctor.py 当前只输出大写 ALL_PROXY / NO_PROXY,漏掉了常见的小写 all_proxy / no_proxy。这会让用户明明配置了代理或 bypass,doctor 里却仍显示 (unset),直接丢失关键诊断信息。
  2. src/gh_llm/diagnostics.py 里的 follow-up 仍建议运行裸 gh auth status,而不是本 PR 在 doctor 中已经采用的 active-host scoped 检查。多 host / 过期账号场景下,这会重新引入与当前 host 无关的误报,和 doctor 的诊断语义不一致。

两个点都直接影响新增诊断流程的可信度,我倾向于修完再合。

@ShigureNyako
Copy link
Member Author

已按最新独立 review 跟进并推送 2384501

  • doctor 现在同时输出小写/大写代理变量,避免遗漏 all_proxy / no_proxy
  • 分层错误提示与 doctor 统一复用 target-host scoped gh auth status --active --hostname <target_host>
  • 新增共享环境 helper 与对应回归测试(包含 GH_HOST 场景)
  • 也顺手把 PR 描述更新为当前实际实现,移除了已经删除的 env 表述

已逐条回复并 resolve 本轮 2 个 review thread。

Copy link
Member Author

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

结论:APPROVE

说明:受当前 GitHub 账号限制影响,我无法对自己的 PR 提交 APPROVE 状态,所以这里用 COMMENT 形式提交正式 review;但从评审结论上看,这个 PR 目前可以合并。

我基于最新提交 2384501 重新独立复核了上轮指出的两个问题,并本地验证了:

  • uv run ruff check .
  • uv run ruff format --check .
  • uv run pyright
  • uv run pytest -q
  • all_proxy='socks5://proxy.example.test:1080' no_proxy='localhost,127.0.0.1' uv run gh-llm doctor

复核结果:

  1. 代理变量展示完整性:已修复。doctor 现在同时输出小写/大写 all_proxy / ALL_PROXYno_proxy / NO_PROXY,不会再把常见的小写配置误显示为 (unset);测试也已补上对应断言。
  2. host-scoped auth-status 提示一致性:已修复。doctor 与分层错误提示现在统一复用 target-host 解析,输出 gh auth status --active --hostname <target_host>GH_HOST 场景也已有回归测试覆盖。

本轮未发现新的阻塞问题,可以合并。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

增加 gh-llm doctor / env 预检,并将 GraphQL EOF 等错误做分层提示

3 participants