fix(agents): return actionable auth error when Graph token is missing#7546
fix(agents): return actionable auth error when Graph token is missing#7546therealjohn wants to merge 1 commit intoAzure:mainfrom
Conversation
jongio
left a comment
There was a problem hiding this comment.
Issues to address:
- agent_identity_rbac.go -
isCredentialAuthErrorrelies on string matching; core azd uses typed error detection (errors.AsType[*azidentity.AuthenticationFailedError]) for the same purpose - agent_identity_rbac.go - structured error message embeds raw err text, which can surface verbose SDK internals to users
cli/azd/extensions/azure.ai.agents/internal/project/agent_identity_rbac.go
Outdated
Show resolved
Hide resolved
cli/azd/extensions/azure.ai.agents/internal/project/agent_identity_rbac.go
Outdated
Show resolved
Hide resolved
…Azure#7514) When azd auth login caches only an ARM token, the postdeploy RBAC hook fails with a cryptic 'AzureDeveloperCLICredential: please run azd auth login' message because the Graph API scope was never consented/cached. Changes: - Detect credential auth errors in ensureAgentIdentityRBACWithCred and return an exterrors.Auth with a suggestion to re-login with the Graph scope - Preserve structured errors in postdeployHandler so the message, code, and suggestion survive gRPC serialization - Add isCredentialAuthError helper with table-driven tests - Add CodeRbacAuthFailed error code Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
0c0fe7f to
19c6b88
Compare
There was a problem hiding this comment.
Pull request overview
Improves the azure.ai.agents extension’s post-deploy RBAC hook error reporting when Microsoft Graph authentication fails, returning a structured/auth error with actionable re-login guidance and preserving structured errors across gRPC boundaries.
Changes:
- Detect Graph token acquisition/auth failures during agent identity discovery and return a structured auth error with a Graph-scope re-login suggestion.
- Preserve structured errors (LocalError/ServiceError) in the
postdeployhandler to avoid losing code/suggestion through gRPC serialization. - Add a helper (
isCredentialAuthError) with table-driven tests and introduce a new error code for RBAC auth failures.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
cli/azd/extensions/azure.ai.agents/internal/project/agent_identity_rbac.go |
Adds auth-error detection during Graph identity lookup and returns a structured auth error with guidance. |
cli/azd/extensions/azure.ai.agents/internal/project/agent_identity_rbac_test.go |
Adds table-driven tests for isCredentialAuthError. |
cli/azd/extensions/azure.ai.agents/internal/exterrors/codes.go |
Introduces CodeRbacAuthFailed for RBAC-specific auth failures. |
cli/azd/extensions/azure.ai.agents/internal/cmd/listen.go |
Ensures structured errors are returned unchanged from the postdeploy hook to preserve metadata across gRPC. |
cli/azd/extensions/azure.ai.agents/cspell.yaml |
Adds aadsts to spelling allowlist for new auth error matching/tests. |
| var localErr *azdext.LocalError | ||
| if errors.As(err, &localErr) { | ||
| return err | ||
| } | ||
| var svcErr *azdext.ServiceError | ||
| if errors.As(err, &svcErr) { |
There was a problem hiding this comment.
In this handler, repo guidance prefers errors.AsType[T](err) over errors.As(...) for type checks (see cli/azd/AGENTS.md). Switching to errors.AsType[*azdext.LocalError] / errors.AsType[*azdext.ServiceError] will keep style consistent and avoid errorlint complaints.
| var localErr *azdext.LocalError | |
| if errors.As(err, &localErr) { | |
| return err | |
| } | |
| var svcErr *azdext.ServiceError | |
| if errors.As(err, &svcErr) { | |
| if _, ok := errors.AsType[*azdext.LocalError](err); ok { | |
| return err | |
| } | |
| if _, ok := errors.AsType[*azdext.ServiceError](err); ok { |
| "run 'azd auth login --scope https://graph.microsoft.com/.default' "+ | ||
| "and re-run 'azd deploy'", |
There was a problem hiding this comment.
The suggestion string uses single quotes around commands. Elsewhere in this extension structured auth suggestions use backticks (e.g., exterrors/authFromGrpcMessage) which renders better in terminals and is more consistent. Consider switching these command quotes to backticks for consistent UX/copy-paste.
| "run 'azd auth login --scope https://graph.microsoft.com/.default' "+ | |
| "and re-run 'azd deploy'", | |
| "run `azd auth login --scope https://graph.microsoft.com/.default` "+ | |
| "and re-run `azd deploy`", |
jongio
left a comment
There was a problem hiding this comment.
The credential auth detection with AADSTS fallback gives users a clear re-login suggestion instead of a cryptic message. The approach is correct, but this logic should live in core rather than be reimplemented per-extension.
Two things to move to core:
-
isCredentialAuthErrorreimplements detection that core azd already has atinternal/cmd/errors.go:192. A shared helper inpkg/azdextwould prevent per-extension reimplementation and keep the detection logic consistent as the SDK evolves. -
The structured error passthrough in
listen.gois technically redundant -azdext.WrapErroralready traverses the error chain witherrors.AsTypeto findLocalError/ServiceError(extension_error.go:75-96), so wrapping withfmt.Errorfdoesn't lose structured metadata. A coreWrapIfUnstructured(err, msg)helper would be cleaner than per-callsite passthrough checks.
Minor items:
- +1 on switching
listen.gofromerrors.Astoerrors.AsTypefor consistency withagent_identity_rbac.goin this same PR - The single-quote convention in the suggestion string is correct (20+ instances across the extension vs 3 backtick uses in
authFromGrpcMessage) - don't follow the Copilot bot's backtick suggestion
wbreza
left a comment
There was a problem hiding this comment.
Code Review — PR #7546
fix(agents): return actionable auth error when Graph token is missing by @therealjohn
Summary
The credential auth detection with AADSTS fallback gives users a clear re-login suggestion instead of a cryptic message. The approach is correct — errors.AsType[*azidentity.AuthenticationFailedError] as primary check with AADSTS string fallback, structured error return that short-circuits the retry loop, and clean user-facing message with specific Graph scope guidance. Three supplementary findings below — not duplicating @jongio's or @copilot-bot's existing feedback.
Prior Review Status
| # | Prior Finding | Author | Status |
|---|---|---|---|
| 1 | Move isCredentialAuthError to core azd |
@jongio | ⌛ Pending |
| 2 | Use errors.AsType in listen.go |
@jongio + @copilot-bot | ⌛ Pending |
| 3 | Structured error passthrough redundancy | @jongio | ⌛ Pending |
New Findings
| Priority | Count |
|---|---|
| Medium | 2 |
| Low | 1 |
| Total | 3 |
🟡 Medium (2 findings)
- AADSTS string fallback breadth —
strings.Contains(err.Error(), "AADSTS")matches any error mentioning AADSTS codes, not just token acquisition failures. The test suite covers uppercase "AADSTS70043" but doesn't document the case-sensitivity assumption. Add a test case for lowercaseaadsts(expectedfalse) to lock in the behavior contract. - No test for the structured error return path —
isCredentialAuthErroris well-tested in isolation, but no test verifies thatensureAgentIdentityRBACWithCredreturnsexterrors.Authwith codeCodeRbacAuthFailedand the Graph scope suggestion when an auth error occurs. A mock ofdiscoverAgentIdentityreturningAuthenticationFailedErrorwould close this gap.
🟢 Low (1 finding)
- Original error discarded without debug logging — When
isCredentialAuthErrormatches, the original error is completely discarded (not wrapped, not logged). The clean user message is correct per @jongio's resolved feedback, but the original error (correlation IDs, specific failure reason) is lost for post-incident debugging.
✅ What Looks Good
- Correct
errors.AsType[*azidentity.AuthenticationFailedError]usage — Go 1.26 convention - Auth error correctly short-circuits the retry loop — credential failures won't resolve with retries
- Well-structured table-driven tests with good edge case coverage
- Single-quote convention in suggestion string matches the extension's dominant pattern (20+ instances)
- Import order follows extension's established convention
Overall Assessment: Comment — supplementary findings only. Primary blockers are @jongio's existing CHANGES_REQUESTED items.
Review performed with GitHub Copilot CLI
| } | ||
| if _, ok := errors.AsType[*azidentity.AuthenticationFailedError](err); ok { | ||
| return true | ||
| } |
There was a problem hiding this comment.
[Medium] AADSTS string fallback breadth
strings.Contains(err.Error(), "AADSTS") matches any error containing "AADSTS", not just token acquisition failures. While Azure consistently returns uppercase "AADSTS", add a test case for lowercase aadsts (expected false) to document the case-sensitivity assumption and lock in the behavior contract.
| assert.Equal(t, tt.want, got) | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
[Medium] Missing test for structured error return path
isCredentialAuthError is well-tested here, but no test verifies that ensureAgentIdentityRBACWithCred returns exterrors.Auth with code CodeRbacAuthFailed and the Graph scope suggestion when an auth error occurs. A mock of discoverAgentIdentity returning AuthenticationFailedError would close this gap.
| exterrors.CodeRbacAuthFailed, | ||
| "agent identity RBAC setup failed: "+ | ||
| "could not acquire a Microsoft Graph token", | ||
| "run 'azd auth login --scope https://graph.microsoft.com/.default' "+ |
There was a problem hiding this comment.
[Low] Original error discarded without debug logging
The structured error provides a clean user message (correctly per @jongio's resolved feedback), but the original error is completely discarded — not wrapped, not logged. The correlation IDs and specific failure reason from the Azure SDK are lost for post-incident debugging. Consider logging the original error at debug level before returning.
Fixes #7514
When azd auth login caches only an ARM token, the postdeploy RBAC hook fails with a cryptic 'AzureDeveloperCLICredential: please run azd auth login' message because the Graph API scope was never consented/cached.
Changes: