Skip to content

fix(agents): return actionable auth error when Graph token is missing#7546

Open
therealjohn wants to merge 1 commit intoAzure:mainfrom
therealjohn:fix/rbac-preflight-auth-check
Open

fix(agents): return actionable auth error when Graph token is missing#7546
therealjohn wants to merge 1 commit intoAzure:mainfrom
therealjohn:fix/rbac-preflight-auth-check

Conversation

@therealjohn
Copy link
Copy Markdown
Contributor

Fixes #7514

When azd auth login caches only an ARM token, the postdeploy RBAC hook fails with a cryptic 'AzureDeveloperCLICredential: please run azd auth login' message because the Graph API scope was never consented/cached.

Changes:

  • Detect credential auth errors in ensureAgentIdentityRBACWithCred and return an exterrors.Auth with a suggestion to re-login with the Graph scope
  • Preserve structured errors in postdeployHandler so the message, code, and suggestion survive gRPC serialization
  • Add isCredentialAuthError helper with table-driven tests
  • Add CodeRbacAuthFailed error code

Copy link
Copy Markdown
Member

@jongio jongio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issues to address:

  • agent_identity_rbac.go - isCredentialAuthError relies on string matching; core azd uses typed error detection (errors.AsType[*azidentity.AuthenticationFailedError]) for the same purpose
  • agent_identity_rbac.go - structured error message embeds raw err text, which can surface verbose SDK internals to users

…Azure#7514)

When azd auth login caches only an ARM token, the postdeploy RBAC hook
fails with a cryptic 'AzureDeveloperCLICredential: please run azd auth
login' message because the Graph API scope was never consented/cached.

Changes:
- Detect credential auth errors in ensureAgentIdentityRBACWithCred and
  return an exterrors.Auth with a suggestion to re-login with the Graph
  scope
- Preserve structured errors in postdeployHandler so the message, code,
  and suggestion survive gRPC serialization
- Add isCredentialAuthError helper with table-driven tests
- Add CodeRbacAuthFailed error code

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@therealjohn therealjohn force-pushed the fix/rbac-preflight-auth-check branch from 0c0fe7f to 19c6b88 Compare April 7, 2026 13:59
@therealjohn therealjohn marked this pull request as ready for review April 7, 2026 14:14
Copilot AI review requested due to automatic review settings April 7, 2026 14:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves the azure.ai.agents extension’s post-deploy RBAC hook error reporting when Microsoft Graph authentication fails, returning a structured/auth error with actionable re-login guidance and preserving structured errors across gRPC boundaries.

Changes:

  • Detect Graph token acquisition/auth failures during agent identity discovery and return a structured auth error with a Graph-scope re-login suggestion.
  • Preserve structured errors (LocalError/ServiceError) in the postdeploy handler to avoid losing code/suggestion through gRPC serialization.
  • Add a helper (isCredentialAuthError) with table-driven tests and introduce a new error code for RBAC auth failures.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
cli/azd/extensions/azure.ai.agents/internal/project/agent_identity_rbac.go Adds auth-error detection during Graph identity lookup and returns a structured auth error with guidance.
cli/azd/extensions/azure.ai.agents/internal/project/agent_identity_rbac_test.go Adds table-driven tests for isCredentialAuthError.
cli/azd/extensions/azure.ai.agents/internal/exterrors/codes.go Introduces CodeRbacAuthFailed for RBAC-specific auth failures.
cli/azd/extensions/azure.ai.agents/internal/cmd/listen.go Ensures structured errors are returned unchanged from the postdeploy hook to preserve metadata across gRPC.
cli/azd/extensions/azure.ai.agents/cspell.yaml Adds aadsts to spelling allowlist for new auth error matching/tests.

Comment on lines +118 to +123
var localErr *azdext.LocalError
if errors.As(err, &localErr) {
return err
}
var svcErr *azdext.ServiceError
if errors.As(err, &svcErr) {
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this handler, repo guidance prefers errors.AsType[T](err) over errors.As(...) for type checks (see cli/azd/AGENTS.md). Switching to errors.AsType[*azdext.LocalError] / errors.AsType[*azdext.ServiceError] will keep style consistent and avoid errorlint complaints.

Suggested change
var localErr *azdext.LocalError
if errors.As(err, &localErr) {
return err
}
var svcErr *azdext.ServiceError
if errors.As(err, &svcErr) {
if _, ok := errors.AsType[*azdext.LocalError](err); ok {
return err
}
if _, ok := errors.AsType[*azdext.ServiceError](err); ok {

Copilot uses AI. Check for mistakes.
Comment on lines +194 to +195
"run 'azd auth login --scope https://graph.microsoft.com/.default' "+
"and re-run 'azd deploy'",
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion string uses single quotes around commands. Elsewhere in this extension structured auth suggestions use backticks (e.g., exterrors/authFromGrpcMessage) which renders better in terminals and is more consistent. Consider switching these command quotes to backticks for consistent UX/copy-paste.

Suggested change
"run 'azd auth login --scope https://graph.microsoft.com/.default' "+
"and re-run 'azd deploy'",
"run `azd auth login --scope https://graph.microsoft.com/.default` "+
"and re-run `azd deploy`",

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

@jongio jongio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The credential auth detection with AADSTS fallback gives users a clear re-login suggestion instead of a cryptic message. The approach is correct, but this logic should live in core rather than be reimplemented per-extension.

Two things to move to core:

  1. isCredentialAuthError reimplements detection that core azd already has at internal/cmd/errors.go:192. A shared helper in pkg/azdext would prevent per-extension reimplementation and keep the detection logic consistent as the SDK evolves.

  2. The structured error passthrough in listen.go is technically redundant - azdext.WrapError already traverses the error chain with errors.AsType to find LocalError/ServiceError (extension_error.go:75-96), so wrapping with fmt.Errorf doesn't lose structured metadata. A core WrapIfUnstructured(err, msg) helper would be cleaner than per-callsite passthrough checks.

Minor items:

  • +1 on switching listen.go from errors.As to errors.AsType for consistency with agent_identity_rbac.go in this same PR
  • The single-quote convention in the suggestion string is correct (20+ instances across the extension vs 3 backtick uses in authFromGrpcMessage) - don't follow the Copilot bot's backtick suggestion

Copy link
Copy Markdown
Contributor

@wbreza wbreza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #7546

fix(agents): return actionable auth error when Graph token is missing by @therealjohn

Summary

The credential auth detection with AADSTS fallback gives users a clear re-login suggestion instead of a cryptic message. The approach is correct — errors.AsType[*azidentity.AuthenticationFailedError] as primary check with AADSTS string fallback, structured error return that short-circuits the retry loop, and clean user-facing message with specific Graph scope guidance. Three supplementary findings below — not duplicating @jongio's or @copilot-bot's existing feedback.

Prior Review Status

# Prior Finding Author Status
1 Move isCredentialAuthError to core azd @jongio ⌛ Pending
2 Use errors.AsType in listen.go @jongio + @copilot-bot ⌛ Pending
3 Structured error passthrough redundancy @jongio ⌛ Pending

New Findings

Priority Count
Medium 2
Low 1
Total 3

🟡 Medium (2 findings)

  1. AADSTS string fallback breadthstrings.Contains(err.Error(), "AADSTS") matches any error mentioning AADSTS codes, not just token acquisition failures. The test suite covers uppercase "AADSTS70043" but doesn't document the case-sensitivity assumption. Add a test case for lowercase aadsts (expected false) to lock in the behavior contract.
  2. No test for the structured error return pathisCredentialAuthError is well-tested in isolation, but no test verifies that ensureAgentIdentityRBACWithCred returns exterrors.Auth with code CodeRbacAuthFailed and the Graph scope suggestion when an auth error occurs. A mock of discoverAgentIdentity returning AuthenticationFailedError would close this gap.

🟢 Low (1 finding)

  1. Original error discarded without debug logging — When isCredentialAuthError matches, the original error is completely discarded (not wrapped, not logged). The clean user message is correct per @jongio's resolved feedback, but the original error (correlation IDs, specific failure reason) is lost for post-incident debugging.

✅ What Looks Good

  • Correct errors.AsType[*azidentity.AuthenticationFailedError] usage — Go 1.26 convention
  • Auth error correctly short-circuits the retry loop — credential failures won't resolve with retries
  • Well-structured table-driven tests with good edge case coverage
  • Single-quote convention in suggestion string matches the extension's dominant pattern (20+ instances)
  • Import order follows extension's established convention

Overall Assessment: Comment — supplementary findings only. Primary blockers are @jongio's existing CHANGES_REQUESTED items.

Review performed with GitHub Copilot CLI

}
if _, ok := errors.AsType[*azidentity.AuthenticationFailedError](err); ok {
return true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] AADSTS string fallback breadth

strings.Contains(err.Error(), "AADSTS") matches any error containing "AADSTS", not just token acquisition failures. While Azure consistently returns uppercase "AADSTS", add a test case for lowercase aadsts (expected false) to document the case-sensitivity assumption and lock in the behavior contract.

assert.Equal(t, tt.want, got)
})
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] Missing test for structured error return path

isCredentialAuthError is well-tested here, but no test verifies that ensureAgentIdentityRBACWithCred returns exterrors.Auth with code CodeRbacAuthFailed and the Graph scope suggestion when an auth error occurs. A mock of discoverAgentIdentity returning AuthenticationFailedError would close this gap.

exterrors.CodeRbacAuthFailed,
"agent identity RBAC setup failed: "+
"could not acquire a Microsoft Graph token",
"run 'azd auth login --scope https://graph.microsoft.com/.default' "+
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] Original error discarded without debug logging

The structured error provides a clean user message (correctly per @jongio's resolved feedback), but the original error is completely discarded — not wrapped, not logged. The correlation IDs and specific failure reason from the Azure SDK are lost for post-incident debugging. Consider logging the original error at debug level before returning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Post-Deploy RBAC Hook Fails, No Pre-Flight Auth Check

4 participants