Skip to content

Fix Grafana log links in osdctl promote by discovering namespace and SaaS file names#853

Open
xiaoyu74 wants to merge 1 commit intoopenshift:masterfrom
xiaoyu74:improve_e2e_test_log_links
Open

Fix Grafana log links in osdctl promote by discovering namespace and SaaS file names#853
xiaoyu74 wants to merge 1 commit intoopenshift:masterfrom
xiaoyu74:improve_e2e_test_log_links

Conversation

@xiaoyu74
Copy link
Contributor

@xiaoyu74 xiaoyu74 commented Feb 22, 2026

Problem

When osdctl promote generates app-interface promotion(e.g MR-174930) the Grafana dashboard links frequently show "No data" even when pipelines exist and completed successfully.

Root Cause

Incorrect namespace mapping - The code used a hardcoded pattern {operator-name}-pipelines which fails for
operators that use:

  • Abbreviated namespaces: managed-cluster-validating-webhooks - > mcvw-pipelines (not
    managed-cluster-validating-webhooks-pipelines)
  • Shared namespaces: backplane-api, backplane-dashboards - > backplane-pipelines
  • Consolidated namespaces: compliance-monkey - > osd-operators-pipelines

This affects both INT and STAGE Grafana links.

What's the PR trying to resolve?

Discover the actual namespace from each operator's SaaS YAML file by reading the pipelinesProvider.$ref field
instead of using a hardcoded pattern.

Example for MCVW:

# saas-managed-cluster-validating-webhooks.yaml
pipelinesProvider:
  $ref:
/services/osd-operators/managed-cluster-validating-webhooks/pipelines/tekton-mcvw-pipelines.appsrep09ue1.yaml

Parse the filename to extract the namespace: mcvw-pipelines

Changes

  • Added discoverE2ETestSaasName() function - Discovers E2E test SaaS file name from YAML (restores functionality removed in earlier PR)
  • Modified generateGrafanaLogsURL() function - Uses discoverPipelineNamespace() to fix namespace bug
  • Updated tests - Validates namespace discovery for multiple operator patterns

Test with a dummy MCVW promotion MR

Summary by CodeRabbit

  • Refactor

    • Improved Grafana logs URL generation used during service promotions: automatically resolves the target namespace from service configuration when available, falls back gracefully when not, and adds more diagnostic output for easier troubleshooting. INT and STAGE promotion flows now use the unified logs URL behavior.
  • Tests

    • Added comprehensive tests covering namespace discovery, logs URL composition across environments, shared filename cases, and error/fallback scenarios.

@xiaoyu74 xiaoyu74 force-pushed the improve_e2e_test_log_links branch from 3f00684 to 9df57a5 Compare February 23, 2026 02:55
@dustman9000
Copy link
Member

/hold

AppSRE's cloudwatch and Grafana should be the main logging entrypoint, not Tekton. Not everyone has access to tekton and the retention is very short for those jobs.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 23, 2026
@xiaoyu74
Copy link
Contributor Author

/hold

AppSRE's cloudwatch and Grafana should be the main logging entrypoint, not Tekton. Not everyone has access to tekton and the retention is very short for those jobs.

Thanks Dustin, as discussed in the thread, I will update the PR to keep using the Grafana links and fixing the incorrect namespace mapping.

@xiaoyu74 xiaoyu74 changed the title Replace Grafana links with Tekton console links for pipeline logs Fix Grafana log links in osdctl promote by discovering namespace and SaaS file names Mar 9, 2026
@xiaoyu74 xiaoyu74 force-pushed the improve_e2e_test_log_links branch from 9df57a5 to 900c23a Compare March 10, 2026 03:22
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 10, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xiaoyu74
Once this PR has been reviewed and has the lgtm label, please assign joshbranham for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

Walkthrough

Adds pipeline-namespace discovery by parsing pipelinesProvider.$ref in SaaS YAMLs and reworks Grafana logs URL generation to use the discovered namespace (with fallbacks). Replaces generateTestLogsURL with generateGrafanaLogsURL and updates promotion flow to use the new generator.

Changes

Cohort / File(s) Summary
Namespace discovery & Grafana URL logic
cmd/promote/saas/utils.go
Adds discoverPipelineNamespace(appInterface, serviceName) (string, error) which reads SaaS YAML from ServicesFilesMap, parses pipelinesProvider.$ref, derives namespace from referenced pipeline filename (strips tekton-/tekton. and cluster suffix). Replaces generateTestLogsURL(...) with generateGrafanaLogsURL(appInterface, serviceName, operatorName, gitHash, env) string. generateGrafanaLogsURL tries discovered namespace, falls back to operator-based namespace, discovers e2e SaaS name, and builds Grafana URL with vars: var-namespace, var-targetref, var-env, var-saasfilename plus fixed query params. Adds warnings, error handling, and diagnostic prints; updates servicePromotion calls for INT/STAGE to use the new generator.
Tests: discovery & URL generation
cmd/promote/saas/utils_test.go
Adds extensive tests: TestDiscoverPipelineNamespace (various SaaS YAML scenarios), TestDiscoverPipelineNamespace_ServiceNotFound, TestGenerateGrafanaLogsURL_NamespaceDiscovery, TestGenerateGrafanaLogsURL_SameSaasfilenameBothEnvs, and TestGenerateGrafanaLogsURL_URLFormat. Validates namespace extraction, fallback behavior, URL parameter composition, and missing-file errors. Adds fmt import for formatting in tests.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as "promote CLI"
    participant App as "git.AppInterface"
    participant FS as "Repository files (SaaS YAML)"
    participant Builder as "Grafana URL builder"

    CLI->>App: request ServicesFilesMap for serviceName
    App->>FS: read `services/<service>.saas.yaml`
    FS-->>App: SaaS YAML content
    App-->>CLI: return file content
    CLI->>Builder: discoverPipelineNamespace(content)
    Builder-->>Builder: parse pipelinesProvider.$ref -> pipeline filename
    Builder-->>Builder: strip prefixes/suffixes -> namespace
    Builder-->>CLI: namespace (or error)
    CLI->>Builder: generateGrafanaLogsURL(serviceName, operator, gitHash, env, namespace?)
    Builder-->>Builder: discover e2e SaaS filename (if needed)
    Builder-->>CLI: Grafana URL with var-namespace, var-targetref, var-env, var-saasfilename, etc.
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: fixing Grafana log links by discovering namespace and SaaS file names, which directly addresses the two bugs identified in the PR objectives.
Stable And Deterministic Test Names ✅ Passed The test file uses standard Go testing framework with static, descriptive test names. The custom check targets Ginkgo BDD-style test definitions, which are not present here.
Test Structure And Quality ✅ Passed The custom check targets Ginkgo test code, but this PR contains only standard Go unit tests using the testing package with testify assertions, not Ginkgo patterns.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@xiaoyu74 xiaoyu74 force-pushed the improve_e2e_test_log_links branch from 900c23a to 98274a2 Compare March 10, 2026 03:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/promote/saas/utils.go (1)

171-180: ⚠️ Potential issue | 🔴 Critical

Add environment-specific logic to use correct SaaS file naming convention for STAGE.

The code currently attempts to discover E2E test SaaS files for both INT and STAGE environments. However, based on AppSRE conventions, STAGE deployment pipelines use regular deployment SaaS files (saas-<service>.yaml), while INT uses E2E test files.

This explains why STAGE returns "No data" in Grafana when looking for non-existent E2E test pipelines. The env parameter should control whether to discover E2E test SaaS names (for INT) or regular deployment SaaS names (for STAGE).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/promote/saas/utils.go` around lines 171 - 180, The code always tries to
discover an E2E test SaaS name via discoverE2ETestSaasName and falls back to an
E2E-style name; change this to use the env parameter to pick the correct naming
convention: if env indicates INT, call discoverE2ETestSaasName(appInterface,
operatorName) and fall back to fmt.Sprintf("saas-%s-e2e-test", operatorName); if
env indicates STAGE, either call a regular discovery function (e.g.,
discoverRegularSaasName(appInterface, operatorName)) or directly set the name to
the regular deployment filename fmt.Sprintf("saas-%s.yaml", operatorName) and
log accordingly; update references to e2eTestSaasName and the fmt.Printf
messages to reflect the branch chosen.
🧹 Nitpick comments (1)
cmd/promote/saas/utils.go (1)

96-101: Unused parameter appInterface in function signature.

The discoverPipelineNamespace function accepts appInterface git.AppInterface but never uses it. The function only accesses the global ServicesFilesMap. This contrasts with discoverE2ETestSaasName which uses appInterface.GitDirectory.

Consider either removing the unused parameter or refactoring to use appInterface consistently (e.g., deriving the SaaS file path from appInterface.GitDirectory instead of relying on the global map).

♻️ Option 1: Remove unused parameter
-func discoverPipelineNamespace(appInterface git.AppInterface, serviceName string) (string, error) {
+func discoverPipelineNamespace(serviceName string) (string, error) {

This would require updating all call sites accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/promote/saas/utils.go` around lines 96 - 101, The function
discoverPipelineNamespace has an unused parameter appInterface
(git.AppInterface) while it only reads ServicesFilesMap; either remove the
appInterface parameter from discoverPipelineNamespace and update all call sites
to match, or change the implementation to derive the saasFilePath using
appInterface.GitDirectory (like discoverE2ETestSaasName) instead of
ServicesFilesMap; update references to the function and any callers accordingly
so the signature and usage are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/promote/saas/utils.go`:
- Around line 171-180: The code always tries to discover an E2E test SaaS name
via discoverE2ETestSaasName and falls back to an E2E-style name; change this to
use the env parameter to pick the correct naming convention: if env indicates
INT, call discoverE2ETestSaasName(appInterface, operatorName) and fall back to
fmt.Sprintf("saas-%s-e2e-test", operatorName); if env indicates STAGE, either
call a regular discovery function (e.g., discoverRegularSaasName(appInterface,
operatorName)) or directly set the name to the regular deployment filename
fmt.Sprintf("saas-%s.yaml", operatorName) and log accordingly; update references
to e2eTestSaasName and the fmt.Printf messages to reflect the branch chosen.

---

Nitpick comments:
In `@cmd/promote/saas/utils.go`:
- Around line 96-101: The function discoverPipelineNamespace has an unused
parameter appInterface (git.AppInterface) while it only reads ServicesFilesMap;
either remove the appInterface parameter from discoverPipelineNamespace and
update all call sites to match, or change the implementation to derive the
saasFilePath using appInterface.GitDirectory (like discoverE2ETestSaasName)
instead of ServicesFilesMap; update references to the function and any callers
accordingly so the signature and usage are consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c2a208d6-d77c-4d5a-9923-eb4621baa5f6

📥 Commits

Reviewing files that changed from the base of the PR and between 8b9046f and 900c23a.

📒 Files selected for processing (2)
  • cmd/promote/saas/utils.go
  • cmd/promote/saas/utils_test.go

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 10, 2026

@xiaoyu74: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/promote/saas/utils.go (1)

157-191: ⚠️ Potential issue | 🟠 Major

Use the deployment SaaS name for STAGE.

Lines 172-176 still force var-saasfilename to saas-<operator>-e2e-test for every env. That preserves the STAGE bug called out in the PR: most operators only run the regular deploy pipeline in STAGE, so the generated STAGE link will keep filtering to a non-existent e2e PipelineRun and return “No data”.

💡 Suggested direction
- e2eTestSaasName, err := discoverE2ETestSaasName(appInterface, operatorName)
+ saasName, err := discoverGrafanaSaasName(appInterface, serviceName, operatorName, env)
  ...
- url += fmt.Sprintf("&var-saasfilename=%s", e2eTestSaasName)
+ url += fmt.Sprintf("&var-saasfilename=%s", saasName)
// INT   -> resolve the e2e test SaaS YAML name
// STAGE -> resolve the promoted/deploy SaaS YAML name
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/promote/saas/utils.go`:
- Around line 96-106: discoverPipelineNamespace currently reads the raw
ServicesFilesMap value which may be a directory (causing EISDIR); ensure the
function operates on the resolved SaaS YAML path by either (a) changing callers
to call GetSaasDir(servicePath) and pass that resulting YAML path into
discoverPipelineNamespace, or (b) update discoverPipelineNamespace to call
GetSaasDir on the received service path before os.ReadFile; reference the
discoverPipelineNamespace function and the GetSaasDir helper when making the
change so directory-backed services are converted to their deploy.yaml /
hypershift-deploy.yaml file before reading.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5d47cc62-484b-4ebb-aa5d-9a31d47fa10f

📥 Commits

Reviewing files that changed from the base of the PR and between 900c23a and 98274a2.

📒 Files selected for processing (2)
  • cmd/promote/saas/utils.go
  • cmd/promote/saas/utils_test.go

Comment on lines +96 to +106
func discoverPipelineNamespace(appInterface git.AppInterface, serviceName string) (string, error) {
// Get the SaaS file path from the services map
saasFilePath, ok := ServicesFilesMap[serviceName]
if !ok {
return "", fmt.Errorf("saas file not found for service %s", serviceName)
}

// Read the SaaS YAML file
fileContent, err := os.ReadFile(saasFilePath)
if err != nil {
return "", fmt.Errorf("failed to read SaaS file: %w", err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Resolve the actual SaaS YAML before reading it.

Line 104 reads the raw ServicesFilesMap entry, but that map also contains saas-* directories; GetSaasDir only converts those into deploy.yaml / hypershift-deploy.yaml later in this file. For directory-backed services this helper will hit EISDIR, fall back to <operator>-pipelines, and keep the Grafana link wrong for the exact operators this PR is trying to fix.

🐛 Suggested direction
-func discoverPipelineNamespace(appInterface git.AppInterface, serviceName string) (string, error) {
-	saasFilePath, ok := ServicesFilesMap[serviceName]
-	if !ok {
-		return "", fmt.Errorf("saas file not found for service %s", serviceName)
-	}
+func discoverPipelineNamespace(saasFilePath string) (string, error) {
// Call this with the already-resolved YAML path returned by GetSaasDir.
namespace, err := discoverPipelineNamespace(saasDir)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/promote/saas/utils.go` around lines 96 - 106, discoverPipelineNamespace
currently reads the raw ServicesFilesMap value which may be a directory (causing
EISDIR); ensure the function operates on the resolved SaaS YAML path by either
(a) changing callers to call GetSaasDir(servicePath) and pass that resulting
YAML path into discoverPipelineNamespace, or (b) update
discoverPipelineNamespace to call GetSaasDir on the received service path before
os.ReadFile; reference the discoverPipelineNamespace function and the GetSaasDir
helper when making the change so directory-backed services are converted to
their deploy.yaml / hypershift-deploy.yaml file before reading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants