Skip to content

fix(release): merge develop into main#221

Merged
bedatty merged 13 commits intomainfrom
develop
Apr 15, 2026
Merged

fix(release): merge develop into main#221
bedatty merged 13 commits intomainfrom
develop

Conversation

@bedatty
Copy link
Copy Markdown
Contributor

@bedatty bedatty commented Apr 15, 2026

Lerian

GitHub Actions Shared Workflows


Description

Type of Change

  • feat: New workflow or new input/output/step in an existing workflow
  • fix: Bug fix in a workflow (incorrect behavior, broken step, wrong condition)
  • perf: Performance improvement (e.g. caching, parallelism, reduced steps)
  • refactor: Internal restructuring with no behavior change
  • docs: Documentation only (README, docs/, inline comments)
  • ci: Changes to self-CI (workflows under .github/workflows/ that run on this repo)
  • chore: Dependency bumps, config updates, maintenance
  • test: Adding or updating tests
  • BREAKING CHANGE: Callers must update their configuration after this PR

Breaking Changes

None.

Testing

  • YAML syntax validated locally
  • Triggered a real workflow run on a caller repository using @develop or the beta tag
  • Verified all existing inputs still work with default values
  • Confirmed no secrets or tokens are printed in logs
  • Checked that unrelated workflows are not affected

Caller repo / workflow run:

Related Issues

Closes #

Summary by CodeRabbit

  • New Features

    • Introduced a deployment matrix configuration to centrally manage app-to-cluster mappings, replacing hardcoded deployment flags.
    • Added support for a third deployment cluster (anacleto).
    • Implemented force-off override behavior for deployment targets via boolean inputs.
  • Documentation

    • Updated GitOps workflow documentation to reflect matrix-driven deployment topology and new cluster support.
  • Chores

    • Added validation tooling for the deployment matrix configuration file.
    • Updated workflow automation to enforce deployment matrix schema integrity.

bedatty and others added 13 commits April 13, 2026 11:52
…to support

Introduce config/deployment-matrix.yaml as the single source of truth for
which apps deploy to which clusters. The workflow now reads the manifest
at the same pinned ref as itself (sparse checkout) and resolves the
cluster set from app_name. Adds anacleto as the third deployment target.

deploy_in_<cluster> inputs become force-off overrides — they subtract
clusters from the manifest-resolved set but cannot add a cluster the
manifest does not list. This prevents accidental cross-cluster spillover
while still allowing emergency containment.

Adds src/lint/deployment-matrix composite (Python embedded, follows the
composite-schema pattern) that validates schema, app/cluster integrity,
duplicates, and orphan apps. Wired into self-pr-validation as a gated
job that only runs when config/deployment-matrix.yaml changes.

The manifest topology was inferred empirically from the GitOps repo by
cross-referencing folder presence with CI commit history — only apps
that are real callers of this workflow are included (excludes apps
managed manually like underwriter, jd-mock-api, mock-btg-server,
control-plane, platform-console, ledger, dockerhub-secret).
…tution

Address PR #212 lint failures:

1. Pin all `actions/checkout@v6` occurrences in self-pr-validation.yml to
   the SHA already used in gitops-update.yml. Required by pinned-actions
   lint for external (non-LerianStudio) actions. Also clears pre-existing
   tech debt in this file that surfaced because the new deployment-matrix
   job touched it.

2. Replace `echo "$RESOLVED" | sed 's/^/  - /'` with a bash `while read`
   loop in the resolve_clusters step. Fixes shellcheck SC2001
   (prefer bash parameter expansion over sed for simple substitutions).
Resolves 6 medium-severity findings from github-advanced-security:

CODE INJECTION (4 findings — actions/code-injection/medium):
- Move `${{ github.workflow_ref }}` to step env: WORKFLOW_REF
  - Bonus: replace `echo | sed -E 's|.*@||'` with bash `${VAR##*@}`
  - Eliminates injection vectors at lines 106 + 108
- Move resolve_clusters outputs (has_clusters, clusters) to step env:
  HAS_CLUSTERS + RESOLVED_SERVERS in apply_tags step
- Move inputs.yaml_key_mappings + inputs.configmap_updates to step env:
  MAPPINGS + CONFIGMAP_MAPPINGS
- Replace `${{ env.IS_BETA/RC/PRODUCTION/SANDBOX }}` with direct
  `$IS_BETA/...` (already in job-level env, no need to re-interpolate)
- Replace `${{ github.ref }}` with `${GITHUB_REF}` (auto-set by runner)

UNTRUSTED CHECKOUT (2 findings — actions/untrusted-checkout/medium):
- Add `persist-credentials: false` to manifest sparse checkout (read-only,
  no credentials needed, never executes code from this checkout)
- Document trust model inline for the GitOps repo checkout (workflow_call
  is not triggered by untrusted PRs; inputs.gitops_repository comes from
  trusted internal callers; MANAGE_TOKEN is required for the subsequent
  commit/push step, so we cannot drop persist-credentials there)
1. [CRITICAL] Replace `github.workflow_ref` with `github.job_workflow_sha`
   for manifest checkout. In reusable workflows, github.workflow_ref points
   to the CALLER's workflow file/ref, not the called reusable workflow —
   my previous design would have failed for every external caller.
   `job_workflow_sha` is the commit SHA of the running reusable workflow,
   which is exactly what we need. Bonus: SHA is more secure than textual
   ref, and removes the need for the `Resolve shared-workflows ref` step
   entirely (−18 lines).

2. [HIGH] Remove `|| true` from the RESOLVED pipeline. Silenced yq/jq
   failures would collapse into the "app not registered" warning path,
   hiding real manifest/query errors. Now fails fast on parse errors;
   empty RESOLVED from a successful query remains the legitimate
   "no matching clusters" case (handled explicitly below).

3. [MEDIUM] Rename config/deployment-matrix.yaml → .yml to match the
   repo convention (77 .yml files vs 2 .yaml). Updated all references:
   workflow input default, self-pr-validation gate, composite default,
   README docs, and the workflow doc.

4. [LOW] Add prominent migration callout to docs about deploy_in_*
   semantic change — apps must be in the manifest; inputs only subtract.

Declined: per-cluster warning when deploy_in_<cluster>: true but app
is absent from that cluster's manifest list. Inputs default to true, so
this would fire for every app missing from any cluster on every run —
noise without signal. Existing "app in zero clusters" warning already
covers the actionable case.
actionlint v1.7.x (pinned via raven-actions/actionlint@v2.1.2) does not
yet include `github.job_workflow_sha` in its GitHub context schema,
triggering a false-positive "property not defined" error on the previous
direct reference.

Replace the inline `${{ github.job_workflow_sha }}` expression with an
intermediate step that reads the equivalent auto-set env var
GITHUB_JOB_WORKFLOW_SHA and exports it as a step output. Functionally
identical (the runner populates both from the same source) but the
`steps.X.outputs.Y` expression is recognized by actionlint.

Also adds a defensive guard that fails fast if GITHUB_JOB_WORKFLOW_SHA
is empty — which would mean the workflow is being called outside a
reusable-workflow context, catching that misconfiguration loudly.
…ssuming auto env var

GITHUB_JOB_WORKFLOW_SHA is not exposed automatically by the runner. The
github.job_workflow_sha context must be mapped explicitly through the
step's env: block like any other context value. Prior implementation
relied on a nonexistent auto env var and failed with 'is this job really
running as part of a reusable workflow?' on every execution.

Validated against real run: https://github.com/LerianStudio/plugin-br-pix-indirect-btg/actions/runs/24458387402/job/71466177318
Drops the 'Resolve reusable workflow SHA' step entirely — github.job_workflow_sha
is empty when evaluated inside a job of a reusable workflow invoked via
jobs.X.uses (empirically confirmed on run 24461037331). Three prior attempts to
source that SHA all failed for different reasons:

- parsing github.workflow_ref: points to the caller, not the reusable
- GITHUB_JOB_WORKFLOW_SHA env var: does not exist
- github.job_workflow_sha context: empty in this evaluation context

This commit is a TEMP workaround for end-to-end validation: manifest checkout
is hardcoded to the feature branch. Before merging #212 this will be replaced
with a proper 'deployment_matrix_ref' input (default 'main').
… external action

The LerianStudio/github-actions-argocd-sync action suppresses stderr via
'> /dev/null 2>&1' on every CLI invocation. Any failure (auth, permission,
network, malformed URL, expired token) is rendered indistinguishable from
'app does not exist' and skipped silently when skip-if-not-exists=true.

Replaces the external action with inline argocd CLI calls that surface the
real error output. Preserves the skip-if-not-exists semantics (warn + exit 0
on app get failure), but syncs fail the job loudly.
…to validate resolution

Temporary change for end-to-end testing of the manifest-driven gitops
pipeline on PR #212. Expected behavior on next beta of
plugin-br-pix-indirect-btg:

- resolve_clusters: {firmino, anacleto} (clotilde dropped)
- values.yaml updated only in firmino/dev and anacleto/dev
- argocd_sync fan-out: 2 jobs (firmino-*-dev, anacleto-*-dev)

Revert this commit before merging #212.
…d restore matrix

- Adds deployment_matrix_ref input (default 'main'). Callers on pinned tags
  get the latest manifest automatically; test runs can override via the
  input without editing the workflow.
- Drops the temporary hardcoded ref to the feature branch.
- Restores plugin-br-pix-indirect-btg in the clotilde cluster (removed
  temporarily during exclusion-validation test).

End-to-end validation completed against plugin-br-pix-indirect-btg:
- v1.5.2-beta.9: full fan-out to firmino + clotilde + anacleto, sync OK
- v1.5.2-beta.10: manifest exclusion respected (firmino + anacleto only)
…anges

- New 'deployment-matrix' label auto-applied by the labeler on PRs that
  touch config/deployment-matrix.yml.
- config/deployment-matrix.yml added to self-release.yml paths-ignore:
  since callers resolve the manifest from main at runtime (via the
  deployment_matrix_ref input with default 'main'), matrix-only changes
  propagate to all callers without requiring a new release tag.
- Mixed commits that touch the matrix plus workflow/action code still
  trigger a release as usual.
…ix-anacleto

feat(gitops-update): manifest-driven topology + anacleto cluster
@bedatty bedatty requested a review from a team as a code owner April 15, 2026 18:48
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

Walkthrough

This PR introduces a manifest-driven deployment matrix system to replace hard-coded cluster boolean flags in GitOps workflows. It adds configuration infrastructure (YAML schema, labels, linting, reporting) and updates the gitops-update workflow to resolve target clusters from config/deployment-matrix.yml based on app_name, with deploy_in_* inputs converted to force-off overrides.

Changes

Cohort / File(s) Summary
GitHub labeling infrastructure
.github/labeler.yml, .github/labels.yml
Added deployment-matrix labeler rule and label definition (color 5319e7) to tag PRs modifying config/deployment-matrix.yml.
Workflow CI/CD
.github/workflows/gitops-update.yml
Significant refactor: replaced deploy_in_firmino/deploy_in_clotilde boolean logic with manifest-driven cluster resolution; added deploy_in_anacleto support; introduced sparse checkout of deployment matrix file; replaced ArgoCD sync action with inline CLI steps (get/sync/wait with retries); updated inputs to deployment_matrix_file and deployment_matrix_ref; changed tag interpolation from direct input substitution to env-based passing.
Workflow validation
.github/workflows/self-pr-validation.yml, .github/workflows/self-release.yml
Added deployment-matrix lint job in validation workflow and its integration with lint reporter; added config/deployment-matrix.yml to release workflow's paths-ignore to prevent spurious releases on matrix-only changes.
Deployment matrix schema
config/deployment-matrix.yml
New configuration file defining cluster topology: version, apps.registry (allowed app names), and clusters.<name>.apps lists mapping apps to clusters; includes documentation of runtime semantics (manifest resolution + force-off override behavior).
Linting implementation
src/lint/deployment-matrix/action.yml, src/lint/deployment-matrix/README.md
New composite action validating deployment matrix YAML: checks version=1, apps.registry as list of non-empty strings (no duplicates), clusters as mapping of cluster specs with apps lists, integrity rules (referenced apps must exist in registry, no duplicates within clusters), and hygiene warnings for orphan registry apps.
Lint reporting integration
src/notify/pr-lint-reporter/action.yml, src/notify/pr-lint-reporter/README.md
Added deployment-matrix-result and deployment-matrix-files inputs; integrated deployment matrix check into PR comment checks array with result/file-list parsing.
Documentation
docs/gitops-update-workflow.md
Updated workflow documentation: clarified manifest-driven cluster topology, redefined deploy_in_* inputs as force-off overrides (subtract only, never add), documented missing app_name and empty cluster set behaviors, added anacleto cluster, updated examples to use <cluster> placeholder.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

The primary driver is the substantial logic refactor in gitops-update.yml (197 lines added/47 removed) including cluster resolution, environment variable handling, and ArgoCD CLI replacement. The addition of a new linting system (deployment-matrix action + reporter integration) introduces multiple moving parts. While individual file changes like labeler updates and schema definitions are straightforward, the heterogeneous nature (workflow logic, validation schema, CLI integration, and documentation updates) and density of logic changes across the system demand careful cross-file verification.

Possibly related PRs

Suggested labels

workflow, deployment-matrix, github-config, documentation, size/L

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Title check ⚠️ Warning Title claims a release merge but describes a major feature (deployment matrix system) across 9 files with workflow, validation, and configuration changes. Retitle to reflect the actual primary change: e.g., 'feat: add deployment matrix manifest and validation' or 'feat: implement manifest-driven cluster topology'.
Description check ❓ Inconclusive Description correctly marks the change as 'feat' with testing validation, but leaves the Description section empty (summary of changes not provided). Fill the Description section with a summary of what this PR does, which workflows are affected, and what behavior changes (deployment matrix system, new cluster support, validation linting).
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch develop

Comment @coderabbitai help to get the list of available commands and usage tips.

@lerian-studio lerian-studio added size/L PR changes 500–999 lines documentation Improvements or additions to documentation workflow Changes to one or more reusable workflow files github-config Changes to repository configuration (templates, CODEOWNERS, labeler, etc.) deployment-matrix Changes to the canonical deployment matrix (config/deployment-matrix.yml) labels Apr 15, 2026
@lerian-studio
Copy link
Copy Markdown

🔍 Lint Analysis

Check Files Scanned Status
YAML Lint 8 file(s) ✅ success
Action Lint 3 file(s) ✅ success
Pinned Actions 5 file(s) ✅ success
Markdown Link Check 3 file(s) ✅ success
Spelling Check 11 file(s) ✅ success
Shell Check 5 file(s) ✅ success
README Check 5 file(s) ✅ success
Composite Schema 2 file(s) ✅ success
Deployment Matrix 1 file(s) ✅ success

🔍 View full scan logs

@lerian-studio
Copy link
Copy Markdown

🛡️ CodeQL Analysis Results

Languages analyzed: actions

Found 2 issue(s): 2 Medium

Severity Rule File Message
🟡 Medium actions/untrusted-checkout/medium .github/workflows/gitops-update.yml:100 Potential unsafe checkout of untrusted pull request on privileged workflow.
🟡 Medium actions/untrusted-checkout/medium .github/workflows/gitops-update.yml:108 Potential unsafe checkout of untrusted pull request on privileged workflow.

🔍 View full scan logs | 🛡️ Security tab

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/gitops-update-workflow.md`:
- Line 37: Update the docs to document the inputs.deployment_matrix_ref
behavior: add deployment_matrix_ref to the "optional inputs" table, state its
default (when omitted the workflow will checkout the shared-workflows repo at
the same pinned ref as the workflow) and explain that passing
deployment_matrix_ref overrides that default to read the manifest from a
different ref; also clarify how this interacts with manifest-ref resolution in
.github/workflows/gitops-update.yml so callers understand when they must supply
deploy_in_* or deployment_matrix_ref.

In `@src/lint/deployment-matrix/action.yml`:
- Around line 12-19: The "Install dependencies" step currently falls back to
apt-get installing "python3-yaml" if python3 lacks yaml; update this step to
also detect pip3 (or pip) and use pip install pyyaml as a fallback when apt-get
isn't available: check for python3 -c "import yaml", then if not present try
apt-get install -y --no-install-recommends python3-yaml, and if that fails or
apt-get is not found use pip3 (or pip) to install pyyaml; adjust the shell block
under the step named "Install dependencies" to probe for pip executables and run
pip install pyyaml when appropriate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7b8bab11-ef25-4088-a814-549b1a1ce796

📥 Commits

Reviewing files that changed from the base of the PR and between 05d3850 and 02ce0a9.

📒 Files selected for processing (11)
  • .github/labeler.yml
  • .github/labels.yml
  • .github/workflows/gitops-update.yml
  • .github/workflows/self-pr-validation.yml
  • .github/workflows/self-release.yml
  • config/deployment-matrix.yml
  • docs/gitops-update-workflow.md
  • src/lint/deployment-matrix/README.md
  • src/lint/deployment-matrix/action.yml
  • src/notify/pr-lint-reporter/README.md
  • src/notify/pr-lint-reporter/action.yml

Comment thread docs/gitops-update-workflow.md
Comment thread src/lint/deployment-matrix/action.yml
bedatty added a commit that referenced this pull request Apr 15, 2026
…fault resolution

Addresses CodeRabbit feedback on PR #221. The workflow no longer checks out
the manifest at the same ref as itself — it defaults to 'main' (via the
deployment_matrix_ref input) so manifest updates propagate to every caller
without bumping the pinned workflow tag.

- Lead paragraph: replace 'same pinned ref' description.
- Optional inputs table: add deployment_matrix_ref row.
- 'How it works' step 2: rewrite to reflect the new behavior and rationale.
@bedatty bedatty merged commit 7982514 into main Apr 15, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployment-matrix Changes to the canonical deployment matrix (config/deployment-matrix.yml) documentation Improvements or additions to documentation github-config Changes to repository configuration (templates, CODEOWNERS, labeler, etc.) size/L PR changes 500–999 lines workflow Changes to one or more reusable workflow files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants