Skip to content

Add S3 upload support for changelogs#43

Draft
cotti wants to merge 4 commits intomainfrom
feature/changelog-upload
Draft

Add S3 upload support for changelogs#43
cotti wants to merge 4 commits intomainfrom
feature/changelog-upload

Conversation

@cotti
Copy link
Copy Markdown
Contributor

@cotti cotti commented Mar 26, 2026

This pull request introduces a new automated workflow for uploading merged changelog files to an S3 bucket. The implementation includes a reusable GitHub Actions workflow, supporting scripts, and documentation updates to guide users on setup and usage.

Changelog Upload Automation

  • Adds a reusable GitHub Actions workflow for uploading changelog files to the elastic-docs-v3-changelog-bundles S3 bucket when a pull request is merged. The workflow determines which changelog files were added or modified, extracts the relevant products, and uploads each file under the correct S3 path. (.github/workflows/changelog-upload.yml)
  • Implements a composite GitHub Action (changelog/upload/action.yml) that checks out the merged commit, finds relevant changelog files, authenticates with AWS using OIDC, and uploads files to S3, handling cases where no changelog file is present gracefully.
  • Adds a Node.js script (changelog/upload/scripts/prepare-upload.js) used by the workflow to parse the repository’s changelog configuration, detect changed changelog files, and extract product information for upload targeting.

Documentation

  • Updates changelog/README.md with clear instructions for enabling and configuring the changelog upload workflow, including workflow YAML examples, OIDC setup requirements, and a step-by-step explanation of how the automation works.

@cotti cotti self-assigned this Mar 26, 2026
@cotti cotti added the enhancement New feature or request label Mar 26, 2026
@cotti cotti marked this pull request as draft March 26, 2026 17:30
run: |
while IFS=' ' read -r FRAGMENT PRODUCT; do
S3_KEY="${PRODUCT}/changelogs/$(basename "${FRAGMENT}")"
echo "Uploading ${FRAGMENT} → s3://elastic-docs-v3-changelog-bundles/${S3_KEY}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a risk here since PRODUCT comes from the YAML file content and gets used directly in the S3 key. Would it make sense to validate it in prepare-upload.js before it reaches this point? Something like checking it matches [a-zA-Z0-9_-]+ to avoid any unexpected path traversal if a product value ever contains ../ or similar characters.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, doing it now via 1b90a61

owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
per_page: 100,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick thought, listFiles caps at 100 results per page here. Would it make sense to use github.paginate() instead, just in case a PR touches more than 100 files? Probably rare for changelog PRs but larger refactor PRs that also include a changelog entry could hit this silently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and we used paginate for changelog additions too. Added in 1b90a61

for (const line of content.split('\n')) {
if (/^products:\s*$/.test(line)) { inProducts = true; continue; }
if (inProducts) {
const m = line.match(/^\s+product:\s*(\S+)/);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this, does the changelog YAML use product: as a mapping key directly under products:, or does it use YAML list syntax like - product: kibana? If it's the latter, this regex wouldn't match because of the leading - . Just want to make sure this lines up with what docs-builder changelog add actually produces!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, confirmed it here. I believe using js-yaml fixed it.

let content;
try {
content = fs.readFileSync(configFile, 'utf8');
} catch (_) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use a YAML parsing library here (like js-yaml) instead of parsing line by line? The regex approach could miss things like inline values, quoted strings, or comments after values. Since actions/github-script runs in Node, js-yaml should be available or easy to add. That would cover both this function and readProducts below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done at 1b90a61.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on using https://github.com/vercel/ncc, I think this is a common pattern to bundle the JS, including its npm dependencies.

@cotti cotti requested a review from reakaleek March 26, 2026 23:40
Copy link
Copy Markdown
Member

@Mpdreamz Mpdreamz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to docs-builder use the same incremental logic we do for uploading docs with content hash checks again S3?

I don't want to rely on changed files detection logic, I also want to have only one source of truth reading the config not duplicate it in an action outside of the main repository that would be harder to validate we are not breaking.

Copy link
Copy Markdown
Member

@Mpdreamz Mpdreamz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security & logic review

Compared against changelog-validate.yml, changelog-submit.yml, changelog/submit/action.yml, and docs-deploy.yml.

Six findings — one high severity that blocks merge, the rest medium/logic/best-practice gaps.


🔴 High

1. aws s3 cp missing --no-follow-symlinkschangelog/upload/action.yml line 62

aws s3 cp "${FRAGMENT}" "s3://elastic-docs-v3-changelog-bundles/${S3_KEY}" \
  --checksum-algorithm SHA256

A merged PR can include a symlink at a path inside the changelog directory (e.g. docs/changelog/v9.2.0.yaml -> /home/runner/.config/...). The upload step would follow that symlink and upload the target file contents to S3 with no indication anything went wrong.

docs-deploy.yml guards against exactly this with --no-follow-symlinks on all S3 copy operations. This should do the same.

Fix: add --no-follow-symlinks to the aws s3 cp invocation.


🟡 Medium

2. Action pinned to @main instead of @v1.github/workflows/changelog-upload.yml line 23

uses: elastic/docs-actions/changelog/upload@main

Every other reusable workflow in this repo references actions at @v1 (a versioned, immutable tag). @main is a mutable branch tip — any future commit to main will silently change the behaviour of every consumer the next time they run. This is a supply-chain hygiene issue regardless of who controls the repo.

Fix: change to @v1 (consistent with all sibling workflows).


3. Missing top-level permissions: {}.github/workflows/changelog-upload.yml

Both changelog-validate.yml (line 11) and changelog-submit.yml (line 15) deny all permissions at the workflow scope and grant only what each job needs. This file skips that and inherits whatever the repository default token permissions are.

Fix: add permissions: {} between the on: block and concurrency:.


4. No guard when merge_commit_sha is emptychangelog/upload/action.yml line 25

ref: ${{ github.event.pull_request.merge_commit_sha }}

merge_commit_sha is only present when the triggering event is pull_request (type closed, merged). If the action is invoked in a different context, this resolves to an empty string and actions/checkout silently falls back to the default branch HEAD — a different commit from what was merged, with no error.

The pattern in changelog/submit/action.yml (lines 57, 93–103) resolves the ref explicitly and then verifies the checkout SHA matches the expected value before proceeding.

Fix: fail fast if merge_commit_sha is empty, similar to the SHA-verification step in changelog/submit/action.yml.


🟠 Logic

5. Space-separated upload pairs break if a filename contains a spaceprepare-upload.js line 50 / action.yml line 59

pairs.push(`${fragmentPath} ${product}`);
while IFS=' ' read -r FRAGMENT PRODUCT; do

A changelog filename with a space (unusual but not impossible) would split incorrectly: FRAGMENT gets everything up to the first space, PRODUCT gets the remainder. The product regex validation would then silently skip the entry rather than failing clearly.

Fix: use a delimiter that cannot appear in either field (e.g. :: or tab), or encode pairs as JSON and parse them in bash.


6. Concurrency group does not deduplicate in workflow_call context.github/workflows/changelog-upload.yml line 12

group: changelog-upload-${{ github.event.pull_request.number || github.run_id }}

In a workflow_call invocation the caller's github.event.pull_request.number is not propagated into the called workflow's own event context — it will always be empty. The || github.run_id fallback makes every run have a unique concurrency key, so the group provides no deduplication (two concurrent uploads for the same PR would both proceed).

The sibling changelog-validate.yml uses the same pattern but is triggered directly by pull_request, so the PR number is always populated there. Here it isn't.

Fix: expose a pr-number workflow input and use it in the concurrency key, or document that deduplication is the caller's responsibility.


Summary

File Line(s) Severity Finding
changelog/upload/action.yml 62 🔴 aws s3 cp missing --no-follow-symlinks
changelog-upload.yml 23 🟡 Action pinned to @main, not @v1
changelog-upload.yml (top) 🟡 Missing top-level permissions: {}
changelog/upload/action.yml 25 🟡 No guard on empty merge_commit_sha
prepare-upload.js:50 + action.yml:59 🟠 Space separator breaks on filenames with spaces
changelog-upload.yml 12 🟠 Concurrency group always falls back to run_id in workflow_call

while IFS=' ' read -r FRAGMENT PRODUCT; do
S3_KEY="${PRODUCT}/changelogs/$(basename "${FRAGMENT}")"
echo "Uploading ${FRAGMENT} → s3://elastic-docs-v3-changelog-bundles/${S3_KEY}"
aws s3 cp "${FRAGMENT}" "s3://elastic-docs-v3-changelog-bundles/${S3_KEY}" \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing --no-follow-symlinks. A merged PR can include a symlink inside the changelog directory (e.g. docs/changelog/v9.2.0.yaml -> /home/runner/.config/...) and aws s3 cp will follow it and silently upload the target file's contents to S3.

docs-deploy.yml uses --no-follow-symlinks on all S3 copy operations for exactly this reason. Apply the same guard:

aws s3 cp "${FRAGMENT}" "s3://elastic-docs-v3-changelog-bundles/${S3_KEY}" \
  --no-follow-symlinks \
  --checksum-algorithm SHA256

id-token: write
steps:
- name: Upload changelog
uses: elastic/docs-actions/changelog/upload@main
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinned to @main (a mutable branch tip). Every other workflow in this repo uses @v1. With @main, any commit to the main branch silently changes behaviour for all consumers on their next run — this is the definition of a supply-chain risk.

Change to:

uses: elastic/docs-actions/changelog/upload@v1

group: changelog-upload-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true

jobs:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing top-level permissions: {}. Both changelog-validate.yml (line 11) and changelog-submit.yml (line 15) deny all permissions at the workflow scope and then grant only what each job needs. Without this, the workflow inherits the repository's default token permissions.

Add before jobs::

permissions: {}

- name: Checkout
uses: actions/checkout@v6
with:
ref: ${{ github.event.pull_request.merge_commit_sha }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge_commit_sha is only populated when the triggering event is a merged pull_request. If empty, actions/checkout silently falls back to the default branch HEAD — a different commit than what was merged, with no error or warning.

Compare: changelog/submit/action.yml lines 93–103 explicitly verifies the checked-out SHA matches the expected value before proceeding.

Add a guard step before the checkout:

- name: Verify event context
  shell: bash
  env:
    MERGE_SHA: ${{ github.event.pull_request.merge_commit_sha }}
  run: |
    if [ -z "$MERGE_SHA" ]; then
      echo "::error::merge_commit_sha is empty — must be triggered from a merged pull_request event"
      exit 1
    fi

);
continue;
}
pairs.push(`${fragmentPath} ${product}`);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a space as the field separator between fragmentPath and product is fragile. A changelog filename with a space would cause the bash while IFS=' ' read -r FRAGMENT PRODUCT to split incorrectly — FRAGMENT gets the path up to the first space, PRODUCT gets the remainder, and the product regex silently rejects it.

Consider using a tab (\t) or a sentinel like :: as separator, and adjust the bash IFS accordingly. Or encode the pairs as JSON and parse in bash with jq.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants