diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..e025bf3 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,2 @@ +.github/workflows/*.lock.yml linguist-generated=true merge=ours +.github/workflows/*.campaign.g.md linguist-generated=true merge=ours diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS new file mode 100644 index 0000000..9363b1b --- /dev/null +++ b/.github/CODEOWNERS @@ -0,0 +1,5 @@ +# TODO: Replace with your team's GitHub usernames or team names +# See https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners + +# Default owners for everything in the repo +# * @llm-d/your-team diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 0000000..971e0e8 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,72 @@ +name: Bug Report +description: File a bug report. +title: "[Bug]: " +labels: ["bug", "triage"] +body: + - type: markdown + attributes: + value: | + Thanks for taking the time to fill out this bug report! + + - type: input + id: contact + attributes: + label: Contact Details + description: How can we get in touch with you if we need more info? + placeholder: ex. email@example.com + validations: + required: false + + - type: textarea + id: what-happened + attributes: + label: What happened? + description: Also tell us, what did you expect to happen? + placeholder: Describe the bug and expected behavior + validations: + required: true + + - type: input + id: version + attributes: + label: Version + description: What version are you running? (e.g., v0.1.0, commit SHA, or "main") + placeholder: v0.1.0 + validations: + required: true + + - type: textarea + id: reproduce + attributes: + label: Steps to Reproduce + description: How can we reproduce this issue? + placeholder: | + 1. Deploy with config... + 2. Send request to... + 3. Observe error... + validations: + required: true + + - type: textarea + id: environment + attributes: + label: Environment + description: | + Please provide relevant environment details. + placeholder: | + - Kubernetes version: + - Cloud provider: + - GPU type: + - OS: + render: markdown + validations: + required: false + + - type: textarea + id: logs + attributes: + label: Relevant log output + description: Please copy and paste any relevant log output. This will be automatically formatted into code. + render: shell + validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000..da03f04 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,8 @@ +blank_issues_enabled: false +contact_links: + - name: Questions & Discussion + url: https://github.com/llm-d/llm-d/discussions + about: Ask questions and discuss ideas in the llm-d community + - name: Slack + url: https://cloud-native.slack.com/archives/llm-d + about: Chat with the community on Slack diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 0000000..0bfde2c --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,65 @@ +name: Feature Request +description: Suggest a new feature or improvement. +title: "[Feature]: " +labels: ["enhancement", "triage"] +body: + - type: markdown + attributes: + value: | + ## Feature Request + + Thank you for suggesting an improvement! Please fill in the details below. + + Before opening a new feature request, please check + [existing issues](https://github.com/llm-d/llm-d-prism/issues) + to see if a similar request already exists. + + - type: textarea + id: problem + attributes: + label: Problem Statement + description: | + What problem does this feature solve? Describe the use case or pain point. + placeholder: "I'm always frustrated when..." + validations: + required: true + + - type: textarea + id: solution + attributes: + label: Proposed Solution + description: | + Describe the solution you'd like. Be as specific as possible about + expected behavior and user experience. + placeholder: "Ideally, it would..." + validations: + required: true + + - type: textarea + id: alternatives + attributes: + label: Alternatives Considered + description: Describe any alternative solutions or workarounds you've considered. + validations: + required: false + + - type: dropdown + id: contribution + attributes: + label: Willingness to Contribute + description: Would you be willing to help implement this feature? + options: + - "Yes, I can submit a PR" + - "Yes, with guidance" + - "No, but I can help test" + - "No" + validations: + required: true + + - type: textarea + id: additional-context + attributes: + label: Additional Context + description: Add any other context, screenshots, links, or references. + validations: + required: false diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..42516cf --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,26 @@ +## What does this PR do? + + + +## Why is this change needed? + + + +## How was this tested? + + +- [ ] Unit tests added/updated +- [ ] Integration/e2e tests added/updated +- [ ] Manual testing performed + +## Checklist + +- [ ] Commits are signed off (`git commit -s`) per [DCO](PR_SIGNOFF.md) +- [ ] Code follows project [contributing guidelines](CONTRIBUTING.md) +- [ ] Tests pass locally (`make test`) +- [ ] Linters pass (`make lint`) +- [ ] Documentation updated (if applicable) + +## Related Issues + + diff --git a/.github/actions/docker-build-and-push/action.yml b/.github/actions/docker-build-and-push/action.yml new file mode 100644 index 0000000..d2d083a --- /dev/null +++ b/.github/actions/docker-build-and-push/action.yml @@ -0,0 +1,54 @@ +name: Docker Build and Push +description: Build and push multi-arch container image to ghcr.io + +inputs: + image-name: + required: true + description: Image name (e.g., my-project) + tag: + required: true + description: Image tag (e.g., v0.1.0) + github-token: + required: true + description: GitHub token for ghcr.io login + registry: + required: false + description: Container registry + default: ghcr.io/llm-d + platforms: + required: false + description: Target platforms + default: linux/amd64,linux/arm64 + prerelease: + required: false + description: If true, skip tagging as 'latest' + default: "false" + +runs: + using: "composite" + steps: + - name: Set up QEMU + uses: docker/setup-qemu-action@v3 + + - name: Set up Docker Buildx + uses: docker/setup-buildx-action@v3 + + - name: Log in to GitHub Container Registry + run: echo "${{ inputs.github-token }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin + shell: bash + + - name: Build and push image + run: | + if [[ "${{ inputs.prerelease }}" == "true" ]]; then + LATEST_TAG="" + else + LATEST_TAG="-t ${{ inputs.registry }}/${{ inputs.image-name }}:latest" + fi + docker buildx build \ + --platform ${{ inputs.platforms }} \ + --push \ + --annotation "index:org.opencontainers.image.source=${{ github.server_url }}/${{ github.repository }}" \ + --annotation "index:org.opencontainers.image.licenses=Apache-2.0" \ + -t ${{ inputs.registry }}/${{ inputs.image-name }}:${{ inputs.tag }} \ + ${LATEST_TAG} . + shell: bash diff --git a/.github/actions/trivy-scan/action.yml b/.github/actions/trivy-scan/action.yml new file mode 100644 index 0000000..c493717 --- /dev/null +++ b/.github/actions/trivy-scan/action.yml @@ -0,0 +1,26 @@ +name: Trivy Security Scan +description: Scan container image for HIGH and CRITICAL vulnerabilities + +inputs: + image: + required: true + description: Container image to scan (e.g., ghcr.io/llm-d/my-project:v0.1.0) + severity: + required: false + description: Severity levels to report + default: HIGH,CRITICAL + exit-code: + required: false + description: Exit code when vulnerabilities are found (0 = warn only, 1 = fail) + default: "0" + +runs: + using: "composite" + steps: + - name: Run Trivy vulnerability scanner + uses: aquasecurity/trivy-action@master + with: + image-ref: ${{ inputs.image }} + format: table + severity: ${{ inputs.severity }} + exit-code: ${{ inputs.exit-code }} diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 0000000..257fdd5 --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,71 @@ +# Canonical Dependabot configuration for llm-d repos +# Copy this file to .github/dependabot.yml in your repo +# +# Covers: Go modules, GitHub Actions, Docker base images +# Remove sections that don't apply to your repo + +version: 2 +updates: + + # Go module updates + - package-ecosystem: "gomod" + directory: "/" + schedule: + interval: "weekly" + open-pull-requests-limit: 10 + commit-message: + prefix: "deps(go)" + labels: + - "dependencies" + - "release-note-none" + ignore: + # Ignore major and minor updates to Go toolchain + - dependency-name: "go" + update-types: ["version-update:semver-major", "version-update:semver-minor"] + # Ignore major and minor version updates to k8s packages + - dependency-name: "k8s.io/*" + update-types: ["version-update:semver-major", "version-update:semver-minor"] + - dependency-name: "sigs.k8s.io/*" + update-types: ["version-update:semver-major", "version-update:semver-minor"] + # Ignore major updates for all packages + - dependency-name: "*" + update-types: ["version-update:semver-major"] + groups: + kubernetes: + patterns: + - "k8s.io/*" + - "sigs.k8s.io/*" + go-dependencies: + patterns: + - "*" + + # GitHub Actions dependencies + - package-ecosystem: "github-actions" + directory: "/" + schedule: + interval: "weekly" + labels: + - "dependencies" + - "release-note-none" + commit-message: + prefix: "deps(actions)" + + # Docker base image updates + - package-ecosystem: "docker" + directory: "/" + schedule: + interval: "weekly" + labels: + - "dependencies" + commit-message: + prefix: "deps(docker)" + + # Python dependencies (uncomment if repo uses pip/requirements.txt) + # - package-ecosystem: "pip" + # directory: "/" + # schedule: + # interval: "weekly" + # labels: + # - "dependencies" + # commit-message: + # prefix: "deps(pip)" diff --git a/.github/workflows/ci-pr-checks.yaml b/.github/workflows/ci-pr-checks.yaml new file mode 100644 index 0000000..2903c2d --- /dev/null +++ b/.github/workflows/ci-pr-checks.yaml @@ -0,0 +1,117 @@ +name: CI - PR Checks + +on: + pull_request: + branches: + - main + +jobs: + # Detect if PR contains code changes (skip expensive checks for docs-only PRs) + check-code-changes: + runs-on: ubuntu-latest + permissions: + contents: read + outputs: + has_code_changes: ${{ steps.filter.outputs.code }} + steps: + - name: Checkout source + uses: actions/checkout@v6 + + - name: Check for code changes + uses: dorny/paths-filter@v4 + id: filter + with: + filters: | + code: + - '!docs/**' + - '!**/*.md' + - '!LICENSE' + - '!OWNERS' + + # Go: lint, build, and test + lint-and-test: + runs-on: ubuntu-latest + needs: check-code-changes + if: needs.check-code-changes.outputs.has_code_changes == 'true' + steps: + - name: Checkout source + uses: actions/checkout@v6 + + - name: Extract Go version from go.mod + run: sed -En 's/^go (.*)$/GO_VERSION=\1/p' go.mod >> $GITHUB_ENV + + - name: Set up Go + uses: actions/setup-go@v6 + with: + go-version: "${{ env.GO_VERSION }}" + cache-dependency-path: ./go.sum + + - name: Download dependencies + run: go mod download + + - name: Run golangci-lint + uses: golangci/golangci-lint-action@v9 + with: + version: v2.8.0 + args: "" + + - name: Build + run: make build + + - name: Test + run: make test + + # Python: lint (only if Python files exist) + python-lint: + runs-on: ubuntu-latest + needs: check-code-changes + if: needs.check-code-changes.outputs.has_code_changes == 'true' + steps: + - name: Checkout source + uses: actions/checkout@v6 + + - name: Check for Python files + id: check-python + run: | + if find . -name "*.py" -not -path "./.git/*" | head -1 | grep -q .; then + echo "has_python=true" >> "$GITHUB_OUTPUT" + else + echo "has_python=false" >> "$GITHUB_OUTPUT" + fi + + - name: Set up Python + if: steps.check-python.outputs.has_python == 'true' + uses: actions/setup-python@v6 + with: + python-version: "3.11" + + - name: Install ruff + if: steps.check-python.outputs.has_python == 'true' + run: pip install ruff + + - name: Run ruff check + if: steps.check-python.outputs.has_python == 'true' + run: ruff check . + + - name: Run ruff format check + if: steps.check-python.outputs.has_python == 'true' + run: ruff format --check . + + # Container: build (no push) to validate Dockerfile + container-build: + runs-on: ubuntu-latest + needs: check-code-changes + if: needs.check-code-changes.outputs.has_code_changes == 'true' + steps: + - name: Checkout source + uses: actions/checkout@v6 + + - name: Set up Docker Buildx + uses: docker/setup-buildx-action@v4 + + - name: Build container (no push) + run: | + docker buildx build \ + --platform linux/amd64 \ + --tag test-build:pr-${{ github.event.pull_request.number }} \ + . diff --git a/.github/workflows/ci-release.yaml b/.github/workflows/ci-release.yaml new file mode 100644 index 0000000..0397ba9 --- /dev/null +++ b/.github/workflows/ci-release.yaml @@ -0,0 +1,53 @@ +name: CI - Release + +on: + push: + tags: + - 'v*' + release: + types: [published] + +permissions: + contents: read + packages: write + +jobs: + docker-build-and-push: + runs-on: ubuntu-latest + steps: + - name: Checkout source + uses: actions/checkout@v6 + + - name: Set project name from repository + id: meta + run: | + repo="${GITHUB_REPOSITORY##*/}" + echo "project_name=$repo" >> "$GITHUB_OUTPUT" + + - name: Determine tag + id: tag + run: | + if [[ "${GITHUB_EVENT_NAME}" == "release" ]]; then + echo "tag=${GITHUB_REF##refs/tags/}" >> "$GITHUB_OUTPUT" + echo "prerelease=${{ github.event.release.prerelease }}" >> "$GITHUB_OUTPUT" + elif [[ "${GITHUB_REF}" == refs/tags/* ]]; then + echo "tag=${GITHUB_REF##refs/tags/}" >> "$GITHUB_OUTPUT" + echo "prerelease=false" >> "$GITHUB_OUTPUT" + else + echo "tag=latest" >> "$GITHUB_OUTPUT" + echo "prerelease=false" >> "$GITHUB_OUTPUT" + fi + shell: bash + + - name: Build and push + uses: ./.github/actions/docker-build-and-push + with: + image-name: ${{ steps.meta.outputs.project_name }} + tag: ${{ steps.tag.outputs.tag }} + github-token: ${{ secrets.GITHUB_TOKEN }} + prerelease: ${{ steps.tag.outputs.prerelease }} + + - name: Trivy security scan + uses: ./.github/actions/trivy-scan + with: + image: ghcr.io/llm-d/${{ steps.meta.outputs.project_name }}:${{ steps.tag.outputs.tag }} diff --git a/.github/workflows/ci-signed-commits.yaml b/.github/workflows/ci-signed-commits.yaml new file mode 100644 index 0000000..2b374bf --- /dev/null +++ b/.github/workflows/ci-signed-commits.yaml @@ -0,0 +1,9 @@ +name: Check Signed Commits +on: pull_request_target + +jobs: + signed-commits: + uses: llm-d/llm-d-infra/.github/workflows/reusable-signed-commits.yml@main + permissions: + contents: read + pull-requests: write diff --git a/.github/workflows/copilot-setup-steps.yml b/.github/workflows/copilot-setup-steps.yml new file mode 100644 index 0000000..230707a --- /dev/null +++ b/.github/workflows/copilot-setup-steps.yml @@ -0,0 +1,20 @@ +name: "Copilot Setup Steps" + +on: + workflow_dispatch: + push: + paths: + - .github/workflows/copilot-setup-steps.yml + +jobs: + copilot-setup-steps: + runs-on: ubuntu-latest + permissions: + contents: read + steps: + - name: Install gh-aw extension + run: | + curl -fsSL https://raw.githubusercontent.com/githubnext/gh-aw/refs/heads/main/install-gh-aw.sh | bash + + - name: Verify gh-aw installation + run: gh aw version diff --git a/.github/workflows/non-main-gatekeeper.yml b/.github/workflows/non-main-gatekeeper.yml new file mode 100644 index 0000000..03bd4d3 --- /dev/null +++ b/.github/workflows/non-main-gatekeeper.yml @@ -0,0 +1,8 @@ +name: Non-Main Gatekeeper +on: + pull_request: + types: [opened, edited, synchronize, reopened] + +jobs: + gatekeeper: + uses: llm-d/llm-d-infra/.github/workflows/reusable-non-main-gatekeeper.yml@main diff --git a/.github/workflows/prow-github.yml b/.github/workflows/prow-github.yml new file mode 100644 index 0000000..6fe0b23 --- /dev/null +++ b/.github/workflows/prow-github.yml @@ -0,0 +1,12 @@ +name: Prow Commands +on: + issue_comment: + types: [created] + +permissions: + issues: write + pull-requests: write + +jobs: + prow: + uses: llm-d/llm-d-infra/.github/workflows/reusable-prow-commands.yml@main diff --git a/.github/workflows/prow-pr-automerge.yml b/.github/workflows/prow-pr-automerge.yml new file mode 100644 index 0000000..b837684 --- /dev/null +++ b/.github/workflows/prow-pr-automerge.yml @@ -0,0 +1,8 @@ +name: Prow Auto-merge +on: + schedule: + - cron: "*/5 * * * *" + +jobs: + auto-merge: + uses: llm-d/llm-d-infra/.github/workflows/reusable-prow-automerge.yml@main diff --git a/.github/workflows/prow-pr-remove-lgtm.yml b/.github/workflows/prow-pr-remove-lgtm.yml new file mode 100644 index 0000000..7625ee2 --- /dev/null +++ b/.github/workflows/prow-pr-remove-lgtm.yml @@ -0,0 +1,6 @@ +name: Prow Remove LGTM +on: pull_request + +jobs: + remove-lgtm: + uses: llm-d/llm-d-infra/.github/workflows/reusable-prow-remove-lgtm.yml@main diff --git a/.github/workflows/stale.yaml b/.github/workflows/stale.yaml new file mode 100644 index 0000000..af558db --- /dev/null +++ b/.github/workflows/stale.yaml @@ -0,0 +1,11 @@ +name: Mark Stale Issues +on: + schedule: + - cron: '0 1 * * *' + +jobs: + stale: + uses: llm-d/llm-d-infra/.github/workflows/reusable-stale.yml@main + permissions: + issues: write + pull-requests: write diff --git a/.github/workflows/unstale.yaml b/.github/workflows/unstale.yaml new file mode 100644 index 0000000..5f85771 --- /dev/null +++ b/.github/workflows/unstale.yaml @@ -0,0 +1,12 @@ +name: Unstale Issues +on: + issues: + types: [reopened] + issue_comment: + types: [created] + +jobs: + unstale: + uses: llm-d/llm-d-infra/.github/workflows/reusable-unstale.yml@main + permissions: + issues: write diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..12af59e --- /dev/null +++ b/.gitignore @@ -0,0 +1,35 @@ +# Binaries +*.exe +*.dll +*.so +*.dylib +bin/ + +# Test +*.test +*.out +coverage.txt +coverage.html + +# Go workspace +go.work +go.work.sum + +# Build output +/build + +# IDE +.idea/ +.vscode/ +*.swp +*.swo + +# OS +.DS_Store +Thumbs.db + +# Environment +.env +.envrc +.secrets.env +vendor/ diff --git a/.golangci.yml b/.golangci.yml new file mode 100644 index 0000000..70ad8aa --- /dev/null +++ b/.golangci.yml @@ -0,0 +1,122 @@ +version: "2" +run: + concurrency: 4 + go: "" + modules-download-mode: readonly + issues-exit-code: 1 + tests: true +output: + formats: + text: + path: stdout + print-linter-name: true + print-issued-lines: false + colors: false +linters: + default: none + enable: + - asasalint + - asciicheck + - bidichk + - bodyclose + - contextcheck + - durationcheck + - errcheck + - errname + - errorlint + - ginkgolinter + - gocritic + - godot + - gosec + - govet + - grouper + - importas + - ineffassign + - lll + - loggercheck + - makezero + - misspell + - nakedret + - nestif + - nilerr + - nilnil + - noctx + - nolintlint + - nonamedreturns + - nosprintfhostport + - prealloc + - predeclared + - promlinter + - reassign + - revive + - staticcheck + # - tagliatelle + - testableexamples + - testpackage + - thelper + - tparallel + - unconvert + - unparam + - usestdlibvars + - varnamelen + - whitespace + settings: + errcheck: + check-type-assertions: true + check-blank: true + gocritic: + enabled-tags: + - diagnostic + - experimental + - opinionated + - performance + - style + lll: + line-length: 130 + nakedret: + max-func-lines: 1 + revive: + rules: + - name: dot-imports + disabled: true + staticcheck: + checks: + - all + varnamelen: + max-distance: 20 + min-name-length: 2 + check-type-param: true + ignore-type-assert-ok: true + ignore-map-index-ok: true + ignore-chan-recv-ok: true + ignore-decls: + - c echo.Context + - t *testing.T + - w http.ResponseWriter + - r *http.Request + exclusions: + generated: lax + presets: + - comments + - common-false-positives + - legacy + - std-error-handling + paths: + - third_party$ + - builtin$ + - examples$ +issues: + max-issues-per-linter: 0 + max-same-issues: 0 + uniq-by-line: false + fix: false +formatters: + enable: + - gofumpt + - goimports + exclusions: + generated: lax + paths: + - third_party$ + - builtin$ + - examples$ \ No newline at end of file diff --git a/.hadolint.yaml b/.hadolint.yaml new file mode 100644 index 0000000..f528647 --- /dev/null +++ b/.hadolint.yaml @@ -0,0 +1,24 @@ +--- +failure-threshold: warning # only fail on warnings and errors, not info + +ignored: + # already ignored in pre-commit + - DL3008 # pin versions for apt-get + - DL3009 # delete apt-get cache + + # workflow/build patterns + - DL3003 # use WORKDIR instead of cd + - DL3059 # multiple consecutive RUN instructions + - DL4001 # using both wget and curl + + # package manager preferences + - DL3015 # avoid additional packages (--no-install-recommends) + - DL3041 # specify version with dnf install + - DL3042 # avoid cache directory with pip + - DL3047 # wget without progress bar + + # shellcheck rules embedded in hadolint + - SC1091 # not following sourced files + - SC2034 # variable appears unused + - SC2046 # quote to prevent word splitting + - SC2086 # double quote to prevent globbing diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml new file mode 100644 index 0000000..8dbc2ec --- /dev/null +++ b/.pre-commit-config.yaml @@ -0,0 +1,60 @@ +# Canonical pre-commit configuration for llm-d repos +# Copy this file to the root of your repo +# +# Install: pip install pre-commit && pre-commit install +# Run all: pre-commit run --all-files + +repos: + # General file hygiene hooks + - repo: https://github.com/pre-commit/pre-commit-hooks + rev: v6.0.0 + hooks: + - id: trailing-whitespace + - id: end-of-file-fixer + - id: check-yaml + args: [--unsafe] # allows custom YAML tags used in k8s + - id: check-json + - id: check-added-large-files + args: [--maxkb=1000] + - id: check-merge-conflict + - id: mixed-line-ending + - id: check-case-conflict + + # Shell script linting (requires shellcheck installed) + - repo: local + hooks: + - id: shellcheck + name: shellcheck + language: system + entry: shellcheck + args: [-x, --severity=warning] + types: [shell] + + # Dockerfile linting (requires hadolint installed) + - repo: local + hooks: + - id: hadolint-docker + name: hadolint + language: system + entry: hadolint + args: [--failure-threshold, error] + files: Dockerfile(\\..*)?$ + types: [file] + + # Markdown linting + - repo: https://github.com/igorshubovych/markdownlint-cli + rev: v0.47.0 + hooks: + - id: markdownlint + args: [--fix] + + # YAML linting + - repo: https://github.com/adrienverge/yamllint + rev: v1.37.1 + hooks: + - id: yamllint + args: + - -d + - >- + {extends: default, rules: {line-length: {max: 250}, + document-start: disable, truthy: {check-keys: false}}} diff --git a/.prowlabels.yaml b/.prowlabels.yaml new file mode 100644 index 0000000..e1cace9 --- /dev/null +++ b/.prowlabels.yaml @@ -0,0 +1,11 @@ +# Labels for labeling issues and pull requests using GitHub prow action. +kind: + - 'bug' + - 'security' + - 'feature' + - 'docs' + +priority: + - 'p0' + - 'p1' + - 'p2' diff --git a/.yamllint.yml b/.yamllint.yml new file mode 100644 index 0000000..54e476d --- /dev/null +++ b/.yamllint.yml @@ -0,0 +1,8 @@ +# YAML lint configuration for Kubernetes-heavy repos +extends: default +rules: + line-length: + max: 250 + document-start: disable + truthy: + check-keys: false diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 6070ae1..576122c 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,95 +1,34 @@ -# Code of Conduct +## Code of Conduct and Covenant ## Our Pledge -In the interest of fostering an open and welcoming environment, we as -contributors and maintainers pledge to making participation in our project and -our community a harassment-free experience for everyone, regardless of age, body -size, disability, ethnicity, gender identity and expression, level of -experience, education, socio-economic status, nationality, personal appearance, -race, religion, or sexual identity and orientation. +In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. ## Our Standards -Examples of behavior that contributes to creating a positive environment -include: +Examples of behavior that contributes to creating a positive environment include: -* Using welcoming and inclusive language -* Being respectful of differing viewpoints and experiences -* Gracefully accepting constructive criticism -* Focusing on what is best for the community -* Showing empathy towards other community members +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members Examples of unacceptable behavior by participants include: -* The use of sexualized language or imagery and unwelcome sexual attention or - advances -* Trolling, insulting/derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others' private information, such as a physical or electronic - address, without explicit permission -* Disrespecting the community's time by sending spam or other unsolicited - commercial messages -* Other conduct which could reasonably be considered inappropriate in a - professional setting +* The use of sexualized language or imagery and unwelcome sexual attention or advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a professional setting ## Our Responsibilities -Project maintainers are responsible for clarifying the standards of acceptable -behavior and are expected to take appropriate and fair corrective action in -response to any instances of unacceptable behavior. - -Project maintainers have the right and responsibility to remove, edit, or reject -comments, commits, code, wiki edits, issues, and other contributions that are -not aligned to this Code of Conduct, or to ban temporarily or permanently any -contributor for other behaviors that they deem inappropriate, threatening, -offensive, or harmful. +Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. ## Scope -This Code of Conduct applies both within project spaces and in public spaces -when an individual is representing the project or its community. Examples of -representing a project or community include using an official project e-mail -address, posting via an official social media account, or acting as an appointed -representative at an online or offline event. Representation of a project may be -further defined and clarified by project maintainers. - -This Code of Conduct also applies outside the project spaces when the Project -Steward has a reasonable belief that an individual's behavior may have a -negative impact on the project or its community. - -## Conflict Resolution - -We do not believe that all conflict is bad; healthy debate and disagreement -often yield positive results. However, it is never okay to be disrespectful or -to engage in behavior that violates the project’s code of conduct. - -If you see someone violating the code of conduct, you are encouraged to address -the behavior directly with those involved. Many issues can be resolved quickly -and easily, and this gives people more control over the outcome of their -dispute. If you are unable to resolve the matter for any reason, or if the -behavior is threatening or harassing, report it. We are dedicated to providing -an environment where participants feel welcome and safe. - -Reports should be directed to *[PROJECT STEWARD NAME(s) AND EMAIL(s)]*, the -Project Steward(s) for *[PROJECT NAME]*. It is the Project Steward’s duty to -receive and address reported violations of the code of conduct. They will then -work with a committee consisting of representatives from the Open Source -Programs Office and the Google Open Source Strategy team. If for any reason you -are uncomfortable reaching out to the Project Steward, please email -opensource@google.com. - -We will investigate every complaint, but you may not receive a direct response. -We will use our discretion in determining when and how to follow up on reported -incidents, which may range from not taking action to permanent expulsion from -the project and project-sponsored spaces. We will notify the accused of the -report and provide them an opportunity to discuss it before any action is taken. -The identity of the reporter will be omitted from the details of the report -supplied to the accused. In potentially harmful situations, such as ongoing -harassment or threats to anyone's safety, we may take action without notice. - -## Attribution +This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.Attribution +This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at -This Code of Conduct is adapted from the Contributor Covenant, version 1.4, -available at -https://www.contributor-covenant.org/version/1/4/code-of-conduct/ +For answers to common questions about this code of conduct, see diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index b16bd94..9774c91 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,33 +1,99 @@ -# How to contribute +## Contributing Guidelines -We'd love to accept your patches and contributions to this project. +Thank you for your interest in contributing to this project. Community involvement is highly valued and crucial for the project's growth and success. This project accepts contributions via GitHub pull requests. This document outlines the process to help get your contribution accepted. -## Before you begin +To ensure a clear direction and cohesive vision for the project, the project leads have the final decision on all contributions. However, these guidelines outline how you can contribute effectively. -### Sign our Contributor License Agreement +## How You Can Contribute -Contributions to this project must be accompanied by a -[Contributor License Agreement](https://cla.developers.google.com/about) (CLA). -You (or your employer) retain the copyright to your contribution; this simply -gives us permission to use and redistribute your contributions as part of the -project. +There are several ways you can contribute: -If you or your current employer have already signed the Google CLA (even if it -was for a different project), you probably don't need to do it again. +* **Reporting Issues:** Help us identify and fix bugs by reporting them clearly and concisely. +* **Suggesting Features:** Share your ideas for new features or improvements. +* **Improving Documentation:** Help make the project more accessible by enhancing the documentation. +* **Submitting Code Contributions:** Code contributions that align with the project's vision are always welcome. -Visit to see your current agreements or to -sign a new one. +## Code of Conduct -### Review our community guidelines +This project adheres to the [Code of Conduct and Covenant](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. -This project follows -[Google's Open Source Community Guidelines](https://opensource.google/conduct/). +## Community and Communication -## Contribution process +* **Developer Slack:** [Join our developer Slack workspace](https://llm-d.ai/slack) to connect with the core maintainers and other contributors, ask questions, and participate in discussions. +* **Weekly Meetings:** Project updates, ongoing work discussions, and Q&A will be covered in our weekly project meetings. Please join by [adding the shared calendar](https://red.ht/llm-d-public-calendar). You can also [join our Google Group](https://groups.google.com/g/llm-d-contributors) for access to shared content. +* **Code:** Hosted in the [llm-d](https://github.com/llm-d) GitHub organization +* **Issues:** Project-scoped bugs or issues should be reported in this repo or in [llm-d/llm-d](https://github.com/llm-d/llm-d) +* **Mailing List:** [llm-d-contributors@googlegroups.com](mailto:llm-d-contributors@googlegroups.com) -### Code reviews +## Contributing Process -All submissions, including submissions by project members, require review. We -use GitHub pull requests for this purpose. Consult -[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more -information on using pull requests. +We follow a **lazy consensus** approach: changes proposed by people with responsibility for a problem, without disagreement from others, within a bounded time window of review by their peers, should be accepted. + +### Types of Contributions + +#### 1. Features with Public APIs or New Components + +All features involving public APIs, behavior between core components, or new core subsystems must be accompanied by an **approved project proposal**. + +**Process:** + +1. Create a pull request adding a proposal document under `./docs/proposals/` with a descriptive name +2. Include these sections: Summary, Motivation (Goals/Non-Goals), Proposal, Design Details, Alternatives +3. Get review from impacted component maintainers +4. Get approval from project maintainers + +#### 2. Fixes, Issues, and Bugs + +For changes that fix broken code or add small changes within a component: + +* All bugs and commits must have a clear description of the bug, how to reproduce, and how the change is made +* Any other changes can be proposed in a pull request — a maintainer must approve the change +* For moderate size changes, create an RFC issue in GitHub, then engage in Slack + +## Code Review Requirements + +* **All code changes** must be submitted as pull requests (no direct pushes) +* **All changes** must be reviewed and approved by a maintainer other than the author +* **All repositories** must gate merges on compilation and passing tests +* **All experimental features** must be off by default and require explicit opt-in + +## Commit and Pull Request Style + +* **Pull requests** should describe the problem succinctly +* **Rebase and squash** before merging +* **Use minimal commits** and break large changes into distinct commits +* **Commit messages** should have: + * Short, descriptive titles + * Description of why the change was needed + * Enough detail for someone reviewing git history to understand the scope +* **DCO Sign-off**: All commits must include a valid DCO sign-off line (`Signed-off-by: Name `) + * Add automatically with `git commit -s` + * See [PR_SIGNOFF.md](PR_SIGNOFF.md) for configuration details + * Required for all contributions per [Developer Certificate of Origin](https://developercertificate.org/) + +## Code Organization and Ownership + +* **Components** are the primary unit of code organization +* **Maintainers** own components and approve changes +* **Contributors** can become maintainers through sufficient evidence of contribution +* Code ownership is reflected in [OWNERS files](https://go.k8s.io/owners) consistent with Kubernetes project conventions + +## Testing Requirements + +We use three tiers of testing: + +1. **Unit tests**: Fast verification of code parts, testing different arguments +2. **Integration tests**: Testing protocols between components and built artifacts +3. **End-to-end (e2e) tests**: Whole system testing including benchmarking + +Strong e2e coverage is required for deployed systems to prevent performance regression. Appropriate test coverage is an important part of code review. + +## Security + +See [SECURITY.md](SECURITY.md) for our vulnerability disclosure process. + +## API Changes and Deprecation + +* **No breaking changes**: Once an API/protocol is in GA release, it cannot be removed or behavior changed +* **Versioning**: All protocols and APIs should be versionable with clear compatibility requirements +* **Documentation**: All APIs must have documented specs describing expected behavior diff --git a/LICENSE b/LICENSE index d9a10c0..2bb9ad2 100644 --- a/LICENSE +++ b/LICENSE @@ -173,4 +173,4 @@ incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. - END OF TERMS AND CONDITIONS + END OF TERMS AND CONDITIONS \ No newline at end of file diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..6eb49c6 --- /dev/null +++ b/Makefile @@ -0,0 +1,107 @@ +# Project configuration +# TODO: Replace {{PROJECT_NAME}} with your project name +PROJECT_NAME ?= {{PROJECT_NAME}} +REGISTRY ?= ghcr.io/llm-d +IMAGE ?= $(REGISTRY)/$(PROJECT_NAME) +VERSION ?= $(shell git describe --tags --always --dirty 2>/dev/null || echo "dev") +PLATFORMS ?= linux/amd64,linux/arm64 + +# Go configuration +GOFLAGS ?= +LDFLAGS ?= -s -w -X main.version=$(VERSION) + +# Tools +GOLANGCI_LINT_VERSION ?= v2.8.0 + +.DEFAULT_GOAL := help + +##@ General + +.PHONY: help +help: ## Show this help message + @awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-20s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST) + +##@ Development + +.PHONY: build +build: ## Build the Go binary + go build $(GOFLAGS) -ldflags "$(LDFLAGS)" -o bin/$(PROJECT_NAME) . + +.PHONY: test +test: ## Run tests with race detection + go test -race -count=1 ./... + +.PHONY: test-coverage +test-coverage: ## Run tests with coverage report + go test -race -coverprofile=coverage.out -covermode=atomic ./... + go tool cover -html=coverage.out -o coverage.html + +.PHONY: lint +lint: lint-go lint-python ## Run all linters + +.PHONY: lint-go +lint-go: ## Run Go linter (golangci-lint v2) + golangci-lint run + +.PHONY: lint-python +lint-python: ## Run Python linter (ruff) — skipped if no Python files found + @if ls *.py **/*.py 2>/dev/null | head -1 > /dev/null 2>&1; then \ + ruff check . && ruff format --check .; \ + else \ + echo "No Python files found, skipping Python lint"; \ + fi + +.PHONY: fmt +fmt: ## Format Go and Python code + gofmt -w . + @if ls *.py **/*.py 2>/dev/null | head -1 > /dev/null 2>&1; then \ + ruff format .; \ + fi + +.PHONY: generate +generate: ## Run go generate + go generate ./... + +.PHONY: vet +vet: ## Run go vet + go vet ./... + +.PHONY: tidy +tidy: ## Run go mod tidy + go mod tidy + +.PHONY: pre-commit +pre-commit: ## Run pre-commit hooks on all files + pre-commit run --all-files + +##@ Container + +.PHONY: image-build +image-build: ## Build multi-arch container image (local only) + docker buildx build \ + --platform $(PLATFORMS) \ + --tag $(IMAGE):$(VERSION) \ + --tag $(IMAGE):latest \ + . + +.PHONY: image-push +image-push: ## Build and push multi-arch container image + docker buildx build \ + --platform $(PLATFORMS) \ + --push \ + --annotation "index:org.opencontainers.image.source=https://github.com/llm-d/$(PROJECT_NAME)" \ + --annotation "index:org.opencontainers.image.licenses=Apache-2.0" \ + --tag $(IMAGE):$(VERSION) \ + --tag $(IMAGE):latest \ + . + +##@ CI Helpers + +.PHONY: ci-lint +ci-lint: ## CI: install and run golangci-lint + @which golangci-lint > /dev/null 2>&1 || go install github.com/golangci/golangci-lint/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION) + golangci-lint run + +.PHONY: clean +clean: ## Remove build artifacts + rm -rf bin/ coverage.out coverage.html diff --git a/OWNERS b/OWNERS new file mode 100644 index 0000000..8aa0a15 --- /dev/null +++ b/OWNERS @@ -0,0 +1,7 @@ +# See https://go.k8s.io/owners for format documentation + +reviewers: + - seanhorgan + +approvers: + - seanhorgan diff --git a/PR_SIGNOFF.md b/PR_SIGNOFF.md new file mode 100644 index 0000000..cfc0d4c --- /dev/null +++ b/PR_SIGNOFF.md @@ -0,0 +1,203 @@ +# Git Commit Signoff and Signing + +**NOTE**: "sign-off" is different from "signing" a commit. The former +indicates your assent to the repository's terms for contributors, the +latter adds a cryptographic signature that is rarely displayed. See +[the git +book](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) +about signing. For commit signoff, do a web search on `git +signoff`. GitHub has a concept of [a commit being +"verified"](https://docs.github.com/en/authentication/managing-commit-signature-verification) +that extends the Git concept of signing. + +In order to get a pull request approved, you must first complete a DCO +sign-off for each commit that the request is asking to add to the +repository. This process is defined by the CNCF, and there are two +cases: individual contributors and contributors that work for a +corporate CNCF member. Both mean consent with the terms stated in [the +`DCO` file at the root of this Git +repository](https://github.com/llm-d/llm-d/blob/main/DCO). In +the case of an individual, DCO sign-off is accomplished by doing a Git +"sign-off" on the commit. + +We prefer that commits contributed to this repository be signed and +GitHub verified, but this is not strictly necessary or enforced. + +## Commit Sign-off + +Your submitted PR must pass the automated checks in order to be merged. One of these checks that each commit that you propose to contribute is signed-off. If you use the `git` shell command, this involves passing the `-s` flag on the command line. For example, the following command will create a signed-off commit but _not_ sign it. + +```shell +git commit -s +``` + +Alternatively, the following command will create a commit that is both signed-off and signed. + +```shell +git commit -s -S +``` + +For other tools, consult their documentation. + +## Signing Commits + +Before signing any commits, you must have a GPG or SSH key. Basic setup instructions can be found below (For more detailed instructions, refer to the Github [GPG](https://docs.github.com/en/authentication/managing-commit-signature-verification/generating-a-new-gpg-key) and [SSH](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent#generating-a-new-ssh-key) setup pages.) + +To sign a particular commit, you must either include `-S` on the `git commit` command line (see the command exhibited above for an example) or have configured automatic signing (see ["Everyone Must Sign" in the Git Book](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work#_everyone_must_sign) for a hint about that). + +Before starting, make sure that your user email is verified on Github. To check for this: + +1. Login to Github and navigate to your Github **Settings** page +2. In the sidebar, open the **Emails** tab +3. Emails associated with Github should be listed at the top of the page under the "Emails" label +4. An unverified email would have an "Unverified" label under it in orange text +5. To verify, click **Resend verification email** and follow its prompts +6. Navigate back to your **Emails** page, if the "Unverified" label is no longer there, then you're good to go! + +
+ +For Windows users, **Git Bash** is also highly recommended. + +
+ +## Setting up the GPG Key + +1. Install GnuPG (the GPG command line tool). + - Binary releases for your specific OS can be found + [on the GnuPG download page](https://www.gnupg.org/download/) after scrolling down to the Binary Releases + section (i.e. Gpg4win on Windows, Mac GPG for MacOS, etc). + - After downloading the installer, follow the prompts to set up GnuPG. + +2. Open Git Bash (or your CLI of choice) and use the following command to generate your GPG key pair: + + ```shell + gpg --full-generate-key + ``` + +3. If prompted to specify the size, type, and duration of the key that you want, press `Enter` to select the default option. +4. Once prompted, enter your user info and a passphrase: + - Make sure to list your email as the same one that's verified by Github +5. Use the following command to list the long form of your generated GPG keys: + + ```shell + gpg --list-secret-keys --keyid-format=long + ``` + + - Your GPG key ID should be the characters on the output line starting with `sec`, beginning directly after the `/` and ending before the listed date. + - For example, in the output below (from the Github [GPG](https://docs.github.com/en/authentication/managing-commit-signature-verification/generating-a-new-gpg-key) setup page), the GPG key ID would be `3AA5C34371567BD2` + + ```shell + $ gpg --list-secret-keys --keyid-format=long + /Users/hubot/.gnupg/secring.gpg + ------------------------------------ + sec 4096R/3AA5C34371567BD2 2016-03-10 [expires: 2017-03-10] + uid Hubot + ssb 4096R/4BB6D45482678BE3 2016-03-10 + ``` + +6. Copy your GPG key ID and run the command below, replacing `[your_GPG_key_ID]` with the key ID you just copied: + + ```shell + gpg --armor --export [your_GPG_key_ID] + ``` + +7. This should generate an output with your GPG key. Copy the characters starting from `-----BEGIN PGP PUBLIC KEY BLOCK-----` and ending at `--END PGP PUBLIC KEY BLOCK-----` (inclusive) to your clipboard. +8. After copying or saving your GPG key, navigate to **Settings** in your Github +9. Navigate to the **SSH and GPG keys** page under the Access section in the sidebar +10. Under GPG keys, select **New GPG key** + - Enter a suitable name for your key under "Title" and paste your GPG key that you copied/saved in **Step 7** under "Key". + - Once done, click **Add GPG key** +11. Your new GPG key should now be displayed under GPG keys. + +
+ +## Setting up the SSH Key + +1. Open Git Bash (or your CLI of choice) and use the following command to generate your new SSH key (make sure to replace `your_email` with your Github-verified email address): + + ```shell + ssh-keygen -t ed25519 -C "your_email" + ``` + +2. Press `Enter` to select the default option if prompted to set a save-file or passphrase for the key (you may choose to enter a passphrase if desired; this will prompt you to enter the passphrase every time you perform a DCO sign-off). + - The following output should generate a `randomart` image +3. Use the following command to copy the **public** part of the new SSH key to your clipboard: + + ```shell + clip < ~/.ssh/id_ed25519.pub + ``` + + Note: If you are in a WSL shell, use instead + + ```shell + clip.exe < ~/.ssh/id_ed25519.pub + ``` + +4. After copying or saving your SSH key, navigate to **Settings** in your Github. +5. Navigate to the **SSH and GPG keys** page under the Access section in the sidebar. +6. Under SSH keys, select **New SSH key**. + - Enter a suitable name for your key under "Title" (it'll pick up the email address if left empty) + - Open the dropdown menu under "Key type" and select **Signing Key** + - Paste your SSH public key that you copied/saved in **Step 3** under "Key" +7. Your new SSH key should now be displayed under SSH keys, in the **Signing Key** section. +8. **Optional**: If you want to use the same SSH or GPG key for authentication as well, repeat steps above, selecting **Authentication** as the "Key type". +9. **Optional**: To test if your SSH key is connecting properly or not, run the following command in your CLI (more specific instructions can be found in the [Github documentation](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/testing-your-ssh-connection)): + + ```shell + ssh -T git@github.com + ``` + + - If given a warning saying something like `The authenticity of the host '[host IP]' can't be established` along + with a key fingerprint and a prompt to continue, verify if the provided key fingerprint matches any of those + listed [in GitHub's SSH key fingerprints documentation](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/githubs-ssh-key-fingerprints) + - Once you've verified the match, type `yes` + - If the resulting message says something along the lines of `Hi [User]! You've successfully authenticated, but GitHub does not provide shell access.`, then it means your SSH key is up and ready. + - If you get an error saying something like `Error: Permission denied (publickey)` repeat the procedure in step 6 _with the same key_ only select **Authentication Key**. Then try the test command again. + +
+ +## Creating Pull Requests Using the GitHub Website + +This is not recommended for individual contributors, because the commits that it produces are not "signed-off" (as defined by Git) and thus do not carry assent to the DCO; see [Repairing commits](#repairing-commits) below for a way to recover if you have inadvertently made such a PR. For corporate contributors the DCO assent is indicated differently. + +Whether it's editing files from llm-d.ai or directly from the llm-d Github, there are a couple steps to follow that streamlines the workflow of your PR: + +1. Changes made to any file are automatically committed to a new branch in your fork. + - After clicking **Commit changes...**, write your commit message summary line and any extended description that you want. Then click **Propose changes**, review your changes, and then create the PR. + - When making the PR, make sure to specify the type of PR at the beginning of the PR's title (i.e. :bug: if it addresses a bug-type issue) + +1. If the PR addresses a specific issue that has already been opened in GitHub, make sure to include the open issue number in **Related Issue(s)** (i.e. `Fixes #NNNN`); this will cause GitHub to automatically close the Issue once the PR is merged. If you have finished addressing an open issue without getting it automatically closed then explicitly close it. + +## Repairing commits + +If you have already created a PR that proposes to merge a branch that +adds commits that are not signed-off then you can repair this (and +lack of signing, if you choose) by adding the signoff to each using +`git commit -s --amend` on each of them. If you also want those +commits signed then you would use `git commit -s -S --amend` or +configure automatic signing. Following is an outline of how to do it +for a branch that adds **exactly one** commit. If your branch adds +more than one commit then you can extrapolate using `git cherry-pick -s -S` +(or `git rebase -i HEAD~NN` and setting commits to `edit`) to build up a +revised series of commits one-by-one. + +The following instructions provide a basic walk-through if you have already created your own fork of the repository but yet not made a clone on your workstation. + +1. Navigate to the **Code** page of the llm-d github. + +2. Click the **Fork** dropdown in the top right corner of the page. + - Under "Existing Forks" click your fork (should look something like "your_username/llm-d") +3. Once in your fork, click the **Code** dropdown. + - Under the "Local" tab at the top of the dropdown, select the SSH tab + - Copy the SSH repo URL to your clipboard +4. Open Git Bash (or your CLI of choice), create or change to a different directory if desired. +5. Clone the repository using `git clone` followed by pasting the URL you just copied. +6. Change your directory to the llm-d repo using `cd llm-d`. +7. `git checkout` to the branch in your fork where the changes were committed. + - The branch name should be written at the top of your submitted PR page and looks something like "patch-_X_" (where "X" should be the number of PRs made on your fork to date) +8. Once in your branch, type `git commit -s --amend` to sign off your PR. + - The commit will also be signed if either you have set up automatic signing or both include the `-S` flag on that command and have set up your SSH or GPG key. + - You may extend that command with `-m` followed by a quoted commit message if you desire. Otherwise `git` will pop up an editor for you to use in making any desired adjustment to the commit message. After making any desired changes, save and exit the editor. FYI: in `vi` (which GitBash uses), when it is in Command mode (which is the normal mode, and contrasts with Insert mode) the keystrokes `:wq!` will attempt to save and then will exit no matter what. +9. Type `git push -f origin [branch_name]`, replacing `[branch_name]` with the actual name of your branch. +10. Navigate back to your PR github page. + - A green `dco-signoff: yes` label indicates that your PR is successfully signed diff --git a/README.md b/README.md index 908a742..526af4a 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,25 @@ -# Prism: Performance analysis for distributed inference systems +# Prism -## Project Overview +[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) -Currently, AI Platform Engineers and ML Engineers face significant challenges assembling the full end-to-end inference serving stack for their applications, leading to lengthy, manual evaluation cycles, suboptimal performance, and unnecessarily high friction and costs. While many benchmarks and tools exist, the data is often scattered across disconnected docs, spreadsheets, or vendor-specific sites. +> Performance analysis for distributed inference systems -Prism helps you choose, configure and optimize the right AI inference infrastructure by unifying benchmark data from disparate sources—cloud APIs, public repositories, and local experiments—into a single interactive analysis experience. Prism makes it easier to navigate the complex trade-offs between throughput, latency cost, and quality with data grounded in validated benchmarks. +_Note: The contents of this repo are currently being populated from the original Prism repo. We expect this to be completed by the end of March 2026._ -# Documentation +## Overview -- [Product Requirements Document (PRD)](./specs/PRD.md): Detailed requirements and specifications. +AI Platform Engineers and ML Engineers face significant challenges assembling the full end-to-end inference serving stack for their applications, leading to lengthy, manual evaluation cycles, suboptimal performance, and unnecessarily high friction and costs. While many benchmarks and tools exist, the data is often scattered across disconnected docs, spreadsheets, or vendor-specific sites. -> **Note to Agents:** This file serves as the primary source of context and instructions. Please read this file carefully for project context, rules, and deployment procedures. +Prism helps you choose, configure and optimize the right AI inference infrastructure by unifying benchmark data from disparate sources—cloud APIs, public repositories, and local experiments—into a single interactive analysis experience. Prism makes it easier to navigate the complex trade-offs between throughput, latency cost, and quality with data grounded in validated benchmarks. -# Technology Stack +

+ + + Prism Overview + +

+ +## Technology Stack - **Framework:** React 19 (via Vite) - **Styling:** Vanilla CSS (Tailwind CSS v4 for utility-first components) @@ -21,65 +28,13 @@ Prism helps you choose, configure and optimize the right AI inference infrastruc - **Language:** JavaScript (ESNext) - **Cloud:** Google Cloud (GCS, GIQ), AWS (S3) -# Directory Structure - -The application is located at the repository root. - -``` -├── deploy.sh # Deployment script -├── specs/ # Specifications, PRDs -├── public/ # Local benchmark sample data -├── server/ -│ └── server.js # Backend API & Static File Server (Express) -├── tools/ # Utility scripts for data processing -├── src/ -│ ├── components/ -│ │ ├── Dashboard/ # Modular Dashboard Components -│ │ │ ├── FilterPanel.jsx -│ │ │ ├── ThroughputCostChart.jsx -│ │ │ └── UnifiedDataTable.jsx -│ │ ├── Dashboard.jsx # Dashboard Orchestrator -│ │ ├── DataConnectionsPanel.jsx # Data Source Management -│ │ └── DataInspector.jsx -│ ├── hooks/ -│ │ ├── useAWS.js # AWS S3 Hook -│ │ ├── useGCS.js # Cloud Storage Hook -│ │ ├── useGIQ.js # API Data Hook -│ │ └── useLLMD.js # Benchmarking Hook -│ ├── utils/ -│ │ ├── dashboardHelpers.js # Shared formatting & utility functions -│ │ ├── dataParser.js # Central data normalization logic -│ │ └── cacheManager.js # Client-side caching for API results -│ ├── App.jsx # React App Root -│ └── main.jsx # Entry Point -└── package.json -``` +## Development -# Setup & Deployment +See [CONTRIBUTING.md](CONTRIBUTING.md) for development guidelines, coding standards, and how to submit changes. -## Google Cloud +## Setup & Deployment -To deploy the application and authenticate with the Google Cloud APIs, follow these steps: - -1. **Install Google Cloud SDK**: Follow the instructions [here](https://cloud.google.com/sdk/docs/install) to install the `gcloud` CLI. -2. **Authenticate**: - ```bash - gcloud auth login - ``` -3. **Set Project**: - ```bash - gcloud config set project - ``` -4. **Set Quota Project** (Critical for ADC to work correctly): - ```bash - gcloud config set billing/quota_project - ``` -5. **Generate Application Default Credentials**: - ```bash - gcloud auth application-default login - ``` - -## Local Development +### Local Development 1. **Install Dependencies**: ```bash @@ -100,17 +55,17 @@ To deploy the application and authenticate with the Google Cloud APIs, follow th DEFAULT_PROJECTS="my-google-project-id" npm start ``` -## Cloud Deployment (Google Cloud Run) +### Cloud Deployment (Google Cloud Run) The application is deployed to Google Cloud Run using the `deploy.sh` script. This script handles API enablement, configuration persistence, and the deployment command itself. -### Usage +#### Usage ```bash ./deploy.sh [options] ``` -### Options +#### Options | Alternative | Option | Description | | :---------- | :---------------------- | :--------------------------------------------------------------------------------------------- | @@ -121,11 +76,11 @@ The application is deployed to Google Cloud Run using the `deploy.sh` script. Th | `-g` | `--ga-id ` | **Google Analytics Tracking ID** (e.g., `G-XXXX`). | | `-c` | `--contact ` | **Contact Us Link**. Supports URLs or email addresses (automatically prefixed with `mailto:`). | -### Configuration Persistence +#### Configuration Persistence The script saves the most recent deployment configuration to a `.deploy_config` file in the root directory. Subsequent runs will use these values as defaults unless overridden by command-line arguments. -### Example +#### Example To deploy using a specific configuration file (e.g., public release): @@ -139,7 +94,7 @@ To deploy to a specific project with a custom name and contact email: ./deploy.sh --project my-project-id --name "Prototype" --contact "support@example.com" ``` -### Shared Defaults (Environment Variables) +#### Shared Defaults (Environment Variables) The following environment variables can be set via `--set-env-vars` in the `gcloud run deploy` command (the script handles this via the arguments above): @@ -151,15 +106,16 @@ The following environment variables can be set via `--set-env-vars` in the `gclo - **`GOOGLE_API_KEY`**: API Key for Google Drive/Sheets (auto-detected from `.env.local` if present). ## Multi-Cloud Deployment (AWS, Azure, On-Prem) +Note: Deployment on other clouds is a work in progress and requires testing. This application can be deployed to any container platform (AWS App Runner, Azure Container Apps, ECS, Kubernetes). 1. **Build Docker Image**: ```bash - docker build -t prism-viz . + docker build -t prism . ``` 2. **Authentication**: - The application requires Google Cloud credentials to interface with GCS/GIQ. + The application requires Google Cloud credentials to interface with GCS/GIQ -- which are optional. - **Create a Service Account Key**: Generate a JSON key for a Service Account with `roles/storage.objectViewer` and `roles/serviceusage.serviceUsageConsumer`. - **Mount Key**: Mount this JSON file into the container (e.g., at `/app/credentials.json`). - **Set Env Var**: Set `GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json`. @@ -170,10 +126,44 @@ This application can be deployed to any container platform (AWS App Runner, Azur -e PORT=8080 \ -e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json \ -v $(pwd)/credentials.json:/app/credentials.json \ - prism-viz + prism + ``` + +### Google Cloud APIs + +To deploy the application and authenticate with the Google Cloud APIs, follow these steps: + +1. **Install Google Cloud SDK**: Follow the instructions [here](https://cloud.google.com/sdk/docs/install) to install the `gcloud` CLI. +2. **Authenticate**: + ```bash + gcloud auth login + ``` +3. **Set Project**: + ```bash + gcloud config set project + ``` +4. **Set Quota Project** (Critical for ADC to work correctly): + ```bash + gcloud config set billing/quota_project + ``` +5. **Generate Application Default Credentials**: + ```bash + gcloud auth application-default login ``` -# Dependency Management +## Architecture + +The application uses a **Backend-for-Frontend (BFF)** architecture to simplify security and configuration: + +- **Frontend (React)**: Fetches shared configuration from `/api/config` on startup. +- **Backend (Node.js/Express)**: + - Serves the static React application. + - Proxies requests to Google Cloud APIs, automatically injecting **Application Default Credentials (ADC)**. + - Enforces Rate Limiting to prevent abuse. + +## Configuration + +### Dependency Management The project uses a standard `.npmrc` to enforce the public npm registry (`https://registry.npmjs.org/`) to ensure consistency for all developers. @@ -202,24 +192,16 @@ The project uses a standard `.npmrc` to enforce the public npm registry (`https: - **Source Selection**: `Dashboard.jsx` implements strict filtering for data sources. Only sources explicitly permitted by `defaultState` (or user interaction) are enabled. - **Active Connection Sorting**: When a data connection is enabled, it is automatically sorted to the bottom of the active list in the Data Connections panel for better UX navigation. -# Common Tasks +## Contributing -- **Fixing Data:** If `public/data.json` seems broken or missing fields (like `workload` or `time_per_output_token`), run `node tools/patch_data.js`. -- **Adding Filters:** Update `facetCounts`, `filterOptions`, and `filteredBySource` memos in `Dashboard.jsx`. +We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. -# Architecture +All commits must be signed off (DCO). See [PR_SIGNOFF.md](PR_SIGNOFF.md) for instructions. -The application uses a **Backend-for-Frontend (BFF)** architecture to simplify security and configuration: +## Security -- **Frontend (React)**: Fetches shared configuration from `/api/config` on startup. -- **Backend (Node.js/Express)**: - - Serves the static React application. - - Proxies requests to Google Cloud APIs, automatically injecting **Application Default Credentials (ADC)**. - - Enforces Rate Limiting to prevent abuse. +To report a security vulnerability, please see [SECURITY.md](SECURITY.md). -# Constraints & Rules +## License -- **Absolute Paths Only:** Always use full absolute paths when referencing files. -- **No Placeholders:** Generate real code or functional mocks. -- **Aesthetics:** Do not create basic HTML/CSS. Use the existing Tailwind dark theme. -- **Package Lock**: Do not commit changes to `package-lock.json` that introduce private registry URLs. +This project is licensed under the Apache License 2.0 - see [LICENSE](LICENSE) for details. diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000..4de70ea --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,39 @@ +## Security Announcements + +Join the [llm-d-security-announce](https://groups.google.com/u/1/g/llm-d-security-announce) group for emails about security and major API announcements. + +## Report a Vulnerability + +We're extremely grateful for security researchers and users that report vulnerabilities to the llm-d Open Source Community. All reports are thoroughly investigated by a set of community volunteers. + +You can email the private [llm-d-security-reporting@googlegroups.com](mailto:llm-d-security-reporting@googlegroups.com) list with the security details and the details expected for [all llm-d bug reports](.github/ISSUE_TEMPLATE/bug.yml). + +### When Should I Report a Vulnerability? + +- You think you discovered a potential security vulnerability in llm-d +- You are unsure how a vulnerability affects llm-d +- You think you discovered a vulnerability in another project that llm-d depends on + - For projects with their own vulnerability reporting and disclosure process, please report it directly there + +### When Should I NOT Report a Vulnerability? + +- You need help tuning llm-d components for security +- You need help applying security related updates +- Your issue is not security related + +## Security Vulnerability Response + +Each report is acknowledged and analyzed by the maintainers of llm-d within 3 working days. + +Any vulnerability information shared with Security Response Committee stays within llm-d project and will not be disseminated to other projects unless it is necessary to get the issue fixed. + +As the security issue moves from triage, to identified fix, to release planning we will keep the reporter updated. + +## Public Disclosure Timing + +A public disclosure date is negotiated by the llm-d Security Response Committee and the bug submitter. +We prefer to fully disclose the bug as soon as possible once a user mitigation is available. +It is reasonable to delay disclosure when the bug or the fix is not yet fully understood, the solution is not well-tested, or for vendor coordination. +The timeframe for disclosure is from immediate (especially if it's already publicly known) to a few weeks. +For a vulnerability with a straightforward mitigation, we expect report date to disclosure date to be on the order of 7 days. +The llm-d maintainers hold the final say when setting a disclosure date. diff --git a/_typos.toml b/_typos.toml new file mode 100644 index 0000000..d41e1d2 --- /dev/null +++ b/_typos.toml @@ -0,0 +1,14 @@ +# Typos configuration +# https://github.com/crate-ci/typos + +[default.extend-words] +# Azure Kubernetes Service abbreviation +AKS = "AKS" +aks = "aks" +# Indian Standard Time / Istio abbreviation +IST = "IST" +# Abbreviation (e.g., 2nd -> ND) +ND = "ND" + +# Add repo-specific false positives here: +# word = "word" diff --git a/cmd/main.go b/cmd/main.go new file mode 100644 index 0000000..d35e01e --- /dev/null +++ b/cmd/main.go @@ -0,0 +1,21 @@ +// Copyright 2025 The llm-d Authors. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package main + +import "fmt" + +func main() { + fmt.Println("TODO: implement your service here") +} diff --git a/deploy/.gitkeep b/deploy/.gitkeep new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/deploy/.gitkeep @@ -0,0 +1 @@ + diff --git a/docs/images/prism-overview.png b/docs/images/prism-overview.png new file mode 100644 index 0000000..164991d Binary files /dev/null and b/docs/images/prism-overview.png differ diff --git a/docs/upstream-versions.md b/docs/upstream-versions.md new file mode 100644 index 0000000..1604e3a --- /dev/null +++ b/docs/upstream-versions.md @@ -0,0 +1,12 @@ +# Upstream Dependency Version Tracking + +> This file is the source of truth for the [upstream dependency monitor](.github/workflows/upstream-monitor.md) workflow. +> Add your project's key upstream dependencies below. The monitor runs daily and creates GitHub issues when breaking changes are detected. + +## Dependencies + + + +| Dependency | Current Pin | Pin Type | File Location | Upstream Repo | +|-----------|-------------|----------|---------------|---------------| + diff --git a/go.mod b/go.mod new file mode 100644 index 0000000..1a9786c --- /dev/null +++ b/go.mod @@ -0,0 +1,3 @@ +module github.com/llm-d/{{PROJECT_NAME}} + +go 1.24.0 diff --git a/hooks/pre-commit b/hooks/pre-commit new file mode 100755 index 0000000..e5ab0ce --- /dev/null +++ b/hooks/pre-commit @@ -0,0 +1,10 @@ +#!/usr/bin/env bash +set -e + +echo "Running lint..." +make lint + +echo "Running tests..." +make test + +echo "All checks passed!" diff --git a/internal/.gitkeep b/internal/.gitkeep new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/internal/.gitkeep @@ -0,0 +1 @@ + diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 0000000..a82a581 --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,34 @@ +# Python tooling configuration +# Only needed if this repo contains Python code + +[tool.ruff] +line-length = 120 +target-version = "py311" + +[tool.ruff.lint] +select = [ + "E", # pycodestyle errors + "W", # pycodestyle warnings + "F", # pyflakes + "I", # isort + "N", # pep8-naming + "UP", # pyupgrade + "B", # flake8-bugbear + "S", # flake8-bandit (security) + "A", # flake8-builtins + "C4", # flake8-comprehensions + "RET", # flake8-return + "SIM", # flake8-simplify +] +ignore = [ + "S101", # assert used (OK in tests) + "S603", # subprocess call with shell=false (OK) + "S607", # start process with partial path (OK) +] + +[tool.ruff.lint.per-file-ignores] +"tests/**/*.py" = ["S101"] + +[tool.ruff.format] +quote-style = "double" +indent-style = "space" diff --git a/users/README.md b/users/README.md new file mode 100644 index 0000000..d2dcb2a --- /dev/null +++ b/users/README.md @@ -0,0 +1,31 @@ +# User Profiles and Skills + +This directory contains specialized skills tailored for different user profiles involved in the llm-d project. These skills are designed to assist agents in developing features, tuning configurations, and maintaining production stability. We will build on these user profiles over time. + +![User Roles Diagram](./roles_diagram.png) + +## User Roles + +### 1. [Feature Developer](llm-d-feature-developer/SKILL.md) +**Focus:** Ensures new features hit price/performance goals and can be shared easily. +- Automates baseline A/B test validations. +- Separates unit vs. system benchmarks. +- Produces reproducible and shareable artifacts. + +### 2. [Config Tuner](config-tuner/SKILL.md) (Solutions Architect / Customer Engineer) +**Focus:** Tailored for defining and sweeping optimal end-to-end component stacks. +- Orchestrates complex configuration sweeps (Parallelism, vLLM config, etc.). +- Isolates test environments. +- Generates verifiable reference architecture matrices. + +### 3. [Stack Operator](llm-d-stack-operator/SKILL.md) +**Focus:** Prioritizes production stability and regression tracking. +- Runs standard "well-lit paths". +- Sets up deep stress testing telemetry. +- Compares footprints vs. historical production data. + +### 4. [Benchmark Developer](benchmark-developer/SKILL.md) (Analyst / PR) +**Focus:** Tailored towards community-facing output. +- Generates clear, presentation-ready graphs and visualizations. +- Compares llm-d against competitors. +- Focuses on user-facing metrics without requiring deep codebase knowledge. diff --git a/users/benchmark-developer/SKILL.md b/users/benchmark-developer/SKILL.md new file mode 100644 index 0000000..ad0a5e1 --- /dev/null +++ b/users/benchmark-developer/SKILL.md @@ -0,0 +1,22 @@ +--- +name: benchmark-developer +description: Skill for Analyst / PR roles to compare cross-stack performance and publish reproducible, community-facing benchmarks. +--- +# Benchmark Developer (Analyst / PR) Skill + +## User Profile Goal +The primary goal of this profile is to compare the performance of llm-d optimizations against alternative inference stacks (e.g., Dynamo) and share easy-to-understand benchmarks with the broader community. + +## Agent Responsibilities +When designing and developing features for this user profile, you must: + +1. **Prioritize Presentation:** + - Create high-quality, easy-to-understand visualizations and tables that are ready to be embedded into public reports, social media posts, or PR materials. + - Clearly highlight price/performance ratios of various components. + +2. **Lower the Technical Ceiling:** + - Assume this user is a 3rd-party (3P) to the core llm-d community. Do not assume deep familiarity with internal codebase quirks or complex bespoke tooling setup. + - Create one-click reproducible scripts where possible. + +3. **Enable Cross-Stack Comparisons:** + - Guide the generation of benchmarking tools that can smoothly test not just llm-d, but competitor stacks simultaneously for fairness and clear, objective reporting. diff --git a/users/config-tuner/SKILL.md b/users/config-tuner/SKILL.md new file mode 100644 index 0000000..809ec17 --- /dev/null +++ b/users/config-tuner/SKILL.md @@ -0,0 +1,24 @@ +--- +name: config-tuner +description: Skill for helping Solutions Architects and Customer Engineers sweep configurations and find optimal end-to-end stack setups. +--- +# Config Tuner / Solutions Architect Skill + +## User Profile Goal +The primary goal of this profile is to recommend optimal component-level and full end-to-end stack configurations (hardware infrastructure, model serving, orchestration) to customers based on rigorous testing. + +## Agent Responsibilities +When designing and developing features for this user profile, you must: + +1. **Orchestrate Complex Sweeps:** + - Generate and run automated benchmarks across complex sweeps of configurations. + - Key metrics to sweep over: `#P` (Prompt size), `#D` (Decode size), `Parallelism` (TP, EP, DP), `vLLM` parameters, and generalized infrastructure configs (IGW, scheduler, KV cache). + +2. **Isolate Test Environments:** + - Ensure each benchmark is run against a freshly deployed stack so that the configuration changes are completely isolated and not affected by previous state. + +3. **Visualize Relative Performance:** + - Create tables, heatmaps, or charts that allow the user to immediately understand the relative performance impact of different configuration combinations. + +4. **Produce Reference Architectures:** + - Synthesize the optimal benchmark results into clear, experimentally verifiable reference architectures that guarantee specific scalability Service Level Objectives (SLOs). diff --git a/users/llm-d-feature-developer/SKILL.md b/users/llm-d-feature-developer/SKILL.md new file mode 100644 index 0000000..29041be --- /dev/null +++ b/users/llm-d-feature-developer/SKILL.md @@ -0,0 +1,26 @@ +--- +name: llm-d-feature-developer +description: Skill for helping llm-d Feature Developers run, compare, and validate benchmarks for new optimization features. +--- +# llm-d Feature Developer Skill + +## User Profile Goal +The primary goal of the llm-d Feature Developer is to ensure that new features (e.g., optimizations like medusa) hit their price/performance goals and can be shared in an easily reproducible way. + +## Agent Responsibilities +When designing and developing features for this user profile, you must: + +1. **Clarify Benchmark Scope (Unit vs. System):** + - Ensure a clear separation between unit benchmarks (isolated component tests) and system benchmarks (end-to-end integration tests). + - If the user is evaluating a specific component optimization, default to unit benchmarking before scaling up. + +2. **Automate Baseline Comparisons:** + - Always run the user's new feature against a pre-deployed, stable baseline stack. + - Design commands and scripts that make it trivial to perform A/B testing with constant small tweaks. + +3. **Facilitate Reproducibility:** + - Format results cleanly so they can easily be published in blog posts or PR descriptions. + - Record and output the exact commands, environment variables, and commit hashes used for every run to ensure anyone else can reproduce the benchmark. + +4. **Address Common Pitfalls:** + - Detect and warn the user when baseline validations are missing or when the standard llm-d benchmarking tool is misconfigured for their specific edge case. diff --git a/users/llm-d-stack-operator/SKILL.md b/users/llm-d-stack-operator/SKILL.md new file mode 100644 index 0000000..da44396 --- /dev/null +++ b/users/llm-d-stack-operator/SKILL.md @@ -0,0 +1,23 @@ +--- +name: llm-d-stack-operator +description: Skill for helping Stack Operators monitor production stacks, perform regular stress tests, and detect regressions. +--- +# llm-d Stack Operator Skill + +## User Profile Goal +The primary goal of this profile is to ensure their production stacks (on cloud environments like GCP or IBM Cloud) are configured with the best stable optimizations and actively monitored for performance regressions. + +## Agent Responsibilities +When designing and developing features for this user profile, you must: + +1. **Focus on "Well-Lit Paths":** + - Execute regular benchmarking runs using pre-selected, standardized workloads. + - Run consistent stress tests and deep profiling on established architectures. + +2. **Robust Regression Testing:** + - Automate regression runs specifically catered to the relevant cloud providers the user operates on. + - When a new optimization is deployed, immediately compare its performance against historical baselines to flag unexpected regressions. + +3. **Enhance Observability:** + - Assist in setting up and interpreting production observability tools. + - Synthesize complex deployment telemetry into straightforward operational reports for the operator. diff --git a/users/roles_diagram.png b/users/roles_diagram.png new file mode 100644 index 0000000..1e5c668 Binary files /dev/null and b/users/roles_diagram.png differ