Skip to content

feat: Add CLI metadata extraction infrastructure and enrich-cli-help skill#574

Open
llewellyn-sl wants to merge 2 commits intomasterfrom
ll-cli-docs-automation-infrastructure
Open

feat: Add CLI metadata extraction infrastructure and enrich-cli-help skill#574
llewellyn-sl wants to merge 2 commits intomasterfrom
ll-cli-docs-automation-infrastructure

Conversation

@llewellyn-sl
Copy link
Contributor

@llewellyn-sl llewellyn-sl commented Jan 22, 2026

Summary

Replace brittle Python regex parsing with deterministic Java reflection using picocli's CommandSpec API. Add Claude Code skill for improving CLI help text. Integrate metadata updates into PR workflow.

What Changed

Added

  • Java metadata extractor (CliMetadataExtractor.java)

    • Uses picocli CommandSpec API for deterministic extraction
    • Automatically resolves all @Mixin annotations
    • Captures 1011 options (118% more than Python approach)
    • Outputs complete type information
  • Gradle task: extractCliMetadata

    • Run: ./gradlew extractCliMetadata
    • Outputs to docs/cli-metadata.json
  • PR template with checklist

    • Enforces running extractor before merge for CLI changes
    • Makes metadata updates final step in PR process
  • Claude Code skill: enrich-cli-help

    • Guides contributors on improving CLI help text
    • Documents OpenAPI-quality description standards
    • Provides architecture patterns and examples
  • GitHub Actions workflow

    • Triggers docs repo on release
    • Verifies metadata exists

Updated

  • .gitignore - Removed command-spec.json entry
  • docs/README.md - Documented Java reflection approach
  • .claude/README.md - Added contributor guide

Removed

  • Python scripts (replaced by Java extractor)

Benefits

  • Deterministic - Same input always produces same output
  • Complete - Captures all options including platform/provider mixins
  • Maintainable - Type-safe Java code, no regex brittleness
  • Integrated - Part of build system, no Python dependency
  • Enforced - PR template checklist ensures metadata stays current

Testing

./gradlew extractCliMetadata

Output:

  • Total commands: 164
  • Total options: 1011
  • Total parameters: 12

Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com

Replace brittle Python regex parsing with deterministic Java reflection using picocli's CommandSpec API.
Integrate metadata updates into PR workflow via checklist.

## Added
- Java metadata extractor (CliMetadataExtractor.java)
  - Uses picocli CommandSpec API for reflection-based extraction
  - Automatically resolves all @mixin annotations
  - Captures 1011 options deterministically
  - Outputs complete type information
- Gradle task: extractCliMetadata
  - Runs the Java extractor
  - Outputs to docs/cli-metadata.json
- PR template with metadata update checklist
  - Enforces running extractor before merge
  - Final step before merging CLI changes
- GitHub Actions workflow for release automation
  - Triggers docs repo on release
  - Verifies metadata exists
- Claude Code configuration and enrich-cli-help skill
  - Provides guidance for improving CLI help text
  - Documents metadata extraction workflow

## Benefits
- Deterministic: Same input always produces same output
- Complete: Captures all options including platform/provider mixins
- Maintainable: Type-safe Java code, no regex brittleness
- Integrated: Part of build system, no Python dependency
- Enforced: PR template checklist ensures metadata stays current

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@llewellyn-sl llewellyn-sl force-pushed the ll-cli-docs-automation-infrastructure branch from 73a473b to 51473a8 Compare January 23, 2026 13:20
@llewellyn-sl llewellyn-sl changed the title feat: Replace Python metadata extractor with Java reflection approach feat: Add CLI metadata extraction infrastructure and enrich-cli-help skill Jan 23, 2026
Add comprehensive guidelines for handling default values in CLI option descriptions,
based on learnings from PR 569 review feedback (2026-01-29).

Key additions:
1. Six rules for different default value scenarios:
   - CLI-enforced defaults (use "Default: X")
   - Platform-enforced defaults (use "If absent, Platform defaults to X")
   - Boolean flags (omit default mention)
   - Required fields (omit default mention)
   - Using ${COMPLETION-CANDIDATES} for enums

2. Verification checklist for default value handling

3. Real-world examples in Quality Standards Reference:
   - CLI-enforced: --type with defaultValue="stdout"
   - Platform-enforced: --port, --boot-disk-size
   - Enum with placeholder: --type with ${COMPLETION-CANDIDATES}

Prevents future confusion about "Default: X" vs "If absent, Platform defaults to X"
patterns. Ensures consistency across all future CLI help enrichment work.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Comment on lines +49 to +50
- Repository: `/Users/llewelyn-van-der-berg/Documents/GitHub/tower-cli`
- Branch: `ll-metadata-extractor-and-docs-automation`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard coded local path and branch name.
It'll be wrong for every other contributor the moment this merges. We can either remove this section or make it generic

import java.io.PrintWriter;
import java.time.Instant;
import java.util.ArrayList;
import java.util.Arrays;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Arrays import is unused, pls remove

Comment on lines +284 to +292
private boolean isBuiltInOption(OptionSpec opt) {
for (String name : opt.names()) {
if (name.equals("-h") || name.equals("--help") ||
name.equals("-V") || name.equals("--version")) {
return true;
}
}
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fragile.
If the CLI ever changes help options or picocli changes defaults, this silently breaks.
Consider checking opt.inherited() or comparing against the root spec's standard help mixin options instead.

Comment on lines +316 to +393
* Simple JSON serialization without external dependencies.
* Produces well-formatted JSON output.
*/
private String toJson(Object obj, int indent) {
StringBuilder sb = new StringBuilder();
toJson(obj, indent, sb);
return sb.toString();
}

@SuppressWarnings("unchecked")
private void toJson(Object obj, int indent, StringBuilder sb) {
String indentStr = " ".repeat(indent);
String childIndent = " ".repeat(indent + 1);

if (obj == null) {
sb.append("null");
} else if (obj instanceof Map) {
Map<String, Object> map = (Map<String, Object>) obj;
sb.append("{\n");
int i = 0;
for (Map.Entry<String, Object> entry : map.entrySet()) {
sb.append(childIndent);
sb.append("\"").append(escapeJson(entry.getKey())).append("\": ");
toJson(entry.getValue(), indent + 1, sb);
if (i < map.size() - 1) {
sb.append(",");
}
sb.append("\n");
i++;
}
sb.append(indentStr).append("}");
} else if (obj instanceof List) {
List<?> list = (List<?>) obj;
if (list.isEmpty()) {
sb.append("[]");
} else if (list.get(0) instanceof String) {
// Compact array for simple strings
sb.append("[");
for (int i = 0; i < list.size(); i++) {
sb.append("\"").append(escapeJson((String) list.get(i))).append("\"");
if (i < list.size() - 1) {
sb.append(", ");
}
}
sb.append("]");
} else {
sb.append("[\n");
for (int i = 0; i < list.size(); i++) {
sb.append(childIndent);
toJson(list.get(i), indent + 1, sb);
if (i < list.size() - 1) {
sb.append(",");
}
sb.append("\n");
}
sb.append(indentStr).append("]");
}
} else if (obj instanceof String) {
sb.append("\"").append(escapeJson((String) obj)).append("\"");
} else if (obj instanceof Boolean || obj instanceof Number) {
sb.append(obj.toString());
} else {
sb.append("\"").append(escapeJson(obj.toString())).append("\"");
}
}

/**
* Escape special characters in JSON strings.
*/
private String escapeJson(String str) {
if (str == null) return "";
return str
.replace("\\", "\\\\")
.replace("\"", "\\\"")
.replace("\n", "\\n")
.replace("\r", "\\r")
.replace("\t", "\\t");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hand-rolled JSON serializer is super fragile.
It works, but it is a risk surface. Edge cases in string escaping or type handling could produce invalid JSON that breaks downstream consumers silently.
Please consider using Jackson (which is already in the classpath as dependency) replacing this code with ObjectMapper with SerializationFeature.INDENT_OUTPUT. This would be safer, shorter, and handle all edge cases.
If you want to keep the compact string-list formatting, Jackson supports custom serializers for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude has an interesting suggestion for this file:

 add it to .gitattributes with linguist-generated=true so GitHub collapses it in diffs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is effectively a point-in-time snapshot of the Seqera Platform OpenAPI spec.
Committing it creates a coupling: it will go stale as the API evolves.
Is there a reason this can't be fetched on-demand during the enrichment workflow instead of living in the repo?
If it needs to stay, how will it be kept in sync?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general comment here: there is no tests.
Since this is a utility that generates documentation consumed by downstream tooling, even a basic test that verifies the output is valid JSON with the expected top-level structure (metadata, hierarchy, commands keys) would catch regressions.
Please consider adding a test under src/test/ that runs the extractor against a real CLI command and validates the output shape.

Comment on lines +113 to +121
task extractCliMetadata(type: JavaExec) {
group = 'documentation'
description = 'Extract CLI metadata using Java reflection (deterministic, includes resolved mixins)'
classpath = sourceSets.main.runtimeClasspath
mainClass = 'io.seqera.tower.cli.utils.metadata.CliMetadataExtractor'
args = [file('docs/cli-metadata.json').absolutePath]
dependsOn classes
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider creating the output directory if it doesn't exist.
Adding a

doFirst {  file('docs').mkdirs() }

at the beginning of the task would make it more robust.

@JaimeSeqLabs
Copy link
Contributor

The generated file strategy deserves a longer conversation with the team about whether it should live in the repo or be published as a release artifact.
For now, I'm ok with the current setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants