Skip to content

rego.metadata.* and rego.parse_module builtins #212

@matajoh

Description

@matajoh

Summary

Implement the rego.metadata.chain, rego.metadata.rule, and rego.parse_module built-in functions. All three are currently registered as placeholders in src/builtins/rego.cc, returning "Rego metadata not supported" / "Rego module parsing not supported" at runtime.

OPA reference

  • rego.metadata.chain: returns the chain of metadata annotation objects from the current rule up through the package hierarchy. Each entry represents an ancestor node that has declared # METADATA annotations.
  • rego.metadata.rule: returns the metadata annotation object for the active rule, or an empty object if no annotations exist.
  • rego.parse_module: rego.parse_module(filename, rego) — parses a Rego module string and returns its AST as a JSON object.

OPA implementation reference

Current state

  • src/builtins/rego.cc: Placeholder declarations for all three functions exist with full type metadata. All return BuiltInDef::placeholder(...).
  • Dispatcher routing is fully wired. Only implementation functions and the required evaluation-context plumbing need to be added.

Work items

rego.metadata.chain and rego.metadata.rule

These two functions are the most architecturally invasive, as they require the evaluator to provide rule-level context to the built-in at call time.

  1. Annotation parsing — Extend the Rego parser to recognize # METADATA comment blocks (YAML) preceding rules and packages. Parse and store annotation objects (title, description, authors, organizations, related_resources, schemas, scope, custom, etc.) per rule/package node.
  2. Annotation storage — Attach parsed annotation metadata to rule and package AST nodes so it survives through compilation.
  3. Evaluation context threading — The evaluator must make the "current rule" and its ancestry available to built-in functions. This likely requires extending the built-in call interface to pass evaluation context (or adding a context-aware variant of BuiltInBehavior).
  4. rego.metadata.rule implementation — Return the annotation object for the currently evaluating rule, or an empty object {} if none exists.
  5. rego.metadata.chain implementation — Walk up from the current rule through parent rules/packages, collecting all nodes that have annotations, and return as an array.

rego.parse_module

  1. rego.parse_module implementation — Invoke the existing Rego parser on the input string and convert the resulting AST into a JSON object matching OPA's AST format. The main challenge is ensuring the output JSON structure matches OPA's expected format (package, imports, rules, comments, annotations, etc.).

Finalization

  1. Swap BuiltInDef::placeholderBuiltInDef::create in all three factory functions.
  2. Update README.md — Remove rego.metadata.chain/rego.metadata.rule/rego.parse_module from the unsupported builtins list (line 157).
  3. Tests — Add test cases covering:
    • Rule with # METADATA block → rego.metadata.rule returns annotation object
    • Rule without annotations → returns {}
    • Nested package/rule chain → rego.metadata.chain returns correct ancestry
    • Multiple annotations at different scopes
    • rego.parse_module with a simple module → correct AST structure
    • rego.parse_module with imports, multiple rules, and annotations
    • Error cases: invalid module string, empty input
    • Validate behavior matches OPA for the relevant OPA compliance tests

Notes

  • This is the highest-complexity built-in work item in the remaining backlog. The metadata functions require changes to the parser, AST representation, and evaluator — not just the built-in system. The rego.parse_module function is somewhat independent and could be delivered separately.
  • The annotation format is YAML embedded in comments. OPA uses a specific prefix (# METADATA) and expects the YAML block to be contiguous. Consider whether to add a YAML parsing dependency or implement a minimal subset parser.
  • The OPA AST JSON format for rego.parse_module is complex and version-dependent. Careful alignment with OPA's output will be needed for compliance.
  • Consider splitting this into sub-issues if the scope is too large for a single PR: (a) annotation parsing + metadata builtins, (b) rego.parse_module.

Metadata

Metadata

Assignees

No one assigned

    Labels

    built-insAdding built-in functionsenhancementNew feature or requestopa-compatIncreasing compatibility with the upstream OPA implementation.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions