Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 13 additions & 17 deletions .github/workflows/build-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,15 @@ jobs:
steps:
- uses: actions/checkout@v6

- name: install uv
uses: astral-sh/setup-uv@v6

# We use Python 3.12 here because it's the minimum Python version supported by this library.
- name: Setup Python 3.12
uses: actions/setup-python@v6
with:
python-version: '3.12'

- name: Install dependencies
run: pip install --upgrade pip build
run: uv python install 3.12

- name: Build package
run: python -m build
run: uv build

- name: Upload build artifacts
uses: actions/upload-artifact@v7
Expand All @@ -34,7 +32,7 @@ jobs:
path: dist/

test:
# This job tests the built package by installing it via pip and running unit tests (without tox).
# This job tests the built package by installing it via pip and running unit tests.
name: Test package
needs: build
runs-on: ubuntu-latest
Expand All @@ -47,25 +45,23 @@ jobs:
with:
timezoneLinux: "Europe/Berlin"

- name: install uv
uses: astral-sh/setup-uv@v6

- name: Setup Python 3.12
uses: actions/setup-python@v6
with:
python-version: '3.12'
run: uv python install 3.12

- name: Download build artifacts
uses: actions/download-artifact@v8
with:
name: dist_packages
path: dist/

- name: Install test dependencies
run: pip install -r requirements.txt -r requirements-dev.txt

- name: Install built package
run: pip install dist/schema2validataclass-*.whl
- name: Install built package and test dependencies
run: uv sync --group dev && uv pip install --python .venv dist/schema2validataclass-*.whl --force-reinstall --no-deps

- name: Run unit tests
run: python -m pytest
run: uv run pytest

publish:
name: Publish package
Expand Down
35 changes: 11 additions & 24 deletions .github/workflows/lint-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,17 @@ jobs:
- name: checkout
uses: actions/checkout@v6

- name: setup Python v3.12
uses: actions/setup-python@v6
with:
python-version: '3.12'
cache: 'pip'
- name: install uv
uses: astral-sh/setup-uv@v6

- name: pip install
run: pip install -r requirements.txt -r requirements-dev.txt
- name: setup Python v3.12
run: uv python install 3.12

- name: lint using ruff
# We could also use the official GitHub Actions integration.
# https://beta.ruff.rs/docs/usage/#github-action
# uses: chartboost/ruff-action@v1
run: ruff check --output-format github ./src ./tests
run: uv run ruff check --output-format github ./src ./tests

- name: format check using ruff
# We could also use the official GitHub Actions integration.
# https://beta.ruff.rs/docs/usage/#github-action
# uses: chartboost/ruff-action@v1
run: |
ruff format --check ./src ./tests
run: uv run ruff format --check ./src ./tests

test:
runs-on: ubuntu-latest
Expand All @@ -50,14 +40,11 @@ jobs:
- name: checkout
uses: actions/checkout@v6

- name: setup Python v3.12
uses: actions/setup-python@v6
with:
python-version: '3.12'
cache: 'pip'
- name: install uv
uses: astral-sh/setup-uv@v6

- name: pip install
run: pip install -r requirements.txt -r requirements-dev.txt
- name: setup Python v3.12
run: uv python install 3.12

- name: run pytest
run: python -m pytest tests
run: uv run --group dev pytest tests
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ repos:
- id: check-yaml

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.15.6
rev: v0.15.9
hooks:
- id: ruff-format
- id: ruff
Expand Down
154 changes: 101 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
# schema2validataclass

A Python code generator that transforms [JSON Schema](https://json-schema.org/) definitions into Python [`@validataclass`](https://github.com/binary-butterfly/validataclass)-decorated dataclasses or plain `@dataclass` classes, along with Enum classes.
A Python code generator that transforms [JSON Schema](https://json-schema.org/) definitions into Python [`@validataclass`](https://github.com/binary-butterfly/validataclass)-decorated dataclasses, plain `@dataclass` classes, or [Pydantic](https://docs.pydantic.dev/) `BaseModel` classes, along with Enum classes.


## Features

- Parses JSON Schema files, including `$ref` references across multiple files and remote schemas (HTTP)
- Generates `@validataclass`-decorated dataclasses with typed validator fields, or plain `@dataclass` classes
- Three output formats: `@validataclass` (with validators), plain `@dataclass`, or Pydantic `BaseModel`
- Generates Python `Enum` classes for JSON Schema `enum` types
- Supports nested objects, arrays, references with property overrides, and schema inheritance
- Handles required vs. optional fields with configurable `UnsetValue` / `None` defaults
- Resolves schema dependency graphs automatically by following `$ref` chains
- Detects and breaks circular `$ref` references to prevent import cycles
- Ignore specific references or schema paths to exclude unwanted types from output
- Configurable via YAML configuration file

## Requirements
Expand All @@ -19,14 +21,23 @@ A Python code generator that transforms [JSON Schema](https://json-schema.org/)
- [Jinja2](https://jinja.palletsprojects.com/) (template rendering)
- [PyYAML](https://pyyaml.org/) (configuration file parsing)
- [Ruff](https://docs.astral.sh/ruff/) (post-processing of generated files)
- [validataclass](https://github.com/binary-butterfly/validataclass) (used in generated code, not needed for `@dataclass` output)
- [validataclass](https://github.com/binary-butterfly/validataclass) (used in generated code, only needed for `validataclass` output)
- [Pydantic](https://docs.pydantic.dev/) (used in generated code, only needed for `pydantic` output)


## Installation

```bash
uv add schema2validataclass
```

Or with pip:

```bash
pip install schema2validataclass
```


## Usage

```bash
Expand All @@ -52,9 +63,24 @@ This reads the schema, recursively resolves all `$ref` references to other schem
- One Python file per `enum` type (e.g. `day_enum.py`)
- One Python file per `object` type (e.g. `closure_information_input.py`)

### Generated output example

Given a JSON Schema object with optional boolean and string fields, the generator produces:
### Generated output examples

Given a JSON Schema object with optional boolean and string fields, the generator produces different output depending on the configured `output_format`.

For enum schemas, the output is the same across all formats:

```python
from enum import Enum

class DayEnum(Enum):
MONDAY = "monday"
TUESDAY = "tuesday"
WEDNESDAY = "wednesday"
```


#### `validataclass` output (default)

```python
from validataclass.validators import StringValidator, BooleanValidator
Expand All @@ -69,29 +95,17 @@ class ClosureInformationInput(ValidataclassMixin):
closedFrom: str | UnsetValueType = StringValidator(), Default(UnsetValue)
```

For enum schemas:

```python
from enum import Enum

class DayEnum(Enum):
MONDAY = "monday"
TUESDAY = "tuesday"
WEDNESDAY = "wednesday"
```
This is the default output format. It uses the [`validataclass`](https://github.com/binary-butterfly/validataclass) library for runtime validation. Optional fields use `UnsetValue` by default (configurable via `unset_value_output`). Classes optionally inherit from `ValidataclassMixin` (configurable via `set_validataclass_mixin`).

### `@dataclass` output

Instead of generating `@validataclass`-decorated classes, the generator can produce plain Python `@dataclass` classes. This is useful when you don't need runtime validation and want lightweight data containers with no external dependencies beyond the standard library.
#### `dataclass` output

Set `output_format: dataclass` in your config file (see [Configuration](#configuration) below):

```yaml
output_format: dataclass
```

The same schema that produces the validataclass example above generates:

```python
from dataclasses import dataclass

Expand All @@ -102,14 +116,27 @@ class ClosureInformationInput:
closedFrom: str | None = None
```

Key differences from `@validataclass` output:
This produces plain Python `@dataclass` classes with no external dependencies. Optional fields default to `None`, required fields are bare type annotations. The `set_validataclass_mixin` and `unset_value_output` config options have no effect.


#### `pydantic` output

Set `output_format: pydantic` in your config file:

```yaml
output_format: pydantic
```

- Uses Python's built-in `@dataclass(kw_only=True)` decorator
- Fields are plain type annotations without validators
- Optional fields default to `None` instead of `UnsetValue`
- Required fields are bare type annotations (e.g. `name: str`)
- No dependency on the `validataclass` package
- `set_validataclass_mixin` and `unset_value_output` config options have no effect
```python
from pydantic import BaseModel

class ClosureInformationInput(BaseModel):
permananentlyClosed: bool | None = None
temporarilyClosed: bool | None = None
closedFrom: str | None = None
```

This produces [Pydantic V2](https://docs.pydantic.dev/) `BaseModel` classes. Constraints like `minimum`, `maximum`, `minLength`, `maxLength`, and `pattern` are rendered using `Annotated[type, Field(...)]`. Properties that conflict with Python reserved words are renamed with a trailing underscore and a `@model_validator(mode='before')` is generated to map the original names. The `set_validataclass_mixin` and `unset_value_output` config options have no effect.

### Loop detection

Expand All @@ -127,6 +154,7 @@ Loop detection is enabled by default. To disable it:
detect_looping_references: false
```


## Configuration

The generator can be configured via a YAML file passed with the `-c` / `--config` flag:
Expand All @@ -146,67 +174,87 @@ detect_looping_references: true
post_processing:
- ruff-format
- ruff-check
ignored_uris: []
ignore_references: []
ignore_paths: []
renamed_properties: []
header: |
"""
Custom copyright header.
"""
```


### Options

| Option | Default | Description |
|-----------------------------|-----------------------------|--------------------------------------------------------------------------------------------------------|
| `output_format` | `validataclass` | Output style: `validataclass` (with validators) or `dataclass` (plain Python dataclasses) |
| `unset_value_output` | `UNSET_VALUE` | How optional fields are represented: `UNSET_VALUE` (uses `UnsetValue`) or `NONE` (uses `None`) |
| `object_postfix` | `'Input'` | Suffix appended to generated class names (e.g. `ClosureInformation` becomes `ClosureInformationInput`) |
| `set_validataclass_mixin` | `true` | Whether generated validataclass classes inherit from `ValidataclassMixin` |
| `detect_looping_references` | `true` | Detect and remove circular `$ref` chains to prevent import cycles |
| `post_processing` | `[ruff-format, ruff-check]` | Post-processing steps to run on generated files |
| `ignored_uris` | `[]` | List of field URI paths to skip during generation |
| `header` | Copyright header | Python file header prepended to every generated file |
| Option | Default | Description |
|-----------------------------|-----------------------------|---------------------------------------------------------------------------------------------------------------------|
| `output_format` | `validataclass` | Output style: `validataclass`, `dataclass`, or `pydantic` |
| `unset_value_output` | `UNSET_VALUE` | How optional fields are represented: `UNSET_VALUE` (uses `UnsetValue`) or `NONE` (uses `None`). Validataclass only. |
| `object_postfix` | `'Input'` | Suffix appended to generated class names (e.g. `ClosureInformation` becomes `ClosureInformationInput`) |
| `set_validataclass_mixin` | `true` | Whether generated validataclass classes inherit from `ValidataclassMixin`. Validataclass only. |
| `detect_looping_references` | `true` | Detect and remove circular `$ref` chains to prevent import cycles |
| `post_processing` | `[ruff-format, ruff-check]` | Post-processing steps to run on generated files |
| `ignore_references` | `[]` | List of `$ref` target URIs to ignore (suffix match). Properties referencing these are removed from their parent. |
| `ignore_paths` | `[]` | List of schema paths to ignore (suffix match). The property at the given path is removed during loading. |
| `renamed_properties` | Python keywords | List of property names that get a trailing `_` to avoid conflicts. Defaults to all Python reserved keywords. |
| `header` | Copyright header | Python file header prepended to every generated file |


### `ignore_references` vs `ignore_paths`

Both options remove properties from the generated output, but they match differently:

- **`ignore_references`** matches the **target** of a `$ref`. For example, `third_schema.json#/definitions/IgnoredObject` removes every property that references that definition, regardless of where the property appears.
- **`ignore_paths`** matches the **location** of a property in the schema. For example, `second_schema.json#/definitions/SecondObject/properties/IgnoredObject` removes only that specific property from `SecondObject`, even if other objects also reference the same definition.

Both use suffix matching, so you can omit leading path components.


## Supported JSON Schema types

| JSON Schema type | Generated validator |
|-------------------------|-----------------------------------------------------------------|
| `boolean` | `BooleanValidator()` |
| `integer` | `IntegerValidator(min_value=..., max_value=...)` |
| `number` | `FloatValidator(min_value=..., max_value=...)` |
| `string` | `StringValidator(min_length=..., max_length=...)` |
| `string` with `pattern` | `RegexValidator(pattern=r'...')` |
| `enum` | `EnumValidator(EnumClassName)` |
| `array` | `ListValidator(inner_validator)` |
| `object` | `DataclassValidator(ClassName)` |
| JSON Schema type | validataclass | dataclass / pydantic |
|-------------------------|-----------------------------------------------------------------|------------------------------------------------------|
| `boolean` | `BooleanValidator()` | `bool` |
| `integer` | `IntegerValidator(min_value=..., ...)` | `int` / `Annotated[int, Field(ge=..., ...)]` |
| `number` | `FloatValidator(min_value=..., ...)` | `float` / `Annotated[float, Field(ge=..., ...)]` |
| `string` | `StringValidator(min_length=..., ...)` | `str` / `Annotated[str, Field(min_length=..., ...)]` |
| `string` with `pattern` | `RegexValidator(pattern=r'...')` | `str` / `Annotated[str, Field(pattern=r'...')]` |
| `enum` | `EnumValidator(EnumClassName)` | `EnumClassName` |
| `array` | `ListValidator(inner_validator)` | `list[inner_type]` |
| `object` | `DataclassValidator(ClassName)` | `ClassName` |
| `$ref` | Resolved to the referenced type with property overrides applied |


## Development

This project uses [uv](https://docs.astral.sh/uv/) for dependency management.

```bash
# Install with development dependencies
pip install -e ".[testing]"
# Sync project with dev dependencies
uv sync --group dev
```

For running without installing, use the development script which adds `src/` to the Python path:

```bash
python dev/run.py <schema_path> <output_path>
uv run python dev/run.py <schema_path> <output_path>
```

```bash
# Lint
ruff check .
uv run ruff check .

# Format
ruff format .
uv run ruff format .

# Run pre-commit hooks
pre-commit run --all-files

# Run tests
pytest
uv run pytest
```


## License

MIT - see LICENSE.txt for details.
Expand Down
Loading
Loading