Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions docs/tutorials/command_line_client.md
Original file line number Diff line number Diff line change
Expand Up @@ -552,9 +552,10 @@ Generate JSON Schema(s) from a data model
synapse generate-json-schema [-h] [--data-types data_type1, data_type2] [--output dir_name] [--data-model-labels class_label] data_model_path
```

| Name | Type | Description |
|--------------------------|------------|---------------------------------------------------------------------|
| `data_model_path` | Positional | Data model path or URL |
| `--data-types` | Named | Optional list of data types to create JSON Schema for |
| `--output` | Named | Optional. Either a file path ending in '.json', or a directory path |
| `--data-model-labels` | Named | Either 'class_label', or 'display_label' |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am technically ok with this backwards incompatible change, but then should this upcoming package release be 5.0.0? or should this deprecation be done in a different method?

| Name | Type | Description |
|-----------------------------------|------------|-----------------------------------------------------------------------------------|
| `data_model_path` | Positional | Data model path or URL |
| `--data-types` | Named | Optional list of data types to create JSON Schema for |
| `--output` | Named | Optional. Either a file path ending in '.json', or a directory path |
| `--use-property-display-names` | Named | Optional. Defaults to False. Formats the property name strings in the JSON Schema |
Copy link
Member

@thomasyu888 thomasyu888 Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a product perspective: May want to add in a note about how we do formatting here because Synapse does not accept annotation keys that have special character or spaces.

| `--use-valid-value-display-names` | Named | Optional. Defaults to False. Formats the valid value strings in the JSON Schema |
Copy link
Member

@thomasyu888 thomasyu888 Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a product perspective, it is a bit unclear why we actually have this field. I'm not entirely sure there are use cases for updating the valid value strings themselves? Although I think the complexity is going to be conditionals.

If the property name is "DataType" after --use-property-display-names but then the valid value is "Data Type" that will potentially cause a mismatch.

The other note, we definitely want DCC's to continue providing valid values like this:

enum:
- "ab cd"
- "bc dd"
- "cd ff"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasyu888 This is a great question. Is there any reason to format valid values? It's only there because the original method did this, I just added the ability to turn it off.

10 changes: 10 additions & 0 deletions docs/tutorials/python/schema_operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,16 @@ Create a JSON Schema

If you don't set `output` parameter the JSON Schema file will be created in the current working directory.

## 8. Create a JSON Schema using display names

Create a JSON Schema

```python
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=56-63}
```

You can have Curator format the property names and/or valid values in the JSON Schema. This will remove whitespace and special characters.

## Source Code for this Tutorial

<details class="quote">
Expand Down
9 changes: 9 additions & 0 deletions docs/tutorials/python/tutorial_scripts/schema_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,12 @@
data_types=DATA_TYPE,
synapse_client=syn,
)

# Create JSON Schema in using display names for both properties names and valid values
schemas, file_paths = generate_jsonschema(
data_model_source=DATA_MODEL_SOURCE,
data_types=DATA_TYPE,
use_property_display_names=True,
use_valid_value_display_names=True,
synapse_client=syn,
)
22 changes: 12 additions & 10 deletions synapseclient/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -808,7 +808,8 @@ def generate_json_schema(args, syn):
data_model_source=args.data_model_path,
output=args.output,
data_types=args.data_types,
data_model_labels=args.data_model_labels,
use_property_display_names=args.use_property_display_names,
use_valid_value_display_names=args.use_valid_value_display_names,
synapse_client=syn,
)
logging.info(f"Created JSON Schema files: [{paths}]")
Expand Down Expand Up @@ -1833,15 +1834,16 @@ def build_parser():
),
)
parser_generate_json_schema.add_argument(
"--data-model-labels",
type=str,
default="class_label",
choices=["class_label", "display_label"],
help=(
"Optional Label format for properties in the generated schema. "
"'class_label' uses standard attribute names (default). "
"'display_label' uses display names when valid"
),
"--use-property-display-names",
action="store_true",
default=False,
help="Use display names for properties in the generated JSON Schema",
)
parser_generate_json_schema.add_argument(
"--use-valid-value-display-names",
action="store_true",
default=False,
help="Use display names for valid values in the generated JSON Schema",
)
parser_generate_json_schema.set_defaults(func=generate_json_schema)

Expand Down
46 changes: 31 additions & 15 deletions synapseclient/extensions/curator/schema_generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -5137,7 +5137,7 @@ def update_property(self, property_dict: dict[str, Property]) -> None:
def _set_conditional_dependencies(
json_schema: JSONSchema,
graph_state: GraphTraversalState,
use_property_display_names: bool = True,
use_property_display_names: bool = False,
) -> None:
"""
This sets conditional requirements in the "allOf" keyword.
Expand Down Expand Up @@ -5204,7 +5204,7 @@ def _set_conditional_dependencies(


def _create_enum_array_property(
node: TraversalNode, use_valid_value_display_names: bool = True
node: TraversalNode, use_valid_value_display_names: bool = False
) -> Property:
"""
Creates a JSON Schema property array with enum items
Expand Down Expand Up @@ -5270,7 +5270,7 @@ def _create_array_property(node: TraversalNode) -> Property:


def _create_enum_property(
node: TraversalNode, use_valid_value_display_names: bool = True
node: TraversalNode, use_valid_value_display_names: bool = False
) -> Property:
"""
Creates a JSON Schema property enum
Expand Down Expand Up @@ -5346,8 +5346,8 @@ def _set_type_specific_keywords(schema: dict[str, Any], node: TraversalNode) ->
def _set_property(
json_schema: JSONSchema,
node: TraversalNode,
use_property_display_names: bool = True,
use_valid_value_display_names: bool = True,
use_property_display_names: bool = False,
use_valid_value_display_names: bool = False,
) -> None:
"""
Sets a property in the JSON schema. that is required by the schema
Expand Down Expand Up @@ -5393,8 +5393,8 @@ def _process_node(
json_schema: JSONSchema,
graph_state: GraphTraversalState,
logger: Logger,
use_property_display_names: bool = True,
use_valid_value_display_names: bool = True,
use_property_display_names: bool = False,
use_valid_value_display_names: bool = False,
) -> None:
"""
Processes a node in the data model graph.
Expand Down Expand Up @@ -5473,8 +5473,8 @@ def create_json_schema( # pylint: disable=too-many-arguments
write_schema: bool = True,
schema_path: Optional[str] = None,
jsonld_path: Optional[str] = None,
use_property_display_names: bool = True,
use_valid_value_display_names: bool = True,
use_property_display_names: bool = False,
use_valid_value_display_names: bool = False,
) -> dict[str, Any]:
"""
Creates a JSONSchema dict for the datatype in the data model.
Expand Down Expand Up @@ -5594,7 +5594,8 @@ def generate_jsonschema(
synapse_client: Synapse,
data_types: Optional[list[str]] = None,
output: Optional[str] = None,
data_model_labels: DisplayLabelType = "class_label",
use_property_display_names: bool = False,
use_valid_value_display_names: bool = False,
) -> tuple[list[dict[str, Any]], list[str]]:
"""
Generate JSON Schema files from a data model.
Expand All @@ -5612,9 +5613,11 @@ def generate_jsonschema(
- If None, schemas will be written to the current working directory, with filenames formatted as `<DataType>.json`.
- If a directory path, schemas will be written to that directory, with filenames formatted as `<Output>/<DataType>.json`.
- If a file path (must end with `.json`) and a single data type is specified, the schema for that data type will be written to that file.
data_model_labels: Label format for properties in the generated schema:
- `"class_label"` (default): Uses standard attribute names as property keys
- `"display_label"`: Uses display names if valid (no blacklisted characters),.
use_property_display_names: If True, the properties in the JSONSchema
will be written using node display names
use_valid_value_display_names: If True, the valid_values in the JSONSchema
will be written using node display names


Returns:
A tuple containing:
Expand Down Expand Up @@ -5670,10 +5673,22 @@ def generate_jsonschema(
data_model_source="https://raw.githubusercontent.com/org/repo/main/model.csv",
output_directory="./schemas",
data_type=None,
data_model_labels="class_label",
synapse_client=syn
)
```

Generate JSON Schema using labels instead of display names:

```python
schemas, file_paths = generate_jsonschema(
data_model_source="https://raw.githubusercontent.com/org/repo/main/model.csv",
output_directory="./schemas",
data_type=None,
synapse_client=syn,
use_property_display_names=False,
use_valid_value_display_names=False,
)
```
"""
check_curator_imports()
data_model_parser = DataModelParser(
Expand Down Expand Up @@ -5728,7 +5743,8 @@ def generate_jsonschema(
logger=synapse_client.logger,
write_schema=True,
schema_path=schema_path,
use_property_display_names=(data_model_labels == "display_label"),
use_property_display_names=use_property_display_names,
use_valid_value_display_names=use_valid_value_display_names,
)
for data_type, schema_path in zip(data_types, schema_paths)
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,20 @@
"Enum": {
"description": "TBD",
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"title": "Enum"
},
"EnumNotRequired": {
"description": "TBD",
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"title": "Enum Not Required"
},
Expand Down Expand Up @@ -67,10 +67,10 @@
"description": "TBD",
"items": {
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"type": "string"
},
Expand All @@ -81,10 +81,10 @@
"description": "TBD",
"items": {
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"type": "string"
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,10 @@
"description": "TBD",
"items": {
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"type": "string"
},
Expand All @@ -51,10 +51,10 @@
"description": "TBD",
"items": {
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"type": "string"
},
Expand All @@ -70,10 +70,10 @@
"description": "TBD",
"items": {
"enum": [
"ab",
"cd",
"ef",
"gh"
"Ab",
"Cd",
"Ef",
"Gh"
],
"type": "string"
},
Expand Down
Loading
Loading