-
Notifications
You must be signed in to change notification settings - Fork 75
[SYNPY-1726] Add bind-jsonschema CLI command #1317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
- Create schema_management.py with register_jsonschema and bind_jsonschema - Export functions from curator extension __init__.py
- Update register_json_schema to use register_jsonschema wrapper - Update bind_json_schema to use bind_jsonschema wrapper - Add register-json-schema CLI command with argparse configuration
- Add register-json-schema to table of contents - Add complete command documentation with usage and parameters - Document schema registration workflow
- Update tutorial to cover complete JSON Schema workflow - Add section 9: Register JSON Schema to Synapse - Add section 10: Bind JSON Schema to entities - Update tutorial script with register_jsonschema and bind_jsonschema examples - Update imports and line references
- Add test for argument parsing - Add test for --enable-derived-annotations flag - Add test for API call invocation
Update test to mock bind_jsonschema wrapper instead of asyncio.run, and verify correct arguments are passed to the wrapper function.
|
Thanks for doing this @aditigopalan ! Could you add an integration test as well as the unit testing you did? |
| ) | ||
|
|
||
| # Build the schema URI | ||
| schema_uri = f"{organization_name}-{schema_name}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JSONSchema class has a uri attribute that does this for you.
| if schema_version: | ||
| schema_uri += f"-{schema_version}" | ||
|
|
||
| message = f"Successfully registered schema '{schema_name}' to organization '{organization_name}'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returning this message, you should log it instead.
|
|
||
| message = f"Successfully registered schema '{schema_name}' to organization '{organization_name}'" | ||
|
|
||
| return schema_uri, message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just return the JSONSchema object to the user.
| synapse_client=syn, | ||
| ) | ||
| else: | ||
| # Fallback to direct API call for old-style entities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I'm not sure this is what we want to be doing here. The entity should really just be a Project, Folder or File correct? Those definitely have the bind_schema method.
- Update register_jsonschema to log message and return only schema_uri - Update CLI function to match new signature - Update docstring and examples
- Update return type from str to JSONSchema - Update docstring and examples to show accessing json_schema.uri - Update tutorial script accordingly
Add comprehensive integration tests for register_jsonschema and bind_jsonschema wrapper functions. Tests cover: - Registering schemas with and without version - Binding schemas to folders - Enabling derived annotations - Complete register + bind workflow
linglp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your hard work @aditigopalan ! I left some comments about documentation and code style changes.
| with open(schema_path, "r") as f: | ||
| schema_body = json.load(f) | ||
|
|
||
| # Create JSONSchema instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't think we need comments like this since the code is pretty self-explanatory
| json_schema_uri: str, | ||
| enable_derived_annotations: bool = False, | ||
| synapse_client: Optional["Synapse"] = None, | ||
| ) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: What does entity.bind_schema() actually return?
When I checked out the bind_schema function, it returns a JSONSchemaBinding dataclass. Should this return the same?
|
|
||
| syn = Synapse.get_client(synapse_client=synapse_client) | ||
|
|
||
| # Get the entity (File, Folder, or Project) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Like above, I think comments like these can be removed
| Arguments: | ||
| schema_path: Path to the JSON schema file to register | ||
| organization_name: Name of the organization to register the schema under | ||
| schema_name: The name of the JSON schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Name of the JSON Schema
This is to be consistent with organization_name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant that you have:
organization_name: Name of the organization to register the schema under
And similarly, we can do:
schema_name: Name of the JSON schema
synapseclient/__main__.py
Outdated
| syn.logger.info( | ||
| f"Successfully bound schema '{args.json_schema_uri}' to entity '{args.id}'" | ||
| ) | ||
| syn.logger.info(f"Binding details: {result}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewelamb Should we print the binding details here? Will binding details be useful or distracting? When I tested it out, it seems like a bit distracting to read all the binding details.
(synapsePythonClient) lpeng@w290 synapsePythonClient % synapse bind-json-schema syn73698041 sage.schemas.v2571-el.BiospecimenHumanTemplate.schema-0.0.0 --enable-derived-annotations
Welcome, Lingling Peng! You are using the 'default' profile.
Successfully bound schema 'sage.schemas.v2571-el.BiospecimenHumanTemplate.schema-0.0.0' to entity 'syn73698041'
Binding details: JSONSchemaBinding(json_schema_version_info=JSONSchemaVersionInfo(organization_id='607', organization_name='sage.schemas.v2571', schema_id='7066', id='sage.schemas.v2571-el.BiospecimenHumanTemplate.schema-0.0.0', schema_name='el.BiospecimenHumanTemplate.schema', version_id='43041', semantic_version='0.0.0', json_sha256_hex='17339e10ff2e6d363b3cd96a80426faab4133c2ea6180d0fb6d201ee3d71598d', created_on='2025-07-31T16:33:39.171Z', created_by='3360851'), object_id=73698041, object_type='entity', created_on='2026-02-10T22:06:04.080Z', created_by='3443707', enable_derived_annotations=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let's return the result here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you meant to not return the result, and just Successfully bound schema '{args.json_schema_uri}' to entity '{args.id}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like:
result = bind_jsonschema(
entity_id=args.id,
json_schema_uri=args.json_schema_uri,
enable_derived_annotations=args.enable_derived_annotations,
synapse_client=syn,
)
syn.logger.info(
f"Successfully bound schema '{args.json_schema_uri}' to entity '{args.id}'"
)
return result
|
|
||
| This tutorial uses the Python client as a library. To use the CLI tool, see the [command line documentation](../command_line_client.md). | ||
|
|
||
| ## Prerequisites |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here users also need to have organizations that they can work with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we should list the organizations here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is referring to the fact that a Jsonschema organization is required, I don't think we need to list out all of the organizations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think you can have something like: "The user has a schema organization that they can work with" as a prerequisite.
| {!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=1-6} | ||
| ``` | ||
|
|
||
| You'll need to import: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't think in other tutorials, we explain the imports. Login, imports, and set up variables can just be one step called "Initial set up". See example here: https://synapsepythonclient--1317.org.readthedocs.build/en/1317/tutorials/python/wiki/
thomasyu888
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution @aditigopalan !
@andrewelamb or @linglp once all the changes are made, can one of you make sure the all the tests contributed here runs, since contributions via forks don't have CI tests.
Thanks all!
|
@linglp @andrewelamb thank you both for reviewing! Please let me know if I understood the documentation changes correctly. Regarding the organizations section, I’m not entirely sure what would make sense to include in the documentation. Since new organizations can be created, I’m not sure what it means for the JSON schemas to be “consistent with organization name.” For example, for HTAN2Organization, we generate schemas like these: I’m also not sure what other DCCs are doing in this space, so I’ll defer to you both (and @thomasyu888) on what makes sense to describe in the documentation. |
| assert result is not None | ||
|
|
||
| # Verify the schema is actually bound by retrieving it | ||
| retrieved_folder = syn.get(folder.id, downloadFile=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get is deprecated here make sure to use the newer version as before.
|
|
||
| # Verify the schema is actually bound by retrieving it | ||
| retrieved_folder = syn.get(folder.id, downloadFile=False) | ||
| bound_schema = retrieved_folder.get_schema(synapse_client=syn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work because the object you are getting from the deprecated get method doesn't have the method.
| if TYPE_CHECKING: | ||
| from synapseclient import Synapse | ||
| from synapseclient.models.schema_organization import JSONSchema, JSONSchemaBinding | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is something wrong with the JSONSchemaBinding import, use:
from synapseclient.models.mixins.json_schema import JSONSchemaBinding
|
|
||
| # Verify the binding | ||
| assert result is not None | ||
| assert "entityId" in result or "entity_id" in result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result here is a JSONSchemaBinding, this assert statement won't work. What are you trying to test here?
|
|
||
| # Verify the binding | ||
| assert result is not None | ||
| assert "entityId" in result or "entity_id" in result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
saem as below
| assert json_schema is not None | ||
| assert json_schema.uri is not None | ||
| assert json_schema.name == schema_name | ||
| assert json_schema.version is not None # Should have auto-generated version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSONSchemas don't have an innate version, it can contain many versions.
| assert json_schema is not None | ||
| assert json_schema.uri is not None | ||
| assert json_schema.name == schema_name | ||
| assert json_schema.version == version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as bellow
| def delete_org(): | ||
| # Delete all schemas in the organization | ||
| for schema in org.get_json_schemas(synapse_client=syn): | ||
| schema.delete(synapse_client=syn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failing. One of the tests below is doing this cleanup, but the schema is still bound to an folder. Either you need to unbind the schema, or delete the folder first.
|
@aditigopalan Thank you for adding the integration tests! However, they are not passing. These all need to be resolved: |
- previously tried to test if that the binding result contains entity information (like entityId or entity_id) - result is a JSONSchemaBinding object (a Python class instance), not a dictionary - just checking that the result is not none should be sufficient

Problem:
The CLI didn't have a workflow for registering and binding JSON Schemas.
Specifically:
Highlighted in SYNPY-1762
Solution:
schema_management.pymodule that provides thin wrapper functions (register_jsonschema() and bind_jsonschema()), keeping CLI logic minimal and consistent with previous workregister-json-schemaCLI command that registers schemas using JSONSchema.store().bind-json-schemaCLI command that binds schemas to entities via Entity.bind_schema().Testing:
register-json-schemaandbind-json-schemaCLI commands.