Conversation
…used by model package workflow
|
Top-level manifest.json should define the overall inputs/outputs so a user of the package knows what it does. They shouldn't have to trawl through the information to find the first and last things that will be run to infer this info. |
| { | ||
| "variant_name": "variant_1", | ||
| "file": "mul_1.onnx", | ||
| "constraints": { | ||
| "ep": "example_ep", | ||
| "device": "cpu", | ||
| "architecture": "arch1" | ||
| } | ||
| }, |
There was a problem hiding this comment.
This seems like lower level per-variant info I would have expected to be in the component model's metadata.json not the top level manifest.
There was a problem hiding this comment.
I think where to store this lower-level, per-variant metadata is still open for discussion. It can either reside in the top-level manifest or within each component model’s metadata.json.
Option 1: Top-level manifest:
-
Pros:
- Provides a single source of truth for the entire model package.
- Simplifies parsing for ORT, as it only needs to read one manifest file to obtain a complete view.
-
Cons:
- The manifest may become overly detailed, containing extensive information about each precompiled variant.
- Requires synchronization with the component model directories.
If component model directories are added or removed, the manifest must be updated accordingly
Option 2: Component model's metadata.json:
-
Pros:
- Better modularity and separation of concerns.
- Changes to component model directories typically do not require updates to the top-level manifest.
-
Cons:
- ORT must scan and parse all component model directories to collect per-variant metadata.
- May introduce additional runtime overhead during model package loading.
There was a problem hiding this comment.
Either way the same amount of info needs to be parsed. Should be trivial to scan the component model directories as the metadata should be in the top-level directory for each component model.
I'm wary about the top-level manifest getting overwhelmingly large, especially if we have multiple component models all with multiple variants. Harder to see issues/inconsistencies and easier to get mega-config files wrong. But maybe humans will never read this stuff and it doesn't matter anymore.
Option 2 slightly simplifies add/remove variant from package as you should only need to update the component model metadata file when doing so. For Foundry Local we will need to add/remove frequently as we will have to publish per-variant packages due to catalog being immutable and to keep the versioning specific to the variant, and merge those on-device post-download. e.g. user downloads TRT-RTX variant and CPU variant separately as they are different entries in the catalog.
I don't quite understand how Option 2 adds runtime overhead. I would have expected we have a general model package helper class. Create it by pointing it at the package directory and it parses everything into in-memory info at that point and checks the package is valid. Via that class instance I should be able to easily get things like the ordered list of component models, the available variants for each component model and things like the EP they require, and a way to get the directory of a variant to load it.
There was a problem hiding this comment.
While we are still discussing internally, i made the ModelPackageManifestParser::ParseManifestbe able to parse manifest.json and metadata.json for all component models as well as their associated model variants.
If a variant appears in both, it choses metadata.json as the source of truth, but falls back to manifest.json
if metadata.json is missing required fields.
|
Would be good to add some details about the 'how' things are done as the PR description says what has changed but doesn't cover things like how selection is being implemented. |
|
|
||
| Status RegisterExecutionProviders(InferenceSession& sess, | ||
| std::vector<std::unique_ptr<IExecutionProvider>>& providers); | ||
|
|
There was a problem hiding this comment.
Does anything call these?
There was a problem hiding this comment.
good catch, that's unnecessary anymore and will remove it
| }; | ||
|
|
||
| class ModelPackageManifestParser { | ||
| public: |
There was a problem hiding this comment.
Would be good to keep the model package handling (parsing manifest and metadata files, iterating directories etc. to create a user-friendly in-memory representation) standalone as it will be needed in other places like Foundry Local
onnxruntime/core/session/utils.cc
Outdated
| // Parse manifest and gather components. | ||
| ModelPackageManifestParser parser(logging::LoggingManager::DefaultLogger()); | ||
| std::vector<EpContextVariantInfo> components; | ||
| ORT_API_RETURN_IF_STATUS_NOT_OK(parser.ParseManifest(package_root, components)); |
There was a problem hiding this comment.
nit: would suggest having a ModelPackage class that owns all this info instead of doing things piecemeal in other places. Construct the ModelPackage from the package_root. It parses and validates the info and provides getters for the code to read that. That way all the parsing and processing of the model package is in one class so if there are any issues there's one place to fix them.
This also feels like it's missing a layer. The package has one or more component models (if multiple there's a specific order they're executed in). A component model has one or more variants. But this code is reading a collection of EpContextVariantInfo so the required grouping by component model seems to be lost.
There was a problem hiding this comment.
A ModelPackage class that owns all the info is a good suggestion and will change the code here.
Also, i did miss the "component model" layer and will add it back.
| Status SelectComponent(gsl::span<EpContextVariantInfo> components, | ||
| gsl::span<SelectionEpInfo> ep_infos, | ||
| std::optional<std::filesystem::path>& selected_component_path) const; |
There was a problem hiding this comment.
nit: A 'context' owning the selection logic feels slightly off. Maybe that's just a naming thing as this seems more like it's implementing a selection policy for a model variant (which != component model).
There was a problem hiding this comment.
i was mixing the use of component model and model variant.
You are right, here is implementing a selection policy for a model variant. I will change the naming.
Description
To support the model package design, one of the goals for ORT is to automatically select the most suitable compiled EPContext binary from a collection of precompiled variants based on the EP, provider options, metadata, and available devices.
This PR is for ORT to support first phase model package. There could be other follow-up PRs in the future.
A model package is a collection of models, binaries, and metadata files organized in a hierarchically structured directory.
The directory structure is not yet finalized, so the following is just a simple example of a model package directory:
Definitions:
Model Package
Component Model
Model Variant
manifest.json and metadata.json
Read the spec here
A manifest.json may look like:
A metadata.json for a component model may look like:
Model Selection
The selection logic is implemented in
MatchesVariant(), which evaluates the following constraints:(Note: A constraint refers to a value under the "constraints" field in either manifest.json or metadata.json.)
OrtEpFactory::GetSupportedDevices, therefore ORTwon't have the supported device information for those EPs. In that case, ORT will skip the device constraint validation for those EPs.
Note
Check the unit test here to better understand how to use model package.
Code Change
This pull request introduces significant enhancements to the execution provider (EP) selection and management infrastructure in ONNX Runtime. The main focus is on supporting more sophisticated device selection and manifest-based model packaging, as well as refactoring provider selection logic for modularity and future extensibility.
Key changes include:
The most important changes are:
Model Package Context and Manifest Support
model_package_context.handmodel_package_context.ccto implement manifest parsing, device/EP constraint matching, and component selection logic for model packages. This enables ONNX Runtime to select the most appropriate model variant based on available hardware and EP configuration. [1] [2]Execution Provider Interface Enhancements
IExecutionProviderclass to support construction with a list ofOrtEpDevicepointers, and added aGetEpDevices()method to retrieve the supported devices. This allows plugin and bridge EPs to expose multiple devices. [1] [2]Provider Policy Context Refactoring
SelectEpsForSessioninto smaller methods:OrderDevices,SelectEpDevices,LogTelemetry,CreateExecutionProviders,RegisterExecutionProviders, and a new flow for model package-based EP selection. [1] [2] [3] [4]These changes collectively lay the groundwork for more flexible, robust, and extensible device and EP selection in ONNX Runtime, especially in scenarios involving packaged models with multiple variants and complex hardware environments.
Motivation and Context