omnibioai-workflow-bundles is the canonical repository for engine-agnostic, versioned bioinformatics workflow bundles used by the OmniBioAI Workflow Registry Service.
This repository is used for authoring, versioning, and testing workflows, and is not required in deployed OmniBioAI runtime environments.
All workflows in this repository are:
- Authored and version-controlled in Git
- Packaged as immutable workflow bundles
- Uploaded via CLI into OmniBioAI
- Stored as objects in OmniObjectService
- Indexed in the Workflow Registry Service
Important: This repository is not accessed at runtime by OmniBioAI services or plugins. Runtime execution always resolves workflows via the Workflow Registry + Object Storage layer, never directly from Git.
A workflow bundle is a versioned, self-contained artifact that defines everything required to execute a computational pipeline using a specific workflow engine.
Each bundle may include:
- Workflow definition files:
- WDL (Cromwell)
- Nextflow
- Snakemake
- CWL
- Engine-specific configuration files
- Container definitions (Docker / Conda / Apptainer)
- Reference datasets or helper scripts (optional)
- A strict
manifest.jsondescribing metadata and entrypoints
Bundles in Git are mutable during development, but:
Every upload to OmniBioAI produces an immutable runtime artifact
Once registered:
- Bundles are never modified
- Each update creates a new version
- Each version receives a unique
object_id
This repository supports multiple workflow engines:
- WDL (Cromwell-compatible)
- Nextflow
- Snakemake
- CWL
Each workflow bundle targets exactly one engine.
Equivalent workflows implemented in different engines are stored as separate bundles.
This repository is organized by biological domain, with each subfolder containing multiple workflow bundles.
omnibioai-workflow-bundles/
├── wes/
│ └── omnibioai_wes_snakemake_v1/
│ ├── workflow/
│ │ ├── Snakefile
│ ├── config/
│ │ └── inputs.json
│ ├── envs/
│ ├── docker/
│ ├── test/
│ └── manifest.json
│
├── rnaseq/
├── chipseq/
├── atacseq/
├── wgs/
├── sv/
├── spatial/
├── cellranger/
└── README.md
- Each directory under a domain (e.g.,
wes/,rnaseq/) contains multiple workflow bundles - Each bundle is self-contained and versioned
- Directory names are human-readable only
- Canonical identity is defined by
manifest.json, not filesystem paths
Each workflow bundle is uniquely identified by:
(category, engine, name, version)
category: wes
engine: snakemake
name: omnibioai_wes_snakemake_v1
version: 1.0.0
When a new version is uploaded:
- A new immutable bundle is created
- A new
object_idis generated - A new registry entry is inserted
- Previous versions remain fully accessible and unchanged
Each bundle MUST include a manifest.json file describing its canonical metadata.
{
"name": "omnibioai_wes_snakemake_v1",
"display_name": "Whole Exome Sequencing Pipeline (Snakemake)",
"category": "wes",
"engine": "snakemake",
"version": "1.0.0",
"entrypoint": "workflow/Snakefile",
"configs": ["config/inputs.json"],
"description": "End-to-end WES pipeline including QC, trimming, alignment, and variant calling",
"container_support": {
"docker": true,
"conda": true,
"apptainer": false
},
"tools": [
"trimmomatic",
"bwa",
"samtools",
"gatk"
]
}- Manifest is the single source of truth
- Registry never infers metadata from file paths
- Entry points must be explicit
- Tool dependencies should be declared
Modern workflow execution supports multiple isolation strategies:
- Conda environments (recommended for Snakemake)
- Docker containers (optional per-rule or engine-level)
- Apptainer/Singularity (HPC environments)
Workflows must NOT assume globally installed bioinformatics tools.
Each rule should declare its runtime environment explicitly when possible.
The Workflow Registry Service is the authoritative metadata index for all OmniBioAI workflows.
| Component | Responsibility |
|---|---|
| Workflow Bundles Repo | Authoring & version control |
| CLI Upload Tool | Validation & packaging |
| Workflow Registry | Metadata indexing & discovery |
| OmniObjectService | Immutable bundle storage |
| Execution Engine | Workflow materialization & runtime |
Registry = metadata Object Store = immutable artifacts
The registry does not store files or paths — only object_id references.
Bundles are uploaded via the OmniBioAI CLI.
input/
omnibioai_wes_snakemake_v1/
manifest.json
workflow/
config/
envs/
python manage.py workflow_upload \
--bundle input/omnibioai_wes_snakemake_v1 \
--created-by manish \
--enable- Validate bundle structure
- Parse
manifest.json - Validate entrypoint & configs
- Package bundle into archive
- Upload to OmniObjectService
- Receive
object_id - Create immutable registry entry
OmniBioAI plugins do not access this repository directly.
Instead:
-
Plugin queries Workflow Catalog Service
-
User selects workflow + version
-
Execution request is submitted
-
Runtime system:
- Resolves registry entry
- Fetches bundle via
object_id - Materializes workflow in execution environment
- No Git dependency at runtime
- No filesystem-based discovery
- Fully reproducible execution
- Complete audit trail per
object_id
-
ID-first architecture
-
Immutable workflow artifacts
-
Engine-agnostic bundle specification
-
Metadata-driven discovery
-
CLI-first ingestion workflow
-
Strict separation of:
- Authoring
- Registry
- Storage
- Execution
This repository is intended for:
- Workflow engineers
- Bioinformatics developers
- OmniBioAI platform maintainers
End users interact only through the OmniBioAI UI and APIs, not directly with this repository.
- ❌ A runtime execution environment
- ❌ A plugin system
- ❌ A database or registry
- ❌ A production workflow scheduler
- ❌ A mutable shared execution workspace
omnibioai-workflow-bundles is a version-controlled authoring repository for engine-specific bioinformatics workflows that are packaged into immutable artifacts and registered in the OmniBioAI Workflow Registry for reproducible, metadata-driven execution across multiple workflow engines.