dreadnode · GangGreenTemperTatum · Apr 1, 2026 · Apr 1, 2026
diff --git a/README.md b/README.md
@@ -30,6 +30,8 @@
 
 Dyana is a sandbox environment using Docker and [Tracee](https://github.com/aquasecurity/tracee) for loading, running and profiling a wide range of files, including machine learning models, ELF executables, Pickle serialized files, Javascripts [and more](https://docs.dreadnode.io/open-source/dyana/topics/loaders). It provides detailed insights into GPU memory usage, filesystem interactions, network requests, and security related events.
 
+It also includes a lightweight host-side planning command, `dyana fit`, that recommends models likely to fit the current machine based on available RAM, GPU memory, and detected local runtimes.
+
 ## Installation
 
 Install with:
@@ -70,6 +72,24 @@ uv run pytest dyana
 
 See our docs on dyana usage [here](https://docs.dreadnode.io/open-source/dyana/basic-usage)
 
+Quick example:
+
+```bash
+dyana fit --use-case coding --top-k 5
+```
+
+Constrain the recommendation surface:
+
+```bash
+dyana fit --use-case coding --runtime ollama --max-memory-gb 12 --explain-excluded
+```
+
+Get Dyana-native recommendations for the `automodel` loader:
+
+```bash
+dyana fit --use-case coding --runtime automodel
+```
+
 ## License
 
 Dyana is released under the [MIT license](LICENSE). Tracee is released under the [Apache 2.0 license](third_party_licenses/APACHE2.md).
diff --git a/docs/basic-usage.md b/docs/basic-usage.md
@@ -12,6 +12,30 @@ Show help for a specific loader:
 dyana help automodel
 ```
 
+Plan model choices against the current machine before tracing:
+
+```bash
+dyana fit --use-case coding --top-k 5
+```
+
+Emit machine-readable fit recommendations:
+
+```bash
+dyana fit --use-case general --json
+```
+
+Restrict the planner to a specific runtime and memory budget:
+
+```bash
+dyana fit --use-case coding --runtime ollama --max-memory-gb 12 --explain-excluded
+```
+
+Plan specifically for Dyana's built-in automodel loader:
+
+```bash
+dyana fit --use-case coding --runtime automodel
+```
+
 Create a trace file for a loader run:
 
 ```bash
@@ -36,6 +60,8 @@ Show a summary of a trace file:
 dyana summary --trace-path trace.json
 ```
 
+`dyana fit` is host-side only. It does not start Docker, pull models, or execute artifacts. It is intended as a quick planning step before a real traced run.
+
 ## Default Safeguards
 
 Network access is disabled by default for loader containers. Allow it explicitly when needed:

diff --git a/docs/fit.md b/docs/fit.md
@@ -0,0 +1,90 @@
+# Fit Planning
+
+`dyana fit` recommends a small set of models that are likely to fit the current machine.
+
+Unlike `dyana trace`, this command is host-side only:
+
+- it does not start Docker
+- it does not run loaders
+- it does not download models
+- it does not execute artifacts
+
+It is meant to answer a narrower question first: what is even worth trying on this hardware?
+
+## What It Uses
+
+The current prototype looks at:
+
+- total system RAM
+- detected NVIDIA GPU memory, if present
+- Apple Silicon unified memory heuristics on `Darwin arm64`
+- detected runtimes such as Dyana `automodel`, `ollama`, `llama.cpp`, and `mlx`
+- a packaged local model and provider catalog
+
+## Examples
+
+Recommend coding-oriented models:
+
+```bash
+dyana fit --use-case coding --top-k 5
+```
+
+Get JSON output for automation:
+
+```bash
+dyana fit --use-case general --top-k 3 --json
+```
+
+Limit results to a specific runtime and budget:
+
+```bash
+dyana fit --use-case coding --runtime ollama --max-memory-gb 12
+```
+
+Prefer a Dyana-native execution path:
+
+```bash
+dyana fit --use-case coding --runtime automodel
+```
+
+Explain why some candidates were excluded:
+
+```bash
+dyana fit --use-case coding --explain-excluded
+```
+
+## Output
+
+The text view shows:
+
+- detected hardware summary
+- detected runtimes
+- ranked recommendations
+- estimated memory use
+- runtime and quantization choice
+- a short rationale for each recommendation
+- provider-specific artifact and invocation hints
+- optional exclusion reasons for rejected candidates
+
+The JSON view includes the same information in a machine-readable structure.
+
+## Preferences
+
+The planner supports a small set of opinionated controls:
+
+- `--runtime` to limit results to `automodel`, `ollama`, `mlx`, or `llama_cpp`
+- `--max-memory-gb` to cap the effective memory budget
+- `--preference balanced|quality|speed` to nudge quantization ranking
+- `--explain-excluded` to include a short rejection reason for excluded candidates
+
+## Current Scope
+
+This is intentionally lightweight. The prototype:
+
+- uses simple fit heuristics instead of benchmark-backed throughput estimates
+- ranks a packaged local catalog rather than a large external model index
+- focuses on fit and practical starting points, not exhaustive provider support
+
+The command is a planning tool. For real artifact execution and profiling, continue to use `dyana trace`.
+
+When the selected provider is `automodel`, the recommendation includes a Dyana invocation hint using `dyana trace --loader automodel`.
diff --git a/docs/index.md b/docs/index.md
@@ -2,12 +2,19 @@
 
 Dyana is a sandbox environment using Docker and [Tracee](https://github.com/aquasecurity/tracee) for loading, running, and profiling a wide range of files, including machine learning models, ELF executables, pickle files, JavaScript, and more.
 
+In addition to trace-time inspection, Dyana includes a small host-side planning surface for choosing models that are likely to fit your hardware before you run anything.
+
 It provides visibility into:
 
 - GPU memory usage
 - Filesystem interactions
 - Network requests
 - Security-relevant runtime events
+- Model fit recommendations for the current host
+
+## Fit Planning
+
+Use [`dyana fit`](fit.md) to rank a compact set of model recommendations against the current machine's RAM, GPU or unified memory budget, and detected local runtimes such as Ollama or MLX.
 
 ## Loaders
 

diff --git a/dyana/cli.py b/dyana/cli.py
@@ -2,24 +2,26 @@
 import pathlib
 import platform as platform_pkg
 
+import typer
+from rich import box
+from rich import print as rich_print
+from rich.table import Table
+
 try:
     import cysimdjson
 
     _HAS_CYSIMDJSON = True
 except ImportError:
     _HAS_CYSIMDJSON = False
 
-import typer
-from rich import box
-from rich import print as rich_print
-from rich.table import Table
-
 import dyana.loaders as loaders_pkg
+from dyana.fit import detect_hardware, fit_result_json, recommend_models
 from dyana.loaders.loader import Loader
 from dyana.tracer.tracee import Tracer
 from dyana.view import (
     view_disk_events,
     view_disk_usage,
+    view_fit,
     view_gpus,
     view_header,
     view_imports,
@@ -45,6 +47,35 @@
 )
 
 
+@cli.command(help="Recommend models that fit the current machine.")
+def fit(
+    use_case: str = typer.Option(help="Target workload, e.g. coding, chat, reasoning, general.", default="general"),
+    top_k: int = typer.Option(help="Number of recommendations to return.", default=5),
+    runtime: str | None = typer.Option(help="Limit results to a specific runtime, e.g. ollama, mlx, llama_cpp.", default=None),
+    max_memory_gb: float | None = typer.Option(help="Override the available memory budget in GiB.", default=None),
+    preference: str = typer.Option(help="Ranking preference: balanced, quality, or speed.", default="balanced"),
+    explain_excluded: bool = typer.Option(False, help="Include a short explanation for excluded candidates."),
+    json_output: bool = typer.Option(False, "--json", help="Emit recommendations as JSON."),
+) -> None:
+    if preference not in {"balanced", "quality", "speed"}:
+        raise typer.BadParameter("preference must be one of: balanced, quality, speed")
+
+    result = recommend_models(
+        detect_hardware(),
+        use_case=use_case,
+        top_k=top_k,
+        runtime=runtime,
+        max_memory_gb=max_memory_gb,
+        preference=preference,
+        explain_excluded=explain_excluded,
+    )
+
+    if json_output:
+        rich_print(fit_result_json(result))
+    else:
+        view_fit(result.model_dump())
+
+
 @cli.command(
     help="Show the available loaders.",
 )

diff --git a/dyana/cli_test.py b/dyana/cli_test.py
@@ -49,6 +49,15 @@ def test_loaders_help(self) -> None:
         assert result.exit_code == 0
         assert "--build" in _strip_ansi(result.output)
 
+    def test_fit_help(self) -> None:
+        result = runner.invoke(cli, ["fit", "--help"])
+        assert result.exit_code == 0
+        output = _strip_ansi(result.output)
+        assert "--use-case" in output
+        assert "--top-k" in output
+        assert "--runtime" in output
+        assert "--max-memory-gb" in output
+
 
 class TestSummaryCommand:
     def test_summary_with_modern_trace(self, tmp_path: t.Any) -> None:
@@ -137,6 +146,72 @@ def test_summary_missing_file(self) -> None:
         assert result.exit_code != 0
 
 
+class TestFitCommand:
+    def test_fit_text_output(self) -> None:
+        fake_result: dict[str, t.Any] = {
+            "hardware": {
+                "platform": "Linux",
+                "arch": "x86_64",
+                "total_ram_gb": 64.0,
+                "gpu_name": "RTX 4090",
+                "gpu_count": 1,
+                "total_vram_gb": 24.0,
+                "unified_memory": False,
+                "runtimes": {"automodel": True, "ollama": True, "llama_cpp": False, "mlx": False},
+            },
+            "use_case": "coding",
+            "runtime_filter": None,
+            "max_memory_gb": None,
+            "recommendations": [
+                {
+                    "model_id": "qwen25-coder-7b",
+                    "model": "Qwen2.5-Coder 7B Instruct",
+                    "runtime": "ollama",
+                    "provider": "Ollama",
+                    "quantization": "Q8_0",
+                    "mode": "gpu",
+                    "estimated_memory_gb": 8.5,
+                    "score": 92,
+                    "rationale": "Fits comfortably.",
+                    "artifact_hint": "Use an Ollama model tag.",
+                    "invocation_hint": "ollama run qwen2.5-coder:7b",
+                }
+            ],
+            "excluded": [],
+        }
+
+        with (
+            patch("dyana.cli.detect_hardware"),
+            patch("dyana.cli.recommend_models") as mock_recommend,
+        ):
+            mock_recommend.return_value.model_dump.return_value = fake_result
+            result = runner.invoke(cli, ["fit", "--use-case", "coding"])
+
+        assert result.exit_code == 0
+        output = _strip_ansi(result.output)
+        assert "Hardware" in output
+        assert "Qwen2.5-Coder 7B Instruct" in output
+        assert "ollama run qwen2.5-coder:7b" in output
+        assert "automodel" in output
+
+    def test_fit_json_output(self) -> None:
+        payload = json.dumps({"use_case": "general", "recommendations": [], "excluded": []})
+        with (
+            patch("dyana.cli.detect_hardware"),
+            patch("dyana.cli.recommend_models"),
+            patch("dyana.cli.fit_result_json", return_value=payload),
+        ):
+            result = runner.invoke(cli, ["fit", "--json"])
+
+        assert result.exit_code == 0
+        assert json.loads(result.output)["use_case"] == "general"
+
+    def test_fit_rejects_invalid_preference(self) -> None:
+        result = runner.invoke(cli, ["fit", "--preference", "unknown"])
+        assert result.exit_code == 2
+        assert "preference must be one of" in _strip_ansi(result.output)
+
+
 def _noop_loader_init(self: t.Any, **kwargs: t.Any) -> None:
     self.name = kwargs.get("name", "automodel")
     self.settings = None

diff --git a/dyana/data/__init__.py b/dyana/data/__init__.py
@@ -0,0 +1 @@
+# Package data container for Dyana.