|
| 1 | +planning: |
| 2 | + high-priority: |
| 3 | + - id: PLAN-ROADMAP-001 |
| 4 | + title: Execute post-alignment BitNet implementation roadmap |
| 5 | + note: 'Recommended order: complete tasks 1 through 6 first to clear the Phase 3 bottleneck, then implement export and interoperability, then finish chain-bucket production work and CI/test-lane separation.' |
| 6 | + done: true |
| 7 | + completed: 2026-03-23 |
| 8 | + description: |
| 9 | + - Carry the repository from architecture scaffolding into a production-capable training, evaluation, serialization, and speculative decoding workflow. |
| 10 | + - 'The immediate bottleneck is Phase 3: a real training core and token-sequence data pipeline.' |
| 11 | + - This item captures the concrete next work after the completed repo-alignment/docs phase and the recent teacher-forced continuation training fix. |
| 12 | + done-summary: Completed all roadmap tasks. The final slice added a shared paper-model snapshot, repo-authored GGUF save/load, .gguf model loading, and a minimal export command, with targeted GGUF regressions plus a full dual-target test pass. |
| 13 | + remaining: All roadmap tasks are complete. The latest completion added shared snapshot-backed JSON checkpoint and GGUF state capture, repo-authored .gguf import/export, app-level .gguf loading, and verified round-trip coverage. |
| 14 | + technical-details: |
| 15 | + - Phase 1 repo-alignment and documentation work is complete. |
| 16 | + - BitLinear and the transformer skeleton are mostly present already. |
| 17 | + - The training core now reaches beyond output-head-only updates by applying AdamW updates to the paper model's final RMSNorm scale alongside the output head. |
| 18 | + - Repo-authored GGUF export/import now complements the repo-local JSON checkpoint through a shared paper-model snapshot layer and strict tensor/metadata validation. |
| 19 | + - Default validation now stays fast while expensive training and benchmark checks run in a dedicated SlowLane category and benchmark-report CI lane. |
| 20 | + technical-requirements: |
| 21 | + - Preserve the teacher-forced rolling-context training fix already present in BitNetPaperModel. |
| 22 | + - Avoid direct edits to docs/todo.yaml; keep the roadmap synced through the MCP todo API. |
| 23 | + - Keep the first loader/token pipeline compatible with the existing BitNetTokenizer and vocabulary semantics. |
| 24 | + - Prefer new Training/* files and narrow adapters over broad rewrites of dirty worktree files. |
| 25 | + implementation-tasks: |
| 26 | + - task: Create src/BitNetSharp.Core/Training/ with BitNetTrainingOptions, TrainingBatch, CrossEntropyLoss, AdamWOptimizer, and a trainer that owns the loop instead of BitNetPaperModel. |
| 27 | + done: true |
| 28 | + - task: Implement a real BitNetDataLoader for packed fixed-length token batches and held-out splits, and cover it with loader tests under tests/BitNetSharp.Tests/. |
| 29 | + done: true |
| 30 | + - task: Extend paper-model training beyond output-head-only updates to STE/AdamW-style updates across deeper transformer parameters. |
| 31 | + done: true |
| 32 | + - task: Add a small-corpus path first in scripts/ plus repo-local fixtures and tests, before attempting larger SlimPajama or RedPajama-scale ingestion. |
| 33 | + done: true |
| 34 | + - task: Extend TrainingReport to include validation metrics, checkpoint cadence, and evaluation summaries rather than loss history alone. |
| 35 | + done: true |
| 36 | + - task: Wire periodic evaluation through BitNetBenchmarkFixtures for WikiText2, C4, and RedPajama fixture slices during training. |
| 37 | + done: true |
| 38 | + - task: Add a dedicated training CLI surface in src/BitNetSharp.App/Program.cs so training configuration and execution are first-class. |
| 39 | + done: true |
| 40 | + - task: Implement model export and import beyond the repo-local JSON checkpoint, with GGUF or an intermediate bridge format as the next target. |
| 41 | + done: true |
| 42 | + - task: Finish chain-bucket productionization with chain-buckets.bin persistence, threshold-based acceptance metrics, and benchmark reporting for acceptance rate and tokens per second. |
| 43 | + done: true |
| 44 | + - task: Split expensive training and benchmark validations into a separate category or CI lane so the default suite stays fast. |
| 45 | + done: true |
0 commit comments