Hi TinyEngine folks,
I wanted to share a small but unusual language-runtime project that may still be relevant to the broader system-algorithm co-design question your work represents, even though it targets language-task capability rather than the usual tiny vision or sensor workloads.
We built a public demo line called Engram and deployed it on a commodity ESP32-C3.
Current public numbers:
Important scope note:
This is not presented as unrestricted open-input native LLM generation on MCU.
The board-side path is closer to a flash-resident, table-driven runtime with:
- packed token weights
- hashed lookup structures
- fixed compiled probe batches
- streaming fold / checksum style execution over precompiled structures
So this is not a standard tiny dense model path. It is closer to a task-specialized language runtime whose behavior has been crystallized into a
compact executable form under severe physical constraints.
Repo:
https://github.com/Alpha-Guardian/Engram
Why I’m posting here is that TinyEngine and MCUNet are among the clearest public examples of system-algorithm co-design under extreme memory constraints.
What I’d be curious about is whether systems like this should be thought of as:
- completely outside the normal tiny-DNN family
- an extreme endpoint where some language-task capability may require its own tiny co-designed runtime path
- or an early sign that future tiny language systems may split into both very small dense models and highly specialized executable runtime forms
Would be very interested in your thoughts.