AI engineer building practical AI systems with deep learning, local LLMs, and on-device inference.
I focus on offline-first AI products, efficient inference, retrieval systems, and real-world AI applications.
Privacy-first offline document intelligence system with persistent local RAG, hybrid retrieval, and source-grounded answers.
Run local LLMs like Gemma, Qwen, and LLaMA on Android for offline, private, real-time chat and question answering using LiteRT and ONNX Runtime.
Multi-task neural network architecture for ADAS-focused perception tasks.
Tools and experiments for training, fine-tuning, and understanding large language models from scratch.


