carlosfundora

Follow

🎯

Focusing

Carlos Fundora carlosfundora

🎯

Focusing

Follow

8 followers · 23 following

New orleans, LA
07:33 (UTC -12:00)
@CarlosFundora

Achievements

Achievements

Pinned Loading

llama.cpp-1-bit-turbo llama.cpp-1-bit-turbo Public

Forked from ggml-org/llama.cpp

HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with PrismML Q1_0_G128 1-bit quant support, RotorQuant, TurboQuant, EAGLE3 and P-EAGLE speculative decoding, and full Wave32 kernel optimizations.

C++ 11
sglang-1-bit-turbo sglang-1-bit-turbo Public

Forked from sgl-project/sglang

AMD ROCm (gfx1030) inference fork with RotorQuant/TurboQuant KV compression, PHANTOM-X zero-copy draft speculation, EAGLE3 speculative decoding, 12 RDNA2 crash fixes, and PrismML Bonsai Q1_0_G128 1…

Python 5
vllm-1-bit-turbo vllm-1-bit-turbo Public

Forked from vllm-project/vllm

HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with EAGLE3 speculative decoding, TurboQuant KV compression, PrismML Bonsai Q1_0_G128 1-bit GGUF support, and gfx1031 compatibility enablement.

Python 1
gfxGRAPH gfxGRAPH Public

CUDA Graph → HIP Graph translation layer for AMD gfx1030 (RDNA2). Bridges all 4 CUDA Graph parity gaps on ROCm.

Python 1 1
litellm-turbo litellm-turbo Public

Forked from BerriAI/litellm

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python
ATLAS ATLAS Public

Forked from itigges22/ATLAS

Adaptive Test-time Learning and Autonomous Specialization

Python 1