forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 23
Pull requests: PrismML-Eng/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
(Performance) Optimized x86 and generic q1_0(_g128) dot
ggml
#10
opened Apr 3, 2026 by
pl752
Loading…
fix: Q1_0_g128 x86 CPU kernel — float truncation + AVX2 vectorization
ggml
#7
opened Apr 2, 2026 by
wildcattrio
Loading…
4 tasks done
fix: Q1_0_g128 x86 CPU kernel - correct output + AVX2/AVX-512 VNNI
ggml
#6
opened Apr 2, 2026 by
stfurkan
Loading…
Fixes for CPU backend + instructions for targetting AMD GPUs
ggml
#5
opened Apr 2, 2026 by
philtomson
Loading…
fix: Q1_0_g128 CPU dot product int truncation
ggml
#4
opened Apr 2, 2026 by
Marxist-Leninist
Loading…
fix: Q1_0_g128 CPU kernel - correct output and AVX-512 SIMD
ggml
#3
opened Apr 1, 2026 by
jordankzf
Loading…
feat: port TQ3_0 KV cache from llama-turboquant
examples
ggml
Nvidia GPU
#2
opened Apr 1, 2026 by
carlosfundora
Loading…
ProTip!
Exclude everything labeled
bug with -label:bug.