From 98792c48e0be545aa51880d722777b83d2d56a59 Mon Sep 17 00:00:00 2001 From: JasonOA888 Date: Fri, 3 Apr 2026 09:05:57 +0800 Subject: [PATCH] fix(i2_s): remove unsafe to_float cast + guard BLAS from I2_S generic MUL_MAT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The I2_S quantization format stores its scale outside per-row data. The type traits table cast dequantize_row_i2_s (4 params) to ggml_to_float_t (3 params), discarding the scale — UB on all platforms. Also the BLAS backend claimed generic MUL_MAT for I2_S when ubatch >= 32, causing segfaults on Apple Silicon Metal+BLAS. 1. Set to_float = NULL for I2_S — must use specialized gemv/gemm 2. Add I2_S exclusion in BLAS MUL_MAT and OUT_PROD support checks Fixes #468 (Bug 1: UB), Fixes #512 (Apple Silicon segfault) --- 3rdparty/llama.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/3rdparty/llama.cpp b/3rdparty/llama.cpp index 1f86f058d..6f8a942da 160000 --- a/3rdparty/llama.cpp +++ b/3rdparty/llama.cpp @@ -1 +1 @@ -Subproject commit 1f86f058de0c3f4098dedae2ae8653c335c868a1 +Subproject commit 6f8a942da0316ccd7e722c057c1fc0586c276b10