Internal error: failed to load model with internal loader: grpc service not ready

**LocalAI version:**
v4.0.0

**Environment, CPU architecture, OS, and Version:**
Darwin MacBook-Pro.lan 25.3.0 Darwin Kernel Version 25.3.0: Wed Jan 28 20:54:38 PST 2026; root:xnu-12377.91.3~2/RELEASE_ARM64_T6050 arm64

Running on M5

**Describe the bug**
I am unable to run any LLM models at all.  I have tried several different ones but they all fail the same way.

A probably unrelated issue is that the exec is compiled against protobuf 33.5, which it doesn't find if protobuf v34 is installed.  The result is a bunch of error messages about a missing library.  I was able to fix that by explicitly reinstalling protobuf@33.  Unfortunately it didn't solve this problem.

**To Reproduce**
On a Apple Silicon Mac, install the metal-llama-cpp backend, and any appropriate model

**Expected behavior**
I expect the model to work.

**Logs**
```
➜  ~ cat localai.log
[21:06:47] STDOUT: Mar 17 21:06:47 DEBUG GPU vendor gpuVendor="" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/system/state.go"  caller.L=77 }
[21:06:47] STDOUT: Mar 17 21:06:47 DEBUG Total available VRAM vram=0 caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/system/state.go"  caller.L=79 }
[21:06:47] STDOUT: Mar 17 21:06:47 INFO  Using metal capability (arm64 on mac) env="LOCALAI_FORCE_META_BACKEND_CAPABILITY" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/system/capabilities.go"  caller.L=106 }
[21:06:47] STDOUT: Mar 17 21:06:47 INFO  Starting LocalAI threads=18 modelsPath="/Users/ilsa/.localai/models" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/startup.go"  caller.L=31 }
[21:06:47] STDOUT: Mar 17 21:06:47 INFO  LocalAI version version="v4.0.0 (8e8b7df715e620b64d07cddf6b73bff8f966dac5)" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/startup.go"  caller.L=32 }
[21:06:47] STDOUT: Mar 17 21:06:47 DEBUG agent_tasks.json not found, starting with empty tasks caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/services/agent_jobs.go"  caller.L=135 }
[21:06:47] STDOUT: Mar 17 21:06:47 DEBUG agent_jobs.json not found, starting with empty jobs caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/services/agent_jobs.go"  caller.L=199 }
[21:06:47] STDOUT: Mar 17 21:06:47 INFO  AgentJobService started retention_days=30 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/services/agent_jobs.go"  caller.L=1353 }
[21:06:47] STDOUT: Mar 17 21:06:47 DEBUG CPU capabilities capabilities=[AdvSIMD AdvSIMD_HPFPCvt FEAT_AES FEAT_AFP FEAT_BF16 FEAT_BTI FEAT_CRC32 FEAT_CSSC FEAT_CSV2 FEAT_CSV3 FEAT_DIT FEAT_DPB FEAT_DPB2 FEAT_DotProd FEAT_EBF16 FEAT_ECV FEAT_FCMA FEAT_FHM FEAT_FP16 FEAT_FPAC FEAT_FPACCOMBINE FEAT_FRINTTS FEAT_FlagM FEAT_FlagM2 FEAT_HBC FEAT_I8MM FEAT_JSCVT FEAT_LRCPC FEAT_LRCPC2 FEAT_LSE FEAT_LSE2 FEAT_MTE FEAT_MTE2 FEAT_MTE4 FEAT_MTE_CANONICAL_TAGS FEAT_MTE_NO_ADDRESS_TAGS FEAT_MTE_STORE_ONLY FEAT_PACIMP FEAT_PAuth FEAT_PAuth2 FEAT_PMULL FEAT_RDM FEAT_RPRES FEAT_SB FEAT_SHA1 FEAT_SHA256 FEAT_SHA3 FEAT_SHA512 FEAT_SME FEAT_SME2 FEAT_SME2p1 FEAT_SME_B16B16 FEAT_SME_F16F16 FEAT_SME_F64F64 FEAT_SME_I16I64 FEAT_WFxT FP_SyncExceptions SME_B16F32 SME_BI32I32 SME_F16F32 SME_F32F32 SME_I16I32 SME_I8I32 armv8_crc32 floatingpoint] caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/startup.go"  caller.L=40 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG [gguf] guessDefaultsFromFile: NGPULayers set NGPULayers=0x717c6e62f8d0 modelName="Qwen3.5 0.8B" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/gguf.go"  caller.L=48 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG [gguf] guessDefaultsFromFile: template already set name="qwen_qwen3.5-0.8b" modelName="Qwen3.5 0.8B" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/gguf.go"  caller.L=62 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG [gguf] guessDefaultsFromFile: NGPULayers set NGPULayers=0x717c6f0e9358 modelName="SmolVLM 500M Instruct" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/gguf.go"  caller.L=48 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG [gguf] guessDefaultsFromFile: template already set name="smolvlm-500m-instruct" modelName="SmolVLM 500M Instruct" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/gguf.go"  caller.L=62 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG No system backends found caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/gallery/backends.go"  caller.L=403 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG Registering backend name="mlx" runFile="/Users/ilsa/.localai/backends/mlx/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/gallery/backends.go"  caller.L=513 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG Registering backend name="moonshine" runFile="/Users/ilsa/.localai/backends/metal-moonshine/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/gallery/backends.go"  caller.L=513 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG Registering backend name="llama-cpp" runFile="/Users/ilsa/.localai/backends/metal-llama-cpp/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/gallery/backends.go"  caller.L=513 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG Registering backend name="metal-llama-cpp" runFile="/Users/ilsa/.localai/backends/metal-llama-cpp/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/gallery/backends.go"  caller.L=513 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG Registering backend name="metal-moonshine" runFile="/Users/ilsa/.localai/backends/metal-moonshine/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/gallery/backends.go"  caller.L=513 }
[21:06:48] STDOUT: Mar 17 21:06:48 INFO  Preloading models path="/Users/ilsa/.localai/models" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/model_config_loader.go"  caller.L=271 }
[21:06:48] STDOUT:
[21:06:48] STDOUT:   Model name: moonshine-tiny
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT:   Model name: qwen_qwen3.5-0.8b
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT:   Imported from https://huggingface.co/bartowski/Qwen_Qwen3.5-0.8B-GGUF
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT:   Model name: smolvlm-500m-instruct
[21:06:48] STDOUT:
[21:06:48] STDOUT:
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG runtime_settings.json not found, using defaults caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/startup.go"  caller.L=225 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG reading file for dynamic config update filename="/Applications/LocalAI.app/Contents/Resources/configuration/api_keys.json" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=66 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG processing api keys runtime update numKeys=0 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=139 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG no API keys discovered from dynamic config file caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=153 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG total api keys after processing numKeys=0 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=156 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG reading file for dynamic config update filename="/Applications/LocalAI.app/Contents/Resources/configuration/external_backends.json" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=66 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG processing external_backends.json caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=165 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG external backends loaded from external_backends.json caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=182 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG reading file for dynamic config update filename="/Applications/LocalAI.app/Contents/Resources/configuration/runtime_settings.json" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=66 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG processing runtime_settings.json caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=190 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG runtime settings loaded from runtime_settings.json caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/config_file_watcher.go"  caller.L=362 }
[21:06:48] STDOUT: Mar 17 21:06:48 INFO  core/startup process completed! caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/application/startup.go"  caller.L=174 }
[21:06:48] STDOUT: Mar 17 21:06:48 INFO  LocalAI is started and running address="127.0.0.1:8080" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/cli/run.go"  caller.L=414 }
[21:06:48] STDOUT: Mar 17 21:06:48 INFO  Agent pool started stateDir="/Applications/LocalAI.app/Contents/Resources/data" apiURL="http://127.0.0.1:8080" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/services/agent_pool.go"  caller.L=177 }
[21:06:48] STDOUT: Mar 17 21:06:48 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:06:51] STDOUT: Mar 17 21:06:51 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:06:51] STDOUT: Mar 17 21:06:51 INFO  HTTP request method="GET" path="/system" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:51] STDOUT: Mar 17 21:06:51 INFO  HTTP request method="GET" path="/v1/models" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:51] STDOUT: Mar 17 21:06:51 DEBUG HTTP request method="GET" path="/api/resources" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:06:52] STDOUT: Mar 17 21:06:52 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:06:55] STDOUT: Mar 17 21:06:55 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:06:55] STDOUT: Mar 17 21:06:55 INFO  HTTP request method="GET" path="/api/models/config-json/smolvlm-500m-instruct" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:55] STDOUT: Mar 17 21:06:55 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:06:56] STDOUT: Mar 17 21:06:56 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:06:56] STDOUT: Mar 17 21:06:56 INFO  HTTP request method="GET" path="/system" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:56] STDOUT: Mar 17 21:06:56 INFO  HTTP request method="GET" path="/v1/models" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:56] STDOUT: Mar 17 21:06:56 DEBUG HTTP request method="GET" path="/api/resources" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:06:57] STDOUT: Mar 17 21:06:57 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:06:59] STDOUT: Mar 17 21:06:59 INFO  HTTP request method="GET" path="/api/models/config-json/smolvlm-500m-instruct" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:59] STDOUT: Mar 17 21:06:59 INFO  HTTP request method="GET" path="/api/models/capabilities" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG context local model name not found, setting to the first model first model name="qwen_qwen3.5-0.8b" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/middleware/request.go"  caller.L=115 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG [gguf] guessDefaultsFromFile: NGPULayers set NGPULayers=0x717c6f0e9358 modelName="SmolVLM 500M Instruct" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/gguf.go"  caller.L=48 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG [gguf] guessDefaultsFromFile: template already set name="smolvlm-500m-instruct" modelName="SmolVLM 500M Instruct" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/config/gguf.go"  caller.L=62 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG input.Input input="<nil>" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/middleware/request.go"  caller.L=412 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Chat endpoint configuration read config=&{/Users/ilsa/.localai/models/smolvlm-500m-instruct.yaml <|im_start|>{% for message in messages %}{{message['role'] | capitalize}}{% if message['content'][0]['type'] == 'image' %}{{':'}}{% else %}{{': '}}{% endif %}{% for line in message['content'] %}{% if line['type'] == 'text' %}{{line['text']}}{% elif line['type'] == 'image' %}{{ '<image>' }}{% endif %}{% endfor %}<end_of_utterance>
[21:06:59] STDOUT: {% endfor %}{% if add_generation_prompt %}{{ 'Assistant:' }}{% endif %} {{SmolVLM-500M-Instruct-Q8_0.gguf}  false 0 0x717c6e62fd40 0x717c6e62fd48 0x717c6e62fd50 0x717c6e62fd80 false 0 false 0 0 0 0 0 0x717c6e62fd78 0x717c6e62fd70 0x717c6e62fd28 {false} <nil> map[]  0 0 0 0 } smolvlm-500m-instruct 0x717c6e62fd0a 0x717c6e62fd30 0x717c6e62fd89 map[] 0x717c6e62fd89 llama-cpp {<|im_start|>
[21:06:59] STDOUT: {{.Input -}}
[21:06:59] STDOUT: Assistant:  {{if eq .RoleName "assistant"}}Assistant{{else if eq .RoleName "system"}}System{{else if eq .RoleName "user"}}User{{end}}: {{.Content }}<end_of_utterance>
[21:06:59] STDOUT:  {{-.Input}}
[21:06:59] STDOUT:    false <nil>  } [FLAG_COMPLETION FLAG_CHAT] 0x717c6c9e4328 {   } [] [] []    map[] {false {false false false false false  false   []}   [] [] []   [] [] []    <nil> false <nil>} {<nil> <nil> <nil> [] []} map[] {   0 0  false false 0x717c6e62fd68 0x717c6e62fd60 0x717c6e62fd58 0x717c6f0e9358 0x717c6e62fd0b 0x717c6e62fd89 0x717c6e62fd89 0x717c6e62fd89  [<|im_end|> <dummy32000> </s> <| <end_of_utterance> <|endoftext|>] [] [] [] [] 0x717c6f0e9350 false   [] [] 0 false  0   0 false false 0 0 0 false  {0 0 0} mmproj-SmolVLM-500M-Instruct-Q8_0.gguf <nil> false     0 0 0 0 0} {false    false 0   } 0 {0 0} { } false []   [] [] { } {0 0 false false false false false 0 0 false}} caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=434 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Tool call routing decision shouldUseFn=false len(input.Functions)=0 len(input.Tools)=0 config.ShouldUseFunctions()=true config.FunctionToCall()="" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=534 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Parameters config=&{/Users/ilsa/.localai/models/smolvlm-500m-instruct.yaml <|im_start|>{% for message in messages %}{{message['role'] | capitalize}}{% if message['content'][0]['type'] == 'image' %}{{':'}}{% else %}{{': '}}{% endif %}{% for line in message['content'] %}{% if line['type'] == 'text' %}{{line['text']}}{% elif line['type'] == 'image' %}{{ '<image>' }}{% endif %}{% endfor %}<end_of_utterance>
[21:06:59] STDOUT: {% endfor %}{% if add_generation_prompt %}{{ 'Assistant:' }}{% endif %} {{SmolVLM-500M-Instruct-Q8_0.gguf}  false 0 0x717c6e62fd40 0x717c6e62fd48 0x717c6e62fd50 0x717c6e62fd80 false 0 false 0 0 0 0 0 0x717c6e62fd78 0x717c6e62fd70 0x717c6e62fd28 {false} <nil> map[]  0 0 0 0 } smolvlm-500m-instruct 0x717c6e62fd0a 0x717c6e62fd30 0x717c6e62fd89 map[] 0x717c6e62fd89 llama-cpp {<|im_start|>
[21:06:59] STDOUT: {{.Input -}}
[21:06:59] STDOUT: Assistant:  {{if eq .RoleName "assistant"}}Assistant{{else if eq .RoleName "system"}}System{{else if eq .RoleName "user"}}User{{end}}: {{.Content }}<end_of_utterance>
[21:06:59] STDOUT:  {{-.Input}}
[21:06:59] STDOUT:    false <nil>  } [FLAG_COMPLETION FLAG_CHAT] 0x717c6c9e4328 {   } [] [] []    map[] {false {false false false false false  false   []}   [] [] []   [] [] []    <nil> false <nil>} {<nil> <nil> <nil> [] []} map[] {   0 0  false false 0x717c6e62fd68 0x717c6e62fd60 0x717c6e62fd58 0x717c6f0e9358 0x717c6e62fd0b 0x717c6e62fd89 0x717c6e62fd89 0x717c6e62fd89  [<|im_end|> <dummy32000> </s> <| <end_of_utterance> <|endoftext|>] [] [] [] [] 0x717c6f0e9350 false   [] [] 0 false  0   0 false false 0 0 0 false  {0 0 0} mmproj-SmolVLM-500M-Instruct-Q8_0.gguf <nil> false     0 0 0 0 0} {false    false 0   } 0 {0 0} { } false []   [] [] { } {0 0 false false false false false 0 0 false}} caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=657 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG templated message for chat message="User: test<end_of_utterance>\n" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/templates/evaluator.go"  caller.L=142 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Prompt (before templating) prompt="User: test<end_of_utterance>\n" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/templates/evaluator.go"  caller.L=206 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Template found, input modified input="<|im_start|>\nUser: test<end_of_utterance>\nAssistant: " caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/templates/evaluator.go"  caller.L=224 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Prompt (after templating) prompt="<|im_start|>\nUser: test<end_of_utterance>\nAssistant: " caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=666 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Stream request received caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=675 }
[21:06:59] STDOUT: Mar 17 21:06:59 INFO  BackendLoader starting modelID="smolvlm-500m-instruct" backend="llama-cpp" model="SmolVLM-500M-Instruct-Q8_0.gguf" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=157 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Loading model in memory from file file="/Users/ilsa/.localai/models/SmolVLM-500M-Instruct-Q8_0.gguf" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/loader.go"  caller.L=230 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Loading Model with gRPC modelID="smolvlm-500m-instruct" file="/Users/ilsa/.localai/models/SmolVLM-500M-Instruct-Q8_0.gguf" backend="llama-cpp" options={llama-cpp SmolVLM-500M-Instruct-Q8_0.gguf smolvlm-500m-instruct {{}} 0x717c6c266608 map[] 20 2 false} caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=51 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Loading external backend uri="/Users/ilsa/.localai/backends/metal-llama-cpp/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=75 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG external backend is file file=&{run.sh 1480 493 {0 63909110012 0x10ad43960} {16777232 33261 1 4824172 501 20 0 [0 0 0 0] {1773791630 841305493} {1773513212 0} {1773791465 628736845} {1773513212 0} 1480 8 4096 0 0 0 [0 0]}} caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=78 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Sending chunk chunk="{\"created\":1773796019,\"object\":\"chat.completion.chunk\",\"id\":\"d3a49712-ddc5-477e-9056-16956594b2d4\",\"model\":\"smolvlm-500m-instruct\",\"choices\":[{\"index\":0,\"finish_reason\":null,\"delta\":{\"role\":\"assistant\",\"content\":null}}],\"usage\":{\"prompt_tokens\":0,\"completion_tokens\":0,\"total_tokens\":0}}" caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=745 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Loading GRPC Process process="/Users/ilsa/.localai/backends/metal-llama-cpp/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=124 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC Service will be running id="smolvlm-500m-instruct" address="127.0.0.1:62305" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=126 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC Service state dir dir="/var/folders/cw/rzy1gl1x76l3hthzjd5vzr4h0000gn/T/go-processmanager943411669" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=150 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC Service Started caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=90 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Wait for the service to start up caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=103 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG Options options=ContextSize:8192 Seed:79123738 NBatch:512 F16Memory:true MMap:true NGPULayers:99999999 Threads:18 MMProj:"/Users/ilsa/.localai/models/mmproj-SmolVLM-500M-Instruct-Q8_0.gguf" FlashAttention:"auto" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=104 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+++ realpath run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="++ dirname /Users/ilsa/.localai/backends/metal-llama-cpp/run.sh" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ CURDIR=/Users/ilsa/.localai/backends/metal-llama-cpp" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ cd /" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ echo 'CPU info:'" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stdout id="smolvlm-500m-instruct-127.0.0.1:62305" line="CPU info:" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=174 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ grep -e 'model\\sname' /proc/cpuinfo" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ head -1" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="grep: /proc/cpuinfo: No such file or directory" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ grep -e flags /proc/cpuinfo" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ head -1" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="grep: /proc/cpuinfo: No such file or directory" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ BINARY=llama-cpp-fallback" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ grep -q -e '\\savx\\s' /proc/cpuinfo" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="grep: /proc/cpuinfo: No such file or directory" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ grep -q -e '\\savx2\\s' /proc/cpuinfo" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="grep: /proc/cpuinfo: No such file or directory" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ grep -q -e '\\savx512f\\s' /proc/cpuinfo" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="grep: /proc/cpuinfo: No such file or directory" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ '[' -n '' ']'" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="++ uname" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ '[' Darwin == Darwin ']'" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ export DYLD_LIBRARY_PATH=/Users/ilsa/.localai/backends/metal-llama-cpp/lib:" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ DYLD_LIBRARY_PATH=/Users/ilsa/.localai/backends/metal-llama-cpp/lib:" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ '[' -f /Users/ilsa/.localai/backends/metal-llama-cpp/lib/ld.so ']'" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stdout id="smolvlm-500m-instruct-127.0.0.1:62305" line="Using binary: llama-cpp-fallback" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=174 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ echo 'Using binary: llama-cpp-fallback'" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG GRPC stderr id="smolvlm-500m-instruct-127.0.0.1:62305" line="+ exec /Users/ilsa/.localai/backends/metal-llama-cpp/llama-cpp-fallback --addr 127.0.0.1:62305" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=165 }
[21:06:59] STDOUT: Mar 17 21:06:59 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:07:37] STDOUT: Mar 17 21:07:37 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:07:37] STDOUT: Mar 17 21:07:37 ERROR failed starting/connecting to the gRPC service error=rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:62305: connect: connection refused" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=116 }
[21:07:38] STDOUT: Mar 17 21:07:38 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:07:39] STDOUT: Mar 17 21:07:39 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:07:39] STDOUT: Mar 17 21:07:39 DEBUG GRPC Service NOT ready caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=122 }
[21:07:39] STDOUT: Mar 17 21:07:39 ERROR Failed to load model modelID="smolvlm-500m-instruct" error=failed to load model with internal loader: grpc service not ready backend="llama-cpp" caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/initializers.go"  caller.L=177 }
[21:07:39] STDOUT: Mar 17 21:07:39 DEBUG No choices in the response, skipping caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=720 }
[21:07:39] STDOUT: Mar 17 21:07:39 DEBUG No choices in the response, skipping caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=720 }
[21:07:39] STDOUT: Mar 17 21:07:39 DEBUG No choices in the response, skipping caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=720 }
[21:07:39] STDOUT: Mar 17 21:07:39 ERROR Stream ended with error error=failed to load model with internal loader: grpc service not ready caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/openai/chat.go"  caller.L=757 }
[21:07:39] STDOUT: Mar 17 21:07:39 INFO  HTTP request method="POST" path="/v1/chat/completions" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=144 }
[21:07:40] STDOUT: Mar 17 21:07:40 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
*removed repeating lines*
[21:07:47] STDOUT: Mar 17 21:07:47 DEBUG HTTP request method="GET" path="/api/operations" status=200 caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/app.go"  caller.L=142 }
[21:07:47] STDOUT: Mar 17 21:07:47 DEBUG Closed all MCP sessions caller={caller.file="/home/runner/work/LocalAI/LocalAI/core/http/endpoints/mcp/tools.go"  caller.L=609 }
[21:07:47] STDOUT: Mar 17 21:07:47 ERROR error while shutting down grpc process error=failed to send signal terminated to process 50531: no such process caller={caller.file="/home/runner/work/LocalAI/LocalAI/pkg/model/process.go"  caller.L=155 }
```

**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Internal error: failed to load model with internal loader: grpc service not ready #9050

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Internal error: failed to load model with internal loader: grpc service not ready #9050

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions