fix(deps): update module github.com/ollama/ollama to v0.21.2 by renovate[bot] · Pull Request #66 · lw396/ChatCopilot

renovate · 2024-08-28T01:39:07Z

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Change	Age	Confidence
github.com/ollama/ollama	`v0.3.6` → `v0.21.2`

Release Notes

ollama/ollama (github.com/ollama/ollama)

`v0.21.2`

Compare Source

What's Changed

Improved reliability of the OpenClaw onboarding flow in ollama launch
Recommended models in ollama launch now appear in a fixed, canonical order
OpenClaw integration now bundles Ollama's web search plugin in OpenClaw

New Contributors

@madflow made their first contribution in #15733

Full Changelog: ollama/ollama@v0.21.1...v0.21.2

`v0.21.1`

Compare Source

What's Changed

Kimi CLI

You can now install and run the Kimi CLI through Ollama.

ollama launch kimi --model kimi-k2.6:cloud

Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system.

MLX runner adds logprobs support for compatible models
Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler
Improved MLX prompt tokenization by moving tokenization into request handler goroutines
Better MLX thread safety for array management
GLM4 MoE Lite performance improvement with a fused sigmoid router head
Fixed model picker showing stale model after switching chats in the macOS app
Fixed structured outputs for Gemma 4 when think=false

Full Changelog: ollama/ollama@v0.21.0...v0.21.1

`v0.21.0`

Compare Source

Hermes Agent

ollama launch hermes

Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks.

What's Changed

Gemma 4 on MLX. Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked up mixed-precision quantization, better capability detection, and a batch of new op wrappers (Conv2d, Pad, activations, trig, masked SDPA, and RoPE-with-freqs).
Hermes and GitHub Copilot CLI in ollama launch. Added both integrations, which can now be configured in one command alongside the rest of the supported coding agents.
OpenCode moved to inline config. ollama launch opencode now writes its config inline rather than to a separate file, matching how other integrations are handled.
ollama launch no longer rewrites config when nothing changed. Pressing → on a configured multi-model integration, or passing --model with the current primary, used to trigger a confirmation prompt and rewrite both the editor's config file and config.json. Now it's a no-op when the resolved model list matches what's already saved.
Fixed ollama launch openclaw --yes so it correctly skips the channels configuration step, so non-interactive setups complete cleanly.
Restored the Gemma 4 nothink renderer with the e2b-style prompt.
Fixed the Gemma 4 compiler error that was breaking Metal builds.
Fixed macOS cross-compiles so they no longer trigger generate, which was breaking cmake builds on some Xcode versions.
Quieted cgo builds by suppressing deprecated warnings during go build.

Full Changelog: ollama/ollama@v0.20.7...v0.21.0

`v0.20.7`

Compare Source

What's Changed

Fix quality of gemma:e2b and gemma:e4b when thinking is disabled
ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in #15483

Full Changelog: ollama/ollama@v0.20.6...v0.20.7

`v0.20.6`

Compare Source

What's Changed

Gemma 4 tool calling ability is improved and updated to use Google's latest post-launch fixes
Parallel tool calling improved for streaming responses
Hermes agent Ollama integration guide is now available
Ollama app is updated to fix image attachment errors

New Contributors

@matteocelani made their first contribution in #15272

Full Changelog: ollama/ollama@v0.20.5...v0.20.6

`v0.20.5`

Compare Source

OpenClaw channel setup with `ollama launch`

What's Changed

OpenClaw channel setup: connect WhatsApp, Telegram, Discord, and other messaging channels through ollama launch openclaw
Enable flash attention for Gemma 4 on compatible GPUs
ollama launch opencode now detects curl-based OpenCode installs at ~/.opencode/bin
Fix /save command for models imported from safetensors

New Contributors

@sjhddh made their first contributihttps://github.com/ollama/ollama/pull/15424/15424

Full Changelog: ollama/ollama@v0.20.4...v0.20.5

`v0.20.4`

Compare Source

What's Changed

mlx: Improve M5 performance with NAX
gemma4: enable flash attention

Full Changelog: ollama/ollama@v0.20.3...v0.20.4

`v0.20.3`

Compare Source

What's Changed

Gemma 4 Tool Calling improvements
Added latest models to Ollama App
OpenClaw fixes for launching TUI

Full Changelog: ollama/ollama@v0.20.2...v0.20.3

`v0.20.2`

Compare Source

What's Changed

app: default app home view to new chat instead of launch by @jmorganca in #15312

Full Changelog: ollama/ollama@v0.20.1...v0.20.2

`v0.20.1`

Compare Source

What's Changed

bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in #15158
model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in #15254
ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in #15301
gemma4: enable flash attention by @dhiltgen in #15296
ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in #15305
model/parsers: rework gemma4 tool call handling by @drifkin in #15306

Full Changelog: ollama/ollama@v0.20.0...v0.20.1

`v0.20.0`

Compare Source

Gemma 4

Effective 2B (E2B)

ollama run gemma4:e2b

Effective 4B (E4B)

ollama run gemma4:e4b

26B (Mixture of Experts model with 4B active parameters)

ollama run gemma4:26b

31B (Dense)

ollama run gemma4:31b

What's Changed

docs: update pi docs by @ParthSareen in #15152
mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in #15185
tokenizer: add SentencePiece-style BPE support by @dhiltgen in #15162

Full Changelog: ollama/ollama@v0.19.0...v0.20.0-rc0

`v0.19.0`

Compare Source

Ollama is now powered by MLX on Apple Silicon in preview

Ollama on Apple silicon is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture.

mlx-coding-agent.mov

What's Changed

Ollama's app will now no longer incorrectly show "model is out of date"
ollama launch pi now includes web search plugin that uses Ollama's web search
Improved KV cache hit rate when using the Anthropic-compatible API
Fixed tool call parsing issue with Qwen3.5 where tool calls would be output in thinking
MLX runner will now create periodic snapshots during prompt processing
Fixed KV cache snapshot memory leak in MLX runner
Fixed issue where flash attention would be incorrectly enabled for grok models
Fixed qwen3-next:80b not loading in Ollama

New Contributors

@amatas made their first contribution in #15022

Full Changelog: ollama/ollama@v0.18.3...v0.19.0

`v0.18.3`

Compare Source

Visual Studio Code

Microsoft Visual Studio Code now directly integrates with Ollama via GitHub Copilot.

If you have Ollama installed, any local or cloud model from Ollama can be selected for use within visual studio code.

Ollama screenshot 2026-03-26 at 01 43 57@2x

What's Changed

GLM parser improvements for tool calls
OpenClaw integration improvements for gateway checks

Full Changelog: ollama/ollama@v0.18.2...v0.18.3

`v0.18.2`

Compare Source

What's Changed

Add extra check to ensure npm and git are installed before installing OpenClaw
Claude Code will now be faster when run locally, due to preventing cache breakages
Fix to correctly support ollama launch openclaw --model <model>
Register Ollama's websearch package correctly for OpenClaw

Full Changelog: ollama/ollama@v0.18.1...v0.18.2

`v0.18.1`

Compare Source

Web Search and Fetch in OpenClaw

Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. This also allows OpenClaw with Ollama to be able to fetch the web and extract readable content for processing. This feature does not execute JavaScript.

When using local models with web search in OpenClaw, ensure you are signed into Ollama with ollama signin

ollama launch openclaw

You can install web search directly into OpenClaw as a plugin if you already have OpenClaw configured and working:

Ollama web search plugin

openclaw plugins install @&#8203;ollama/openclaw-web-search

Non-interactive (headless) mode for ollama launch

ollama launch can now run in non-interactive mode.

Perfect for:

Docker/containers: spin up an integration as a pipeline step to run evals, test prompts, or validate model behavior as part of your build. Tear it down when the job ends.
CI/CD: Generate code reviews, security checks, and other tasks within your CI
Scripts/automation: Kick off automated tasks with Ollama and claude code
--model must be specified to run in headless mode
--yes flag will auto-pull the model and skip any selectors

Try with: ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?"

Use non-interactive mode in OpenClaw

You can ask your OpenClaw to run tasks using claude with subagents:

ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?" using a subagent

What's Changed

ollama launch openclaw will now use the official Ollama auth and model provider for OpenClaw
Improvements to Ollama's benchmarking tool in ./cmd/bench
ollama launch openclaw will now skip --install-daemon when systemd is unavailable

Full Changelog: ollama/ollama@v0.18.0...v0.18.1

`v0.18.0`

Compare Source

Ollama 0.18 includes improved performance for OpenClaw and Ollama’s cloud models, including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks.

Improved OpenClaw performance with Kimi-K2.5

This release of Ollama improves performance of cloud models and their reliability.

Up to 2x faster speeds with Kimi-K2.5
Tool calling accuracy has been improved

ollama launch openclaw --model kimi-k2.5

Ollama is now a provider in OpenClaw

Ollama can now be selected as an authentication and model provider during OpenClaw onboarding (thanks @BruceMacD for contributing and @steipete for reviewing!)

openclaw onboard --auth-choice ollama

More information: https://docs.openclaw.ai/providers/ollama

Nemotron-3-Super

Nemotron-3-Super: is a new 122B parameter model with strong reasoning and tool calling capability, while having top performance when run on modern hardware:

ollama run nemotron-3-super:cloud
ollama run nemotron-3-super to run locally (requires 96GB+ of VRAM)

Nemotron-3-Super scores highest of any open model on PinchBench, a benchmark suite that measures how successful models are at completing tasks when used with OpenClaw.

ollama launch openclaw --model nemotron-3-super:cloud

Or using OpenClaw’s onboarding:

openclaw onboard \
	--auth-choice ollama \
	--custom-model-id nemotron-3-super:cloud

Non-interactive task support

ollama launch now supports non-interactive tasks by passing in --yes. This enables using Claude, Codex, Pi and more in scripts, GitHub Actions, and other non-interactive environments.

ollama launch claude \
	--model glm-5:cloud \
	--yes \
	-- "Do a quick code review of this pull request and respond on GitHub with a comment summarizing your feedback."

Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud

For customers in North America, MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud now respond much faster, up to 10x and up to 2x faster respectively, and often in less than a second. This is ideal for tasks that require a fast Time To First Token (TTFT) when needing quick answers from OpenClaw or quick back-to-back coding tasks.

ollama launch claude --model minimax-m2.5

Driver updates required for ROCm 7

This version of Ollama ships with ROCm 7, and requires updating drivers to the latest version for continued support.

What's Changed

Ollama's cloud models no longer require downloading via ollama pull. Setting :cloud as a tag will now automatically connect to cloud models.
New --yes flag for ollama launch that skips all prompts, making it possible to run AI assistants and other tools in non-interactive environments
Fixed issue where "Reset to Defaults" in Ollama's app would disable downloading automatic updates.
Ollama will now ensure context compaction occurs at the correct context length for each model when using ollama launch claude

New Contributors

@flipbit03 made their first contribution in #14821
@shivamtiwari3 made their first contribution in #14825

Full Changelog: ollama/ollama@v0.17.7...v0.18.0

`v0.17.7`

Compare Source

What's Changed

Allow thinking levels such as "medium" to correctly interpreted in Ollama's API for all thinking models
Add context length to support compaction when using ollama launch

Full Changelog: ollama/ollama@v0.17.6...v0.17.7

`v0.17.6`

Compare Source

What's Changed

Fixed issue where GLM-OCR would not work due to incorrect prompt rendering
Fixed tool calling parsing and rendering for Qwen 3.5 models

New Contributors

@Victor-Quqi made their first contribution in #14584

Full Changelog: ollama/ollama@v0.17.5...v0.17.6

`v0.17.5`

Compare Source

New models

Qwen3.5: the small Qwen 3.5 model series is now available in 0.8B, 2B, 4B and 9B parameter sizes.

What's Changed

Fixed crash in Qwen 3.5 models when split over GPU & CPU
Fixed issue where Qwen 3.5 models would repeat themselves due to no presence penalty (note: you may have to redownload the qwen3.5 models: ollama pull qwen3.5:35b for example)
ollama run --verbose will now show peak memory usage when using Ollama's MLX engine
Fixed memory issues and crashes in MLX runner
Fixed issue where Ollama would not be able to run models imported from Qwen3.5 GGUF files

Full Changelog: ollama/ollama@v0.17.4...v0.17.5

`v0.17.4`

Compare Source

New models

Qwen 3.5: a family of open-source multimodal models that delivers exceptional utility and performance.
LFM 2: LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.

Note: for users on 0.17.1, this version will not automatically update. Re-downloading is required to receive the latest version of Ollama.

What's Changed

Tool call indices will now be included in parallel tool calls

Full Changelog: ollama/ollama@v0.17.3...v0.17.4

`v0.17.3`

Compare Source

What's Changed

Fixed issue where tool calls in the Qwen 3 and Qwen 3.5 model families would not be parsed correctly if emitted during thinking

Full Changelog: ollama/ollama@v0.17.2...v0.17.3

`v0.17.2`

Compare Source

What's Changed

Fixed issue where Ollama's app on Windows would crash when a new update has been downloaded

Full Changelog: ollama/ollama@v0.17.1...v0.17.2

`v0.17.1`

Compare Source

What's Changed

Nemotron architecture support in Ollama's engine
MLX engine now has improved memory usage
Ollama's app will now allow models that support tools to use web search capabilities
Improved LFM2 and LFM2.5 models in Ollama's engine
ollama create will no longer default to affine quantization for unquantized models when using the MLX engine
Added configuration for disabling automatic update downloading

Full Changelog: ollama/ollama@v0.17.0...v0.17.1

`v0.17.0`

Compare Source

OpenClaw

OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5.

Get started

ollama launch openclaw

Web search in OpenClaw

When using cloud models, websearch is enabled - allowing OpenClaw to search the internet.

What's Changed

Improved tokenizer performance
Ollama's macOS and Windows apps will now default to a context length based on available VRAM

New Contributors

@natl-set made their first contributihttps://github.com/ollama/ollama/pull/14322/14322

Full Changelog: ollama/ollama@v0.16.3...v0.17.0

`v0.16.3`

Compare Source

What's Changed

New ollama launch cline added for the Cline CLI
ollama launch <integration> will now always show the model picker
Added Gemma 3, Llama and Qwen 3 architectures to MLX runner

New Contributors

@hellosaumil made their first contribution in #14271

Full Changelog: ollama/ollama@v0.16.2...v0.16.3

`v0.16.2`

Compare Source

What's Changed

ollama launch claude now supports searching the web when using :cloud models
Fixed rendering issue when running ollama in PowerShell
New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running ollama serve manually, set OLLAMA_NO_CLOUD=1.
Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1

Full Changelog: ollama/ollama@v0.16.1...v0.16.2-rc0

`v0.16.1`

Compare Source

What's Changed

Installing Ollama via the curl install script on macOS will now only prompt for your password if its required
Installing Ollama via the iem install script in Windows will now show progress
Image generation models will now respect the OLLAMA_LOAD_TIMEOUT variable

Full Changelog: ollama/ollama@v0.16.0...v0.16.1

`v0.16.0`

Compare Source

New models

GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
MiniMax-M2.5: a new state-of-the-art large language model designed for real-world productivity and coding tasks.

New `ollama`

The new ollama command makes it easy to launch your favorite apps with models using Ollama

Ollama screenshot 2026-02-12 at 04 48 55@2x

What's Changed

Launch Pi with ollama launch pi
Improvements to Ollama's MLX runner to support GLM-4.7-Flash
Ctrl+G will now allow for editing text prompts in a text editor when running a model

Full Changelog: ollama/ollama@v0.15.6...v0.16.0

`v0.15.6`

Compare Source

What's Changed

Fixed context limits when running ollama launch droid
ollama launch will now download missing models instead of erroring
Fixed bug where ollama launch claude would cause context compaction when providing images

Full Changelog: ollama/ollama@v0.15.5...v0.15.6

`v0.15.5`

Compare Source

New models

Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

Improvements to `ollama launch`

ollama launch can now be provided arguments, for example ollama launch claude -- --resume
ollama launch will now work run subagents when using ollama launch claude
Ollama will now set context limits for a set of models when using ollama launch opencode

What's Changed

Sub-agent support for ollama launch for planning, deep research, and similar tasks
ollama signin will now open a browser window to make signing in easier
Ollama will now default to the following context lengths based on VRAM:
- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
GLM-4.7-Flash support on Ollama's experimental MLX engine
ollama signin will now open the browser to the connect page
Fixed off by one error when using num_predict in the API
Fixed issue where tokens from a previous sequence would be returned when hitting num_predict

New Contributors

@avukmirovich made their first contribution in #13934

Full Changelog: ollama/ollama@v0.15.4...v0.15.5

`v0.15.4`

Compare Source

What's Changed

ollama launch openclaw will now enter the standard OpenClaw onboarding flow if this has not yet been completed.

Full Changelog: ollama/ollama@v0.15.3...v0.15.4

`v0.15.3`

Compare Source

What's Changed

Renamed ollama launch clawdbot to ollama launch openclaw to reflect the project's new name
Improved tool calling for Ministral models
docs: add clawdbot by @ParthSareen in #13925
cmd/config: Use envconfig.Host() for base API in launch config packages by @gabe-l-hart in #13937
ollama launch will now use the value of OLLAMA_HOST when running it

New Contributors

@MBerguer made their first contribution in #13971
@taronsung made their first contribution in #13965
@noureldin-azzab made their first contribution in #13961
@dhirajlochib made their first contribution in #13645
@ThanhNguyxn made their first contribution in #13979

Full Changelog: ollama/ollama@v0.15.2...v0.15.3

`v0.15.2`

Compare Source

Ollama screenshot 2026-01-26 at 17 53 40@2x (1)

What's Changed

New ollama launch clawdbot command for launching Clawdbot using Ollama models

Full Changelog: ollama/ollama@v0.15.1...v0.15.2

`v0.15.1`

Compare Source

What's Changed

GLM-4.7-Flash performance and correctness improvements, fixing repetitive answers and tool calling quality
Fixed performance issues on macOS and arm64 Linux
Fixed issue where ollama launch would not detect claude and would incorrectly update opencode configurations

New Contributors

@stillhart made their first contribution in #13855

Full Changelog: ollama/ollama@v0.15.0...v0.15.1

`v0.15.0`

Compare Source

An image of Ollama building rapidly on the computer. Build with Ollama!

`ollama launch`

A new ollama launch command to use Ollama's models with Claude Code, Codex, OpenCode, and Droid without separate configuration.

What's Changed

New ollama launch command for Claude Code, Codex, OpenCode, and Droid
Fixed issue where creating multi-line strings with """ would not work when using ollama run
Ctrl+J and Shift+Enter now work for inserting newlines in ollama run
Reduced memory usage for GLM-4.7-Flash models

`v0.14.3`

Compare Source

Ollama screenshot 2026-01-20 at 23 41 54@2x

Z-Image Turbo: 6 billion parameter text-to-image model from Alibaba’s Tongyi Lab. It generates high-quality photorealistic images.
Flux.2 Klein: Black Forest Labs’ fastest image-generation models to date.

New models

GLM-4.7-Flash: As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
LFM2.5-1.2B-Thinking: LFM2.5 is a new family of hybrid models designed for on-device deployment.

What's Changed

Fixed issue where Ollama's macOS app would interrupt system shutdown
Fixed ollama create and ollama show commands for experimental models
The /api/generate API can now be used for image generation
Fixed minor issues in Nemotron-3-Nano tool parsing
Fixed issue where removing an image generation model would cause it to first load
Fixed issue where ollama rm would only stop the first model in the list if it were running

Full Changelog: ollama/ollama@v0.14.2...v0.14.3

`v0.14.2`

Compare Source

New models

TranslateGemma: A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.

What's Changed

Shift + Enter (or Ctrl + j) will now enter a newline in Ollama's CLI
Improve /v1/responses API to better confirm to OpenResponses specification

New Contributors

@yuhongsun96 made their first contribution in #13135
@koaning made their first contribution in #13326

Full Changelog: ollama/ollama@v0.14.1...v0.14.2

`v0.14.1`

Compare Source

Image generation models (experimental)

Experimental image generation models are available for macOS and Linux (CUDA) in Ollama:

Available models

Z-Image-Turbo

ollama run x/z-image-turbo

Note: x is a username on ollama.com where experimental models are uploaded

More models coming soon:

Qwen-Image-2512
Qwen-Image-Edit-2511
GLM-Image

What's Changed

fix macOS auto-update signature verification failure

New Contributors

@joshxfi made their first contribution in #13711
@maternion made their first contribution in #13709

Full Changelog: ollama/ollama@v0.14.0...v0.14.1

`v0.14.0`

Compare Source

What's Changed

ollama run --experimental CLI will now open a new Ollama CLI that includes an agent loop and the bash tool
Anthropic API compatibility: support for the /v1/messages API
A new REQUIRES command for the Modelfile allows declaring which version of Ollama is required for the model
For older models, Ollama will avoid an integer underflow on low VRAM systems during memory estimation
More accurate VRAM measurements for AMD iGPUs
Ollama's app will now highlight swift source code
An error will now return when embeddings return NaN or -Inf
Ollama's Linux install bundles files now use zst compression
New experimental support for image generation models, powered by MLX

New Contributors

@Vallabh-1504 made their first contribution in #13550
@majiayu000 made their first contribution in #13596
@harrykiselev made their first contribution in #13615

Full Changelog: ollama/ollama@v0.13.5...v0.14.0-rc2

`v0.13.5`

Compare Source

New Models

Google's FunctionGemma a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling.

What's Changed

bert architecture models now run on Ollama's engine
Added built-in renderer & tool parsing capabilities for DeepSeek-V3.1
Fixed issue where nested properties in tools may not have been rendered properly

New Contributors

@familom made their first contribution in #13220
@nathannewyen made their first contribution in #13469

Full Changelog: ollama/ollama@v0.13.4...v0.13.5

`v0.13.4`

Compare Source

New Models

Nemotron 3 Nano: A new Standard for Efficient, Open, and Intelligent Agentic Models
Olmo 3 and Olmo 3.1: A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.

What's Changed

Enable Flash Attention automatically for models by default
Fixed handling of long contexts with Gemma 3 models
Fixed issue that would occur with Gemma 3 QAT models or other models imported with the Gemma 3 architecture

New Contributors

@familom made their first contribution in #13220

Full Changelog: ollama/ollama@v0.13.3...v0.13.4-rc0

`v0.13.3`

Compare Source

New models

Devstral-Small-2: 24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
rnj-1: Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.
nomic-embed-text-v2: nomic-embed-text-v2-moe is a multilingual MoE text embedding model that excels at multilingual retrieval.

What's Changed

Improved truncation logic when using /api/embed and /v1/embeddings
Extend Gemma 3 architecture to support rnj-1 model
Fix error that would occur when running qwen2.5vl with image input

Full Changelog: ollama/ollama@v0.13.2...v0.13.3

`v0.13.2`

Compare Source

New models

Qwen3-Next: The first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed.

What's Changed

Flash attention is now enabled by default for vision models such as mistral-3, gemma3, qwen3-vl and more. This improves memory utilization and performance when providing images as input.
Fixed GPU detection on multi-GPU CUDA machines
Fixed issue where deepseek-v3.1 would always think even with thinking is disabled in Ollama's app

New Contributors

@chengcheng84 made their first contribution in #13265
@nathan-hook made their first contribution in #13256

Full Changelog: ollama/ollama@v0.13.1...v0.13.2

`v0.13.1`

Compare Source

New models

Ministral-3: The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.
Mistral-Large-3: A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.

What's Changed

nomic-embed-text will now use Ollama's engine by default
Tool calling support for cogito-v2.1
Fixed issues with CUDA VRAM discovery
Fixed link to docs in Ollama's app
Fixed issue where models would be evicted on CPU-only systems
Ollama will now better render errors instead of showing Unmarshal: errors
Fixed issue where CUDA GPUs would fail to be detected with older GPUs
Added thinking and tool parsing for cogito-v2.1

New Contributors

@EntropyYue made their first contribution in #13237
@kokes made their first contribution in #13231

Full Changelog: ollama/ollama@v0.13.0...v0.13.1

`v0.13.0`

Compare Source

New models

DeepSeek-OCR: DeepSeek-OCR uses optical 2D mapping to compress long contexts, achieving high OCR precision with reduced vision tokens and demonstrating practical value in document processing.
Cogito-V2.1: instruction tuned generative models, currently the best open-weight LLM by a US company

DeepSeek-OCR

DeepSeek-OCR is now available on Ollama. Example inputs:

ollama run deepseek-ocr "/path/to/image\n<|grounding|>Given the layout of the image."

ollama run deepseek-ocr "/path/to/image\nFree OCR."

ollama run deepseek-ocr "/path/to/image\nParse the figure."

ollama run deepseek-ocr "/path/to/image\nExtract the text in the image."

ollama run deepseek-ocr "/path/to/image\n<|grounding|>Convert the document to markdown."

New `bench` tool

Ollama's GitHub repo now includes a bench tool that can be used to test model performance. For the time being this is a separate tool that can be built in the Ollama GitHub repository:

First, install Go. Then from the root of the Ollama repository run:

go run ./cmd/bench -model gpt-oss:20b

For more information see the tool's documentation

What's Changed

DeepSeek-OCR is now supported
DeepSeek-V3.1 architecture is now supported in Ollama's engine
Fixed performance issues that arose in Ollama 0.12.11 on CUDA
Fixed issue where Linux install packages were missing required Vulkan libraries
Improved CPU and memory detection while in containers/cgroups
Improved VRAM information detection for AMD GPUs
Improved KV cache performance to no longer require defragmentation

New Contributors

@lnicola made their first contribution in #13096
@vignesh1507 made their first contribution in #13078
@pierwill made their first contribution in #12995
@jjuliano made their first contribution in #11877
@omahs made their first contribution in #10683
@SiLeader made their first contribution in #10292
@ssam18 made their first contribution in #13124
@seolyam made their first contribution in #13116

Full Changelog: ollama/ollama@v0.12.11...v0.13.0

`v0.12.11`

Compare Source

Logprobs

Ollama's API and OpenAI-compatible API now support log probabilities. Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. This is useful for different use cases:

Classification tasks
Retrieval (Q&A) evaluation
Autocomplete
Token highlighting and outputting bytes
Calculating perplexity

To enable Logprobs, provide "logprobs": true to Ollama's API:

curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Why is the sky blue?",
  "logprobs": true
}'

When log probabilities are requested, response chunks will now include a "logprobs" field with the token, log probability and raw bytes (for partial unicode).

{
  "model": "gemma3",
  "created_at": "2025-11-14T22:17:56.598562Z",
  "response": "Okay",
  "done": false,
  "logprobs": [
    {
      "token": "Okay",
      "logprob": -1.3434503078460693,
      "bytes": [
        79,
        107,
        97,
        121
      ]
    }
  ]
}

`top_logprobs`

When setting "top_logprobs", a number of most-likely tokens are also provided, making it possible to introspect alternative tokens. Below is an example request.

curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Why is the sky blue?",
  "logprobs": true,
  "top_logprobs": 3
}'

This will generate a stream of response chunks with the following fields:

{
  "model": "gemma3",
  "created_at": "2025-11-14T22:26:10.466324Z",
  "response": "The",
  "done": false,
  "logprobs": [
    {
      "token": "The",
      "logprob": -0.8361086845397949,
      "bytes": [
        84,
        104,
        101
      ],
      "top_logprobs": [
        {
          "token": "The",
          "logprob": -0.8361086845397949,
          "bytes": [
            84,
            104,
            101
          ]
        },
        {
          "token": "Okay",
          "logprob": -1.2590975761413574,
          "bytes": [
            79,
            107,
            97,
            121
          ]
        },
        {
          "token": "That",
          "logprob": -1.2686877250671387,
          "bytes": [
            84,
            104,
            97,
            116
          ]
        }
      ]
    }
  ]
}

Special thanks

Thank you @baptistejamin for adding Logprobs to Ollama's API.

Vulkan support (opt-in)

Ollama 0.12.11 includes support for Vulkan acceleration. Vulkan brings support for a broad range of GPUs from AMD, Intel, and iGPUs. Vulkan support is not yet enabled by default, and requires opting in by running Ollama with a custom environment variable:

OLLAMA_VULKAN=1 ollama serve

On Powershell, use:

$env:OLLAMA_VULKAN="1"
ollama serve

For issues or feedback on using Vulkan with Ollama, create an issue labelled Vulkan and make sure to include server logs where possible to aid in debugging.

What's Changed

Ollama's API and the OpenAI-compatible API now supports Logprobs
Ollama's new app now supports WebP images
Improved rendering performance in Ollama's new app, especially when rendering code
The "required" field in tool definitions will now be omitted if not specified
Fixed issue where "tool_call_id" would be omitted when using the OpenAI-compatible API.
Fixed issue where ollama create would import data from both consolidated.safetensors and other safetensor files.
Ollama will now prefer dedicated GPUs over iGPUs when scheduling models
Vulkan can now be enabled by setting OLLAMA_VULKAN=1. For example: OLLAMA_VULKAN=1 ollama serve

New Contributors

@mags0ft made their first contribution in #11371
@macarronesc made their first contribution in #12973
@breatn made their first contribution in #12985
@cybardev made their first contribution in #13045
@baptistejamin made their first contribution in #12899

Full Changelog: ollama/ollama@v0.12.10...v0.12.11

`v0.12.10`

Compare Source

`ollama run` now works with embedding models

ollama run can now run embedding models to generate vector embeddings from text:

ollama run embeddinggemma "Hello world"

Content can also be provided to ollama run via standard input:

echo "Hello world" | ollama run embeddinggemma

What's Changed

Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct
Enable flash attention for Vulkan (currently needs to be built from source)
Add Vulkan memory detection for Intel GPU using DXGI+PDH
Ollama will now return tool call IDs from the /api/chat API
Fixed hanging due to CPU discovery
Ollama will now show login instructions when switching to a cloud model in interactive mode
Fix reading stale VRAM data
ollama run now works with embedding models

New Contributors

@ryanycoleman made their first contribution in #11740
@Rajathbail made their first contribution in #12929
@virajwad made their first contribution in #12664
@AXYZdong made their first contribution in #8601

Full Changelog: ollama/ollama@v0.12.9...v0.12.10

`v0.12.9`

Compare Source

What's Changed

Fix performance regression on CPU-only systems

Full Changelog: ollama/ollama@v0.12.8...v0.12.9

`v0.12.8`

Compare Source

What's Changed

qwen3-vl performance improvements, including flash attention support by default
qwen3-vl will now output less leading whitespace in the response when thinking
Fixed issue where deepseek-v3.1 thinking could not be disabled in Ollama's new app
Fixed issue where qwen3-vl would fail to interpret images with transparent backgrounds
Ollama will now stop running a model before removing it via ollama rm
Fixed issue where prompt processing would be slower on Ollama's engine
Ignore unsupported iGPUs when doing device discovery on Windows

New Contributors

@athshh made their first contributihttps://github.com/ollama/ollama/pull/12822/12822

Full Changelog: ollama/ollama@v0.12.7...v0.12.8

`v0.12.7`

Compare Source

<img width="600" alt="Ollama screenshot 2025-10-29

Configuration

📅 Schedule: (UTC)

Branch creation
- At any time (no schedule defined)
Automerge
- At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

renovate · 2024-11-15T22:19:32Z

ℹ Artifact update notice

File name: go.mod

In order to perform the update(s) described in the table above, Renovate ran the go get command, which resulted in the following additional change(s):

6 additional dependencies were updated
The go directive was updated for compatibility reasons

Details:

Package	Change
`go`	`1.23.0` -> `1.24.1`
`golang.org/x/crypto`	`v0.26.0` -> `v0.33.0`
`golang.org/x/exp`	`v0.0.0-20240808152545-0cdaa3abc0fa` -> `v0.0.0-20250218142911-aa4b98e5adaa`
`golang.org/x/sync`	`v0.8.0` -> `v0.11.0`
`golang.org/x/net`	`v0.27.0` -> `v0.35.0`
`golang.org/x/sys`	`v0.23.0` -> `v0.30.0`
`golang.org/x/text`	`v0.17.0` -> `v0.22.0`

renovate · 2026-03-01T13:14:16Z

ℹ️ Artifact update notice

File name: go.mod

In order to perform the update(s) described in the table above, Renovate ran the go get command, which resulted in the following additional change(s):

6 additional dependencies were updated
The go directive was updated for compatibility reasons

Details:

Package	Change
`go`	`1.23.0` -> `1.24.1`
`golang.org/x/crypto`	`v0.26.0` -> `v0.43.0`
`golang.org/x/exp`	`v0.0.0-20240808152545-0cdaa3abc0fa` -> `v0.0.0-20250218142911-aa4b98e5adaa`
`golang.org/x/sync`	`v0.8.0` -> `v0.17.0`
`golang.org/x/net`	`v0.27.0` -> `v0.46.0`
`golang.org/x/sys`	`v0.23.0` -> `v0.37.0`
`golang.org/x/text`	`v0.17.0` -> `v0.30.0`

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 71e1230 to 4bf37c1 Compare August 31, 2024 20:07

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.8~~ fix(deps): update module github.com/ollama/ollama to v0.3.9 Aug 31, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 4bf37c1 to 3937772 Compare September 8, 2024 09:37

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.9~~ fix(deps): update module github.com/ollama/ollama to v0.3.10 Sep 8, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 3937772 to 4addd23 Compare September 18, 2024 08:12

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.10~~ fix(deps): update module github.com/ollama/ollama to v0.3.11 Sep 18, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 4addd23 to 766eb34 Compare September 25, 2024 07:22

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.11~~ fix(deps): update module github.com/ollama/ollama to v0.3.12 Sep 25, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 766eb34 to fb45a8c Compare October 11, 2024 01:49

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.12~~ fix(deps): update module github.com/ollama/ollama to v0.3.13 Oct 11, 2024

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.13~~ fix(deps): update module github.com/ollama/ollama to v0.3.14 Oct 21, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from fb45a8c to c081f43 Compare October 21, 2024 03:49

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from c081f43 to 65d7b07 Compare November 6, 2024 17:22

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.14~~ fix(deps): update module github.com/ollama/ollama to v0.4.0 Nov 6, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 65d7b07 to 01d0591 Compare November 9, 2024 01:14

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.0~~ fix(deps): update module github.com/ollama/ollama to v0.4.1 Nov 9, 2024

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.1~~ fix(deps): update module github.com/ollama/ollama to v0.4.2 Nov 15, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 01d0591 to 3104147 Compare November 15, 2024 22:19

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 3104147 to c626610 Compare November 21, 2024 19:44

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.2~~ fix(deps): update module github.com/ollama/ollama to v0.4.3 Nov 21, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from c626610 to ede6e14 Compare November 23, 2024 00:44

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.3~~ fix(deps): update module github.com/ollama/ollama to v0.4.4 Nov 23, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from ede6e14 to 1d65f7d Compare November 26, 2024 01:26

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.4~~ fix(deps): update module github.com/ollama/ollama to v0.4.5 Nov 26, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 1d65f7d to 7d6c10a Compare November 28, 2024 01:51

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.5~~ fix(deps): update module github.com/ollama/ollama to v0.4.6 Nov 28, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 7d6c10a to 73270f8 Compare November 30, 2024 23:02

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.6~~ fix(deps): update module github.com/ollama/ollama to v0.4.7 Nov 30, 2024

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 73270f8 to 53544b1 Compare December 6, 2024 02:29

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 9a2971b to 875cd62 Compare January 16, 2025 09:46

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.6~~ fix(deps): update module github.com/ollama/ollama to v0.5.7 Jan 16, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 875cd62 to b6b0355 Compare February 11, 2025 02:03

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.7~~ fix(deps): update module github.com/ollama/ollama to v0.5.8 Feb 11, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from b6b0355 to 7b61af1 Compare February 12, 2025 23:10

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.8~~ fix(deps): update module github.com/ollama/ollama to v0.5.9 Feb 12, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 7b61af1 to e5ba3de Compare February 14, 2025 03:54

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.9~~ fix(deps): update module github.com/ollama/ollama to v0.5.10 Feb 14, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from e5ba3de to bc8d98a Compare February 14, 2025 10:33

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.10~~ fix(deps): update module github.com/ollama/ollama to v0.5.11 Feb 14, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from bc8d98a to fa63a40 Compare February 21, 2025 23:26

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.11~~ fix(deps): update module github.com/ollama/ollama to v0.5.12 Feb 21, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from fa63a40 to 534d339 Compare March 4, 2025 03:29

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.12~~ fix(deps): update module github.com/ollama/ollama to v0.5.13 Mar 4, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 534d339 to b7b5a4f Compare March 12, 2025 02:03

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.5.13~~ fix(deps): update module github.com/ollama/ollama to v0.6.0 Mar 12, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch 2 times, most recently from 7f3e16c to 6b3723d Compare March 14, 2025 19:19

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.6.0~~ fix(deps): update module github.com/ollama/ollama to v0.6.1 Mar 14, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch 2 times, most recently from d046c0f to 9fbb897 Compare March 18, 2025 06:51

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.6.1~~ fix(deps): update module github.com/ollama/ollama to v0.6.2 Mar 18, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 9fbb897 to c90da18 Compare March 27, 2025 15:42

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.6.2~~ fix(deps): update module github.com/ollama/ollama to v0.6.3 Mar 27, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from c90da18 to a5e112f Compare April 3, 2025 06:41

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.6.3~~ fix(deps): update module github.com/ollama/ollama to v0.6.4 Apr 3, 2025

renovate Bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from a5e112f to 2935ce3 Compare April 7, 2025 05:40

renovate Bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.6.4~~ fix(deps): update module github.com/ollama/ollama to v0.6.5 Apr 7, 2025

fix(deps): update module github.com/ollama/ollama to v0.21.2

9842384

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deps): update module github.com/ollama/ollama to v0.21.2#66

fix(deps): update module github.com/ollama/ollama to v0.21.2#66
renovate[bot] wants to merge 1 commit intomainfrom
renovate/github.com-ollama-ollama-0.x

renovate Bot commented Aug 28, 2024 •

edited

Loading

Uh oh!

renovate Bot commented Nov 15, 2024 •

edited

Loading

Uh oh!

renovate Bot commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

renovate Bot commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

What's Changed

New Contributors

What's Changed

Kimi CLI

Hermes Agent

What's Changed

What's Changed

What's Changed

New Contributors

OpenClaw channel setup with ollama launch

What's Changed

New Contributors

What's Changed

What's Changed

What's Changed

What's Changed

Gemma 4

What's Changed

Ollama is now powered by MLX on Apple Silicon in preview

What's Changed

New Contributors

Visual Studio Code

What's Changed

What's Changed

Web Search and Fetch in OpenClaw

Ollama web search plugin

Non-interactive (headless) mode for ollama launch

Use non-interactive mode in OpenClaw

What's Changed

Improved OpenClaw performance with Kimi-K2.5

Ollama is now a provider in OpenClaw

Nemotron-3-Super

Non-interactive task support

Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud

Driver updates required for ROCm 7

What's Changed

New Contributors

What's Changed

What's Changed

New Contributors

New models

What's Changed

New models

What's Changed

What's Changed

What's Changed

What's Changed

OpenClaw

Get started

Web search in OpenClaw

What's Changed

New Contributors

What's Changed

New Contributors

What's Changed

What's Changed

New models

New ollama

What's Changed

What's Changed

New models

Improvements to ollama launch

What's Changed

New Contributors

What's Changed

What's Changed

New Contributors

What's Changed

What's Changed

New Contributors

ollama launch

What's Changed

New models

What's Changed

New models

What's Changed

renovate Bot commented Aug 28, 2024 •

edited

Loading

OpenClaw channel setup with `ollama launch`

New `ollama`

Improvements to `ollama launch`

`ollama launch`