You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Release Notes:
- Changed the way context window is set for ollama at the provider level
instead of per model.
---------
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
Copy file name to clipboardExpand all lines: docs/src/ai/llm-providers.md
+16-5Lines changed: 16 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -423,14 +423,23 @@ models are available.
423
423
424
424
#### Ollama Context Length {#ollama-context}
425
425
426
-
Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models.
427
-
Zed API requests to Ollama include this as the `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of RAM are able to use most models out of the box.
428
-
429
-
See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
426
+
Zed API requests to Ollama include the context length as the `num_ctx` parameter. By default, Zed uses a context length of `4096` tokens for all Ollama models.
430
427
431
428
> **Note**: Token counts displayed in the Agent Panel are only estimates and will differ from the model's native tokenizer.
432
429
433
-
Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
430
+
You can set a context length for all Ollama models using the `context_window` setting. This can also be configured in the Ollama provider settings UI:
431
+
432
+
```json [settings]
433
+
{
434
+
"language_models": {
435
+
"ollama": {
436
+
"context_window": 8192
437
+
}
438
+
}
439
+
}
440
+
```
441
+
442
+
Alternatively, you can configure the context length per-model using the `max_tokens` field in `available_models`:
434
443
435
444
```json [settings]
436
445
{
@@ -452,6 +461,8 @@ Depending on your hardware or use-case you may wish to limit or increase the con
452
461
}
453
462
```
454
463
464
+
> **Note**: If `context_window` is set, it overrides any per-model `max_tokens` values.
465
+
455
466
If you specify a context length that is too large for your hardware, Ollama will log an error.
456
467
You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (macOS) or `journalctl -u ollama -f` (Linux).
457
468
Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
0 commit comments