Skip to content
Open
1 change: 0 additions & 1 deletion MODEL-MATRIX.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,6 @@
| Dependency | Version | Reason |
|-----------|---------|--------|
| transformers | `>=5.0.0` | VL models require v5; standard models work with v5 too |
| transformers (git fallback) | `3c2517727ce28a30f5044e01663ee204deb1cdbe` | For VL if v5 has issues |
| vLLM | `==0.14` | Latest stable with LFM support |
| vLLM (VL custom wheel) | commit `72506c98349d6bcd32b4e33eec7b5513453c1502` | VL support not yet upstream |
| llama.cpp | Latest via `brew install` or b7075+ binaries | |
Expand Down
5 changes: 0 additions & 5 deletions deployment/gpu-inference/transformers.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Transformers"
description: "Transformers is a library for inference and training of pretrained models."

Check warning on line 3 in deployment/gpu-inference/transformers.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai-main) - vale-spellcheck

deployment/gpu-inference/transformers.mdx#L3

Did you really mean 'pretrained'?

Check warning on line 3 in deployment/gpu-inference/transformers.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai) - vale-spellcheck

deployment/gpu-inference/transformers.mdx#L3

Did you really mean 'pretrained'?
---

<Tip>
Expand All @@ -27,11 +27,6 @@
uv pip install "transformers>=5.0.0" torch accelerate
```

> **Note:** Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
> ```bash
> uv pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6 torch accelerate
> ```

GPU is recommended for faster inference.

## Basic Usage
Expand Down Expand Up @@ -75,7 +70,7 @@

* **`model_id`**: Can be a Hugging Face model ID (e.g., `"LiquidAI/LFM2.5-1.2B-Instruct"`) or a local path
* **`device_map="auto"`**: Automatically distributes across available GPUs/CPU (requires `accelerate`). Use `device="cuda"` for single GPU or `device="cpu"` for CPU only
* **`dtype="bfloat16"`**: Recommended for modern GPUs. Use `"auto"` for automatic selection, or `"float32"` (slower, more memory)

Check warning on line 73 in deployment/gpu-inference/transformers.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai-main) - vale-spellcheck

deployment/gpu-inference/transformers.mdx#L73

Did you really mean 'GPUs'?

Check warning on line 73 in deployment/gpu-inference/transformers.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai) - vale-spellcheck

deployment/gpu-inference/transformers.mdx#L73

Did you really mean 'GPUs'?

<Accordion title="Click to see a pipeline() example">
The [`pipeline()`](https://huggingface.co/docs/transformers/en/main_classes/pipelines) interface provides a simpler API for text generation with automatic chat template handling. It wraps model loading and tokenization, making it ideal for quick prototyping.
Expand Down Expand Up @@ -115,7 +110,7 @@

* **`do_sample`** (`bool`): Enable sampling (`True`) or greedy decoding (`False`, default)
* **`temperature`** (`float`, default 1.0): Controls randomness (0.0 = deterministic, higher = more random). Typical range: 0.1-2.0
* **`top_p`** (`float`, default 1.0): Nucleus sampling - limits to tokens with cumulative probability ≤ top\_p. Typical range: 0.1-1.0

Check warning on line 113 in deployment/gpu-inference/transformers.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai-main) - vale-spellcheck

deployment/gpu-inference/transformers.mdx#L113

Did you really mean 'top_p'?

Check warning on line 113 in deployment/gpu-inference/transformers.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai) - vale-spellcheck

deployment/gpu-inference/transformers.mdx#L113

Did you really mean 'top_p'?
* **`top_k`** (`int`, default 50): Limits to top-k most probable tokens. Typical range: 1-100
* **`min_p`** (`float`): Minimum token probability threshold. Typical range: 0.01-0.2
* **`max_new_tokens`** (`int`): Maximum number of tokens to generate (preferred over `max_length`)
Expand Down
7 changes: 0 additions & 7 deletions deployment/gpu-inference/vllm.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "vLLM"
description: "vLLM is a high-throughput and memory-efficient inference engine for LLMs. It supports efficient serving with PagedAttention, continuous batching, and optimized CUDA kernels."

Check warning on line 3 in deployment/gpu-inference/vllm.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai-main) - vale-spellcheck

deployment/gpu-inference/vllm.mdx#L3

Did you really mean 'LLMs'?

Check warning on line 3 in deployment/gpu-inference/vllm.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai) - vale-spellcheck

deployment/gpu-inference/vllm.mdx#L3

Did you really mean 'LLMs'?
---

<Tip>
Expand Down Expand Up @@ -52,7 +52,7 @@
Control text generation behavior using [`SamplingParams`](https://docs.vllm.ai/en/v0.4.1/dev/sampling_params.html). Key parameters:

* **`temperature`** (`float`, default 1.0): Controls randomness (0.0 = deterministic, higher = more random). Typical range: 0.1-2.0
* **`top_p`** (`float`, default 1.0): Nucleus sampling - limits to tokens with cumulative probability ≤ top\_p. Typical range: 0.1-1.0

Check warning on line 55 in deployment/gpu-inference/vllm.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai-main) - vale-spellcheck

deployment/gpu-inference/vllm.mdx#L55

Did you really mean 'top_p'?

Check warning on line 55 in deployment/gpu-inference/vllm.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai) - vale-spellcheck

deployment/gpu-inference/vllm.mdx#L55

Did you really mean 'top_p'?
* **`top_k`** (`int`, default -1): Limits to top-k most probable tokens (-1 = disabled). Typical range: 1-100
* **`min_p`** (`float`): Minimum token probability threshold. Typical range: 0.01-0.2
* **`max_tokens`** (`int`): Maximum number of tokens to generate
Expand Down Expand Up @@ -186,7 +186,7 @@

### Installation for Vision Models

To use LFM Vision Models with vLLM, install the precompiled wheel along with the required transformers version:

Check warning on line 189 in deployment/gpu-inference/vllm.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai-main) - vale-spellcheck

deployment/gpu-inference/vllm.mdx#L189

Did you really mean 'precompiled'?

Check warning on line 189 in deployment/gpu-inference/vllm.mdx

View check run for this annotation

Mintlify / Mintlify Validation (liquidai) - vale-spellcheck

deployment/gpu-inference/vllm.mdx#L189

Did you really mean 'precompiled'?

```bash
VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 VLLM_USE_PRECOMPILED=1 uv pip install git+https://github.com/vllm-project/vllm.git
Expand All @@ -196,13 +196,6 @@
uv pip install "transformers>=5.0.0" pillow
```

<Note>
Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
```bash
uv pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
```
</Note>

This installs vLLM with the necessary changes for LFM Vision Model support. Once these changes are merged upstream, you'll be able to use the standard vLLM installation.

### Basic Usage
Expand Down
25 changes: 23 additions & 2 deletions lfm/models/lfm2-24b-a2b.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ description: "24B parameter Mixture-of-Experts model with 2B active parameters
---

import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";

<a href="/lfm/models/text-models" className="back-button">← Back to Text Models</a>
Expand Down Expand Up @@ -54,6 +53,28 @@ LFM2-24B-A2B is Liquid AI's largest Mixture-of-Experts model, combining 24B tota
<TextLlamacpp ggufRepo="LiquidAI/LFM2-24B-A2B-GGUF" samplingFlags="--temp 0.1 --top-k 50 --repeat-penalty 1.05"></TextLlamacpp>
</Tab>
<Tab title="vLLM">
<TextVllm modelId="LiquidAI/LFM2-24B-A2B" samplingParams="temperature=0.1, top_k=50, repetition_penalty=1.05, "></TextVllm>
<Warning>
LFM2-24B-A2B requires vLLM ≥0.15.1 and transformers ≥5.1.0. Install transformers on top of the vLLM image.
</Warning>

**Install:**

```bash
uv pip install "vllm>=0.15.1"
uv pip install "transformers>=5.1.0"
```

**Run:**

```python
from vllm import LLM, SamplingParams

llm = LLM(model="LiquidAI/LFM2-24B-A2B")

sampling_params = SamplingParams(temperature=0.1, top_k=50, repetition_penalty=1.05, max_tokens=512)

output = llm.chat("What is machine learning?", sampling_params)
print(output[0].outputs[0].text)
```
</Tab>
</Tabs>
50 changes: 4 additions & 46 deletions notebooks/LFM2_Inference_with_Transformers.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!uv pip install \"transformers>=5.0.0\" \"torch==2.9.0\" accelerate"
]
"source": "!uv pip uninstall torchvision -y\n!uv pip install \"transformers>=5.0.0\" torchvision accelerate"
},
{
"cell_type": "markdown",
Expand All @@ -43,34 +41,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
"\n",
"# Load model and tokenizer\n",
"model_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" model_id,\n",
" device_map=\"auto\",\n",
" dtype=\"bfloat16\",\n",
")\n",
"tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
"\n",
"# Generate answer\n",
"prompt = \"What is C. elegans?\"\n",
"inputs = tokenizer.apply_chat_template(\n",
" [{\"role\": \"user\", \"content\": prompt}],\n",
" add_generation_prompt=True,\n",
" return_tensors=\"pt\",\n",
" return_dict=True,\n",
").to(model.device)\n",
"\n",
"output = model.generate(**inputs, max_new_tokens=512)\n",
"\n",
"# Decode only the newly generated tokens\n",
"input_length = inputs[\"input_ids\"].shape[1]\n",
"response = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\n",
"print(response)"
]
"source": "from transformers import AutoModelForCausalLM, AutoTokenizer\n\n# Load model and tokenizer\nmodel_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\nmodel = AutoModelForCausalLM.from_pretrained(\n model_id,\n device_map=\"auto\",\n dtype=\"bfloat16\",\n)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n\n# Generate answer\nprompt = \"What is C. elegans?\"\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": prompt}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n tokenize=True,\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)\n\n# Decode only the newly generated tokens\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
},
{
"cell_type": "markdown",
Expand All @@ -86,7 +57,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "from transformers import GenerationConfig\n\ngeneration_config = GenerationConfig(\n do_sample=True,\n temperature=0.1,\n top_k=50,\n repetition_penalty=1.05,\n max_new_tokens=512,\n)\n\nprompt = \"Explain quantum computing in simple terms.\"\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": prompt}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, generation_config=generation_config)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
"source": "from transformers import GenerationConfig\n\ngeneration_config = GenerationConfig(\n do_sample=True,\n temperature=0.1,\n top_k=50,\n repetition_penalty=1.05,\n max_new_tokens=512,\n)\n\nprompt = \"Explain quantum computing in simple terms.\"\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": prompt}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n tokenize=True,\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, generation_config=generation_config)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
},
{
"cell_type": "markdown",
Expand All @@ -102,20 +73,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from transformers import TextStreamer\n",
"\n",
"prompt = \"Tell me a story about space exploration.\"\n",
"inputs = tokenizer.apply_chat_template(\n",
" [{\"role\": \"user\", \"content\": prompt}],\n",
" add_generation_prompt=True,\n",
" return_tensors=\"pt\",\n",
" return_dict=True,\n",
").to(model.device)\n",
"\n",
"streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)\n",
"output = model.generate(**inputs, streamer=streamer, max_new_tokens=512)"
]
"source": "from transformers import TextStreamer\n\nprompt = \"Tell me a story about space exploration.\"\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": prompt}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n tokenize=True,\n return_dict=True,\n).to(model.device)\n\nstreamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)\noutput = model.generate(**inputs, streamer=streamer, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)"
},
{
"cell_type": "markdown",
Expand Down
2 changes: 1 addition & 1 deletion notebooks/quickstart_snippets.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
"snippet": "text-transformers"
},
"outputs": [],
"source": "from transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\nmodel = AutoModelForCausalLM.from_pretrained(\n model_id,\n device_map=\"auto\",\n dtype=\"bfloat16\",\n)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": \"What is machine learning?\"}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
"source": "from transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\nmodel = AutoModelForCausalLM.from_pretrained(\n model_id,\n device_map=\"auto\",\n dtype=\"bfloat16\",\n)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n\ninputs = tokenizer.apply_chat_template(\n [{\"role\": \"user\", \"content\": \"What is machine learning?\"}],\n add_generation_prompt=True,\n return_tensors=\"pt\",\n tokenize=True,\n return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
},
{
"cell_type": "code",
Expand Down
2 changes: 1 addition & 1 deletion notebooks/💧_LFM2_5_VL_SFT_with_TRL.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
"outputId": "01173385-066c-4114-d217-6d6e1d91f12b"
},
"outputs": [],
"source": "!uv pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe datasets trl"
"source": "!uv pip install \"transformers>=5.0.0\" datasets trl"
},
{
"cell_type": "markdown",
Expand Down
22 changes: 5 additions & 17 deletions scripts/generate_snippets.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
"sections": [
{"type": "label", "text": "Install:"},
{"type": "code_block", "language": "bash",
"code": 'pip install "transformers>=5.0.0" torch accelerate'},
"code": 'uv pip install "transformers>=5.0.0" torch accelerate'},
{"type": "label", "text": "Download & Run:"},
{"type": "code_block", "language": "python",
"code": (
Expand Down Expand Up @@ -82,7 +82,7 @@
"sections": [
{"type": "label", "text": "Install:"},
{"type": "code_block", "language": "bash",
"code": "pip install vllm==0.14"},
"code": "uv pip install vllm==0.14"},
{"type": "label", "text": "Run:"},
{"type": "code_block", "language": "python",
"code": (
Expand Down Expand Up @@ -121,13 +121,7 @@
"sections": [
{"type": "label", "text": "Install:"},
{"type": "code_block", "language": "bash",
"code": 'pip install "transformers>=5.0.0" pillow torch'},
{"type": "note", "children": [
{"type": "text",
"text": "Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:"},
{"type": "code_block_margin", "language": "bash",
"code": "pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch"},
]},
"code": 'uv pip install "transformers>=5.0.0" pillow torch'},
{"type": "label", "text": "Download & Run:"},
{"type": "notebook_code", "language": "python"},
],
Expand All @@ -142,15 +136,9 @@
"text": "vLLM support for LFM Vision Models requires a specific version. Install from the custom source below."},
{"type": "label", "text": "Install:"},
{"type": "code_block", "language": "bash",
"code": "VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 \\\n VLLM_USE_PRECOMPILED=1 \\\n pip install git+https://github.com/vllm-project/vllm.git"},
"code": "VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 \\\n VLLM_USE_PRECOMPILED=1 \\\n uv pip install git+https://github.com/vllm-project/vllm.git"},
{"type": "code_block", "language": "bash",
"code": 'pip install "transformers>=5.0.0" pillow'},
{"type": "note", "children": [
{"type": "text",
"text": "Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:"},
{"type": "code_block_margin", "language": "bash",
"code": "pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow"},
]},
"code": 'uv pip install "transformers>=5.0.0" pillow'},
{"type": "label", "text": "Run:"},
{"type": "notebook_code", "language": "python"},
],
Expand Down
2 changes: 1 addition & 1 deletion snippets/quickstart/text-transformers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ export const TextTransformers = ({ modelId, samplingParams }) => (
<p><strong>Install:</strong></p>
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
<code language="bash">
{`pip install "transformers>=5.0.0" torch accelerate`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
{`uv pip install "transformers>=5.0.0" torch accelerate`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
<p><strong>Download & Run:</strong></p>
Expand Down
2 changes: 1 addition & 1 deletion snippets/quickstart/text-vllm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ export const TextVllm = ({ modelId, samplingParams }) => (
<p><strong>Install:</strong></p>
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
<code language="bash">
{`pip install vllm==0.14`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
{`uv pip install vllm==0.14`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
<p><strong>Run:</strong></p>
Expand Down
10 changes: 1 addition & 9 deletions snippets/quickstart/vl-transformers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,9 @@ export const VlTransformers = ({ modelId }) => (
<p><strong>Install:</strong></p>
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
<code language="bash">
{`pip install "transformers>=5.0.0" pillow torch`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
{`uv pip install "transformers>=5.0.0" pillow torch`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
<Note>
Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8', marginTop: '0.5rem'}} language="bash">
<code language="bash">
{`pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
</Note>
<p><strong>Download & Run:</strong></p>
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="python">
<code language="python">
Expand Down
12 changes: 2 additions & 10 deletions snippets/quickstart/vl-vllm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,14 @@ vLLM support for LFM Vision Models requires a specific version. Install from the
<code language="bash">
{`VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 \\
VLLM_USE_PRECOMPILED=1 \\
pip install git+https://github.com/vllm-project/vllm.git`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
uv pip install git+https://github.com/vllm-project/vllm.git`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
<code language="bash">
{`pip install "transformers>=5.0.0" pillow`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
{`uv pip install "transformers>=5.0.0" pillow`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
<Note>
Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8', marginTop: '0.5rem'}} language="bash">
<code language="bash">
{`pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
</code>
</pre>
</Note>
<p><strong>Run:</strong></p>
<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="python">
<code language="python">
Expand Down