Liquid4All · alay2shah · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026
@@ -54,7 +54,6 @@
 | Dependency | Version | Reason |
 |-----------|---------|--------|
 | transformers | `>=5.0.0` | VL models require v5; standard models work with v5 too |
-| transformers (git fallback) | `3c2517727ce28a30f5044e01663ee204deb1cdbe` | For VL if v5 has issues |
 | vLLM | `==0.14` | Latest stable with LFM support |
 | vLLM (VL custom wheel) | commit `72506c98349d6bcd32b4e33eec7b5513453c1502` | VL support not yet upstream |
 | llama.cpp | Latest via `brew install` or b7075+ binaries | |

@@ -1,6 +1,6 @@
 ---
 title: "Transformers"
 description: "Transformers is a library for inference and training of pretrained models."
 ---

 <Tip>
@@ -27,11 +27,6 @@
 uv pip install "transformers>=5.0.0" torch accelerate
 ```
 
-> **Note:** Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
-> ```bash
-> uv pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6 torch accelerate
-> ```
-
 GPU is recommended for faster inference.
 
 ## Basic Usage
@@ -75,7 +70,7 @@

 * **`model_id`**: Can be a Hugging Face model ID (e.g., `"LiquidAI/LFM2.5-1.2B-Instruct"`) or a local path
 * **`device_map="auto"`**: Automatically distributes across available GPUs/CPU (requires `accelerate`). Use `device="cuda"` for single GPU or `device="cpu"` for CPU only
 * **`dtype="bfloat16"`**: Recommended for modern GPUs. Use `"auto"` for automatic selection, or `"float32"` (slower, more memory)

 <Accordion title="Click to see a pipeline() example">
  The [`pipeline()`](https://huggingface.co/docs/transformers/en/main_classes/pipelines) interface provides a simpler API for text generation with automatic chat template handling. It wraps model loading and tokenization, making it ideal for quick prototyping.
@@ -115,7 +110,7 @@

 * **`do_sample`** (`bool`): Enable sampling (`True`) or greedy decoding (`False`, default)
 * **`temperature`** (`float`, default 1.0): Controls randomness (0.0 = deterministic, higher = more random). Typical range: 0.1-2.0
 * **`top_p`** (`float`, default 1.0): Nucleus sampling - limits to tokens with cumulative probability ≤ top\_p. Typical range: 0.1-1.0
 * **`top_k`** (`int`, default 50): Limits to top-k most probable tokens. Typical range: 1-100
 * **`min_p`** (`float`): Minimum token probability threshold. Typical range: 0.01-0.2
 * **`max_new_tokens`** (`int`): Maximum number of tokens to generate (preferred over `max_length`)

@@ -1,6 +1,6 @@
 ---
 title: "vLLM"
 description: "vLLM is a high-throughput and memory-efficient inference engine for LLMs. It supports efficient serving with PagedAttention, continuous batching, and optimized CUDA kernels."
 ---

 <Tip>
@@ -52,7 +52,7 @@
 Control text generation behavior using [`SamplingParams`](https://docs.vllm.ai/en/v0.4.1/dev/sampling_params.html). Key parameters:

 * **`temperature`** (`float`, default 1.0): Controls randomness (0.0 = deterministic, higher = more random). Typical range: 0.1-2.0
 * **`top_p`** (`float`, default 1.0): Nucleus sampling - limits to tokens with cumulative probability ≤ top\_p. Typical range: 0.1-1.0
 * **`top_k`** (`int`, default -1): Limits to top-k most probable tokens (-1 = disabled). Typical range: 1-100
 * **`min_p`** (`float`): Minimum token probability threshold. Typical range: 0.01-0.2
 * **`max_tokens`** (`int`): Maximum number of tokens to generate
@@ -186,7 +186,7 @@

 ### Installation for Vision Models

 To use LFM Vision Models with vLLM, install the precompiled wheel along with the required transformers version:

 ```bash
 VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 VLLM_USE_PRECOMPILED=1 uv pip install git+https://github.com/vllm-project/vllm.git
@@ -196,13 +196,6 @@
 uv pip install "transformers>=5.0.0" pillow
 ```
 
-<Note>
-Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
-```bash
-uv pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
-```
-</Note>
-
 This installs vLLM with the necessary changes for LFM Vision Model support. Once these changes are merged upstream, you'll be able to use the standard vLLM installation.
 
 ### Basic Usage

@@ -4,7 +4,6 @@ description: "24B parameter Mixture-of-Experts model with 2B active parameters
 ---
 
 import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
-import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
 import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
 
 <a href="/lfm/models/text-models" className="back-button">← Back to Text Models</a>
@@ -54,6 +53,28 @@ LFM2-24B-A2B is Liquid AI's largest Mixture-of-Experts model, combining 24B tota
     <TextLlamacpp ggufRepo="LiquidAI/LFM2-24B-A2B-GGUF" samplingFlags="--temp 0.1 --top-k 50 --repeat-penalty 1.05"></TextLlamacpp>
   </Tab>
   <Tab title="vLLM">
-    <TextVllm modelId="LiquidAI/LFM2-24B-A2B" samplingParams="temperature=0.1, top_k=50, repetition_penalty=1.05, "></TextVllm>
+    <Warning>
+    LFM2-24B-A2B requires vLLM ≥0.15.1 and transformers ≥5.1.0. Install transformers on top of the vLLM image.
+    </Warning>
+
+    **Install:**
+
+    ```bash
+    uv pip install "vllm>=0.15.1"
+    uv pip install "transformers>=5.1.0"
+    ```
+
+    **Run:**
+
+    ```python
+    from vllm import LLM, SamplingParams
+
+    llm = LLM(model="LiquidAI/LFM2-24B-A2B")
+
+    sampling_params = SamplingParams(temperature=0.1, top_k=50, repetition_penalty=1.05, max_tokens=512)
+
+    output = llm.chat("What is machine learning?", sampling_params)
+    print(output[0].outputs[0].text)
+    ```
   </Tab>
 </Tabs>
@@ -25,9 +25,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "!uv pip install \"transformers>=5.0.0\" \"torch==2.9.0\" accelerate"
-   ]
+   "source": "!uv pip uninstall torchvision -y\n!uv pip install \"transformers>=5.0.0\" torchvision accelerate"
   },
   {
    "cell_type": "markdown",
@@ -43,34 +41,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
-    "\n",
-    "# Load model and tokenizer\n",
-    "model_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\n",
-    "model = AutoModelForCausalLM.from_pretrained(\n",
-    "    model_id,\n",
-    "    device_map=\"auto\",\n",
-    "    dtype=\"bfloat16\",\n",
-    ")\n",
-    "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
-    "\n",
-    "# Generate answer\n",
-    "prompt = \"What is C. elegans?\"\n",
-    "inputs = tokenizer.apply_chat_template(\n",
-    "    [{\"role\": \"user\", \"content\": prompt}],\n",
-    "    add_generation_prompt=True,\n",
-    "    return_tensors=\"pt\",\n",
-    "    return_dict=True,\n",
-    ").to(model.device)\n",
-    "\n",
-    "output = model.generate(**inputs, max_new_tokens=512)\n",
-    "\n",
-    "# Decode only the newly generated tokens\n",
-    "input_length = inputs[\"input_ids\"].shape[1]\n",
-    "response = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\n",
-    "print(response)"
-   ]
+   "source": "from transformers import AutoModelForCausalLM, AutoTokenizer\n\n# Load model and tokenizer\nmodel_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_id,\n    device_map=\"auto\",\n    dtype=\"bfloat16\",\n)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n\n# Generate answer\nprompt = \"What is C. elegans?\"\ninputs = tokenizer.apply_chat_template(\n    [{\"role\": \"user\", \"content\": prompt}],\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    tokenize=True,\n    return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)\n\n# Decode only the newly generated tokens\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
   },
   {
    "cell_type": "markdown",
@@ -86,7 +57,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "from transformers import GenerationConfig\n\ngeneration_config = GenerationConfig(\n    do_sample=True,\n    temperature=0.1,\n    top_k=50,\n    repetition_penalty=1.05,\n    max_new_tokens=512,\n)\n\nprompt = \"Explain quantum computing in simple terms.\"\ninputs = tokenizer.apply_chat_template(\n    [{\"role\": \"user\", \"content\": prompt}],\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, generation_config=generation_config)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
+   "source": "from transformers import GenerationConfig\n\ngeneration_config = GenerationConfig(\n    do_sample=True,\n    temperature=0.1,\n    top_k=50,\n    repetition_penalty=1.05,\n    max_new_tokens=512,\n)\n\nprompt = \"Explain quantum computing in simple terms.\"\ninputs = tokenizer.apply_chat_template(\n    [{\"role\": \"user\", \"content\": prompt}],\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    tokenize=True,\n    return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, generation_config=generation_config)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
   },
   {
    "cell_type": "markdown",
@@ -102,20 +73,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "from transformers import TextStreamer\n",
-    "\n",
-    "prompt = \"Tell me a story about space exploration.\"\n",
-    "inputs = tokenizer.apply_chat_template(\n",
-    "    [{\"role\": \"user\", \"content\": prompt}],\n",
-    "    add_generation_prompt=True,\n",
-    "    return_tensors=\"pt\",\n",
-    "    return_dict=True,\n",
-    ").to(model.device)\n",
-    "\n",
-    "streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)\n",
-    "output = model.generate(**inputs, streamer=streamer, max_new_tokens=512)"
-   ]
+   "source": "from transformers import TextStreamer\n\nprompt = \"Tell me a story about space exploration.\"\ninputs = tokenizer.apply_chat_template(\n    [{\"role\": \"user\", \"content\": prompt}],\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    tokenize=True,\n    return_dict=True,\n).to(model.device)\n\nstreamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)\noutput = model.generate(**inputs, streamer=streamer, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)"
   },
   {
    "cell_type": "markdown",

@@ -32,7 +32,7 @@
     "snippet": "text-transformers"
    },
    "outputs": [],
-   "source": "from transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_id,\n    device_map=\"auto\",\n    dtype=\"bfloat16\",\n)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n\ninputs = tokenizer.apply_chat_template(\n    [{\"role\": \"user\", \"content\": \"What is machine learning?\"}],\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
+   "source": "from transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_id = \"LiquidAI/LFM2.5-1.2B-Instruct\"\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_id,\n    device_map=\"auto\",\n    dtype=\"bfloat16\",\n)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n\ninputs = tokenizer.apply_chat_template(\n    [{\"role\": \"user\", \"content\": \"What is machine learning?\"}],\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    tokenize=True,\n    return_dict=True,\n).to(model.device)\n\noutput = model.generate(**inputs, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512)\ninput_length = inputs[\"input_ids\"].shape[1]\nresponse = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)\nprint(response)"
   },
   {
    "cell_type": "code",

diff --git a/notebooks/💧_LFM2_5_VL_SFT_with_TRL.ipynb b/notebooks/💧_LFM2_5_VL_SFT_with_TRL.ipynb
@@ -44,7 +44,7 @@
     "outputId": "01173385-066c-4114-d217-6d6e1d91f12b"
    },
    "outputs": [],
-   "source": "!uv pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe datasets trl"
+   "source": "!uv pip install \"transformers>=5.0.0\" datasets trl"
   },
   {
    "cell_type": "markdown",

@@ -45,7 +45,7 @@
         "sections": [
             {"type": "label", "text": "Install:"},
             {"type": "code_block", "language": "bash",
-             "code": 'pip install "transformers>=5.0.0" torch accelerate'},
+             "code": 'uv pip install "transformers>=5.0.0" torch accelerate'},
             {"type": "label", "text": "Download & Run:"},
             {"type": "code_block", "language": "python",
              "code": (
@@ -82,7 +82,7 @@
         "sections": [
             {"type": "label", "text": "Install:"},
             {"type": "code_block", "language": "bash",
-             "code": "pip install vllm==0.14"},
+             "code": "uv pip install vllm==0.14"},
             {"type": "label", "text": "Run:"},
             {"type": "code_block", "language": "python",
              "code": (
@@ -121,13 +121,7 @@
         "sections": [
             {"type": "label", "text": "Install:"},
             {"type": "code_block", "language": "bash",
-             "code": 'pip install "transformers>=5.0.0" pillow torch'},
-            {"type": "note", "children": [
-                {"type": "text",
-                 "text": "Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:"},
-                {"type": "code_block_margin", "language": "bash",
-                 "code": "pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch"},
-            ]},
+             "code": 'uv pip install "transformers>=5.0.0" pillow torch'},
             {"type": "label", "text": "Download & Run:"},
             {"type": "notebook_code", "language": "python"},
         ],
@@ -142,15 +136,9 @@
              "text": "vLLM support for LFM Vision Models requires a specific version. Install from the custom source below."},
             {"type": "label", "text": "Install:"},
             {"type": "code_block", "language": "bash",
-             "code": "VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 \\\n  VLLM_USE_PRECOMPILED=1 \\\n  pip install git+https://github.com/vllm-project/vllm.git"},
+             "code": "VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 \\\n  VLLM_USE_PRECOMPILED=1 \\\n  uv pip install git+https://github.com/vllm-project/vllm.git"},
             {"type": "code_block", "language": "bash",
-             "code": 'pip install "transformers>=5.0.0" pillow'},
-            {"type": "note", "children": [
-                {"type": "text",
-                 "text": "Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:"},
-                {"type": "code_block_margin", "language": "bash",
-                 "code": "pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow"},
-            ]},
+             "code": 'uv pip install "transformers>=5.0.0" pillow'},
             {"type": "label", "text": "Run:"},
             {"type": "notebook_code", "language": "python"},
         ],

@@ -3,7 +3,7 @@ export const TextTransformers = ({ modelId, samplingParams }) => (
 <p><strong>Install:</strong></p>
 <pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
 <code language="bash">
-{`pip install "transformers>=5.0.0" torch accelerate`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
+{`uv pip install "transformers>=5.0.0" torch accelerate`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
 </code>
 </pre>
 <p><strong>Download & Run:</strong></p>

@@ -3,7 +3,7 @@ export const TextVllm = ({ modelId, samplingParams }) => (
 <p><strong>Install:</strong></p>
 <pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
 <code language="bash">
-{`pip install vllm==0.14`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
+{`uv pip install vllm==0.14`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
 </code>
 </pre>
 <p><strong>Run:</strong></p>

@@ -3,17 +3,9 @@ export const VlTransformers = ({ modelId }) => (
 <p><strong>Install:</strong></p>
 <pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
 <code language="bash">
-{`pip install "transformers>=5.0.0" pillow torch`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
+{`uv pip install "transformers>=5.0.0" pillow torch`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
 </code>
 </pre>
-<Note>
-Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
-<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8', marginTop: '0.5rem'}} language="bash">
-<code language="bash">
-{`pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
-</code>
-</pre>
-</Note>
 <p><strong>Download & Run:</strong></p>
 <pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="python">
 <code language="python">

@@ -8,22 +8,14 @@ vLLM support for LFM Vision Models requires a specific version. Install from the
 <code language="bash">
 {`VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 \\
   VLLM_USE_PRECOMPILED=1 \\
-  pip install git+https://github.com/vllm-project/vllm.git`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
+  uv pip install git+https://github.com/vllm-project/vllm.git`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
 </code>
 </pre>
 <pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="bash">
 <code language="bash">
-{`pip install "transformers>=5.0.0" pillow`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
+{`uv pip install "transformers>=5.0.0" pillow`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
 </code>
 </pre>
-<Note>
-Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
-<pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8', marginTop: '0.5rem'}} language="bash">
-<code language="bash">
-{`pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow`.split('\n').map((line, i) => <span key={i} className="line">{line}{'\n'}</span>)}
-</code>
-</pre>
-</Note>
 <p><strong>Run:</strong></p>
 <pre className="shiki shiki-themes github-light github-dark" style={{backgroundColor: '#fff', '--shiki-dark-bg': '#24292e', color: '#24292e', '--shiki-dark': '#e1e4e8'}} language="python">
 <code language="python">