Track major AI platform updates through RSSHub by BetterAndBetterII · Pull Request #21742 · DIYgod/RSSHub

BetterAndBetterII · 2026-04-15T04:44:02Z

These staged additions introduce a batch of routes for release notes, changellogs, blogs, and research updates across major AI vendors and labs so their platform changes can be followed from RSSHub feeds.

Constraint: Commit only the currently staged route additions and leave unrelated working tree changes untouched
Constraint: Keep the batch scoped to route and namespace files without extra dependency or framework changes
Rejected: Split the staged routes into multiple commits | user asked to commit the staged content as one batch
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Re-review each new route against RSSHub route conventions before opening a PR, especially maintainers, radar targets, and fallback parsing logic
Tested: Verified staged file set before commit
Not-tested: Lint, typecheck, route runtime behavior, and upstream sync/push

Involved Issue / 该 PR 相关 Issue

#20556

Example for the Proposed Route(s) / 路由地址示例

/anthropic/release-notes
/anthropic/science
/arena/blog
/artificialanalysis/changelog
/epoch/gradient-updates
/google/ai
/google/gemini-api/changelog
/google/gemini/blog
/kimi/changelog
/metr/notes
/metr/research
/metr/updates
/minimax/news
/moonshot/blog
/openai/changelog
/openai/model-release-notes
/qwen/blog
/qwen/news
/stanford/helm/updates
/zai/company
/zai/release-notes

New RSS Route Checklist / 新 RSS 路由检查表

New Route / 新的路由
- Follows Script Standard / 跟随路由规范
Anti-bot or rate limit / 反爬/频率限制
- If yes, do your code reflect this sign? / 如果有, 是否有对应的措施?
Date and time / 日期和时间
- Parsed / 可以解析
- Correct time zone / 时区正确
New package added / 添加了新的包
Puppeteer

Note / 说明

These staged additions introduce a batch of routes for release notes, changellogs, blogs, and research updates across major AI vendors and labs so their platform changes can be followed from RSSHub feeds. Constraint: Commit only the currently staged route additions and leave unrelated working tree changes untouched Constraint: Keep the batch scoped to route and namespace files without extra dependency or framework changes Rejected: Split the staged routes into multiple commits | user asked to commit the staged content as one batch Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Re-review each new route against RSSHub route conventions before opening a PR, especially maintainers, radar targets, and fallback parsing logic Tested: Verified staged file set before commit Not-tested: Lint, typecheck, route runtime behavior, and upstream sync/push

+            }
+            const link = new URL(href, baseUrl).href;
+            const pathDepth = new URL(link).pathname.split('/').filter(Boolean).length;
+            if (!link.startsWith(baseUrl) || !/\/[^/]+\/$/.test(link) || pathDepth < 4) {


+        list,
+        (entry) =>
+            cache.tryGet(entry.link!, async () => {
+                if (!entry.link?.startsWith(baseUrl)) {


github-actions · 2026-04-15T04:49:01Z

Auto Review

[Rule 21] lib/routes/anthropic/release-notes.ts:9, lib/routes/google/blog-utils.ts:9, lib/routes/google/gemini-api-changelog.ts:8, lib/routes/openai/changelog.ts:9, lib/routes/zai/company.ts:8: Use hardcoded browser UA strings instead of RSSHub's built-in config.trueUA.
[Rule 19] lib/routes/anthropic/science.ts:29: Uses custom limit query parameter. Remove ctx.req.query("limit") and rely on RSSHub built-in limit common parameter.
[Rule 14] lib/routes/anthropic/release-notes.ts:52, lib/routes/kimi/changelog.ts:48, lib/routes/zai/release-notes.ts:49: Uses parseDate(title) where title contains non-date text. Extract actual date or remove pubDate if no valid date can be parsed.

github-actions · 2026-04-15T04:50:09Z

Successfully generated as following:

http://localhost:1200/anthropic/release-notes - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Anthropic Platform Release Notes</title>
    <link>https://platform.claude.com/docs/en/release-notes/overview</link>
    <atom:link href="http://localhost:1200/anthropic/release-notes" rel="self" type="application/rss+xml"></atom:link>
    <description>Updates to the Claude Platform, including the Claude API, client SDKs, and the Claude Console. - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Wed, 15 Apr 2026 04:49:33 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>April 14, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We announced the deprecation of the Claude Sonnet 4 model (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;claude-sonnet-4-20250514&lt;/code&gt;) and the Claude Opus 4 model (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;claude-opus-4-20250514&lt;/code&gt;), with retirement on the Claude API scheduled for June 15, 2026. We recommend migrating to &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Sonnet 4.6&lt;/a&gt; and &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Opus 4.6&lt;/a&gt; respectively. Read more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/model-deprecations&quot;&gt;model deprecations&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#april-14-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-april-14-2026</guid>
      <pubDate>Mon, 13 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>April 9, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/advisor-tool&quot;&gt;advisor tool&lt;/a&gt; in public beta. Pair a faster executor model with a higher-intelligence advisor model that provides strategic guidance mid-generation, so long-horizon agentic workloads get close to advisor-solo quality while the bulk of token generation happens at executor-model rates. Include the beta header &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;advisor-tool-2026-03-01&lt;/code&gt; in your requests.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#april-9-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-april-9-2026</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>April 8, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;strong class=&quot;font-semibold&quot;&gt;Claude Managed Agents&lt;/strong&gt; in public beta, a fully managed agent harness for running Claude as an autonomous agent with secure sandboxing, built-in tools, and server-sent event streaming. Create agents, configure containers, and run sessions through the API. All endpoints require the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;managed-agents-2026-04-01&lt;/code&gt; beta header. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/managed-agents/overview&quot;&gt;Claude Managed Agents overview&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;strong class=&quot;font-semibold&quot;&gt;&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;ant&lt;/code&gt; CLI&lt;/strong&gt;, a command-line client for the Claude API that enables faster interaction with the Claude API, native integration with Claude Code, and versioning of API resources in YAML files. Learn more in the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/api/sdks/cli&quot;&gt;CLI reference&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#april-8-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-april-8-2026</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>April 7, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We announced &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://anthropic.com/glasswing&quot;&gt;Claude Mythos Preview&lt;/a&gt; is available as a gated research preview for defensive cybersecurity work as part of &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://anthropic.com/glasswing&quot;&gt;Project Glasswing&lt;/a&gt;. Access is invitation-only.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;The &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/api/messages&quot;&gt;Messages API&lt;/a&gt; is now available on Amazon Bedrock as a research preview. The new Claude in Amazon Bedrock endpoint at &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;/anthropic/v1/messages&lt;/code&gt; uses the same request shape as the first-party Claude API and runs on AWS-managed infrastructure with zero operator access. Available in &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;us-east-1&lt;/code&gt;; contact your Anthropic account executive to request access. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/claude-in-amazon-bedrock-research-preview&quot;&gt;Claude in Amazon Bedrock (research preview)&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#april-7-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-april-7-2026</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>March 30, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve raised the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;max_tokens&lt;/code&gt; cap to 300k on the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/batch-processing#extended-output-beta&quot;&gt;Message Batches API&lt;/a&gt; for Claude Opus 4.6 and Sonnet 4.6. Include the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;output-300k-2026-03-24&lt;/code&gt; beta header to generate longer single-turn outputs for long-form content, structured data, and large code generation tasks.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;re retiring the 1M token context window beta for Claude Sonnet 4.5 and Claude Sonnet 4 on &lt;strong class=&quot;font-semibold&quot;&gt;April 30, 2026&lt;/strong&gt;. After that date, the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;context-1m-2025-08-07&lt;/code&gt; beta header will have no effect on these models, and requests that exceed the standard 200k-token context window will return an error. To continue using 1M context windows, migrate to &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Sonnet 4.6&lt;/a&gt; or &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Opus 4.6&lt;/a&gt;, which support the full 1M token context window at standard pricing with no beta header required.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#march-30-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-march-30-2026</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>March 18, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve added model capability fields to the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/api/models/list&quot;&gt;Models API&lt;/a&gt;. &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;GET /v1/models&lt;/code&gt; and &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;GET /v1/models/{model_id}&lt;/code&gt; now return &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;max_input_tokens&lt;/code&gt;, &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;max_tokens&lt;/code&gt;, and a &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;capabilities&lt;/code&gt; object. Query the API to discover what each model supports.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#march-18-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-march-18-2026</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>March 16, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;display&lt;/code&gt; field for extended thinking, letting you omit thinking content from responses for faster streaming. Set &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;thinking.display: &quot;omitted&quot;&lt;/code&gt; to receive thinking blocks with an empty &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;thinking&lt;/code&gt; field and the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;signature&lt;/code&gt; preserved for multi-turn continuity. Billing is unchanged. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/extended-thinking#controlling-thinking-display&quot;&gt;Controlling thinking display&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#march-16-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-march-16-2026</guid>
      <pubDate>Sun, 15 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>March 13, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;The &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/context-windows&quot;&gt;1M token context window&lt;/a&gt; is now generally available for Claude Opus 4.6 and Sonnet 4.6 at standard pricing. Requests over 200k tokens work automatically for these models with no beta header required. The 1M token context window remains in beta for Claude Sonnet 4.5 and Sonnet 4.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve removed the dedicated 1M rate limits for all supported models. Your standard account limits now apply across every context length.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve raised the media limit from 100 to 600 images or PDF pages per request when using the 1M token context window.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#march-13-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-march-13-2026</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>February 19, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;strong class=&quot;font-semibold&quot;&gt;automatic caching&lt;/strong&gt; for the Messages API. Add a single &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;cache_control&lt;/code&gt; field to your request body and the system automatically caches the last cacheable block, moving the cache point forward as conversations grow. No manual breakpoint management required. Works alongside existing block-level cache control for fine-grained optimization. Available on the Claude API and Azure AI Foundry (preview). Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/prompt-caching#automatic-caching&quot;&gt;Prompt caching&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve retired the Claude Sonnet 3.7 model (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;claude-3-7-sonnet-20250219&lt;/code&gt;) and the Claude Haiku 3.5 model (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;claude-3-5-haiku-20241022&lt;/code&gt;). All requests to these models will now return an error. We recommend upgrading to &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Sonnet 4.6&lt;/a&gt; and &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Haiku 4.5&lt;/a&gt; respectively. Researchers can request ongoing access through the &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://support.claude.com/en/articles/9125743-what-is-the-external-researcher-access-program&quot;&gt;External Researcher Access Program&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We announced the deprecation of the Claude Haiku 3 model (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;claude-3-haiku-20240307&lt;/code&gt;), with retirement scheduled for April 19, 2026. We recommend migrating to &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Haiku 4.5&lt;/a&gt;. Read more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/model-deprecations&quot;&gt;model deprecations&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#february-19-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-february-19-2026</guid>
      <pubDate>Wed, 18 Feb 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>February 17, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://www.anthropic.com/news/claude-sonnet-4-6&quot;&gt;Claude Sonnet 4.6&lt;/a&gt;, our latest balanced model combining speed and intelligence for everyday tasks. Sonnet 4.6 delivers improved agentic search performance while consuming fewer tokens. Sonnet 4.6 supports &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/extended-thinking&quot;&gt;extended thinking&lt;/a&gt; and a &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/context-windows&quot;&gt;1M token context window&lt;/a&gt; (beta). See &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models&quot;&gt;Models &amp;amp; Pricing&lt;/a&gt; for details.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;API &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/code-execution-tool&quot;&gt;code execution&lt;/a&gt; is now &lt;strong class=&quot;font-semibold&quot;&gt;free when used with web search or web fetch&lt;/strong&gt;. Sandboxed code execution improves model capability and token efficiency. See the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/code-execution-tool#usage-and-pricing&quot;&gt;pricing details&lt;/a&gt; for standalone usage.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;The &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool&quot;&gt;web search tool&lt;/a&gt; and &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling&quot;&gt;programmatic tool calling&lt;/a&gt; are now generally available (no beta header required). Web search and web fetch now support &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool#dynamic-filtering&quot;&gt;dynamic filtering&lt;/a&gt;, which uses code execution to filter results before they reach the context window for better performance and reduced token cost.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;The &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/code-execution-tool&quot;&gt;code execution tool&lt;/a&gt;, &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-fetch-tool&quot;&gt;web fetch tool&lt;/a&gt;, &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool&quot;&gt;tool search tool&lt;/a&gt;, &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/define-tools#providing-tool-use-examples&quot;&gt;tool use examples&lt;/a&gt;, and &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool&quot;&gt;memory tool&lt;/a&gt; are now generally available (no beta header required).&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#february-17-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-february-17-2026</guid>
      <pubDate>Mon, 16 Feb 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>February 7, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/fast-mode&quot;&gt;fast mode&lt;/a&gt; in research preview for Opus 4.6, providing significantly faster output token generation via the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;speed&lt;/code&gt; parameter. Fast mode is up to 2.5x as fast at premium pricing. Interested customers should join the &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://claude.com/fast-mode&quot;&gt;waitlist&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#february-7-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-february-7-2026</guid>
      <pubDate>Fri, 06 Feb 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>February 5, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://www.anthropic.com/news/claude-opus-4-6&quot;&gt;Claude Opus 4.6&lt;/a&gt;, our most intelligent model for complex agentic tasks and long-horizon work. Opus 4.6 recommends &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking&quot;&gt;adaptive thinking&lt;/a&gt; (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;thinking: {type: &quot;adaptive&quot;}&lt;/code&gt;); manual thinking (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;type: &quot;enabled&quot;&lt;/code&gt; with &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;budget_tokens&lt;/code&gt;) is deprecated. Opus 4.6 does not support prefilling assistant messages. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-6&quot;&gt;What&#39;s new in Claude 4.6&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;The &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/effort&quot;&gt;effort parameter&lt;/a&gt; is now generally available (no beta header required) and supports Claude Opus 4.6. Effort replaces &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;budget_tokens&lt;/code&gt; for controlling thinking depth on new models.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/compaction&quot;&gt;compaction API&lt;/a&gt; in beta, providing server-side context summarization for effectively infinite conversations. Available on Opus 4.6.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve introduced &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/data-residency&quot;&gt;data residency controls&lt;/a&gt;, allowing you to specify where model inference runs with the &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;inference_geo&lt;/code&gt; parameter. US-only inference is available at 1.1x pricing for models released after February 1, 2026.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;The &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/context-windows&quot;&gt;1M token context window&lt;/a&gt; is now available in beta for Claude Opus 4.6, in addition to Sonnet 4.5 and Sonnet 4. &lt;template id=&quot;P:2&quot;&gt;&lt;/template&gt; applies to requests exceeding 200k input tokens.&lt;/li&gt;
        &lt;template id=&quot;P:3&quot;&gt;&lt;/template&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#february-5-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-february-5-2026</guid>
      <pubDate>Wed, 04 Feb 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>January 29, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;&lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/structured-outputs&quot;&gt;Structured outputs&lt;/a&gt; are now generally available on the Claude API for Claude Sonnet 4.5, Claude Opus 4.5, and Claude Haiku 4.5. GA includes expanded schema support, improved grammar compilation latency, and a simplified integration path with no beta header required. The &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;output_format&lt;/code&gt; parameter has moved to &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;output_config.format&lt;/code&gt;. Existing beta users can continue using the beta header during the transition period. Structured outputs remain in public beta on Amazon Bedrock and Microsoft Foundry.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#january-29-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-january-29-2026</guid>
      <pubDate>Wed, 28 Jan 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>January 12, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;console.anthropic.com&lt;/code&gt; now redirects to &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;platform.claude.com&lt;/code&gt;. The Claude Console has moved to its new home as part of our Claude brand consolidation. Existing bookmarks and links will continue working via automatic redirect. For more details, see the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/release-notes/overview#september-16-2025&quot;&gt;September 16, 2025 announcement&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#january-12-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-january-12-2026</guid>
      <pubDate>Sun, 11 Jan 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>January 5, 2026</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve retired the Claude Opus 3 model (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;claude-3-opus-20240229&lt;/code&gt;). All requests to this model will now return an error. We recommend upgrading to &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview#latest-models-comparison&quot;&gt;Claude Opus 4.5&lt;/a&gt;, which offers significantly improved intelligence at a third of the cost. Researchers can request ongoing access to Claude Opus 3 on the API through the &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://support.claude.com/en/articles/9125743-what-is-the-external-researcher-access-program&quot;&gt;External Researcher Access Program&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#january-5-2026</link>
      <guid isPermaLink="false">anthropic-release-notes-january-5-2026</guid>
      <pubDate>Sun, 04 Jan 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>December 19, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We announced the deprecation of the Claude Haiku 3.5 model. Read more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/model-deprecations&quot;&gt;Model deprecations&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#december-19-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-december-19-2025</guid>
      <pubDate>Thu, 18 Dec 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>December 4, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;&lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/structured-outputs&quot;&gt;Structured outputs&lt;/a&gt; now supports Claude Haiku 4.5.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#december-4-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-december-4-2025</guid>
      <pubDate>Wed, 03 Dec 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>November 24, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://www.anthropic.com/news/claude-opus-4-5&quot;&gt;Claude Opus 4.5&lt;/a&gt;, our most intelligent model combining maximum capability with practical performance. Ideal for complex specialized tasks, professional software engineering, and advanced agents. Features step-change improvements in vision, coding, and computer use at a more accessible price point than previous Opus models. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models&quot;&gt;Models overview&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling&quot;&gt;programmatic tool calling&lt;/a&gt; in public beta, allowing Claude to call tools from within code execution to reduce latency and token usage in multi-tool workflows.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool&quot;&gt;tool search tool&lt;/a&gt; in public beta, enabling Claude to dynamically discover and load tools on-demand from large tool catalogs.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/effort&quot;&gt;effort parameter&lt;/a&gt; in public beta for Claude Opus 4.5, allowing you to control token usage by trading off between response thoroughness and efficiency.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve added &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/context-editing#client-side-compaction-sdk&quot;&gt;client-side compaction&lt;/a&gt; to our Python and TypeScript SDKs, automatically managing conversation context through summarization when using &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;tool_runner&lt;/code&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#november-24-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-november-24-2025</guid>
      <pubDate>Sun, 23 Nov 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>November 21, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;Search result content blocks are now generally available on Amazon Bedrock. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/search-results&quot;&gt;Search results&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#november-21-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-november-21-2025</guid>
      <pubDate>Thu, 20 Nov 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>November 19, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched a &lt;strong class=&quot;font-semibold&quot;&gt;new documentation platform&lt;/strong&gt; at &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://platform.claude.com/docs&quot;&gt;platform.claude.com/docs&lt;/a&gt;. Our documentation now lives side by side with the Claude Console, providing a unified developer experience. The previous docs site at docs.claude.com will redirect to the new location.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#november-19-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-november-19-2025</guid>
      <pubDate>Tue, 18 Nov 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>November 18, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;strong class=&quot;font-semibold&quot;&gt;Claude in Microsoft Foundry&lt;/strong&gt;, bringing Claude models to Azure customers with Azure billing and OAuth authentication. Access the full Messages API including extended thinking, prompt caching (5-minute and 1-hour), PDF support, Files API, Agent Skills, and tool use. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry&quot;&gt;Claude in Microsoft Foundry&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#november-18-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-november-18-2025</guid>
      <pubDate>Mon, 17 Nov 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>November 14, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/structured-outputs&quot;&gt;structured outputs&lt;/a&gt; in public beta, providing guaranteed schema conformance for Claude&#39;s responses. Use JSON outputs for structured data responses or strict tool use for validated tool inputs. Available for Claude Sonnet 4.5 and Claude Opus 4.1. To enable, use the beta header &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;structured-outputs-2025-11-13&lt;/code&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#november-14-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-november-14-2025</guid>
      <pubDate>Thu, 13 Nov 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>October 28, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We announced the deprecation of the Claude Sonnet 3.7 model. Read more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/model-deprecations&quot;&gt;Model deprecations&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve retired the Claude Sonnet 3.5 models. All requests to these models will now return an error.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve expanded context editing with thinking block clearing (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;clear_thinking_20251015&lt;/code&gt;), enabling automatic management of thinking blocks. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/context-editing&quot;&gt;Context editing&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#october-28-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-october-28-2025</guid>
      <pubDate>Mon, 27 Oct 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>October 16, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills&quot;&gt;Agent Skills&lt;/a&gt; (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;skills-2025-10-02&lt;/code&gt; beta), a new way to extend Claude&#39;s capabilities. Skills are organized folders of instructions, scripts, and resources that Claude loads dynamically to perform specialized tasks. The initial release includes:&lt;!-- --&gt;
        &lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;&lt;strong class=&quot;font-semibold&quot;&gt;Anthropic-managed Skills&lt;/strong&gt;: Pre-built Skills for working with PowerPoint (.pptx), Excel (.xlsx), Word (.docx), and PDF files&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;&lt;strong class=&quot;font-semibold&quot;&gt;Custom Skills&lt;/strong&gt;: Upload your own Skills via the Skills API (&lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;/v1/skills&lt;/code&gt; endpoints) to package domain expertise and organizational workflows&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;Skills require the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/code-execution-tool&quot;&gt;code execution tool&lt;/a&gt; to be enabled&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview&quot;&gt;Agent Skills&lt;/a&gt; and &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/api/skills/create-skill&quot;&gt;API reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
        &lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#october-16-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-october-16-2025</guid>
      <pubDate>Wed, 15 Oct 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>October 15, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://www.anthropic.com/news/claude-haiku-4-5&quot;&gt;Claude Haiku 4.5&lt;/a&gt;, our fastest and most intelligent Haiku model with near-frontier performance. Ideal for real-time applications, high-volume processing, and cost-sensitive deployments requiring strong reasoning. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models&quot;&gt;Models overview&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#october-15-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-october-15-2025</guid>
      <pubDate>Tue, 14 Oct 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>September 29, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://www.anthropic.com/news/claude-sonnet-4-5&quot;&gt;Claude Sonnet 4.5&lt;/a&gt;, our best model for complex agents and coding, with the highest intelligence across most tasks. Learn more in the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/models/overview&quot;&gt;models overview&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve introduced &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/about-claude/pricing#third-party-platform-pricing&quot;&gt;global endpoint pricing&lt;/a&gt; for AWS Bedrock and Google Vertex AI. The Claude API (1P) pricing is unaffected.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve introduced a new stop reason &lt;code class=&quot;relative inline bg-bg-300 px-2 py-0.5 rounded text-sm font-mono break-words box-decoration-clone&quot;&gt;model_context_window_exceeded&lt;/code&gt; that allows you to request the maximum possible tokens without calculating input size. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/handling-stop-reasons&quot;&gt;Handling stop reasons&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the memory tool in beta, enabling Claude to store and consult information across conversations. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool&quot;&gt;Memory tool&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched context editing in beta, providing strategies to automatically manage conversation context. The initial release supports clearing older tool results and calls when approaching token limits. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/context-editing&quot;&gt;Context editing&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#september-29-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-september-29-2025</guid>
      <pubDate>Sun, 28 Sep 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>September 17, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched tool helpers in beta for the Python and TypeScript SDKs, simplifying tool creation and execution with type-safe input validation and a tool runner for automated tool handling in conversations. For details, see the documentation for &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://github.com/anthropics/anthropic-sdk-python/blob/main/tools.md&quot;&gt;the Python SDK&lt;/a&gt; and &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://github.com/anthropics/anthropic-sdk-typescript/blob/main/helpers.md#tool-helpers&quot;&gt;the TypeScript SDK&lt;/a&gt;.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#september-17-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-september-17-2025</guid>
      <pubDate>Tue, 16 Sep 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>September 16, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve unified our developer offerings under the Claude brand. You should see updated naming and URLs across our platform and documentation, but &lt;strong class=&quot;font-semibold&quot;&gt;our developer interfaces will remain the same&lt;/strong&gt;. Here are some notable changes:&lt;!-- --&gt;
        &lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;Claude Console (&lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://console.anthropic.com/&quot;&gt;console.anthropic.com&lt;/a&gt;) → Claude Console (&lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://platform.claude.com/&quot;&gt;platform.claude.com&lt;/a&gt;). The console will be available at both URLs until January 12, 2026. After that date, &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://console.anthropic.com/&quot;&gt;console.anthropic.com&lt;/a&gt; will automatically redirect to &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://platform.claude.com/&quot;&gt;platform.claude.com&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;Anthropic Docs (&lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://docs.claude.com/&quot;&gt;docs.claude.com&lt;/a&gt;) → Claude Docs (&lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://docs.claude.com/&quot;&gt;docs.claude.com&lt;/a&gt;)&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;Anthropic Help Center (&lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://support.claude.com/&quot;&gt;support.claude.com&lt;/a&gt;) → Claude Help Center (&lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; href=&quot;https://support.claude.com/&quot;&gt;support.claude.com&lt;/a&gt;)&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;API endpoints, headers, environment variables, and SDKs remain the same. Your existing integrations will continue working without any changes.&lt;/li&gt;
        &lt;/ul&gt;
        &lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#september-16-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-september-16-2025</guid>
      <pubDate>Mon, 15 Sep 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>September 10, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the web fetch tool in beta, allowing Claude to retrieve full content from specified web pages and PDF documents. Learn more in &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-fetch-tool&quot;&gt;Web fetch tool&lt;/a&gt;.&lt;/li&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We&#39;ve launched the &lt;a class=&quot;inline-link&quot; href=&quot;https://platform.claude.com/docs/en/build-with-claude/claude-code-analytics-api&quot;&gt;Claude Code Analytics API&lt;/a&gt;, enabling organizations to programmatically access daily aggregated usage metrics for Claude Code, including productivity metrics, tool usage statistics, and cost data.&lt;/li&gt;
        &lt;/ul&gt;</description>
      <link>https://platform.claude.com/docs/en/release-notes/overview#september-10-2025</link>
      <guid isPermaLink="false">anthropic-release-notes-september-10-2025</guid>
      <pubDate>Tue, 09 Sep 2025 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>September 8, 2025</title>
      <description>&lt;ul class=&quot;list-none pl-8 [&amp;amp;_ul]:pl-8 [&amp;amp;_ol]:pl-8 mb-4&quot;&gt;
        &lt;li class=&quot;mb-2 last:mb-0 relative before:absolute before:left-[-2rem] before:w-6 before:text-center [ul&gt;&amp;amp;]:before:content-[&#39;•&#39;] [ol&gt;&amp;amp;]:before:content-[counter(item)&#39;.&#39;] [ol&gt;&amp;amp;]:[counter-increment:item]&quot;&gt;We launched a beta version of the &lt;a class=&quot;inline-link&quot; target=&quot;_blank&quot; rel=&quot;noopener n

...

github-actions · 2026-04-15T04:50:09Z

http://localhost:1200/anthropic/science - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Anthropic Science</title>
    <link>https://www.anthropic.com/science</link>
    <atom:link href="http://localhost:1200/anthropic/science" rel="self" type="application/rss+xml"></atom:link>
    <description>Anthropic is an AI safety and research company that&#39;s working to build reliable, interpretable, and steerable AI systems. - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Wed, 15 Apr 2026 04:49:38 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>Mar 23, 2026ScienceIntroducing our Science Blog</title>
      <description>&lt;div class=&quot;page-wrapper PostDetail-module-scss-module__UQuRMa__hero&quot;&gt;&lt;div class=&quot;PostDetail-module-scss-module__UQuRMa__illustrationHeroWrapper&quot;&gt;&lt;div class=&quot;Illustration-module-scss-module__WyGOtq__root Illustration-module-scss-module__WyGOtq__aspect-wide Illustration-module-scss-module__WyGOtq__padding-lg Illustration-module-scss-module__WyGOtq__radius-lg bg-olive&quot;&gt;&lt;div class=&quot;Illustration-module-scss-module__WyGOtq__inner&quot;&gt;&lt;img alt=&quot;Introducing our Science Blog&quot; loading=&quot;lazy&quot; width=&quot;1000&quot; height=&quot;1000&quot; decoding=&quot;async&quot; data-nimg=&quot;1&quot; class=&quot;&quot; style=&quot;color:transparent&quot; src=&quot;https://www-cdn.anthropic.com/images/4zrzovbb/website/60a35c504cedb3e3f581b211e4b8aef372ffe031-1000x1000.svg&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;article&gt;&lt;div class=&quot;&quot;&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__body&quot; data-theme=&quot;ivory&quot;&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;em&gt;We’re launching a new blog about AI and science. We’ll share work happening at Anthropic and elsewhere, our collaborations with external researchers and labs, and discuss practical workflows for scientists using AI in their research.&lt;/em&gt;&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Increasing the pace of scientific progress is a core part of Anthropic’s mission. &lt;a href=&quot;https://www.darioamodei.com/essay/machines-of-loving-grace&quot;&gt;&lt;em&gt;Machines of Loving Grace&lt;/em&gt;&lt;/a&gt; describes the prospect of a “compressed 21st century” in which decades of scientific progress occur over just a few years. We’re starting to see what the early stages of that compression look like: AI is helping mathematicians to &lt;a href=&quot;https://www-cs-faculty.stanford.edu/~knuth/papers/claude-cycles.pdf&quot;&gt;discover&lt;/a&gt; &lt;a href=&quot;https://www.math.inc/sphere-packing&quot;&gt;new proofs&lt;/a&gt;, individual researchers to &lt;a href=&quot;https://scottdodelson.substack.com/p/evolving-dark-energy-and-ai&quot;&gt;run computational analyses&lt;/a&gt; that once required dedicated teams, and biologists to &lt;a href=&quot;https://www.biorxiv.org/content/10.1101/2025.05.26.656231v1.full.pdf&quot;&gt;identify functional gene relationships&lt;/a&gt; across datasets of millions of cells.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Just as computers took on the task of computation, AI is now taking on parts of &lt;em&gt;cognition&lt;/em&gt;. As a side effect of this shift, work that used to require years of specialized training can increasingly be done more quickly and cheaply with AI. The rate of progress raises sociological questions about the practice of science and the role of scientific institutions: What should research apprenticeship look like? How do we maintain trust in the literature when AI becomes more central to producing it? What does it even mean to be a scientist when the bottleneck shifts from execution to management?&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Although the pace of improvement is rapid, some of these questions may feel premature today—AI’s scientific capabilities are, in many ways, still in beta. While models already seem superhuman at certain parts of the scientific workflow, they can also hallucinate results, be overly sycophantic, and get stuck on problems a domain practitioner would find trivial. Fields Medalist Timothy Gowers captured this tension well, &lt;a href=&quot;https://x.com/wtgowers/status/1984340187003175298?s=20&quot;&gt;writing&lt;/a&gt; that “it looks as though we have entered the brief but enjoyable era where our research is greatly sped up by AI but AI still needs us.”&lt;br&gt;&lt;br&gt;AI will alter the scientific process in ways that are only starting to become apparent. This blog will discuss the upsides and challenges of the current moment for AI and science, exploring the excitement as it unfolds.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;what-well-cover&quot;&gt;What we’ll cover&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;In this blog, we’ll share three main types of posts:&lt;/p&gt;&lt;ul class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;li&gt;&lt;strong&gt;Features:&lt;/strong&gt; Articles on a specific result or line of work, with enough detail to understand both the science and the role AI played in producing it. We’ll publish stories both from Anthropic researchers and guest contributors and collaborators.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Workflows:&lt;/strong&gt; Practical guides for researchers who want to use AI in their own work across various domains in the natural and formal sciences.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Field notes:&lt;/strong&gt; Roundups of developments across the field, including notable results, new tools, and open questions.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;We’re publishing two pieces alongside this introduction: Matthew Schwartz’s “&lt;a href=&quot;https://www.anthropic.com/research/vibe-physics&quot;&gt;Vibe physics: The AI grad student&lt;/a&gt;,” a spotlight on supervising Claude through a real theoretical physics calculation, and &lt;a href=&quot;https://www.anthropic.com/research/long-running-Claude&quot;&gt;a tutorial on orchestrating long-running tasks for scientific computation&lt;/a&gt;.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;science-at-anthropic&quot;&gt;&lt;strong&gt;Science at Anthropic&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Anthropic has several initiatives aimed at accelerating scientific progress. Our &lt;a href=&quot;https://www.anthropic.com/news/ai-for-science-program&quot;&gt;AI for Science&lt;/a&gt; program provides API credits to researchers working on high-impact projects across biology, physics, chemistry, and other fields. &lt;a href=&quot;https://www.anthropic.com/news/claude-for-life-sciences&quot;&gt;Claude for Life Sciences&lt;/a&gt; is dedicated to making Claude useful for life sciences researchers and R&amp;amp;D teams, with partnerships across research institutions, pharma, and biotech. We recently shared some early &lt;a href=&quot;https://www.anthropic.com/news/accelerating-scientific-research&quot;&gt;results of these programs&lt;/a&gt;. And we’re a &lt;a href=&quot;https://www.anthropic.com/news/genesis-mission-partnership&quot;&gt;core partner&lt;/a&gt; in the &lt;a href=&quot;http://genesis.energy.gov/&quot;&gt;Genesis Mission&lt;/a&gt;, a multi-billion-dollar initiative across industry, academia, and government to accelerate American science with AI.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Beyond these dedicated efforts, researchers across Anthropic are working to improve our models&#39; core scientific capabilities and safely accelerate AI-assisted discovery. Many come from biophysics, chemistry, and neuroscience. We&#39;ll be reporting on their work and on efforts elsewhere in the field.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;If you have something you want to see covered here, please reach out to us at scienceblog@anthropic.com.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/article&gt;&lt;/div&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;div class=&quot;PostDetail-module-scss-module__UQuRMa__footnotes&quot;&gt;&lt;h4 class=&quot;headline-5&quot;&gt; &lt;/h4&gt;&lt;/div&gt;&lt;/div&gt;&lt;section class=&quot;LandingPageSection-module-scss-module__ZSMdoa__root&quot; data-theme=&quot;ivory&quot;&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;div class=&quot;SectionIntro-module-scss-module__i9TRza__root LinkGrid-module-scss-module__wTN57W__intro&quot;&gt;&lt;h2 class=&quot;headline-4 SectionIntro-module-scss-module__i9TRza__title&quot;&gt;Related content&lt;/h2&gt;&lt;/div&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__root&quot;&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__items LinkGrid-module-scss-module__wTN57W__threeItems&quot;&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__item&quot;&gt;&lt;h3 class=&quot;headline-6&quot;&gt;Automated Alignment Researchers: Using large language models to scale scalable oversight&lt;/h3&gt;&lt;p class=&quot;body-3 serif&quot;&gt;Can Claude develop, test, and analyze alignment ideas of its own? We ran an experiment to find out.&lt;/p&gt;&lt;a href=&quot;https://www.anthropic.com/research/automated-alignment-researchers&quot; class=&quot;ButtonTextLink-module-scss-module__q8IAwW__textLink LinkGrid-module-scss-module__wTN57W__cta&quot; referrerpolicy=&quot;no-referrer-when-downgrade&quot;&gt;&lt;span class=&quot;body-3&quot;&gt;Read more&lt;/span&gt;&lt;span class=&quot;ButtonTextLink-module-scss-module__q8IAwW__icon&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;20&quot; height=&quot;20&quot; viewBox=&quot;0 0 21 21&quot;&gt;&lt;path d=&quot;M4.14585 9.87492L14.4584 9.87492L9.60419 5.04158L10.5 4.14575L16.8542 10.4999L10.5 16.8541L9.60419 15.9583L14.4584 11.1249L4.14585 11.1249L4.14585 9.87492Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__item&quot;&gt;&lt;h3 class=&quot;headline-6&quot;&gt;Trustworthy agents in practice&lt;/h3&gt;&lt;p class=&quot;body-3 serif&quot;&gt;AI “agents” represent the latest major shift in how people and organizations are using AI. Here, we explain how they work and how we ensure they&#39;re trustworthy.&lt;/p&gt;&lt;a href=&quot;https://www.anthropic.com/research/trustworthy-agents&quot; class=&quot;ButtonTextLink-module-scss-module__q8IAwW__textLink LinkGrid-module-scss-module__wTN57W__cta&quot; referrerpolicy=&quot;no-referrer-when-downgrade&quot;&gt;&lt;span class=&quot;body-3&quot;&gt;Read more&lt;/span&gt;&lt;span class=&quot;ButtonTextLink-module-scss-module__q8IAwW__icon&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;20&quot; height=&quot;20&quot; viewBox=&quot;0 0 21 21&quot;&gt;&lt;path d=&quot;M4.14585 9.87492L14.4584 9.87492L9.60419 5.04158L10.5 4.14575L16.8542 10.4999L10.5 16.8541L9.60419 15.9583L14.4584 11.1249L4.14585 11.1249L4.14585 9.87492Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__item&quot;&gt;&lt;h3 class=&quot;headline-6&quot;&gt;Emotion concepts and their function in a large language model&lt;/h3&gt;&lt;p class=&quot;body-3 serif&quot;&gt;All modern language models sometimes act like they have emotions. What’s behind these behaviors? Our interpretability team investigates.&lt;/p&gt;&lt;a href=&quot;https://www.anthropic.com/research/emotion-concepts-function&quot; class=&quot;ButtonTextLink-module-scss-module__q8IAwW__textLink LinkGrid-module-scss-module__wTN57W__cta&quot; referrerpolicy=&quot;no-referrer-when-downgrade&quot;&gt;&lt;span class=&quot;body-3&quot;&gt;Read more&lt;/span&gt;&lt;span class=&quot;ButtonTextLink-module-scss-module__q8IAwW__icon&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;20&quot; height=&quot;20&quot; viewBox=&quot;0 0 21 21&quot;&gt;&lt;path d=&quot;M4.14585 9.87492L14.4584 9.87492L9.60419 5.04158L10.5 4.14575L16.8542 10.4999L10.5 16.8541L9.60419 15.9583L14.4584 11.1249L4.14585 11.1249L4.14585 9.87492Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/section&gt;&lt;section class=&quot;SubjectNewsletter-module-scss-module__qHNNfa__wrapper&quot; data-theme=&quot;slate&quot;&gt;&lt;div class=&quot;page-wrapper SubjectNewsletter-module-scss-module__qHNNfa__content&quot;&gt;&lt;div class=&quot;SubjectNewsletter-module-scss-module__qHNNfa__text-content&quot;&gt;&lt;h2 class=&quot;headline-2 SubjectNewsletter-module-scss-module__qHNNfa__title&quot;&gt;Subscribe to Anthropic Science&lt;/h2&gt;&lt;p class=&quot;body-1 SubjectNewsletter-module-scss-module__qHNNfa__subtitle&quot;&gt;Features on AI-assisted discoveries, practical workflows, and field notes across the sciences.&lt;/p&gt;&lt;/div&gt;&lt;div id=&quot;subject-newsletter-science&quot; class=&quot;SubjectNewsletter-module-scss-module__qHNNfa__form-container&quot;&gt;&lt;/div&gt;&lt;/div&gt;&lt;/section&gt;</description>
      <link>https://www.anthropic.com/research/introducing-anthropic-science</link>
      <guid isPermaLink="false">https://www.anthropic.com/research/introducing-anthropic-science</guid>
    </item>
    <item>
      <title>Mar 23, 2026ScienceLong-running Claude for scientific computing</title>
      <description>&lt;div class=&quot;page-wrapper PostDetail-module-scss-module__UQuRMa__hero&quot;&gt;&lt;div class=&quot;PostDetail-module-scss-module__UQuRMa__illustrationHeroWrapper&quot;&gt;&lt;div class=&quot;Illustration-module-scss-module__WyGOtq__root Illustration-module-scss-module__WyGOtq__aspect-wide Illustration-module-scss-module__WyGOtq__padding-lg Illustration-module-scss-module__WyGOtq__radius-lg bg-heather&quot;&gt;&lt;div class=&quot;Illustration-module-scss-module__WyGOtq__inner&quot;&gt;&lt;img alt=&quot;Long-running Claude for scientific computing&quot; loading=&quot;lazy&quot; width=&quot;1000&quot; height=&quot;1000&quot; decoding=&quot;async&quot; data-nimg=&quot;1&quot; class=&quot;&quot; style=&quot;color:transparent&quot; src=&quot;https://www-cdn.anthropic.com/images/4zrzovbb/website/ac2fa660649f361111949b32136a308ef35b6864-1000x1000.svg&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;article&gt;&lt;div class=&quot;&quot;&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__body&quot; data-theme=&quot;ivory&quot;&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;em&gt;In this post, Siddharth Mishra-Sharma&lt;/em&gt;, &lt;em&gt;a researcher on the Discovery team, explains how to apply multi-day agentic coding workflows—test oracles, persistent memory, and orchestration patterns—to scientific computing tasks even outside of one’s domain.&lt;/em&gt;&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;the-premise&quot;&gt;&lt;strong&gt;The premise&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Most scientists currently using AI agents work in a conversational loop, managing each step of the process on a tight leash. As models have become &lt;a href=&quot;https://metr.org/time-horizons/&quot;&gt;significantly better at long-horizon tasks&lt;/a&gt; over the last year or so, a new way of working emerged: rather than getting involved with every detail, we can specify the high-level objective and set a team of agents loose to work autonomously. This makes it possible to complete projects in mere hours that might otherwise take us days, weeks, or even months. Certain types of scientific tasks fit well within this model, e.g., reimplementing a numerical solver, converting legacy scientific software written in an old Fortran dialect to a modern language, and debugging a large codebase against a reference implementation. These are tasks where the work is well-scoped, the success criteria are clear, and human oversight can be occasional rather than continuous.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Anthropic’s &lt;a href=&quot;https://www.anthropic.com/engineering/building-c-compiler&quot;&gt;C compiler project&lt;/a&gt; demonstrated a version of this, where Claude worked across roughly 2,000 sessions to build a C compiler capable of compiling the Linux kernel. This post describes how to set up a similar pattern for scientific computing tasks using Claude Code, with a typical academic lab in mind. As a concrete example, I will walk through using Claude Opus 4.6 to &lt;a href=&quot;https://github.com/smsharma/clax&quot;&gt;implement a differentiable version of a cosmological Boltzmann solver&lt;/a&gt;. This is numerical code that predicts the statistical properties of the afterglow of the Big Bang—the Cosmic Microwave Background, or CMB. It does this by evolving coupled equations for photons, baryons, neutrinos, and dark matter through the early universe.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Boltzmann solvers like &lt;a href=&quot;http://class-code.net/&quot;&gt;CLASS&lt;/a&gt; and &lt;a href=&quot;https://camb.info/&quot;&gt;CAMB&lt;/a&gt; are core pieces of scientific infrastructure in cosmology, allowing us to constrain cosmological models using data from surveys like &lt;em&gt;Planck &lt;/em&gt;and the&lt;em&gt; Simons Observatory.&lt;/em&gt; A differentiable version—one that can propagate gradients through the full solver—enables the use of gradient-based inference methods, dramatically speeding up parameter estimation. Writing it in JAX is a natural fit here, since it gives us automatic differentiation and compatibility with accelerators (e.g., GPUs) essentially for free. &lt;br&gt;&lt;br&gt;Notably, the task isn’t in my core scientific domain—I have a high-level familiarity with the tools and the science, but don’t have the expertise to complete it myself in any reasonable time frame. Groups who &lt;em&gt;do&lt;/em&gt; have that expertise have built &lt;a href=&quot;https://arxiv.org/abs/2311.03291&quot;&gt;differentiable&lt;/a&gt; &lt;a href=&quot;https://arxiv.org/abs/2602.15104&quot;&gt;solvers&lt;/a&gt; in JAX with a subset of the features present in CLASS. These efforts typically represent months to years of researcher-time. The point here was to see if an agent could go further with minimal steering from a non-domain expert.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;This kind of task is structurally different from the C compiler project, which can be farmed out to a large number of parallel agents. A Boltzmann solver, on the other hand, is a deeply coupled pipeline—a small numerical error or poor approximation in modeling how the early universe recombines can subtly shift everything downstream. It thus requires a different set of agent skills. Debugging requires tracing causally through the entire chain and drawing from domain knowledge, which may be better suited to a single agent working sequentially, spawning subagents as needed, and using the reference implementation to bisect discrepancies.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;We&#39;ll use an HPC cluster running the SLURM job scheduler as our compute environment, but the core ideas—a progress file, a test oracle, an agent prompt with clear rules—apply regardless of where you run Claude Code.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;draft-a-plan-and-iterate-locally&quot;&gt;&lt;strong&gt;Draft a plan and iterate locally&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;In this shift toward managing an autonomous research team of agents, you should spend most of your time (in consultation with Claude), crafting a set of instructions that clearly articulates the project’s deliverables and relevant context. These instructions should live in a CLAUDE.md file located in the root directory. Claude treats this file specially, keeping it in context and referencing it for the overall plan. Crucially, Claude can edit these instructions as it works, updating them for future work as it works through issues.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;a href=&quot;https://github.com/smsharma/clax/blob/6a6b2330cf25edded1bb31ec57a0091aa794a5d3/CLAUDE.md&quot;&gt;Here&lt;/a&gt; is an early CLAUDE.md for the cosmological Boltzmann solver project, showing the overall plan and design decisions codified after an initial attempt at writing the solver. To arrive at this, I specified the high-level goals of the project—achieving full feature-parity with the reference CLASS implementation while being fully differentiable, and having an accuracy target of 0.1% against CLASS in the main science deliverables—and iterated with Claude until the plan seemed satisfactory. Given that 0.1% is the typical level of agreement between the two canonical Boltzmann codes CLASS and CAMB, this seemed like a good science target.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;memory-across-sessions&quot;&gt;&lt;strong&gt;Memory across sessions&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;The progress file, which by convention we call here CHANGELOG.md, is the agent’s portable long-term memory, acting as a sort of lab notes. In CLAUDE.md, Claude was instructed to keep track of progress in this file.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;A good progress file might track current status, completed tasks, failed approaches and why they didn&#39;t work, accuracy tables at key checkpoints, and known limitations. The failed approaches are important—without them, successive sessions will re-attempt the same dead ends. An entry might look like: “Tried using Tsit5 for the perturbation ODE, system is too stiff. Switched to Kvaerno5.” &lt;a href=&quot;https://github.com/smsharma/clax/blob/main/CHANGELOG.md&quot;&gt;Here&lt;/a&gt; is the changelog for the running example, showing these elements.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;the-test-oracle&quot;&gt;&lt;strong&gt;The test oracle&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;While more open-ended scientific discovery via agents is certainly on the horizon, long-running autonomous scientific work today crucially depends on the agent having a way to know whether it’s making progress. For scientific code, this could be a reference implementation, a clearly quantifiable objective, or an existing test suite. It can also be helpful to instruct the agent to expand the test suite and run tests as it works, to prevent regressions. In my example task, Claude was instructed to construct and continuously run unit tests using &lt;a href=&quot;https://github.com/lesgourg/class_public&quot;&gt;CLASS C source&lt;/a&gt; as a reference implementation.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;git-as-coordination&quot;&gt;&lt;strong&gt;Git as coordination&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Git can be a good way to monitor and coordinate the agent’s progress in a hands-off manner. The agent should commit and push after every meaningful unit of work. This gives you a recoverable history if something goes awry, makes progress visible locally, and prevents work from being lost if, for instance, your compute allocation runs out mid-session.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Practically, this could be a set of instructions in CLAUDE.md, e.g. “Commit and push after every meaningful unit of work. Run `pytest tests/ -x -q` before every commit. Never commit code that breaks existing passing tests.”&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;For steering the agent, you can always SSH into the cluster and manually re-prompt and/or update its instructions. It is typically more ergonomic to simply ask a local instance of Claude Code to SSH in and run commands for you; this will also apply to everything described below.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;the-execution-loop&quot;&gt;&lt;strong&gt;The execution loop&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;As mentioned above, it’s often useful to first iterate on the plan locally until you have one that looks reasonable and is encoded in CLAUDE.md. From there, start a Claude Code session inside a terminal multiplexer like tmux on a compute node, tell the agent where to find your codebase, and let it cook. Because the session runs inside tmux, you can detach, close your laptop, and occasionally check on progress (in the case of the Boltzmann solver, I would check in on GitHub on my phone, e.g. while waiting in line for a coffee).&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;On an HPC cluster you might request a node through the SLURM scheduler, and an example job script that launches Claude Code in a tmux session might look like the following:&lt;/p&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__media-column Body-module-scss-module__z40yvW__inline&quot;&gt;&lt;div class=&quot;CodeBlock-module-scss-module__PbWBnq__codeBlock&quot;&gt;&lt;pre class=&quot;&quot; style=&quot;--height:300px;--height-expanded:0px&quot;&gt;&lt;code class=&quot;plaintext&quot;&gt;#!/bin/bash
        #SBATCH --job-name=claude-agent
        #SBATCH --partition=GPU-shared
        #SBATCH --gres=gpu:h100-32:1
        #SBATCH --time=48:00:00
        #SBATCH --output=agent_%j.log
        cd $PROJECT/my-solver
        source .venv/bin/activate
        export TERM=xterm-256color
        tmux new-session -d -s claude &quot;claude; exec bash&quot;
        tmux wait-for claude&lt;/code&gt;&lt;/pre&gt;&lt;div class=&quot;CodeBlock-module-scss-module__PbWBnq__controls&quot;&gt;&lt;button aria-label=&quot;Copy code&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;11&quot; height=&quot;15&quot; viewBox=&quot;0 0 11 15&quot;&gt;&lt;path d=&quot;M5.4 0C6.39875 0 7.26819 0.543814 7.73525 1.35H9.45C10.1956 1.35 10.8 1.95442 10.8 2.7V13.5C10.8 14.2456 10.1956 14.85 9.45 14.85H1.35C0.604415 14.85 2.17436e-08 14.2456 0 13.5V2.7C1.7395e-07 1.95442 0.604415 1.35 1.35 1.35H3.06475C3.53181 0.543814 4.40125 0 5.4 0ZM1.35 2.25C1.10147 2.25 0.9 2.45147 0.9 2.7V13.5C0.9 13.7485 1.10147 13.95 1.35 13.95H9.45C9.69853 13.95 9.9 13.7485 9.9 13.5V2.7C9.9 2.45147 9.69853 2.25 9.45 2.25H8.06221C8.08677 2.39637 8.1 2.54665 8.1 2.7V3.6C8.1 3.84853 7.89853 4.05 7.65 4.05H3.15C2.90147 4.05 2.7 3.84853 2.7 3.6V2.7C2.7 2.54665 2.71323 2.39637 2.73779 2.25H1.35ZM7.68603 10.6233C7.78376 10.395 8.04828 10.2886 8.27666 10.386C8.50499 10.4838 8.61143 10.7483 8.51396 10.9767C8.24856 11.5967 7.73014 12.15 7.01982 12.15C6.58192 12.1499 6.21722 11.9397 5.93965 11.6332C5.66215 11.9395 5.29801 12.1499 4.86035 12.15C4.42229 12.15 4.05692 11.9398 3.7793 11.6332C3.50175 11.9395 3.13773 12.15 2.7 12.15C2.45147 12.15 2.25 11.9485 2.25 11.7C2.25 11.4515 2.45147 11.25 2.7 11.25C2.8912 11.25 3.16726 11.0879 3.36621 10.6233L3.39697 10.5636C3.47806 10.4321 3.62261 10.35 3.78018 10.35C3.9602 10.3501 4.1233 10.4578 4.19414 10.6233C4.39309 11.0878 4.66917 11.25 4.86035 11.25C5.05156 11.2498 5.32773 11.0877 5.52656 10.6233L5.55732 10.5636C5.63837 10.4323 5.78229 10.3501 5.93965 10.35C6.11974 10.35 6.28275 10.4578 6.35361 10.6233C6.55251 11.0878 6.82862 11.2499 7.01982 11.25C7.21102 11.25 7.48708 11.0879 7.68603 10.6233ZM7.68603 7.02334C7.78376 6.79501 8.04828 6.68857 8.27666 6.78604C8.50499 6.88376 8.61143 7.14828 8.51396 7.37666C8.24856 7.99675 7.73014 8.55 7.01982 8.55C6.58192 8.54994 6.21722 8.3397 5.93965 8.0332C5.66215 8.33947 5.29801 8.54989 4.86035 8.55C4.42229 8.55 4.05692 8.33983 3.7793 8.0332C3.50175 8.33945 3.13773 8.55 2.7 8.55C2.45147 8.55 2.25 8.34853 2.25 8.1C2.25 7.85147 2.45147 7.65 2.7 7.65C2.8912 7.65 3.16726 7.48791 3.36621 7.02334L3.39697 6.96357C3.47806 6.83213 3.62261 6.75 3.78018 6.75C3.9602 6.75007 4.1233 6.85783 4.19414 7.02334C4.39309 7.48782 4.66917 7.65 4.86035 7.65C5.05156 7.6498 5.32773 7.48772 5.52656 7.02334L5.55732 6.96357C5.63837 6.83232 5.78229 6.75012 5.93965 6.75C6.11974 6.75 6.28275 6.85778 6.35361 7.02334C6.55251 7.48782 6.82862 7.6499 7.01982 7.65C7.21102 7.65 7.48708 7.48786 7.68603 7.02334ZM5.4 0.9C4.40589 0.9 3.6 1.70589 3.6 2.7V3.15H7.2V2.7C7.2 1.70589 6.39411 0.9 5.4 0.9Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;span class=&quot;body-3&quot;&gt;Copy&lt;/span&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;br&gt;Once the job starts, you attach to the tmux session, give Claude Code direction (e.g., “Read CHANGELOG.md and pick up the next task”), and detach when you&#39;re satisfied it&#39;s on the right track. You can re-attach whenever you want to check in, steer, or start a new task using something like:&lt;/p&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__media-column Body-module-scss-module__z40yvW__inline&quot;&gt;&lt;div class=&quot;CodeBlock-module-scss-module__PbWBnq__codeBlock&quot;&gt;&lt;pre class=&quot;&quot; style=&quot;--height:300px;--height-expanded:0px&quot;&gt;&lt;code class=&quot;plaintext&quot;&gt;srun --jobid=JOBID --overlap --pty tmux attach -t claude&lt;/code&gt;&lt;/pre&gt;&lt;div class=&quot;CodeBlock-module-scss-module__PbWBnq__controls&quot;&gt;&lt;button aria-label=&quot;Copy code&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;11&quot; height=&quot;15&quot; viewBox=&quot;0 0 11 15&quot;&gt;&lt;path d=&quot;M5.4 0C6.39875 0 7.26819 0.543814 7.73525 1.35H9.45C10.1956 1.35 10.8 1.95442 10.8 2.7V13.5C10.8 14.2456 10.1956 14.85 9.45 14.85H1.35C0.604415 14.85 2.17436e-08 14.2456 0 13.5V2.7C1.7395e-07 1.95442 0.604415 1.35 1.35 1.35H3.06475C3.53181 0.543814 4.40125 0 5.4 0ZM1.35 2.25C1.10147 2.25 0.9 2.45147 0.9 2.7V13.5C0.9 13.7485 1.10147 13.95 1.35 13.95H9.45C9.69853 13.95 9.9 13.7485 9.9 13.5V2.7C9.9 2.45147 9.69853 2.25 9.45 2.25H8.06221C8.08677 2.39637 8.1 2.54665 8.1 2.7V3.6C8.1 3.84853 7.89853 4.05 7.65 4.05H3.15C2.90147 4.05 2.7 3.84853 2.7 3.6V2.7C2.7 2.54665 2.71323 2.39637 2.73779 2.25H1.35ZM7.68603 10.6233C7.78376 10.395 8.04828 10.2886 8.27666 10.386C8.50499 10.4838 8.61143 10.7483 8.51396 10.9767C8.24856 11.5967 7.73014 12.15 7.01982 12.15C6.58192 12.1499 6.21722 11.9397 5.93965 11.6332C5.66215 11.9395 5.29801 12.1499 4.86035 12.15C4.42229 12.15 4.05692 11.9398 3.7793 11.6332C3.50175 11.9395 3.13773 12.15 2.7 12.15C2.45147 12.15 2.25 11.9485 2.25 11.7C2.25 11.4515 2.45147 11.25 2.7 11.25C2.8912 11.25 3.16726 11.0879 3.36621 10.6233L3.39697 10.5636C3.47806 10.4321 3.62261 10.35 3.78018 10.35C3.9602 10.3501 4.1233 10.4578 4.19414 10.6233C4.39309 11.0878 4.66917 11.25 4.86035 11.25C5.05156 11.2498 5.32773 11.0877 5.52656 10.6233L5.55732 10.5636C5.63837 10.4323 5.78229 10.3501 5.93965 10.35C6.11974 10.35 6.28275 10.4578 6.35361 10.6233C6.55251 11.0878 6.82862 11.2499 7.01982 11.25C7.21102 11.25 7.48708 11.0879 7.68603 10.6233ZM7.68603 7.02334C7.78376 6.79501 8.04828 6.68857 8.27666 6.78604C8.50499 6.88376 8.61143 7.14828 8.51396 7.37666C8.24856 7.99675 7.73014 8.55 7.01982 8.55C6.58192 8.54994 6.21722 8.3397 5.93965 8.0332C5.66215 8.33947 5.29801 8.54989 4.86035 8.55C4.42229 8.55 4.05692 8.33983 3.7793 8.0332C3.50175 8.33945 3.13773 8.55 2.7 8.55C2.45147 8.55 2.25 8.34853 2.25 8.1C2.25 7.85147 2.45147 7.65 2.7 7.65C2.8912 7.65 3.16726 7.48791 3.36621 7.02334L3.39697 6.96357C3.47806 6.83213 3.62261 6.75 3.78018 6.75C3.9602 6.75007 4.1233 6.85783 4.19414 7.02334C4.39309 7.48782 4.66917 7.65 4.86035 7.65C5.05156 7.6498 5.32773 7.48772 5.52656 7.02334L5.55732 6.96357C5.63837 6.83232 5.78229 6.75012 5.93965 6.75C6.11974 6.75 6.28275 6.85778 6.35361 7.02334C6.55251 7.48782 6.82862 7.6499 7.01982 7.65C7.21102 7.65 7.48708 7.48786 7.68603 7.02334ZM5.4 0.9C4.40589 0.9 3.6 1.70589 3.6 2.7V3.15H7.2V2.7C7.2 1.70589 6.39411 0.9 5.4 0.9Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;span class=&quot;body-3&quot;&gt;Copy&lt;/span&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;br&gt;&lt;strong&gt;The Ralph loop:&lt;/strong&gt; As models get more capable, they require less bespoke orchestration such as prompt engineering, RAG, or context stuffing. At a given point in time, however, it can be useful to provide some level of scaffolding as a capability uplift. For example, current models can suffer from &lt;em&gt;agentic laziness&lt;/em&gt;—when asked to complete a complex, multi-part task, they can sometimes find an excuse to stop before finishing the entire task (“It’s getting late, let’s pick back up again tomorrow?”).&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;To circumvent this, a useful orchestration pattern is the &lt;a href=&quot;https://ghuntley.com/loop/&quot;&gt;&lt;em&gt;Ralph loop&lt;/em&gt;&lt;/a&gt;, which is essentially a &lt;em&gt;for&lt;/em&gt; loop which kicks the agent back into context when it claims completion, and asks if it’s &lt;em&gt;really&lt;/em&gt; done. This can be useful for long-running tasks since the agent will admit the task is not up to spec, and continue working until it is. Other similar patterns include &lt;a href=&quot;https://github.com/gsd-build/get-shit-done&quot;&gt;GSD&lt;/a&gt; (and &lt;a href=&quot;https://arxiv.org/abs/2603.20179&quot;&gt;domain-specific&lt;/a&gt; &lt;a href=&quot;https://github.com/psi-oss/get-physics-done&quot;&gt;variants&lt;/a&gt;) as well as the native-to-Claude Code /loop command.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Ralph can be installed via /plugin. A typical invocation prompt in Claude Code could look like&lt;/p&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__media-column Body-module-scss-module__z40yvW__inline&quot;&gt;&lt;div class=&quot;CodeBlock-module-scss-module__PbWBnq__codeBlock&quot;&gt;&lt;pre class=&quot;&quot; style=&quot;--height:300px;--height-expanded:0px&quot;&gt;&lt;code class=&quot;plaintext&quot;&gt;/ralph-loop:ralph-loop “Please keep working on the task until the success criterion of 0.1% accuracy across the entire parameter range is achieved.” --max-iterations 20 --completion-promise “DONE”&lt;/code&gt;&lt;/pre&gt;&lt;div class=&quot;CodeBlock-module-scss-module__PbWBnq__controls&quot;&gt;&lt;button aria-label=&quot;Copy code&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;11&quot; height=&quot;15&quot; viewBox=&quot;0 0 11 15&quot;&gt;&lt;path d=&quot;M5.4 0C6.39875 0 7.26819 0.543814 7.73525 1.35H9.45C10.1956 1.35 10.8 1.95442 10.8 2.7V13.5C10.8 14.2456 10.1956 14.85 9.45 14.85H1.35C0.604415 14.85 2.17436e-08 14.2456 0 13.5V2.7C1.7395e-07 1.95442 0.604415 1.35 1.35 1.35H3.06475C3.53181 0.543814 4.40125 0 5.4 0ZM1.35 2.25C1.10147 2.25 0.9 2.45147 0.9 2.7V13.5C0.9 13.7485 1.10147 13.95 1.35 13.95H9.45C9.69853 13.95 9.9 13.7485 9.9 13.5V2.7C9.9 2.45147 9.69853 2.25 9.45 2.25H8.06221C8.08677 2.39637 8.1 2.54665 8.1 2.7V3.6C8.1 3.84853 7.89853 4.05 7.65 4.05H3.15C2.90147 4.05 2.7 3.84853 2.7 3.6V2.7C2.7 2.54665 2.71323 2.39637 2.73779 2.25H1.35ZM7.68603 10.6233C7.78376 10.395 8.04828 10.2886 8.27666 10.386C8.50499 10.4838 8.61143 10.7483 8.51396 10.9767C8.24856 11.5967 7.73014 12.15 7.01982 12.15C6.58192 12.1499 6.21722 11.9397 5.93965 11.6332C5.66215 11.9395 5.29801 12.1499 4.86035 12.15C4.42229 12.15 4.05692 11.9398 3.7793 11.6332C3.50175 11.9395 3.13773 12.15 2.7 12.15C2.45147 12.15 2.25 11.9485 2.25 11.7C2.25 11.4515 2.45147 11.25 2.7 11.25C2.8912 11.25 3.16726 11.0879 3.36621 10.6233L3.39697 10.5636C3.47806 10.4321 3.62261 10.35 3.78018 10.35C3.9602 10.3501 4.1233 10.4578 4.19414 10.6233C4.39309 11.0878 4.66917 11.25 4.86035 11.25C5.05156 11.2498 5.32773 11.0877 5.52656 10.6233L5.55732 10.5636C5.63837 10.4323 5.78229 10.3501 5.93965 10.35C6.11974 10.35 6.28275 10.4578 6.35361 10.6233C6.55251 11.0878 6.82862 11.2499 7.01982 11.25C7.21102 11.25 7.48708 11.0879 7.68603 10.6233ZM7.68603 7.02334C7.78376 6.79501 8.04828 6.68857 8.27666 6.78604C8.50499 6.88376 8.61143 7.14828 8.51396 7.37666C8.24856 7.99675 7.73014 8.55 7.01982 8.55C6.58192 8.54994 6.21722 8.3397 5.93965 8.0332C5.66215 8.33947 5.29801 8.54989 4.86035 8.55C4.42229 8.55 4.05692 8.33983 3.7793 8.0332C3.50175 8.33945 3.13773 8.55 2.7 8.55C2.45147 8.55 2.25 8.34853 2.25 8.1C2.25 7.85147 2.45147 7.65 2.7 7.65C2.8912 7.65 3.16726 7.48791 3.36621 7.02334L3.39697 6.96357C3.47806 6.83213 3.62261 6.75 3.78018 6.75C3.9602 6.75007 4.1233 6.85783 4.19414 7.02334C4.39309 7.48782 4.66917 7.65 4.86035 7.65C5.05156 7.6498 5.32773 7.48772 5.52656 7.02334L5.55732 6.96357C5.63837 6.83232 5.78229 6.75012 5.93965 6.75C6.11974 6.75 6.28275 6.85778 6.35361 7.02334C6.55251 7.48782 6.82862 7.6499 7.01982 7.65C7.21102 7.65 7.48708 7.48786 7.68603 7.02334ZM5.4 0.9C4.40589 0.9 3.6 1.70589 3.6 2.7V3.15H7.2V2.7C7.2 1.70589 6.39411 0.9 5.4 0.9Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;span class=&quot;body-3&quot;&gt;Copy&lt;/span&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Here, Claude will iterate up to 20 times until it guarantees that the task is done with a “DONE” incantation.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;the-result&quot;&gt;&lt;strong&gt;The result&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Claude worked on &lt;a href=&quot;https://github.com/smsharma/clax&quot;&gt;the project&lt;/a&gt; from scratch over a few days, reaching sub-percent agreement with the reference CLASS implementation across its various outputs. I asked Claude to reconstruct the accuracy of some of the main outputs of the code—the various CMB angular power spectra—over the course of the project, also labeling milestones during development. It produced the plot below, illustrating the path to sub-percent accuracy.&lt;/p&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__media-column&quot;&gt;&lt;figure class=&quot;ImageWithCaption-module-scss-module__Duq99q__e-imageWithCaption&quot;&gt;&lt;img loading=&quot;lazy&quot; width=&quot;1680&quot; height=&quot;880&quot; decoding=&quot;async&quot; data-nimg=&quot;1&quot; style=&quot;color:transparent&quot; srcset=&quot;/_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2Fd6b037407956ad8d5317c97730fd9f273a6a6afa-1680x880.png&amp;amp;w=1920&amp;amp;q=75 1x, /_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2Fd6b037407956ad8d5317c97730fd9f273a6a6afa-1680x880.png&amp;amp;w=3840&amp;amp;q=75 2x&quot; src=&quot;https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2Fd6b037407956ad8d5317c97730fd9f273a6a6afa-1680x880.png&amp;amp;w=3840&amp;amp;q=75&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;figcaption class=&quot;caption&quot;&gt;The path to sub-percent accuracy over time as the agent worked on the codebase.&lt;br&gt;&lt;/figcaption&gt;&lt;/figure&gt;&lt;/div&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;The agent’s development trajectory was somewhat clunky. For example, there were clear gaps in its test coverage—for a while it was only testing the code at a single (fiducial) parameter point, drastically reducing its bug-catching surface area. It can also make elementary mistakes, such as tripping over gauge conventions or spending hours chasing bugs that a cosmologist would spot instantly, but it kept making sustained progress towards the stated goal of sub-percent accuracy.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;A side effect of the project was that I learned a surprising amount about Boltzmann solvers and the physics they model by watching the git commit history. The project isn’t drawn from my core scientific domain, but following Claude’s incremental progress and looking up what I didn&#39;t recognize turned out to be an effective way to osmose the science. The &lt;a href=&quot;https://github.com/smsharma/clax/commits/main/&quot;&gt;commit log&lt;/a&gt; reads like lab notes from a fast, hyper-literal postdoc.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;While the resulting solver is not production-grade (e.g., it doesn’t match the reference CLASS implementation to an acceptable accuracy in every regime), it demonstrates that agent-driven development can compress months or even years of researcher work into days.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;This kind of compression changes what counts as idle time. A universal experience in AI research is to launch an experiment (e.g., a training run) overnight and then have the satisfaction of seeing the results in the morning. Not running the experiment comes with an opportunity cost. These days, not running agents feels like it has a cost as well. If you have the compute and projects with well-defined success criteria, every night you &lt;em&gt;don&#39;t&lt;/em&gt; have agents working for you is potential progress left on the table.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;acknowledgments&quot;&gt;&lt;strong&gt;Acknowledgments&lt;em&gt; &lt;/em&gt;&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;We thank Eric Kauderer-Abrams for peer-review, as well as Xander Balwit, Ethan Dyer, and Rebecca Hiscott for providing helpful feedback.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/article&gt;&lt;/div&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;div class=&quot;PostDetail-module-scss-module__UQuRMa__footnotes&quot;&gt;&lt;h4 class=&quot;headline-5&quot;&gt; &lt;/h4&gt;&lt;/div&gt;&lt;/div&gt;&lt;section class=&quot;LandingPageSection-module-scss-module__ZSMdoa__root&quot; data-theme=&quot;ivory&quot;&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;div class=&quot;SectionIntro-module-scss-module__i9TRza__root LinkGrid-module-scss-module__wTN57W__intro&quot;&gt;&lt;h2 class=&quot;headline-4 SectionIntro-module-scss-module__i9TRza__title&quot;&gt;Related content&lt;/h2&gt;&lt;/div&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__root&quot;&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__items LinkGrid-module-scss-module__wTN57W__threeItems&quot;&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__item&quot;&gt;&lt;h3 class=&quot;headline-6&quot;&gt;Automated Alignment Researchers: Using large language models to scale scalable oversight&lt;/h3&gt;&lt;p class=&quot;body-3 serif&quot;&gt;Can Claude develop, test, and analyze alignment ideas of its own? We ran an experiment to find out.&lt;/p&gt;&lt;a href=&quot;https://www.anthropic.com/research/automated-alignment-researchers&quot; class=&quot;ButtonTextLink-module-scss-module__q8IAwW__textLink LinkGrid-module-scss-module__wTN57W__cta&quot; referrerpolicy=&quot;no-referrer-when-downgrade&quot;&gt;&lt;span class=&quot;body-3&quot;&gt;Read more&lt;/span&gt;&lt;span class=&quot;ButtonTextLink-module-scss-module__q8IAwW__icon&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;20&quot; height=&quot;20&quot; viewBox=&quot;0 0 21 21&quot;&gt;&lt;path d=&quot;M4.14585 9.87492L14.4584 9.87492L9.60419 5.04158L10.5 4.14575L16.8542 10.4999L10.5 16.8541L9.60419 15.9583L14.4584 11.1249L4.14585 11.1249L4.14585 9.87492Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__item&quot;&gt;&lt;h3 class=&quot;headline-6&quot;&gt;Trustworthy agents in practice&lt;/h3&gt;&lt;p class=&quot;body-3 serif&quot;&gt;AI “agents” represent the latest major shift in how people and organizations are using AI. Here, we explain how they work and how we ensure they&#39;re trustworthy.&lt;/p&gt;&lt;a href=&quot;https://www.anthropic.com/research/trustworthy-agents&quot; class=&quot;ButtonTextLink-module-scss-module__q8IAwW__textLink LinkGrid-module-scss-module__wTN57W__cta&quot; referrerpolicy=&quot;no-referrer-when-downgrade&quot;&gt;&lt;span class=&quot;body-3&quot;&gt;Read more&lt;/span&gt;&lt;span class=&quot;ButtonTextLink-module-scss-module__q8IAwW__icon&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;20&quot; height=&quot;20&quot; viewBox=&quot;0 0 21 21&quot;&gt;&lt;path d=&quot;M4.14585 9.87492L14.4584 9.87492L9.60419 5.04158L10.5 4.14575L16.8542 10.4999L10.5 16.8541L9.60419 15.9583L14.4584 11.1249L4.14585 11.1249L4.14585 9.87492Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;LinkGrid-module-scss-module__wTN57W__item&quot;&gt;&lt;h3 class=&quot;headline-6&quot;&gt;Emotion concepts and their function in a large language model&lt;/h3&gt;&lt;p class=&quot;body-3 serif&quot;&gt;All modern language models sometimes act like they have emotions. What’s behind these behaviors? Our interpretability team investigates.&lt;/p&gt;&lt;a href=&quot;https://www.anthropic.com/research/emotion-concepts-function&quot; class=&quot;ButtonTextLink-module-scss-module__q8IAwW__textLink LinkGrid-module-scss-module__wTN57W__cta&quot; referrerpolicy=&quot;no-referrer-when-downgrade&quot;&gt;&lt;span class=&quot;body-3&quot;&gt;Read more&lt;/span&gt;&lt;span class=&quot;ButtonTextLink-module-scss-module__q8IAwW__icon&quot;&gt;&lt;svg class=&quot;Icon-module-scss-module__lqbdHG__icon&quot; width=&quot;20&quot; height=&quot;20&quot; viewBox=&quot;0 0 21 21&quot;&gt;&lt;path d=&quot;M4.14585 9.87492L14.4584 9.87492L9.60419 5.04158L10.5 4.14575L16.8542 10.4999L10.5 16.8541L9.60419 15.9583L14.4584 11.1249L4.14585 11.1249L4.14585 9.87492Z&quot; fill=&quot;currentColor&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/section&gt;&lt;section class=&quot;SubjectNewsletter-module-scss-module__qHNNfa__wrapper&quot; data-theme=&quot;slate&quot;&gt;&lt;div class=&quot;page-wrapper SubjectNewsletter-module-scss-module__qHNNfa__content&quot;&gt;&lt;div class=&quot;SubjectNewsletter-module-scss-module__qHNNfa__text-content&quot;&gt;&lt;h2 class=&quot;headline-2 SubjectNewsletter-module-scss-module__qHNNfa__title&quot;&gt;Subscribe to Anthropic Science&lt;/h2&gt;&lt;p class=&quot;body-1 SubjectNewsletter-module-scss-module__qHNNfa__subtitle&quot;&gt;Features on AI-assisted discoveries, practical workflows, and field notes across the sciences.&lt;/p&gt;&lt;/div&gt;&lt;div id=&quot;subject-newsletter-science&quot; class=&quot;SubjectNewsletter-module-scss-module__qHNNfa__form-container&quot;&gt;&lt;/div&gt;&lt;/div&gt;&lt;/section&gt;</description>
      <link>https://www.anthropic.com/research/long-running-Claude</link>
      <guid isPermaLink="false">https://www.anthropic.com/research/long-running-Claude</guid>
    </item>
    <item>
      <title>Mar 23, 2026ScienceVibe physics: The AI grad student</title>
      <description>&lt;div class=&quot;page-wrapper PostDetail-module-scss-module__UQuRMa__hero&quot;&gt;&lt;div class=&quot;PostDetail-module-scss-module__UQuRMa__illustrationHeroWrapper&quot;&gt;&lt;div class=&quot;Illustration-module-scss-module__WyGOtq__root Illustration-module-scss-module__WyGOtq__aspect-wide Illustration-module-scss-module__WyGOtq__padding-lg Illustration-module-scss-module__WyGOtq__radius-lg bg-cactus&quot;&gt;&lt;div class=&quot;Illustration-module-scss-module__WyGOtq__inner&quot;&gt;&lt;img alt=&quot;Vibe physics: The AI grad student&quot; loading=&quot;lazy&quot; width=&quot;1000&quot; height=&quot;1000&quot; decoding=&quot;async&quot; data-nimg=&quot;1&quot; class=&quot;&quot; style=&quot;color:transparent&quot; src=&quot;https://www-cdn.anthropic.com/images/4zrzovbb/website/46e4aa7ea208ed440d5bd9e9e3a0ee66bc336ff1-1000x1000.svg&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;page-wrapper&quot;&gt;&lt;article&gt;&lt;div class=&quot;&quot;&gt;&lt;div class=&quot;Body-module-scss-module__z40yvW__body&quot; data-theme=&quot;ivory&quot;&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;em&gt;Can AI do theoretical physics? In this guest post, professor of physics &lt;a href=&quot;https://www.physics.harvard.edu/people/facpages/schwartz&quot;&gt;Matthew Schwartz&lt;/a&gt; decided to find out by supervising Claude through a real research calculation, start to finish, without ever touching a file himself. His account of what happened is below. &lt;/em&gt;&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;summary&quot;&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;/h3&gt;&lt;ul class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;li&gt;I guided Claude Opus 4.5 through a real theoretical physics calculation, encapsulating the complexity of code and computations behind text prompts.&lt;/li&gt;&lt;li&gt;The result was a technically rigorous, impactful &lt;a href=&quot;https://arxiv.org/abs/2601.02484&quot;&gt;high-energy theoretical physics paper&lt;/a&gt; in two weeks instead of the usual year.&lt;/li&gt;&lt;li&gt;Over 110 separate drafts, 36M tokens, and 40+ hours of local CPU compute, Claude proved fast, indefatigable, and eager to please.&lt;/li&gt;&lt;li&gt;Claude is impressively capable, but also sloppy enough that I found domain expertise essential for evaluating its accuracy.&lt;/li&gt;&lt;li&gt;AI is not doing end-to-end science yet. But this project proves that I could create a set of prompts that can get Claude to do frontier science. This wasn’t true three months ago.&lt;/li&gt;&lt;li&gt;This may be the most important paper I’ve ever written—not for the physics, but for the method. There is no going back.&lt;/li&gt;&lt;/ul&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;who-am-i&quot;&gt;&lt;strong&gt;Who am I?&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;I’m &lt;a href=&quot;https://www.physics.harvard.edu/people/facpages/schwartz&quot;&gt;Matthew Schwartz&lt;/a&gt;, a professor of physics at Harvard and a principal investigator in the NSF Institute for Artificial Intelligence and Fundamental Interactions (&lt;a href=&quot;http://www.iaifi.org/&quot;&gt;IAIFI&lt;/a&gt;). My area of expertise is quantum field theory, which asks what matter is, how particles interact, and why the Universe has the rules it does. One might say I wrote the &lt;a href=&quot;https://www.amazon.com/Quantum-Field-Theory-Standard-Model/dp/1107034736&quot;&gt;book&lt;/a&gt; on the subject. I’ve been working with modern machine learning tools for over a decade. My &lt;a href=&quot;https://arxiv.org/abs/1612.01551&quot;&gt;first modern ML paper&lt;/a&gt;, from 2016, was an early application of deep learning to particle physics. In a &lt;a href=&quot;https://www.nature.com/articles/s42254-022-00538-z&quot;&gt;&lt;em&gt;Nature Reviews Physics&lt;/em&gt;&lt;/a&gt; piece in 2022, I compared the timescale of AI and human evolution, arguing that transferring understanding between biological and artificial intelligence would become a fundamental challenge. Since then, I’ve been trying to push AI towards &lt;a href=&quot;https://arxiv.org/abs/2408.04720&quot;&gt;more symbolic work&lt;/a&gt; (manipulating mathematical expressions rather than numerical data) and the core questions in theoretical physics.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;the-hype&quot;&gt;&lt;strong&gt;The hype&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;There has been a lot of recent hype about AI scientists doing end-to-end research autonomously. In August 2024, Sakana AI released their &lt;a href=&quot;https://sakana.ai/ai-scientist/&quot;&gt;AI Scientist&lt;/a&gt;, a system designed to automate the entire research lifecycle—from generating hypotheses to writing papers. In February 2025, Google released an &lt;a href=&quot;https://arxiv.org/abs/2502.18864&quot;&gt;AI co-scientist&lt;/a&gt; built on Gemini, promising to help researchers generate and evaluate hypotheses at scale. And in August 2025, the Allen Institute for AI (Ai2) launched the open-source &lt;a href=&quot;https://allenai.org/asta&quot;&gt;Asta&lt;/a&gt; ecosystem, featuring tools like &lt;a href=&quot;https://github.com/allenai/codescientist&quot;&gt;CodeScientist&lt;/a&gt; and &lt;a href=&quot;https://allenai.org/blog/autodiscovery&quot;&gt;AutoDiscovery&lt;/a&gt; to find patterns in complex datasets. Since then, a new entrant has appeared every few months—FutureHouse’s &lt;a href=&quot;https://edisonscientific.com/articles/announcing-kosmos&quot;&gt;Kosmos&lt;/a&gt;, the Autoscience Institute’s &lt;a href=&quot;https://autoscience.ai/&quot;&gt;Carl&lt;/a&gt;, the Simons Foundation’s &lt;a href=&quot;https://www.simonsfoundation.org/2025/11/04/meet-denario-an-ai-assistant-for-every-step-of-the-scientific-process/&quot;&gt;Denario&lt;/a&gt; project, among others—each promising some version of end-to-end autonomous research. Even as these approaches are visionary, their successes to date seem a bit forced: run &lt;a href=&quot;https://www.youtube.com/watch?v=no_elVGGgW8&quot;&gt;hundreds or thousands of trials&lt;/a&gt; and define the best one as interesting. While I believe we are not far from end-to-end science, I’m not convinced we can skip the intermediate steps. Maybe LLMs need to go to graduate school before advancing straight to the Ph.D.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;In mathematics, automated end-to-end AI agents have produced impressive results, at least for a certain class of problems. An early breakthrough was DeepMind’s &lt;a href=&quot;https://deepmind.google/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/&quot;&gt;FunSearch&lt;/a&gt;, launched in 2023, and later &lt;a href=&quot;https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/&quot;&gt;AlphaEvolve&lt;/a&gt;, which used LLMs to make new discoveries in combinatorics. A related project, &lt;a href=&quot;https://deepmind.google/blog/ai-solves-imo-problems-at-silver-medal-level/&quot;&gt;AlphaProof&lt;/a&gt;, earned a silver medal at the 2024 International Mathematical Olympiad, solving problems that stumped all but five human contestants, and in 2025, an advanced version of Gemini &lt;a href=&quot;https://deepmind.google/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/&quot;&gt;achieved the gold-medal standard&lt;/a&gt;. And, just as in science, &lt;a href=&quot;https://harmonic.fun/&quot;&gt;more&lt;/a&gt; &lt;a href=&quot;https://arxiv.org/abs/2601.14027&quot;&gt;achievements&lt;/a&gt; have continued to follow.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;What about theoretical physics? End-to-end AI scientists have found their footing in data-rich domains, but theoretical physics is not one of them. Unlike mathematics, theoretical physics problems can be more nebulous—less about formal proof search and more about physical intuition, choosing the right approximations, and navigating a landscape of subtleties that often trip up even experienced researchers. Even so, there are problems in physics where AI might be better suited. Not yet the paradigm-shifting questions at the frontier, but those where the conceptual framework is established and the goal well-defined. To find out if AI can solve these types of theory problems, I supervised Claude through a real research calculation at the level of a second-year grad student.&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;problem-selection&quot;&gt;&lt;strong&gt;Problem selection&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;In grad school, at least at my institution, first-year theory students (G1s) typically just take classes. Research often begins in the second year. G2 students start with well-defined projects that have a guarantee of success—often follow-ups from previous studies where the methods are established and the endpoints clear. This gives them a chance to learn the techniques, make mistakes in a controlled setting, and build confidence. It&#39;s also easy for me as an advisor: I can check their work, spot where they&#39;ve gone off track, and quickly reorient them.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;Advanced students (G3+) work on more open-ended, creative problems. These require choosing your own direction, deciding which approximations matter, and sometimes realizing the original question was wrong (such is the nature of research).&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;For this experiment, I deliberately chose a G2-style problem. My reasoning was that LLMs can already do all the coursework, so they are past the G1 stage. But if AI can&#39;t do the G2 projects—the ones with training wheels, where I know the answer and can check every step—then it certainly can&#39;t do the G3+ projects where creativity and good judgment are essential.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;The problem I chose was resumming the Sudakov shoulder in the C-parameter. For context, when you smash electrons and positrons at a collider, debris sprays out; the C-parameter is a single number that describes the shape of that spray, and its distribution has been measured with extreme precision. The theory that&#39;s supposed to predict that distribution is quantum chromodynamics, the study of the strong nuclear force, which holds nuclei together and powers the sun. The C-parameter is well-defined on paper but brutally hard to calculate, so you approximate. Every approximation is a stress-test—failures tell you something about the foundations of quantum field theory itself: what are the right building blocks and effective degrees of freedom (particles? jets? clouds of gluons?), and what gaps might lead to new insights? At one particular spot on the distribution, a kink called the Sudakov shoulder, the standard approximations break down, and the math starts producing nonsense. The goal of the project was to fix the prediction at this point.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;I picked this problem because it connects directly to the foundations of our understanding of quantum theory. But more importantly, it’s a highly technical calculation that I was confident I could do myself. The physics is understood in principle; what&#39;s missing is a careful, complete treatment.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;The dream was that I could ask:&lt;/p&gt;&lt;blockquote class=&quot;Body-module-scss-module__z40yvW__reading-column Body-module-scss-module__z40yvW__blockquote body-2 serif post-text&quot;&gt;&lt;em&gt;Write a paper on resummation to NLL level of the Sudakov shoulder in the C-parameter in e+e- collisions. Include a derivation of the factorization formula, comparison with previous results, numerical checks against Monte Carlo calculations using EVENT2, and a final plot of the resummed distribution with uncertainty bands.&lt;/em&gt;&lt;/blockquote&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;and out would pop the paper. We are not there yet, of course. I tried giving this prompt to all the frontier models, and—predictably—they all failed pitifully. But I wanted to see if I could &lt;em&gt;coach&lt;/em&gt; the model to succeed: to show, rather than tell it.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;To go about this scientifically, I encapsulated all the work. The rules were strict:&lt;/p&gt;&lt;ul class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;&lt;li&gt;Only give text prompts to &lt;a href=&quot;https://claude.ai/redirect/website.v1.170892e1-6a87-42f0-a44f-145133230533/code&quot;&gt;Claude Code&lt;/a&gt;. No editing files directly.&lt;/li&gt;&lt;li&gt;Don’t cut and paste my own calculations into the chat.&lt;/li&gt;&lt;li&gt;But pasting Gemini or GPT calculations was OK, as long as they were only text-prompted.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;My question was: is there a set of prompts, like instructions to a talented G2, that can guide an AI to produce a high-quality physics paper (one that is genuinely interesting and pushes the field forward)?&lt;/p&gt;&lt;h3 class=&quot;Body-module-scss-module__z40yvW__reading-column headline-6 post-subsection&quot; id=&quot;initial-steps&quot;&gt;&lt;strong&gt;Initial steps&lt;/strong&gt;&lt;/h3&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;I knew from experience that LLMs struggle with context and organization over long projects. So I started by asking Claude to come up with a plan of attack: what tasks needed to be done in what order. I also asked GPT 5.2 and Gemini 3.0. Then, I had all three LLMs merge the best ideas from each, using web interfaces and copying one to another. Next, I gave those merges to Claude, asking it to break the outline into detailed subsections. The result is &lt;a href=&quot;https://www-cdn.anthropic.com/2595299ccf7f8b9a9c74823c24faaa5d9b216804.pdf&quot;&gt;here&lt;/a&gt;. There were 102 separate tasks across seven stages.&lt;/p&gt;&lt;p class=&quot;Body-module-scss-module__z40yvW__reading-column body-2 serif post-text&quot;&gt;From there, I turned to &lt;a href=&quot;https://claude.ai/redirect/website.v1.170892e1-6a87-42f0

...

github-actions · 2026-04-15T04:50:10Z

http://localhost:1200/arena/blog - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Arena Blog</title>
    <link>https://arena.ai/blog/</link>
    <atom:link href="http://localhost:1200/arena/blog" rel="self" type="application/rss+xml"></atom:link>
    <description>Latest updates, research, and leaderboard changes from Arena - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Wed, 15 Apr 2026 04:49:41 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>About Arena</title>
      <description>&lt;p&gt;Created by researchers from UC Berkeley, Arena (formerly LMArena) is a community-powered platform for understanding AI performance in the real world. Tens of millions of builders, researchers, and creative professionals come to Arena to use frontier models and give feedback on their responses, shaping a public leaderboard grounded in real-world use.&amp;nbsp;&lt;/p&gt;&lt;h2 id=&quot;our-mission&quot;&gt;&lt;strong&gt;Our Mission&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;To measure and advance the frontier of AI for real-world use.&lt;/p&gt;&lt;h2 id=&quot;our-vision&quot;&gt;&lt;strong&gt;Our Vision&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;To build the foundation for everyone to understand, shape, and benefit from AI.&lt;/p&gt;&lt;hr&gt;&lt;div class=&quot;kg-card kg-cta-card kg-cta-bg-grey kg-cta-immersive kg-cta-centered&quot; data-layout=&quot;immersive&quot;&gt;
        &lt;div class=&quot;kg-cta-sponsor-label-wrapper&quot;&gt;
        &lt;div class=&quot;kg-cta-sponsor-label&quot;&gt;
        &lt;span style=&quot;white-space: pre-wrap;&quot;&gt;ARENA.ai&lt;/span&gt;
        &lt;/div&gt;
        &lt;/div&gt;
        &lt;div class=&quot;kg-cta-content&quot;&gt;
        &lt;div class=&quot;kg-cta-content-inner&quot;&gt;
        &lt;div class=&quot;kg-cta-text&quot;&gt;
        &lt;p&gt;&lt;b&gt;&lt;strong style=&quot;white-space: pre-wrap;&quot;&gt;Where AI meets the real world. &lt;/strong&gt;&lt;/b&gt;&lt;br&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;Join a community of 5M+ shaping AI through collective feedback. &lt;/span&gt;&lt;/p&gt;
        &lt;/div&gt;
        &lt;a href=&quot;https://arena.ai/?ref=arena.ai&quot; class=&quot;kg-cta-button &quot; style=&quot;background-color: #000000; color: #ffffff;&quot;&gt;
        Explore Arena
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;/div&gt;
        &lt;/div&gt;&lt;h3 id=&quot;join-the-community&quot;&gt;Join The Community&lt;/h3&gt;&lt;p&gt;Join the team: &lt;a href=&quot;https://lmarena.ai/jobs?ref=lmarena.ai&quot;&gt;&lt;u&gt;arena.ai/jobs&lt;/u&gt;&lt;/a&gt;&lt;br&gt;Follow us on X: &lt;a href=&quot;https://arena.ai/blog/arena-expert-model-comparison/&quot; rel=&quot;noreferrer&quot;&gt;&lt;u&gt;@arena&lt;/u&gt;&lt;/a&gt;&lt;br&gt;Follow us on LinkedIn: &lt;a href=&quot;https://linkedin.com/company/arenaai/?ref=arena.ai&quot;&gt;&lt;u&gt;@ArenaAI&lt;/u&gt;&lt;/a&gt;&lt;br&gt;Join the conversation: &lt;a href=&quot;https://discord.gg/arena-ai?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;&lt;u&gt;discord.gg/&lt;/u&gt;&lt;/a&gt;&lt;u&gt;arena-ai&lt;/u&gt;&lt;/p&gt;
      </description>
      <link>https://arena.ai/blog/about/</link>
      <guid isPermaLink="false">https://arena.ai/blog/about/</guid>
      <pubDate>Invalid Date</pubDate>
    </item>
    <item>
      <title>Leaderboard Changelog</title>
      <description>&lt;p&gt;This page documents notable updates to our leaderboard—new models, new arenas, updates to the methodology, and more. Stay tuned! &lt;br&gt;&lt;br&gt;For model deprecations, check the &lt;a href=&quot;https://github.com/lmarena/lmarena.github.io/blob/main/_pages/model_list.md?ref=news.lmarena.ai&quot; rel=&quot;noreferrer&quot;&gt;public updates on GitHub&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;April 10, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.arcee.ai/blog/trinity-large?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;trinity-large&lt;/a&gt; has been renamed to &lt;a href=&quot;https://www.arcee.ai/blog/trinity-large?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;trinity-large-preview&lt;/a&gt; for clarity on the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=3026980&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.7-image&lt;/a&gt; and &lt;a href=&quot;https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=3026980&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.7-image-pro&lt;/a&gt; have been added to the Text-to-Image and Image Edit leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;April 9, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ai.meta.com/blog/introducing-muse-spark-msl/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;muse-spark&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://huggingface.co/zai-org/GLM-5.1?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-5.1&lt;/a&gt; has been added to the Code leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;April 7, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/zai-org/GLM-5.1?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-5.1&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;April 6, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.byteplus.com/en/blog/dola-seed-2-0-pro?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;dola-seed-2.0-pro&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;April 2, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://chat.qwen.ai/?models=Qwen3.6-Plus-Preview&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.6-plus-preview&lt;/a&gt; has been added to the Code leaderboard.&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-search&lt;/a&gt; has been added to the Search leaderboard.&lt;br&gt;&lt;a href=&quot;https://openai.com/index/introducing-gpt-5-4-mini-and-nano/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-mini-high&lt;/a&gt; has been added to the Code (WebDev) leaderboard&lt;br&gt;&lt;a href=&quot;https://huggingface.co/meituan-longcat/LongCat-Flash-Chat?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;longcat-flash-chat&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://aistudio.google.com/app/prompts/new_chat?model=gemma-4-31b-it&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemma-4-31b&lt;/a&gt; and &lt;a href=&quot;https://aistudio.google.com/app/prompts/new_chat?model=gemma-4-26b-a4b-it&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemma-4-26b-a4b&lt;/a&gt; have been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 31, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-6&lt;/a&gt;&lt;strong&gt;, &lt;/strong&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-6-thinking&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt; &lt;/strong&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-sonnet-4-6&lt;/a&gt; have been added to the Vision leaderboard.&lt;br&gt;&lt;a href=&quot;https://docs.x.ai/developers/models/grok-4.20-multi-agent-beta-0309?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4.20-multi-agent-beta-0309&lt;/a&gt; is now on the Text, Vision, and Search leaderboards. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 26, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ai.google.dev/gemini-api/docs/google-search?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-flash-grounding&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Search leaderboard.&lt;br&gt;&lt;a href=&quot;https://www.inceptionlabs.ai/blog/introducing-mercury-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mercury-2&lt;/a&gt; has been added to the Text and Code Area leaderboards.&lt;br&gt;&lt;a href=&quot;https://runwayml.com/research/introducing-runway-gen-4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;runway-gen4&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;br&gt;&lt;a href=&quot;https://openai.com/index/introducing-gpt-5-4-mini-and-nano/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-nano-high&lt;/a&gt; and &lt;a href=&quot;https://openai.com/index/introducing-gpt-5-4-mini-and-nano/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-mini-high&lt;/a&gt; have been added to the Text leaderboard. &lt;br&gt;&lt;br&gt;&lt;strong&gt;March 20, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://mimo.xiaomi.com/mimo-v2-pro?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mimo-v2-pro&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 19, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://qwen.ai/blog?id=qwen3.5-max-preview&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-max-preview&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://microsoft.ai/news/introducing-MAI-Image-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mai-image-2&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;br&gt;&lt;a href=&quot;https://docs.x.ai/developers/models/grok-4.20-beta-0309-reasoning?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4.20-beta-0309-reasoning&lt;/a&gt; has been added to the Vision leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 18, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.minimax.io/news/minimax-m27-en?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;minimax-m2.7&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text and Code leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;March 16, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://docs.x.ai/developers/models/grok-4.20-beta-0309-reasoning?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4.20-beta-0309-reasoning&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;br&gt;&lt;a href=&quot;https://nova.amazon.com/faqs?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;amazon-nova-experimental-chat-26-02-10&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 12, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-high (codex-harness)&lt;/a&gt;, &lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-medium (codex-harness)&lt;/a&gt;, and &lt;a href=&quot;https://openai.com/index/introducing-gpt-5-3-codex/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.3-codex (codex-harness)&lt;/a&gt; have been added to the Code leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 11, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;nvidia-nemotron-3-super-120b-a12b&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4&lt;/a&gt; is on the Text and Document leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 10, 2026&lt;/strong&gt;&lt;br&gt;The Video Arena leaderboards will reflect a single consolidated vote category going forward.&lt;br&gt;&lt;br&gt;Previously, prompts and generations were visible to a broader Discord community, allowing non-authors to vote. After the migration to &lt;a href=&quot;http://arena.ai/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;arena.ai&lt;/a&gt;, voting is now limited to the prompt author only. We are discontinuing the separate “Author Vote” category on the Video Arena leaderboards as all votes now originate from prompt authors.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 6, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://app.pixverse.ai/onboard?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;pixverse-v5.6&lt;/a&gt; has been added to the Text-to-Video and Image-to-Video leaderboards&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 5, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4-high&lt;/a&gt; and &lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.4?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.4&lt;/a&gt; are on the Text leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 4, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/ap-southeast-1/?tab=doc&amp;amp;ref=arena.ai#/doc/?type=model&amp;amp;url=2840914_2&amp;amp;modelId=group-qwen3.5-flash&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-flash&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;March 3, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3.1-flash-lite-preview&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;br&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3.5-122B-A10B?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-122b-a10b&lt;/a&gt;, &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3.5-35B-A3B?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-35b-a3b&lt;/a&gt; and &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3.5-27B?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-27b&lt;/a&gt; have been added to the Text and Code leaderboards. &lt;br&gt;&lt;br&gt;The &lt;a href=&quot;https://arena.ai/leaderboard/document?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Document Arena leaderboard&lt;/a&gt; has been added. The Document Arena displays model rankings based on side-by-side evaluations of real-world document reasoning performance across user-uploaded PDF files.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;March 2, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://runwayml.com/research/introducing-runway-gen-4.5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;runway-gen-4.5&lt;/a&gt; has been added to the Text-to-Video leaderboard&lt;/p&gt;&lt;p&gt;An older version of &lt;a href=&quot;https://docs.x.ai/developers/model-capabilities/images/generation?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-image&lt;/a&gt; has been renamed to &lt;a href=&quot;https://docs.x.ai/developers/model-capabilities/images/generation?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-image (20260207)&lt;/a&gt; in Image Edit to reflect the dated snapshot. The current default model now listed as &lt;a href=&quot;https://docs.x.ai/developers/model-capabilities/images/generation?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-image&lt;/a&gt; represents the latest version.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 26, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.pruna.ai/p-video?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;p-video&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text-to-Video and Image-to-Video leaderboards.&lt;br&gt;&lt;a href=&quot;https://aistudio.google.com/prompts/new_chat?model=gemini-3.1-flash-image-preview&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3.1-flash-image-preview (nano-banana-2) [web-search]&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-6-search&lt;/a&gt; has been added to the Search leaderboard.&lt;br&gt;&lt;a href=&quot;https://fal.ai/models/fal-ai/kling-video/v3/pro/image-to-video?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kling-v3-pro&lt;/a&gt; has been added to the Image-to-Video leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Code Arena WebDev Leaderboard Categorization&lt;/strong&gt;&lt;br&gt;The Code Arena WebDev leaderboard is now segmented into two distinct categories to reflect different generation environments and capability surfaces:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;HTML: ranks standalone HTML/CSS/JS generation within a single document scope.&lt;/li&gt;&lt;li&gt;React: ranks multi-file application generation, including component structure, routing, state coordination, and cross-file consistency.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This separation provides clearer visibility into model performance across simple document-level outputs vs. structured application-level builds.&lt;br&gt;&lt;br&gt;&lt;strong&gt;February 25, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://developers.openai.com/api/docs/models/gpt-5.2-chat-latest?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2-chat-latest-20260210&lt;/a&gt; has been added to the Vision leaderboard.&lt;br&gt;&lt;a href=&quot;https://grok.com/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4.20-beta1&lt;/a&gt; has been added to the Text and Search leaderboards.&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/seedream5_0_lite?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;seedream-5.0-lite&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 24, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3.5-397B-A17B?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-397b-a17b&lt;/a&gt; has been added to the Code leaderboard.&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=2867393&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.6-i2v&lt;/a&gt; and &lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=2865250&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.6-t2v&lt;/a&gt; have been added to the Video leaderboard&lt;br&gt;&lt;a href=&quot;https://nova.amazon.com/faqs?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;amazon-nova-experimental-chat-26-01-10&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 23, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.arcee.ai/blog/trinity-large?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;trinity-large&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://app.reve.com/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;reve-v1.5&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 21, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://developers.openai.com/api/docs/models/gpt-5.2-chat-latest?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2-chat-latest-20260210&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 20, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-sonnet-4-6&lt;/a&gt; has been added to the Text and Code leaderboard.&lt;br&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3.5-397B-A17B?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;qwen3.5-397b-a17b&lt;/a&gt; has been added to the Vision leaderboard.&lt;br&gt;&lt;a href=&quot;https://www.recraft.ai/blog/introducing-recraft-v4-design-taste-meets-image-generation?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;recraft-v4&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 19, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/allenai/Molmo2-8B?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;molmo-2-8b&lt;/a&gt; has been added to the Vision and leaderboard.&lt;br&gt;&lt;a href=&quot;https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3.1-pro-preview&lt;/a&gt; has been added to the Text, Vision and Code leaderboards.&lt;br&gt;&lt;a href=&quot;https://mimo.xiaomi.com/blog/mimo-v2-flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mimo-v2-flash (thinking)&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;February 16, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.minimax.io/news/minimax-m25?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;minimax-m2.5&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text and Code leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 15, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/blog/dola-seed-2-0-preview-model-release-on-arena?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;dola-seed-2.0-preview&lt;/a&gt; has been added to the Text and Vision leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 12, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://arena.ai/leaderboard/text?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-5&lt;/a&gt; has been added to the Code leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 11, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://arena.ai/leaderboard/text?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-5&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 10, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://developers.googleblog.com/en/introducing-veo-3-1-and-new-creative-capabilities-in-the-gemini-api/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;veo-3.1-fast-audio-1080p&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt; &lt;/strong&gt;&lt;a href=&quot;https://developers.googleblog.com/en/introducing-veo-3-1-and-new-creative-capabilities-in-the-gemini-api/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;veo-3.1-audio-1080p&lt;/a&gt; have been added to the Text-to-Video and Image-to-Video leaderboards.&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/stepfun-ai/Step-3.5-Flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;step-3.5-flash&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 9, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-6-thinking&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;br&gt;&lt;br&gt;We’ve updated the Text-to-Image Arena with:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Prompt Categories: category-specific leaderboards for clearer domain-level performance &lt;/li&gt;&lt;li&gt;Quality Filtering: reducing noisy or underspecified prompts for more reliable rankings&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Older categories for user-generated vs. pre-generated prompts have been deprecated. &lt;br&gt;&lt;br&gt;&lt;strong&gt;February 7, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://docs.x.ai/developers/models?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-image-pro&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;and &lt;a href=&quot;https://docs.x.ai/developers/model-capabilities/images/generation?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-image&lt;/a&gt; have been added to the Text-to-Image and Image Edit leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 6, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-6?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-6&lt;/a&gt; has been added to the Text and Code leaderboards.&lt;br&gt;&lt;a href=&quot;https://www.kimi.com/blog/kimi-k2-5.html?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kimi-k2.5-instant&lt;/a&gt; has been added to the Text, Vision and Code leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 5, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.vidu.com/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;vidu-q3-pro&lt;/a&gt; is on the Image-to-Video leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 4, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://docs.x.ai/docs/guides/video-generations?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-video-720p&lt;/a&gt; is on the Image-to-Video leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;February 2, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.kimi.com/blog/kimi-k2-5.html?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kimi-k2.5-thinking&lt;/a&gt; is on the Code Arena, WebDev leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 29, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/zai-org/GLM-4.7-Flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-4.7-flash&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://mistral.ai/news/devstral-2-vibe-cli?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;devstral-2&lt;/a&gt; has been added to the Code leaderboard.&lt;/p&gt;&lt;p&gt;The following models have been added to the Text-to-Video leaderboard:&lt;br&gt;&lt;a href=&quot;https://app.klingai.com/global/release-notes/zipxp988c2?type=dialog&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kling-o1-pro&lt;/a&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/Lightricks/LTX-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ltx-2-19b&lt;/a&gt;&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/Lightricks/LTX-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ltx-2-19b&lt;/a&gt; has been added to the Image-to-Video leaderboard&lt;/p&gt;&lt;p&gt;The following models have been added to the Image Edit leaderboard:&lt;br&gt;&lt;a href=&quot;https://www.pruna.ai/p-image-edit?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;p-image-edit&lt;/a&gt;&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=3001143&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.6-image&lt;/a&gt;&lt;/p&gt;&lt;p&gt;The following models have been added to the Text-to-Image leaderboard:&lt;br&gt;&lt;a href=&quot;https://www.pruna.ai/p-image?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;p-image&lt;/a&gt;&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=3001143&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.6-t2i&lt;/a&gt;&lt;br&gt;&lt;br&gt;The following models have been added to the Search leaderboard:&lt;br&gt;&lt;a href=&quot;https://ai.google.dev/gemini-api/docs/google-search?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-flash-grounding&lt;/a&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-sonnet-4-5-search&lt;/a&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-5-search&lt;/a&gt;&lt;br&gt;&lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2-search-non-reasoning&lt;/a&gt;&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://www.kimi.com/blog/kimi-k2-5.html?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kimi-k2.5-thinking&lt;/a&gt; has been added to the Vision leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 28, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://docs.x.ai/docs/guides/video-generations?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-imagine-video-480p&lt;/a&gt; is the Text-to-Video and Image-to-Video leaderboards&lt;br&gt;&lt;br&gt;&lt;strong&gt;January 27, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.kimi.com/blog/kimi-k2-5.html?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kimi-k2.5-thinking&lt;/a&gt; is on the Text Leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 26, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&amp;amp;modelId=Hunyuan-Image-3.0-Instruct&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;hunyuan-image-3.0-instruct&lt;/a&gt; is on the Image Edit leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 23, 2026&lt;/strong&gt;&lt;br&gt;We&#39;ve added Multi-Image Edit as new category to the Image Edit leaderboard which consists of votes where multiple images were passed into the models. The previous overall leaderboard has been renamed to Single-Image Edit.&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.2-codex?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;gpt-5.2-codex&lt;/a&gt; has been added to the Code leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 21, 2026&lt;/strong&gt;&lt;br&gt;Video Arena leaderboards now include votes collected from both the&amp;nbsp;&lt;a href=&quot;http://lmarena.ai/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;lmarena.ai&lt;/a&gt;&amp;nbsp;website and our &lt;a href=&quot;https://www.discord.gg/lmarena?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Discord server&lt;/a&gt;. Read &lt;a href=&quot;https://arena.ai/blog/video-arena/&quot; rel=&quot;noreferrer&quot;&gt;our blog&lt;/a&gt; for more details.&lt;br&gt;&lt;br&gt;&lt;strong&gt;January 20, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/allenai/Olmo-3.1-32B-Instruct?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;olmo-3.1-32b-instruct&lt;/a&gt; and &lt;a href=&quot;https://huggingface.co/allenai/Olmo-3.1-32B-Think?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;olmo-3.1-32b-think&lt;/a&gt; have been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://z.ai/blog/glm-image?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-image&lt;/a&gt; was added to the Text-to-Image leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 19, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=2982258&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.5-i2i-preview&lt;/a&gt; has been added to the Image Edit leaderboard&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 16, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/zai-org/GLM-4.6V?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-4.6v&lt;/a&gt; has been added to the Vision leaderboard. &lt;br&gt;&lt;a href=&quot;https://blog.google/products/gemini/gemini-3-flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-flash (thinking-minimal)&lt;/a&gt; has been updated on the Vision leaderboard.&lt;br&gt;&lt;a href=&quot;https://bfl.ai/models/flux-2-klein?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;flux-2-klein-9b&lt;/a&gt; &amp;amp; &lt;a href=&quot;https://bfl.ai/models/flux-2-klein?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;flux-2-klein-4b&lt;/a&gt; have been added to the Text-to-Image and Image Edit leaderboards. &lt;br&gt;&lt;a href=&quot;https://github.com/Tongyi-MAI/Z-Image?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;z-image-turbo&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;January 14, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ernie.baidu.com/blog/posts/ernie-5.0-0110-release-on-lmarena/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ernie-5.0-0110&lt;/a&gt; has been added to the Text leaderboard. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 13, 2026&lt;/strong&gt;&lt;br&gt;We’ve completed a major improvement to our data pipeline that resolves several known issues and applies data filtering more consistently:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;The validation process led to minimal adjustments in leaderboard rankings.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Models and leaderboards with fewer votes may see larger score fluctuations.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Here&#39;s a summary of the changes in the new pipeline:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Vote filters, such as identity leak detection and data quality filtering, are now applied more consistently across all votes.&lt;/li&gt;&lt;li&gt;Vote de-duplication is now enabled in text-to-image and video arenas.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;January 8, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://hunyuan.tencent.com/video/en?tabIndex=0&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;hunyuan-video-1.5&lt;/a&gt; has been added to the Text-to-Video and Image-to-Video leaderboards&lt;/p&gt;&lt;p&gt;&lt;strong&gt;January 7, 2026&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ernie.baidu.com/blog/posts/ernie-5.0-preview-1220-release-on-lmarena/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ernie-5.0-preview-1220&lt;/a&gt; has been added to the Vision leaderboard&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/seedance1_5_pro?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;seedance-v1.5-pro&lt;/a&gt; has been added to the Text-to-Video and Image-to-Video leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 31, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.minimax.io/news/minimaxm1?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;minimax-m2.1-preview&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;and &lt;a href=&quot;https://huggingface.co/zai-org/GLM-4.7?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-4.7&lt;/a&gt; have been added to the Text leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;December 29, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.minimax.io/news/minimaxm1?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;minimax-m2.1-preview&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the WebDev leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 23, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://mimo.xiaomi.com/blog/mimo-v2-flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mimo-v2-flash (non-thinking)&lt;/a&gt; has been added to the Text and WebDev leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;December 22, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ernie.baidu.com/blog/posts/ernie-5.0-preview-1203-release-on-lmarena/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;ernie-5.0-preview-1203&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://huggingface.co/zai-org/GLM-4.7?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;glm-4.7&lt;/a&gt; has been added to the WebDev leaderboard powered by the new Code Arena.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 19, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://nova.amazon.com/faqs?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;amazon-nova-experimental-chat-11-10&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 18, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://x.ai/news/grok-4-1-fast?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4-1-fast-search&lt;/a&gt; and &lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2-search&lt;/a&gt; are on the Search leaderboard.&lt;br&gt;&lt;a href=&quot;https://api.reve.com/console/pricing?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;reve-v1.1&lt;/a&gt; and &lt;a href=&quot;https://api.reve.com/console/pricing?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;reve-v1.1-fast&lt;/a&gt; are on the Image Edit leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 17, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://blog.google/products/gemini/gemini-3-flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-flash&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt; &lt;/strong&gt;&lt;a href=&quot;https://blog.google/products/gemini/gemini-3-flash?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-flash (thinking-minimal)&lt;/a&gt; have been added to the Text, Vision and WebDev leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 16, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://bfl.ai/models/flux-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;flux-2-max&lt;/a&gt; is on the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-image-1.5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-image-1.5&lt;/a&gt; is on the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/chatgpt-image-latest?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;chatgpt-image-latest (20251216)&lt;/a&gt; has been added to the Image Edit leaderboard.&lt;br&gt;&lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2-high&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://www.ibm.com/new/announcements/ibm-granite-4-0-hyper-efficient-high-performance-hybrid-models?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ibm-granite-h-small&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;December 15, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;nvidia-nemotron-3-nano-30b-a3b-bf16&lt;/a&gt; has been added to the Text leaderboard&lt;br&gt;&lt;br&gt;&lt;strong&gt;December 12, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://github.com/kandinskylab/kandinsky-5/?tab=readme-ov-file&amp;amp;ref=arena.ai#kandinsky-50-video-lite&quot; rel=&quot;noopener noreferrer&quot;&gt;kandinsky-5.0-t2v-lite&lt;/a&gt; and &lt;a href=&quot;https://github.com/kandinskylab/kandinsky-5/?ref=arena.ai#kandinsky-50-video-pro&quot; rel=&quot;noopener noreferrer&quot;&gt;kandinsky-5.0-t2v-pro&lt;/a&gt; have been added to the Text-to-Video.&lt;br&gt;&lt;a href=&quot;https://app.klingai.com/global/release-notes/c605hp1tzd?type=dialog&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;kling-2.6-pro&lt;/a&gt; has been added to the Text-to-Video and Image-to-Video leaderboards.&lt;br&gt;&lt;a href=&quot;https://bfl.ai/models/flux-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;flux-2-dev&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 11, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2&lt;/a&gt; and &lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.2-high&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;have been added to the &lt;a href=&quot;https://lmarena.ai/leaderboard/webdev?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;WebDev leaderboard&lt;/a&gt; powered by Code Arena.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 10, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://mistral.ai/news/mistral-3?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mistral-large-3&lt;/a&gt; has been added to the &lt;a href=&quot;https://lmarena.ai/leaderboard/webdev?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;WebDev leaderboard&lt;/a&gt; powered by Code Arena.&lt;br&gt;&lt;a href=&quot;https://www.primeintellect.ai/blog/intellect-3?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;intellect-3&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 9, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ernie.baidu.com/blog/posts/ernie-5.0-preview-1103-release-on-lmarena/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ernie-5.0-preview-1103&lt;/a&gt; and &lt;a href=&quot;https://aws.amazon.com/blogs/aws/introducing-amazon-nova-2-lite-a-fast-cost-effective-reasoning-model/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;nova-2-lite&lt;/a&gt; have been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 5, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://x.ai/news/grok-4-fast?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4-fast&lt;/a&gt;&amp;nbsp;has been renamed to&amp;nbsp;&lt;a href=&quot;https://x.ai/news/grok-4-fast?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4-fast-chat&lt;/a&gt; to better reflect the specific model variant. Additionally, &lt;a href=&quot;https://huggingface.co/allenai/Olmo-3-32B-Think?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;olmo-3-32b-think&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 4, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://x.ai/news/grok-4-fast?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4-fast-reasoning&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://mistral.ai/news/devstral-2507?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;devstral-medium-2507&lt;/a&gt; has been added to the &lt;a href=&quot;https://lmarena.ai/leaderboard/webdev?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;new WebDev leaderboard&lt;/a&gt; (powered by Code Arena). &lt;br&gt;&lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1-high&lt;/a&gt; and &lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1&lt;/a&gt; has been added to the Vision leaderboard.&lt;br&gt;&lt;a href=&quot;https://api-docs.deepseek.com/news/news250929?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;deepseek-v3.2&lt;/a&gt; and &lt;a href=&quot;https://api-docs.deepseek.com/news/news250929?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;deepseek-v3.2-thinking&lt;/a&gt; have been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/seedream4_5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;seedream-4.5&lt;/a&gt; has been added to the Image Edit and Text-to-Image leaderboards&lt;/p&gt;&lt;p&gt;&lt;strong&gt;December 3, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1-search&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt; &lt;/strong&gt;&lt;a href=&quot;https://ai.google.dev/gemini-api/docs/google-search?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-pro-grounding&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;have been added to the Search leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;December 2, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://mistral.ai/news/mistral-3?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mistral-large-3&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=2865250&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.5-t2v-preview&lt;/a&gt; has been added to the Text-to-Video leaderboard.&lt;br&gt;&lt;a href=&quot;https://ai.studio/banana?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-pro-image-preview-2k (nano-banana-pro)&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;December 1, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.streamlake.ai/product/kat-coder?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;KAT-Coder-Pro-V1&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;has been added to the new WebDev leaderboard.&lt;br&gt;&lt;a href=&quot;https://bfl.ai/models/flux-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;flux-2-flex&lt;/a&gt; and &lt;a href=&quot;https://bfl.ai/models/flux-2?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;flux-2-pro&lt;/a&gt; have been added to the Text-to-Image and Image Edit leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 26, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-5-20251101&lt;/a&gt; and &lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-5?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;claude-opus-4-5-20251101-thinking-32k&lt;/a&gt; have been added to the Text and WebDev leaderboards&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 21, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ernie.baidu.com/blog/posts/ernie-5.0-preview-1120-release-on-lmarena/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ernie-5.0-preview-1120&lt;/a&gt; has been added to the Vision leaderboard.&lt;br&gt;&lt;a href=&quot;https://ai.studio/banana?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-pro-image-preview (nano-banana-pro)&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;br&gt;Additionally, the following models have been added to the new WebDev leaderboard:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1-medium&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.1-codex?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1-codex&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-5.1-codex?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1-codex-mini&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;November 20, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.inceptionlabs.ai/blog/mercury-refreshed?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;mercury&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 19, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://deepcogito.com/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;cogito v2.1&lt;/a&gt; has been added to the WebDev leaderboard.&lt;br&gt;&lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1-high&lt;/a&gt; and &lt;a href=&quot;https://openai.com/index/gpt-5-1/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gpt-5.1&lt;/a&gt; have been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 18, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://aistudio.google.com/app/prompts/new_chat?model=gemini-3-pro-preview&amp;amp;ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;gemini-3-pro&lt;/a&gt; has been added to the Text, Vision and WebDev leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 17, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://x.ai/news/grok-4-1?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4.1-thinking&lt;/a&gt; and &lt;a href=&quot;https://x.ai/news/grok-4-1?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;grok-4.1&lt;/a&gt; have been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=2862677&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.5-t2i-preview&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;br&gt;&lt;a href=&quot;https://modelstudio.console.alibabacloud.com/?tab=api&amp;amp;ref=arena.ai#/api/?type=model&amp;amp;url=2867393&quot; rel=&quot;noopener noreferrer&quot;&gt;wan2.5-i2v-preview&lt;/a&gt; has been added to the Image-to-Video leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 14, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://lumalabs.ai/ray?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ray-3&lt;/a&gt; has been added to the Text-to-Video and Image-to-Video leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;November 13, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://shengshu.feishu.cn/wiki/LGayww6Dni4Uijkb2N0crvuznhh?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;vidu-q2-turbo&lt;/a&gt; &amp;amp; &lt;a href=&quot;https://shengshu.feishu.cn/wiki/LGayww6Dni4Uijkb2N0crvuznhh?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;vidu-q2-pro&lt;/a&gt; are now on the Image-to-Video leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;November 12, 2025&lt;/strong&gt;&lt;br&gt;The &lt;a href=&quot;https://lmarena.ai/leaderboard/webdev?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;WebDev leaderboard&lt;/a&gt; is now powered by the Code Arena experience.&lt;br&gt;&lt;a href=&quot;https://nova.amazon.com/faqs?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;amazon-nova-experimental-chat-10-20&lt;/a&gt; has been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 7, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://ernie-blog-dev.now.baidu.com/blog/posts/ernie-5.0-preview-1022-release-on-lmarena/?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;ernie-5.0-preview-1022&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://api.reve.com/console/pricing?ref=arena.ai&quot; rel=&quot;noopener noreferrer&quot;&gt;reve-edit-fast&lt;/a&gt; has been added to the Image Edit leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 6, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/gpt-image-1-mini?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;gpt-image-1-mini&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 5, 2025&lt;/strong&gt;&lt;br&gt;Introducing Arena Expert: a new LMArena evaluation framework to identify the toughest, most expert-level prompts from real users, powering a new Expert leaderboard.&lt;br&gt;&lt;br&gt;We also introduce Occupational Categories that underlie eight new leaderboards:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Software &amp;amp; IT Services&amp;nbsp;&lt;/li&gt;&lt;li&gt;Writing, Literature, &amp;amp; Language&lt;/li&gt;&lt;li&gt;Life, Physical, &amp;amp;&amp;nbsp;Social&amp;nbsp;Science&lt;/li&gt;&lt;li&gt;Entertainment, Sports, &amp;amp; Media&lt;/li&gt;&lt;li&gt;Business, Management, &amp;amp; Financial Ops&lt;/li&gt;&lt;li&gt;Mathematical&lt;/li&gt;&lt;li&gt;Legal &amp;amp; Government&lt;/li&gt;&lt;li&gt;Medicine &amp;amp; Healthcare&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Arena Expert aims to sharpen the difficulty level compared to Arena Hard. While Hard includes about a third of all LMArena prompts, Arena Expert includes only 5.5% of all prompts. Expert prompts are identified by their reasoning depth and specificity, producing sharper separations between models. By mapping all Arena prompts across occupational fields, the Occupational Categories system captures the full spectrum of real-world reasoning tasks.&lt;br&gt;&lt;br&gt;→ Read more on our blog:&amp;nbsp;&lt;a href=&quot;https://arena.ai/blog/arena-expert&quot; rel=&quot;noopener noreferrer&quot;&gt;http://news.lmarena.ai/arena-expert&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;November 3, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.minimax.io/news/minimax-m2?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;MiniMax-M2&lt;/a&gt; has been added to the WebDev leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;October 30, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://hailuoai.video/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Hailuo 2.3&lt;/a&gt; has been added to the Text-to-Video leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;October 28, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://hailuoai.video/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Hailuo 2.3&lt;/a&gt; has been added to the Image-to-Video leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;October 20, 2025&lt;/strong&gt;&lt;br&gt;The following models have been added to the WebDev leaderboard:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-haiku-4-5?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Claude-Haiku-4-5-20251001&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Qwen3-235b-a22b-instruct-2507&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-5?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Claude-sonnet-4-5-20250929-thinking-32k&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://z.ai/blog/glm-4.6?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;GLM-4.6&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Additionally, &lt;a href=&quot;https://developers.googleblog.com/en/introducing-veo-3-1-and-new-creative-capabilities-in-the-gemini-api/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Veo 3.1&lt;/a&gt; variants have been added to the Text-to-Video and Image-to-Video leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;October 16, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-haiku-4-5?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Claude Haiku 4.5&lt;/a&gt; and &lt;a href=&quot;https://nova.amazon.com/faqs?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Amazon-Nova-Experimental-Chat-10-09&lt;/a&gt; have been added to the Text leaderboard.&lt;br&gt;&lt;br&gt;We&#39;ve also refined the logic for our Coding category to improve precision. Prompts that resembled code, but are not coding related (such as markdown) have been removed. The new rule has been applied retroactively on data, so while the Coding category is now smaller, it’s more accurate.&lt;br&gt;&lt;br&gt;&lt;strong&gt;October 14, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://platform.openai.com/docs/models/sora-2?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Sora 2&lt;/a&gt; and &lt;a href=&quot;https://platform.openai.com/docs/models/sora-2-pro?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Sora 2 Pro&lt;/a&gt; has been added to the Text-to-Video leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;October 13, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://microsoft.ai/news/introducing-mai-image-1-debuting-in-the-top-10-on-lmarena/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;MAI-1-Image&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;br&gt;&lt;a href=&quot;https://app.klingai.com/global/image-to-video/frame-mode/new?ra=4&amp;amp;ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Kling 2.5 Turbo 1080p&lt;/a&gt; has been added to the Text-to-Video and Image-to-Video leaderboards. &lt;br&gt;&lt;br&gt;&lt;strong&gt;October 8, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://api-docs.deepseek.com/news/news250929?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;DeepSeek-V3.2-Exp&lt;/a&gt; and the thinking variant have been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;October 7, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://huggingface.co/inclusionAI/Ling-flash-2.0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Ling Flash 2.0&lt;/a&gt; and &lt;a href=&quot;https://huggingface.co/inclusionAI/Ring-flash-2.0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Ring Flash 2.0&lt;/a&gt; have been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;October 6, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://github.com/Tencent-Hunyuan/HunyuanVision?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Hunyuan Vision 1.5 Thinking&lt;/a&gt; has been added to the Vision leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;October 4, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://hunyuan.tencent.com/image/en?tabIndex=0&amp;amp;ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Hunyuan Image 3.0&lt;/a&gt; has been added to the Text-to-Image leaderboard.&lt;/p&gt;&lt;p&gt;We added a filter to remove rows where a model battles against itself. This happens very rarely, in instances where we briefly serve the same model from two different API endpoints at the same time.&lt;br&gt;&lt;br&gt;&lt;strong&gt;October 3, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-5?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Claude Sonnet 4.5 Thinking 32k&lt;/a&gt; and &lt;a href=&quot;https://docs.z.ai/guides/llm/glm-4.6?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;GLM 4.6&lt;/a&gt; have been added to the Text leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;October 2, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-5?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Claude Sonnet 4.5&lt;/a&gt; has been added to the Text and Web Dev leaderboards.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;October 1, 2025&lt;/strong&gt;&lt;br&gt;&lt;a href=&quot;https://blog.reve.com/posts/reve-editing-model/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Reve V1&lt;/a&gt; has been added to the Image Edit leaderboard.&lt;br&gt;&lt;br&gt;&lt;strong&gt;September 30, 2025&lt;/strong&gt;&lt;br&gt;The following models have been added to the Text leaderboard:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://api-docs.deepseek.com/news/news250922?ref=arena.ai&quot;&gt;&lt;u&gt;deepseek-v3.1-terminus&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://api-docs.deepseek.com/news/news250922?ref=arena.ai&quot;&gt;&lt;u&gt;deepseek-v3.1-terminus-thinking&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://developers.googleblog.com/en/continuing-to-bring-you-our-latest-models-with-an-improved-gemini-2-5-flash-and-flash-lite-release/?ref=arena.ai&quot;&gt;&lt;u&gt;gemini-2.5-flash-lite-preview-09-2025&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://developers.googleblog.com/en/continuing-to-bring-you-our-latest-models-with-an-improved-gemini-2-5-flash-and-flash-lite-release/?ref=arena.ai&quot;&gt;&lt;u&gt;gemini-2.5-flash-preview-09-2025&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://qwen.ai/blog?id=241398b9cd6353de490b0f82806c7848c5d2777d&amp;amp;from=research.latest-advancements-list&amp;amp;ref=arena.ai&quot;&gt;&lt;u&gt;qwen3-max-2025-09-23&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://qwen.ai/blog?id=99f0335c4ad9ff6153e517418d48535ab6d8afef&amp;amp;from=research.latest-advancements-list&amp;amp;ref=arena.ai&quot;&gt;&lt;u&gt;qwen3-vl-235b-a22b-instruct&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://qwen.ai/blog?id=99f0335c4ad9ff6153e517418d48535ab6d8afef&amp;amp;from=research.latest-advancements-list&amp;amp;ref=arena.ai&quot;&gt;&lt;u&gt;qwen3-vl-235b-a22b-thinking&lt;/u&gt;&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;September 25, 2025&lt;/strong&gt;&lt;br&gt;New model announcement:&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/seedream4_0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Seedream-4-2k&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;br&gt;Note that &lt;a href=&quot;https://seed.bytedance.com/en/seedream4_0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Seedream-4-high-res-fal&lt;/a&gt; and &lt;a href=&quot;https://seed.bytedance.com/en/seedream4_0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Seedream-4-fal&lt;/a&gt; are variants run on the &lt;a href=&quot;https://fal.ai/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;fal.ai&lt;/a&gt; platform. Due to differences in hosting, they are named separately as distinct models. &lt;a href=&quot;https://seed.bytedance.com/en/seedream4_0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Seedream-4-2k&lt;/a&gt; is the official endpoint provided by &lt;a href=&quot;https://seed.bytedance.com/en/?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;ByteDance&lt;/a&gt;.&lt;br&gt;&lt;br&gt;&lt;strong&gt;September 19, 2025&lt;/strong&gt;&lt;br&gt;New model announcements:&lt;br&gt;&lt;a href=&quot;https://x.ai/news/grok-4-fast?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Grok-4-fast&lt;/a&gt; has been added to the Text leaderboard.&lt;br&gt;&lt;a href=&quot;https://x.ai/news/grok-4-fast?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Grok-4-fast-search&lt;/a&gt; has been added to the Search leaderboard.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;September 18, 2025&lt;/strong&gt;&lt;br&gt;We have added a new &quot;preliminary&quot; tag to the leaderboard. If a model is tested anonymously and is subsequently released publicly, we mark its score as &quot;preliminary&quot; until enough fresh votes have been collected after the model’s public release. The tag indicates that scores may shift as community prompts and votes evolve after public launch.&lt;strong&gt; &lt;/strong&gt;See our &lt;a href=&quot;https://arena.ai/blog/policy/&quot; rel=&quot;noreferrer&quot;&gt;leaderboard policy&lt;/a&gt; for more details about evaluating models.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;September 17, 2025&lt;/strong&gt;&lt;br&gt;New model announcements:&lt;br&gt;&lt;a href=&quot;https://huggingface.co/meituan-longcat/LongCat-Flash-Chat?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Longcat-Flash-Chat&lt;/a&gt;, &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Qwen 3 Next-80b-a3b-instruct&lt;/a&gt; and &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Qwen 3 Next-80b-a3b-thinking&lt;/a&gt; have been added to the Text leaderboards.&lt;/p&gt;&lt;p&gt;We&#39;ve updated our data pipeline to add a filter which removes votes from users who exhibit statistically anomalous voting patterns. This improves the quality of the rankings by removing votes from users whose votes are arbitrary, rather than based on the quality of the responses.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;September 16, 2025&lt;/strong&gt;&lt;br&gt;New model announcement:&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/seedream4_0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Seedream 4 High Res&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://api-docs.deepseek.com/news/news250821?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Deepseek v3.1 &amp;amp; Deepseek v3.1-thinking&lt;/a&gt; have been added to the WebDev leaderboard. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;September 12, 2025&lt;/strong&gt;&lt;br&gt;New model announcement:&lt;br&gt;&lt;a href=&quot;https://seed.bytedance.com/en/seedream4_0?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Seedream 4&lt;/a&gt; has been added to the Text-to-Image and Image Edit leaderboards.&lt;br&gt;&lt;br&gt;&lt;strong&gt;September 8, 2025&lt;/strong&gt;&lt;br&gt;New model announcements:&lt;br&gt;&lt;a href=&quot;https://www.alibabacloud.com/help/en/model-studio/models?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Qwen3-max-preview&lt;/a&gt; and &lt;a href=&quot;https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905?ref=arena.ai&quot; rel=&quot;noreferrer&quot;&gt;Kimi-K2-0905-preview&lt;/a&gt; have been added to the Text Leaderboard.&lt;br&gt;&lt;br&gt;We also enabled filtering for the mistaken image generation and image edit requests for text arena.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;September 2, 2025&lt;/strong&gt;&lt;br&gt;Due to the increase in image generation traffic brought by nano-banana, we noticed there were prompts in our vision arena data which were asking for image generation but did not have image output enabled. We&#39;ve implemented an LLM based rule to filter these rows o

...

github-actions · 2026-04-15T04:50:11Z

http://localhost:1200/artificialanalysis/changelog - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Artificial Analysis Changelog</title>
    <link>https://artificialanalysis.ai/changelog</link>
    <atom:link href="http://localhost:1200/artificialanalysis/changelog" rel="self" type="application/rss+xml"></atom:link>
    <description>Changelog and update stream from Artificial Analysis - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Wed, 15 Apr 2026 04:49:42 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>Gemma 4 31B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-14 Apr 2026-0</guid>
      <pubDate>Mon, 13 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.7 on Fireworks</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-7/providers</link>
      <guid isPermaLink="false">artificialanalysis-14 Apr 2026-1</guid>
      <pubDate>Mon, 13 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.7 on Together.ai</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-7/providers</link>
      <guid isPermaLink="false">artificialanalysis-14 Apr 2026-2</guid>
      <pubDate>Mon, 13 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning) on Clarifai</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-14 Apr 2026-3</guid>
      <pubDate>Mon, 13 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Weights &amp; Biases</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-14 Apr 2026-4</guid>
      <pubDate>Mon, 13 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Sub-32B Open Weights</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/sub-32b-open-weights</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-0</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Step 3.5 Flash 2603</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/step-3-5-flash</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-1</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-2</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Together.ai</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-3</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on Weights &amp; Biases</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-4</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-5</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on Together.ai</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-6</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Step 3.5 Flash 2603 on StepFun</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/step-3-5-flash/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Apr 2026-7</guid>
      <pubDate>Sun, 12 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiMo-V2-TTS</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-10 Apr 2026-0</guid>
      <pubDate>Thu, 09 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>A new look for Artificial Analysis</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/a-new-look-for-artificial-analysis</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-0</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-1</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Non-reasoning) on SiliconFlow</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-2</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Non-reasoning) on Novita</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-3</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Non-reasoning) on Parasail</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-4</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-5</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Non-reasoning) on FriendliAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-09 Apr 2026-6</guid>
      <pubDate>Wed, 08 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Muse Spark: Meta is back in the AI race</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/muse-spark-everything-you-need-to-know</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-0</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Muse Spark</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/muse-spark</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-1</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Trinity Large Thinking</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/trinity-large-thinking</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-2</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 v2 (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-3</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 v2 (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-4</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.6 Plus</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-6-plus</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-5</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Trinity Large Thinking on Arcee AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/trinity-large-thinking/providers</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-6</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Trinity Large Thinking on Parasail</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/trinity-large-thinking/providers</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-7</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 v2 (Reasoning) on xAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20/providers</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-8</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 v2 (Non-reasoning) on xAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-9</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.6 Plus on Alibaba Cloud</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-6-plus/providers</link>
      <guid isPermaLink="false">artificialanalysis-08 Apr 2026-10</guid>
      <pubDate>Tue, 07 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiMo-V2-Omni-0327</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mimo-v2-omni-0327</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-0</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-1</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 Omni Flash</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-omni-flash</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-2</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on SiliconFlow</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-3</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on FriendliAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-4</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-5</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on Fireworks</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-6</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-7</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on Novita</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-8</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5.1 (Reasoning) on Parasail</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-1/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-9</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 Omni Flash on Alibaba Cloud</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-omni-flash/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-10</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Nova 2.0 Lite (high) on Amazon Bedrock</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/nova-2-0-lite-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-11</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Nova 2.0 Omni (medium) on Amazon Bedrock</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/nova-2-0-omni-reasoning-medium/providers</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-12</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MAI-Voice-1</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-07 Apr 2026-14</guid>
      <pubDate>Mon, 06 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4: Google on the American open weights frontier</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/gemma-4-everything-you-need-to-know</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-0</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Solar Pro 3</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/solar-pro-3</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-1</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-2</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-3</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 E4B (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-e4b-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-4</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 E2B (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-e2b-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-5</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-6</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Clarifai</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-7</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.5 on Lightning AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-5/providers</link>
      <guid isPermaLink="false">artificialanalysis-06 Apr 2026-8</guid>
      <pubDate>Sun, 05 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 E4B (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-e4b</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-0</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 E2B (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-e2b</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-1</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Google</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-2</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning) on Google</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-3</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning) on Parasail</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-4</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning) on Novita</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-5</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Parasail</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-6</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Novita</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-7</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5 (Reasoning) on Lightning AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-8</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning) on Lightning AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b/providers</link>
      <guid isPermaLink="false">artificialanalysis-03 Apr 2026-9</guid>
      <pubDate>Thu, 02 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Sarvam 105B and Sarvam 30B: India enters the open-weights race</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/sarvam-105b-Sarvam-30b-everything-you-need-to-know</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-0</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MAI-Transcribe-1: Everything you need to know</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/mai-transcribe-1-everything-you-need-to-know</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-1</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 Omni Plus</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-omni-plus</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-2</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 26B A4B (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-26b-a4b</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-3</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemma 4 31B (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemma-4-31b</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-4</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 Omni Plus on Alibaba Cloud</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-omni-plus/providers</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-5</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MAI-Transcribe-1, Azure</title>
      <description>&lt;p&gt;New Speech to Text model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/speech-to-text</link>
      <guid isPermaLink="false">artificialanalysis-02 Apr 2026-6</guid>
      <pubDate>Wed, 01 Apr 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM 5V Turbo (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5v-turbo</link>
      <guid isPermaLink="false">artificialanalysis-01 Apr 2026-0</guid>
      <pubDate>Tue, 31 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Voxtral TTS</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-01 Apr 2026-1</guid>
      <pubDate>Tue, 31 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Magpie-Multilingual 357M (Feb 2026)</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-31 Mar 2026-0</guid>
      <pubDate>Mon, 30 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Nemotron Cascade 2 30B A3B</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/nemotron-cascade-2-30b-a3b</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-0</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>KAT Coder Pro V2</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/kat-coder-pro-v2</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-1</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 0.8B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-0-8b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-2</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 2B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-2b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-3</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 4B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-4b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-4</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 27B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-27b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-5</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 9B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-9b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-6</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 35B A3B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-35b-a3b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-7</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 122B A10B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-122b-a10b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-8</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 397B A17B (Non-reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-397b-a17b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-9</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 0.8B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-0-8b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-10</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 2B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-2b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-11</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 4B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-4b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-12</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 27B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-27b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-13</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 9B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-9b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-14</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 35B A3B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-35b-a3b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-15</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 397B A17B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-397b-a17b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-16</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 122B A10B (Reasoning) on DeepInfra</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-122b-a10b/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-17</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>KAT Coder Pro V2 on StreamLake</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/kat-coder-pro-v2/providers</link>
      <guid isPermaLink="false">artificialanalysis-30 Mar 2026-18</guid>
      <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Gemini 3 Deep Think</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gemini-3-deep-think</link>
      <guid isPermaLink="false">artificialanalysis-26 Mar 2026-0</guid>
      <pubDate>Wed, 25 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Step TTS 2 (Mar 2026)</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-26 Mar 2026-1</guid>
      <pubDate>Wed, 25 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Step Audio EditX (Mar 2026)</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-26 Mar 2026-2</guid>
      <pubDate>Wed, 25 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Cohere Transcribe</title>
      <description>&lt;p&gt;New Speech to Text model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/speech-to-text</link>
      <guid isPermaLink="false">artificialanalysis-26 Mar 2026-3</guid>
      <pubDate>Wed, 25 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax M2.7: Everything you need to know</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/minimax-m2-7-everything-you-need-to-know</link>
      <guid isPermaLink="false">artificialanalysis-25 Mar 2026-0</guid>
      <pubDate>Tue, 24 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiMo-V2-Pro on Xiaomi</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mimo-v2-pro/providers</link>
      <guid isPermaLink="false">artificialanalysis-25 Mar 2026-1</guid>
      <pubDate>Tue, 24 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>SIMBA 1.6</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-25 Mar 2026-2</guid>
      <pubDate>Tue, 24 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>NVIDIA Nemotron 3 Nano 4B</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/nvidia-nemotron-3-nano-4b</link>
      <guid isPermaLink="false">artificialanalysis-24 Mar 2026-0</guid>
      <pubDate>Mon, 23 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5-Turbo</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-turbo</link>
      <guid isPermaLink="false">artificialanalysis-24 Mar 2026-1</guid>
      <pubDate>Mon, 23 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Fish Audio S2 Pro</title>
      <description>&lt;p&gt;New model added to Speech Arena Leaderboard&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/text-to-speech/leaderboard</link>
      <guid isPermaLink="false">artificialanalysis-24 Mar 2026-2</guid>
      <pubDate>Mon, 23 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Apertus 8B Instruct</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/apertus-8b-instruct</link>
      <guid isPermaLink="false">artificialanalysis-21 Mar 2026-0</guid>
      <pubDate>Fri, 20 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Apertus 70B Instruct</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/apertus-70b-instruct</link>
      <guid isPermaLink="false">artificialanalysis-21 Mar 2026-1</guid>
      <pubDate>Fri, 20 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Apertus 70B Instruct on Public AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/apertus-70b-instruct/providers</link>
      <guid isPermaLink="false">artificialanalysis-21 Mar 2026-2</guid>
      <pubDate>Fri, 20 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Apertus 8B Instruct on Public AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/apertus-8b-instruct/providers</link>
      <guid isPermaLink="false">artificialanalysis-21 Mar 2026-3</guid>
      <pubDate>Fri, 20 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiMo-V2-Pro: Everything you need to know</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/mimo-v2-pro-everything-you-need-to-know</link>
      <guid isPermaLink="false">artificialanalysis-20 Mar 2026-0</guid>
      <pubDate>Thu, 19 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Nanbeige4.1-3B</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/nanbeige4-1-3b</link>
      <guid isPermaLink="false">artificialanalysis-20 Mar 2026-1</guid>
      <pubDate>Thu, 19 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiMo-V2-Omni</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mimo-v2-omni</link>
      <guid isPermaLink="false">artificialanalysis-20 Mar 2026-2</guid>
      <pubDate>Thu, 19 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 nano (Non-Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-nano-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-0</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 nano (medium)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-nano-medium</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-1</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (medium) on OpenAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini-medium/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-2</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 (xhigh) on Microsoft Azure</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-3</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 nano (medium) on OpenAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-nano-medium/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-4</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 nano (Non-Reasoning) on OpenAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-nano-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-5</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (xhigh) on OpenAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-6</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (Non-Reasoning) on Microsoft Azure</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-7</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (medium) on Microsoft Azure</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini-medium/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-8</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (xhigh) on Microsoft Azure</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-9</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (Non-Reasoning) on OpenAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-10</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 nano (xhigh) on OpenAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-nano/providers</link>
      <guid isPermaLink="false">artificialanalysis-19 Mar 2026-11</guid>
      <pubDate>Wed, 18 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (Non-Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-0</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (medium)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini-medium</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-1</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 mini (xhigh)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-mini</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-2</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 nano (xhigh)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-nano</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-3</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Sarvam 105B (high)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/sarvam-105b</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-4</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Sarvam 30B (high)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/sarvam-30b</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-5</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-6</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.7</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-7</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-7</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Mistral Small 4 (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mistral-small-4-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-8</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Mistral Small 4 (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mistral-small-4</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-9</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Sarvam 30B (high) on Sarvam</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/sarvam-30b/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-10</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Sarvam 105B (high) on Sarvam</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/sarvam-105b/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-11</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.5 on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-5/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-12</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.7 on MiniMax</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-7/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-13</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiniMax-M2.7 on Novita</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/minimax-m2-7/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-14</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Mistral Small 4 (Non-reasoning) on Mistral</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mistral-small-4-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-15</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Mistral Small 4 (Reasoning) on Mistral</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mistral-small-4/providers</link>
      <guid isPermaLink="false">artificialanalysis-18 Mar 2026-16</guid>
      <pubDate>Tue, 17 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>MiMo-V2-Pro</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/mimo-v2-pro</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-0</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>gpt-oss-120B (high) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-oss-120b/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-1</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3 235B A22B 2507 (Reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-235b-a22b-instruct-2507-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-2</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 397B A17B (Non-reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-397b-a17b-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-3</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3 Next 80B A3B (Reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-next-80b-a3b-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-4</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 397B A17B (Reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-397b-a17b/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-5</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>DeepSeek V3.2 (Non-reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/deepseek-v3-2/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-7</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>DeepSeek V3.2 (Reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/deepseek-v3-2-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-17 Mar 2026-8</guid>
      <pubDate>Mon, 16 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GPT-5.4 (xhigh): CritPT score updated from 20.0% to 23.4%What&#39;s changed</title>
      <description>&lt;p&gt;Minor scoring update — no change to Intelligence Index&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/gpt-5-4</link>
      <guid isPermaLink="false">artificialanalysis-16 Mar 2026-0</guid>
      <pubDate>Sun, 15 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 397B A17B (Reasoning) on Clarifai</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-397b-a17b/providers</link>
      <guid isPermaLink="false">artificialanalysis-16 Mar 2026-1</guid>
      <pubDate>Sun, 15 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>NVIDIA Nemotron 3 VoiceChat: Leading the Open Weights Frontier of Conversational Dynamics vs. Speech Reasoning</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/nemotron-3-voicechat-leader-speech-pareto</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-0</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>DeepSeek V3.2 (Reasoning) on Eigen AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/deepseek-v3-2-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-1</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>DeepSeek V3.2 (Non-reasoning) on Eigen AI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/deepseek-v3-2/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-2</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 397B A17B (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-397b-a17b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-3</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Kimi K2.5 (Non-reasoning) on FriendliAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/kimi-k2-5-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-4</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 27B (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-27b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-5</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 35B A3B (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-35b-a3b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-6</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 122B A10B (Reasoning) on GMI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-122b-a10b/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-7</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Kimi K2.5 (Reasoning) on FriendliAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/kimi-k2-5/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-8</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3 Coder Next on Amazon Bedrock</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-coder-next/providers</link>
      <guid isPermaLink="false">artificialanalysis-13 Mar 2026-9</guid>
      <pubDate>Thu, 12 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Kimi K2.5 (Reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/kimi-k2-5/providers</link>
      <guid isPermaLink="false">artificialanalysis-12 Mar 2026-0</guid>
      <pubDate>Wed, 11 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5 (Non-reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-12 Mar 2026-1</guid>
      <pubDate>Wed, 11 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>GLM-5 (Reasoning) on Nebius</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/glm-5/providers</link>
      <guid isPermaLink="false">artificialanalysis-12 Mar 2026-2</guid>
      <pubDate>Wed, 11 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>NVIDIA Nemotron 3 Super: The new leader in open, efficient intelligence</title>
      <description>&lt;p&gt;🔔 New article published&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/articles/nvidia-nemotron-3-super-the-new-leader-in-open-efficient-intelligence</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-0</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20-0309-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-1</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20-0309</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-2</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Qwen3.5 0.8B (Non-reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/qwen3-5-0-8b-non-reasoning</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-3</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>NVIDIA Nemotron 3 Super 120B A12B (Reasoning)</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/nvidia-nemotron-3-super-120b-a12b</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-4</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>LongCat Flash Lite</title>
      <description>&lt;p&gt;New language model evaluation results available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/longcat-flash-lite</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-5</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 (Reasoning) on xAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20-0309/providers</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-6</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Grok 4.20 0309 (Non-reasoning) on xAI</title>
      <description>&lt;p&gt;Performance benchmark now available&lt;/p&gt;</description>
      <link>https://artificialanalysis.ai/models/grok-4-20-0309-non-reasoning/providers</link>
      <guid isPermaLink="false">artificialanalysis-11 Mar 2026-7</guid>
      <pubDate>Tue, 10 Mar 2026 16:00:00 GMT</pubDate>
    </item>
    <item>
      <title>NVIDIA Nemotron 3 Super 120B A12B (Reasoning) on Weights &amp; Biases</title>
      <description>&lt;p&gt;Performance ben

...

github-actions · 2026-04-15T04:50:11Z

http://localhost:1200/epoch/gradient-updates - Failed ❌

HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>Error: Status code 403
Route: /epoch/gradient-updates
Full Route: /epoch/gradient-updates
Node Version: v24.14.1
Git Hash: bf2094bf

...

TonyRL · 2026-04-15T23:23:31Z

denounce impersonate as dvorak0, goestav, Kjasn, Loongphy, TonyRL, ttttmr and xbot. 9e6f84d.

#21742 (comment)

BetterAndBetterII · 2026-04-16T10:32:50Z

denounce impersonate as dvorak0, goestav, Kjasn, Loongphy, TonyRL, ttttmr and xbot. 9e6f84d.

I want to clarify the intent behind this PR.

I opened it to test whether the bot environment could access the OpenAI blog page correctly, since I was hitting a 503 locally. The PR was incomplete, and I was aware of that. I also used AI assistance for parts of the code.

That said, if the way this PR was configured, authored, or presented gave the impression that I was impersonating dvorak0, goestav, Kjasn, Loongphy, TonyRL, ttttmr, or xbot, that was my mistake. I did not intend to impersonate any maintainer or bot, and I sincerely apologize for the confusion and disruption this caused.

I understand the concerns around this PR. Rather than adding me to .github/VOUCHED.td, I would suggest enforcing this via CI: validate that the PR author matches the maintainer identity configured in the code. That would prevent similar issues more directly in the future.

* feat(route): add Castbox route (DIYgod#21700) * feat(route): add Castbox route * remove User Agent and switch item_image * remove missed use of trueUA * remove unnecessary import * fix typo * fix(route/gameapps): fix selectors (DIYgod#21703) * docs: add sports category (DIYgod#21704) * feat: add sports category * fix: fix runyeah * fix(ci): use REST API to find PRs by branch in workflows `gh pr view` queries with a hidden `first: 30` which fails to find PRs when the target PR falls outside the first page. The REST API filters by `head=owner:branch` server-side which avoid this limitation. * fix(elamigos): fix parsing after webpage layout update (DIYgod#21705) * chore(deps): bump actions/upload-artifact from 7.0.0 to 7.0.1 (DIYgod#21709) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 7.0.0 to 7.0.1. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@bbbca2d...043fb46) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: 7.0.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump undici from 7.24.7 to 7.24.8 (DIYgod#21713) Bumps [undici](https://github.com/nodejs/undici) from 7.24.7 to 7.24.8. - [Release notes](https://github.com/nodejs/undici/releases) - [Commits](nodejs/undici@v7.24.7...v7.24.8) --- updated-dependencies: - dependency-name: undici dependency-version: 7.24.8 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump @hono/node-server from 1.19.13 to 1.19.14 (DIYgod#21712) Bumps [@hono/node-server](https://github.com/honojs/node-server) from 1.19.13 to 1.19.14. - [Release notes](https://github.com/honojs/node-server/releases) - [Commits](honojs/node-server@v1.19.13...v1.19.14) --- updated-dependencies: - dependency-name: "@hono/node-server" dependency-version: 1.19.14 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump dotenv from 17.4.1 to 17.4.2 (DIYgod#21715) Bumps [dotenv](https://github.com/motdotla/dotenv) from 17.4.1 to 17.4.2. - [Changelog](https://github.com/motdotla/dotenv/blob/master/CHANGELOG.md) - [Commits](motdotla/dotenv@v17.4.1...v17.4.2) --- updated-dependencies: - dependency-name: dotenv dependency-version: 17.4.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump jsrsasign from 11.1.1 to 11.1.2 (DIYgod#21717) Bumps [jsrsasign](https://github.com/kjur/jsrsasign) from 11.1.1 to 11.1.2. - [Release notes](https://github.com/kjur/jsrsasign/releases) - [Changelog](https://github.com/kjur/jsrsasign/blob/master/ChangeLog.txt) - [Commits](kjur/jsrsasign@11.1.1...11.1.2) --- updated-dependencies: - dependency-name: jsrsasign dependency-version: 11.1.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump @cloudflare/workers-types in the cloudflare group (DIYgod#21707) Bumps the cloudflare group with 1 update: [@cloudflare/workers-types](https://github.com/cloudflare/workerd). Updates `@cloudflare/workers-types` from 4.20260410.1 to 4.20260413.1 - [Release notes](https://github.com/cloudflare/workerd/releases) - [Changelog](https://github.com/cloudflare/workerd/blob/main/RELEASE.md) - [Commits](https://github.com/cloudflare/workerd/commits) --- updated-dependencies: - dependency-name: "@cloudflare/workers-types" dependency-version: 4.20260413.1 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump docker/build-push-action from 7.0.0 to 7.1.0 (DIYgod#21708) Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 7.0.0 to 7.1.0. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](docker/build-push-action@d08e5c3...bcafcac) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-version: 7.1.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump devenv from `d4410df` to `88ac631` (DIYgod#21718) Bumps [devenv](https://github.com/cachix/devenv) from `d4410df` to `88ac631`. - [Release notes](https://github.com/cachix/devenv/releases) - [Commits](cachix/devenv@d4410df...88ac631) --- updated-dependencies: - dependency-name: devenv dependency-version: 88ac631cf8b6582ed372b8b22e3bd12240c61f64 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(nix): update dependencies hash to sha256-v8KDnut1FrWMgre355e8VodnHmpcQR8XChHSPOfXs5s= * chore(deps): bump re2js from 2.0.1 to 2.1.1 (DIYgod#21714) Bumps [re2js](https://github.com/le0pard/re2js) from 2.0.1 to 2.1.1. - [Release notes](https://github.com/le0pard/re2js/releases) - [Commits](le0pard/re2js@2.0.1...2.1.1) --- updated-dependencies: - dependency-name: re2js dependency-version: 2.1.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump globals from 17.4.0 to 17.5.0 (DIYgod#21711) Bumps [globals](https://github.com/sindresorhus/globals) from 17.4.0 to 17.5.0. - [Release notes](https://github.com/sindresorhus/globals/releases) - [Commits](sindresorhus/globals@v17.4.0...v17.5.0) --- updated-dependencies: - dependency-name: globals dependency-version: 17.5.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump @hono/zod-openapi from 1.2.4 to 1.3.0 (DIYgod#21716) Bumps [@hono/zod-openapi](https://github.com/honojs/middleware/tree/HEAD/packages/zod-openapi) from 1.2.4 to 1.3.0. - [Release notes](https://github.com/honojs/middleware/releases) - [Changelog](https://github.com/honojs/middleware/blob/main/packages/zod-openapi/CHANGELOG.md) - [Commits](https://github.com/honojs/middleware/commits/@hono/zod-openapi@1.3.0/packages/zod-openapi) --- updated-dependencies: - dependency-name: "@hono/zod-openapi" dependency-version: 1.3.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump pnpm/action-setup from 5.0.0 to 6.0.0 (DIYgod#21710) Bumps [pnpm/action-setup](https://github.com/pnpm/action-setup) from 5.0.0 to 6.0.0. - [Release notes](https://github.com/pnpm/action-setup/releases) - [Commits](pnpm/action-setup@fc06bc1...08c4be7) --- updated-dependencies: - dependency-name: pnpm/action-setup dependency-version: 6.0.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * style: auto format * feat: disable IPv6 (DIYgod#21719) * chore: group vitest in dependabot * chore(deps): bump devenv from `88ac631` to `8d558a8` (DIYgod#21722) Bumps [devenv](https://github.com/cachix/devenv) from `88ac631` to `8d558a8`. - [Release notes](https://github.com/cachix/devenv/releases) - [Commits](cachix/devenv@88ac631...8d558a8) --- updated-dependencies: - dependency-name: devenv dependency-version: 8d558a84fa38242a7f13781670fee1a6a8902b48 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(nix): update dependencies hash to sha256-b/SBHeUs+zsKjx3Et/ppNoA1fm8/KGiaHCEvOP+af5I= * refactor: fix first() and undefined fallback abuse * refactor: add GraphQL annotation to queries for auto formatting in oxfmt v0.42 79a525c * style: auto format * chore: fix pnpm install revert DIYgod#21710 close DIYgod#21724 related pnpm/action-setup#225 * chore(deps): bump lru-cache from 11.3.3 to 11.3.5 (DIYgod#21730) Bumps [lru-cache](https://github.com/isaacs/node-lru-cache) from 11.3.3 to 11.3.5. - [Changelog](https://github.com/isaacs/node-lru-cache/blob/main/CHANGELOG.md) - [Commits](isaacs/node-lru-cache@v11.3.3...v11.3.5) --- updated-dependencies: - dependency-name: lru-cache dependency-version: 11.3.5 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump tsdown from 0.21.7 to 0.21.8 (DIYgod#21735) Bumps [tsdown](https://github.com/rolldown/tsdown) from 0.21.7 to 0.21.8. - [Release notes](https://github.com/rolldown/tsdown/releases) - [Commits](rolldown/tsdown@v0.21.7...v0.21.8) --- updated-dependencies: - dependency-name: tsdown dependency-version: 0.21.8 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump msw from 2.13.2 to 2.13.3 (DIYgod#21731) Bumps [msw](https://github.com/mswjs/msw) from 2.13.2 to 2.13.3. - [Release notes](https://github.com/mswjs/msw/releases) - [Changelog](https://github.com/mswjs/msw/blob/main/CHANGELOG.md) - [Commits](mswjs/msw@v2.13.2...v2.13.3) --- updated-dependencies: - dependency-name: msw dependency-version: 2.13.3 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump discord-api-types from 0.38.45 to 0.38.46 (DIYgod#21737) Bumps [discord-api-types](https://github.com/discordjs/discord-api-types) from 0.38.45 to 0.38.46. - [Release notes](https://github.com/discordjs/discord-api-types/releases) - [Changelog](https://github.com/discordjs/discord-api-types/blob/main/CHANGELOG.md) - [Commits](discordjs/discord-api-types@0.38.45...0.38.46) --- updated-dependencies: - dependency-name: discord-api-types dependency-version: 0.38.46 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump the typescript-eslint group with 2 updates (DIYgod#21728) Bumps the typescript-eslint group with 2 updates: [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) and [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser). Updates `@typescript-eslint/eslint-plugin` from 8.58.1 to 8.58.2 - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases) - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md) - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.58.2/packages/eslint-plugin) Updates `@typescript-eslint/parser` from 8.58.1 to 8.58.2 - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases) - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md) - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.58.2/packages/parser) --- updated-dependencies: - dependency-name: "@typescript-eslint/eslint-plugin" dependency-version: 8.58.2 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: typescript-eslint - dependency-name: "@typescript-eslint/parser" dependency-version: 8.58.2 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: typescript-eslint ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump re2js from 2.1.1 to 2.2.0 (DIYgod#21736) Bumps [re2js](https://github.com/le0pard/re2js) from 2.1.1 to 2.2.0. - [Release notes](https://github.com/le0pard/re2js/releases) - [Commits](le0pard/re2js@2.1.1...2.2.0) --- updated-dependencies: - dependency-name: re2js dependency-version: 2.2.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: fix find PR no. by branch name for dependabot * chore(deps-dev): bump got from 15.0.1 to 15.0.2 (DIYgod#21734) Bumps [got](https://github.com/sindresorhus/got) from 15.0.1 to 15.0.2. - [Release notes](https://github.com/sindresorhus/got/releases) - [Commits](sindresorhus/got@v15.0.1...v15.0.2) --- updated-dependencies: - dependency-name: got dependency-version: 15.0.2 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump the cloudflare group with 3 updates (DIYgod#21726) Bumps the cloudflare group with 3 updates: [@cloudflare/puppeteer](https://github.com/cloudflare/puppeteer), [@cloudflare/workers-types](https://github.com/cloudflare/workerd) and [wrangler](https://github.com/cloudflare/workers-sdk/tree/HEAD/packages/wrangler). Updates `@cloudflare/puppeteer` from 1.0.7 to 1.1.0 - [Release notes](https://github.com/cloudflare/puppeteer/releases) - [Commits](cloudflare/puppeteer@v1.0.7...v1.1.0) Updates `@cloudflare/workers-types` from 4.20260413.1 to 4.20260414.1 - [Release notes](https://github.com/cloudflare/workerd/releases) - [Changelog](https://github.com/cloudflare/workerd/blob/main/RELEASE.md) - [Commits](https://github.com/cloudflare/workerd/commits) Updates `wrangler` from 4.81.1 to 4.82.2 - [Release notes](https://github.com/cloudflare/workers-sdk/releases) - [Commits](https://github.com/cloudflare/workers-sdk/commits/wrangler@4.82.2/packages/wrangler) --- updated-dependencies: - dependency-name: "@cloudflare/puppeteer" dependency-version: 1.1.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare - dependency-name: "@cloudflare/workers-types" dependency-version: 4.20260414.1 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare - dependency-name: wrangler dependency-version: 4.82.2 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump undici from 7.24.8 to 7.25.0 (DIYgod#21732) Bumps [undici](https://github.com/nodejs/undici) from 7.24.8 to 7.25.0. - [Release notes](https://github.com/nodejs/undici/releases) - [Commits](nodejs/undici@v7.24.8...v7.25.0) --- updated-dependencies: - dependency-name: undici dependency-version: 7.25.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump @notionhq/client from 5.17.0 to 5.18.0 (DIYgod#21733) Bumps [@notionhq/client](https://github.com/makenotion/notion-sdk-js) from 5.17.0 to 5.18.0. - [Release notes](https://github.com/makenotion/notion-sdk-js/releases) - [Commits](makenotion/notion-sdk-js@v5.17.0...v5.18.0) --- updated-dependencies: - dependency-name: "@notionhq/client" dependency-version: 5.18.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump the oxc group across 1 directory with 5 updates (DIYgod#21739) * chore(deps-dev): bump the oxc group across 1 directory with 5 updates Bumps the oxc group with 5 updates in the / directory: | Package | From | To | | --- | --- | --- | | [@oxlint/plugins](https://github.com/oxc-project/oxc/tree/HEAD/npm/oxlint-plugins) | `1.59.0` | `1.60.0` | | [oxfmt](https://github.com/oxc-project/oxc/tree/HEAD/npm/oxfmt) | `0.44.0` | `0.45.0` | | [oxlint](https://github.com/oxc-project/oxc/tree/HEAD/npm/oxlint) | `1.59.0` | `1.60.0` | | [oxlint-plugin-eslint](https://github.com/oxc-project/oxc/tree/HEAD/npm/oxlint-plugin-eslint) | `1.59.0` | `1.60.0` | | [oxlint-tsgolint](https://github.com/oxc-project/tsgolint) | `0.20.0` | `0.21.0` | Updates `@oxlint/plugins` from 1.59.0 to 1.60.0 - [Release notes](https://github.com/oxc-project/oxc/releases) - [Changelog](https://github.com/oxc-project/oxc/blob/main/CHANGELOG.md) - [Commits](https://github.com/oxc-project/oxc/commits/apps_v1.60.0/npm/oxlint-plugins) Updates `oxfmt` from 0.44.0 to 0.45.0 - [Release notes](https://github.com/oxc-project/oxc/releases) - [Changelog](https://github.com/oxc-project/oxc/blob/main/npm/oxfmt/CHANGELOG.md) - [Commits](https://github.com/oxc-project/oxc/commits/oxfmt_v0.45.0/npm/oxfmt) Updates `oxlint` from 1.59.0 to 1.60.0 - [Release notes](https://github.com/oxc-project/oxc/releases) - [Changelog](https://github.com/oxc-project/oxc/blob/main/npm/oxlint/CHANGELOG.md) - [Commits](https://github.com/oxc-project/oxc/commits/oxlint_v1.60.0/npm/oxlint) Updates `oxlint-plugin-eslint` from 1.59.0 to 1.60.0 - [Release notes](https://github.com/oxc-project/oxc/releases) - [Changelog](https://github.com/oxc-project/oxc/blob/main/npm/oxlint-plugin-eslint/CHANGELOG.md) - [Commits](https://github.com/oxc-project/oxc/commits/apps_v1.60.0/npm/oxlint-plugin-eslint) Updates `oxlint-tsgolint` from 0.20.0 to 0.21.0 - [Release notes](https://github.com/oxc-project/tsgolint/releases) - [Commits](oxc-project/tsgolint@v0.20.0...v0.21.0) --- updated-dependencies: - dependency-name: "@oxlint/plugins" dependency-version: 1.60.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: oxc - dependency-name: oxfmt dependency-version: 0.45.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: oxc - dependency-name: oxlint dependency-version: 1.60.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: oxc - dependency-name: oxlint-plugin-eslint dependency-version: 1.60.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: oxc - dependency-name: oxlint-tsgolint dependency-version: 0.21.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: oxc ... Signed-off-by: dependabot[bot] <support@github.com> * style: use native unicorn/consistent-template-literal-escape and add stylistic/quotes --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tony <TonyRL@users.noreply.github.com> * chore: update format scripts glob doesn't work well oxc-project/oxc#13556 * style: auto format * chore(deps-dev): bump the vitest group with 2 updates (DIYgod#21729) Bumps the vitest group with 2 updates: [@vitest/coverage-v8](https://github.com/vitest-dev/vitest/tree/HEAD/packages/coverage-v8) and [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest). Updates `@vitest/coverage-v8` from 4.0.9 to 4.1.4 - [Release notes](https://github.com/vitest-dev/vitest/releases) - [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.4/packages/coverage-v8) Updates `vitest` from 4.0.9 to 4.1.4 - [Release notes](https://github.com/vitest-dev/vitest/releases) - [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.4/packages/vitest) --- updated-dependencies: - dependency-name: "@vitest/coverage-v8" dependency-version: 4.1.4 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: vitest - dependency-name: vitest dependency-version: 4.1.4 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: vitest ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * style: auto format * feat(route): add pixel update bulletins (DIYgod#21740) * feat(route): add caicai blog (DIYgod#21741) * feat(route): add caicai blog * fix: favicon * chore(deps-dev): bump @cloudflare/workers-types in the cloudflare group (DIYgod#21743) Bumps the cloudflare group with 1 update: [@cloudflare/workers-types](https://github.com/cloudflare/workerd). Updates `@cloudflare/workers-types` from 4.20260414.1 to 4.20260415.1 - [Release notes](https://github.com/cloudflare/workerd/releases) - [Changelog](https://github.com/cloudflare/workerd/blob/main/RELEASE.md) - [Commits](https://github.com/cloudflare/workerd/commits) --- updated-dependencies: - dependency-name: "@cloudflare/workers-types" dependency-version: 4.20260415.1 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump devenv from `8d558a8` to `07aa7cb` (DIYgod#21745) Bumps [devenv](https://github.com/cachix/devenv) from `8d558a8` to `07aa7cb`. - [Release notes](https://github.com/cachix/devenv/releases) - [Commits](cachix/devenv@8d558a8...07aa7cb) --- updated-dependencies: - dependency-name: devenv dependency-version: 07aa7cb4959bdc6d6537b819cc766ab3277fbb59 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(nix): update dependencies hash to sha256-miyvJu4AKhQVlWea8a8bYN2y0KpXd5ooUmJQpoGioCs= * chore(deps): bump hono from 4.12.12 to 4.12.14 (DIYgod#21744) Bumps [hono](https://github.com/honojs/hono) from 4.12.12 to 4.12.14. - [Release notes](https://github.com/honojs/hono/releases) - [Commits](honojs/hono@v4.12.12...v4.12.14) --- updated-dependencies: - dependency-name: hono dependency-version: 4.12.14 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: add vouch trust management system * Update VOUCHED list DIYgod#21742 (comment) * chore: close PR after denouncing * chore(deps): bump sanitize-html from 2.17.2 to 2.17.3 (DIYgod#21749) Bumps [sanitize-html](https://github.com/apostrophecms/apostrophe/tree/HEAD/packages/sanitize-html) from 2.17.2 to 2.17.3. - [Changelog](https://github.com/apostrophecms/apostrophe/blob/main/packages/sanitize-html/CHANGELOG.md) - [Commits](https://github.com/apostrophecms/apostrophe/commits/sanitize-html@2.17.3/packages/sanitize-html) --- updated-dependencies: - dependency-name: sanitize-html dependency-version: 2.17.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump devenv from `07aa7cb` to `2012662` (DIYgod#21751) Bumps [devenv](https://github.com/cachix/devenv) from `07aa7cb` to `2012662`. - [Release notes](https://github.com/cachix/devenv/releases) - [Commits](cachix/devenv@07aa7cb...2012662) --- updated-dependencies: - dependency-name: devenv dependency-version: 2012662a89ff2ce92044151d7bbf3894eec5620a dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(nix): update dependencies hash to sha256-aehV414pbc2t0JsC9Rkbllu9v3Mpw/wmZQo7hvEyX08= * chore(deps): bump nixpkgs from `4c1018d` to `4bd9165` (DIYgod#21752) Bumps [nixpkgs](https://github.com/NixOS/nixpkgs) from `4c1018d` to `4bd9165`. - [Commits](NixOS/nixpkgs@4c1018d...4bd9165) --- updated-dependencies: - dependency-name: nixpkgs dependency-version: 4bd9165a9165d7b5e33ae57f3eecbcb28fb231c9 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump @scalar/hono-api-reference from 0.10.7 to 0.10.8 (DIYgod#21750) Bumps [@scalar/hono-api-reference](https://github.com/scalar/scalar/tree/HEAD/integrations/hono) from 0.10.7 to 0.10.8. - [Release notes](https://github.com/scalar/scalar/releases) - [Changelog](https://github.com/scalar/scalar/blob/main/integrations/hono/CHANGELOG.md) - [Commits](https://github.com/scalar/scalar/commits/HEAD/integrations/hono) --- updated-dependencies: - dependency-name: "@scalar/hono-api-reference" dependency-version: 0.10.8 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump the cloudflare group with 3 updates (DIYgod#21748) Bumps the cloudflare group with 3 updates: [@cloudflare/containers](https://github.com/cloudflare/containers), [@cloudflare/workers-types](https://github.com/cloudflare/workerd) and [wrangler](https://github.com/cloudflare/workers-sdk/tree/HEAD/packages/wrangler). Updates `@cloudflare/containers` from 0.3.0 to 0.3.2 - [Release notes](https://github.com/cloudflare/containers/releases) - [Changelog](https://github.com/cloudflare/containers/blob/main/CHANGELOG.md) - [Commits](cloudflare/containers@v0.3.0...v0.3.2) Updates `@cloudflare/workers-types` from 4.20260415.1 to 4.20260416.2 - [Release notes](https://github.com/cloudflare/workerd/releases) - [Changelog](https://github.com/cloudflare/workerd/blob/main/RELEASE.md) - [Commits](https://github.com/cloudflare/workerd/commits) Updates `wrangler` from 4.82.2 to 4.83.0 - [Release notes](https://github.com/cloudflare/workers-sdk/releases) - [Commits](https://github.com/cloudflare/workers-sdk/commits/wrangler@4.83.0/packages/wrangler) --- updated-dependencies: - dependency-name: "@cloudflare/containers" dependency-version: 0.3.2 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: cloudflare - dependency-name: "@cloudflare/workers-types" dependency-version: 4.20260416.2 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare - dependency-name: wrangler dependency-version: 4.83.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: cloudflare ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix(route/bestblogs): API endpoint failure (DIYgod#21753) * revert: "chore(deps-dev): bump the cloudflare group with 3 updates (DIYgod#21748)" (DIYgod#21754) This reverts commit 9d5f61f. * docs: add FANTIA_COOKIE (DIYgod#21755) * docs: add FANTIA_COOKIE * fix(user): handle optional fanClub.comment in description * chore: bump basic-ftp and lodash * chore: bump protobufjs * chore(deps-dev): bump tsdown from 0.21.8 to 0.21.9 (DIYgod#21761) Bumps [tsdown](https://github.com/rolldown/tsdown) from 0.21.8 to 0.21.9. - [Release notes](https://github.com/rolldown/tsdown/releases) - [Commits](rolldown/tsdown@v0.21.8...v0.21.9) --- updated-dependencies: - dependency-name: tsdown dependency-version: 0.21.9 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump discord-api-types from 0.38.46 to 0.38.47 (DIYgod#21763) Bumps [discord-api-types](https://github.com/discordjs/discord-api-types) from 0.38.46 to 0.38.47. - [Release notes](https://github.com/discordjs/discord-api-types/releases) - [Changelog](https://github.com/discordjs/discord-api-types/blob/main/CHANGELOG.md) - [Commits](discordjs/discord-api-types@0.38.46...0.38.47) --- updated-dependencies: - dependency-name: discord-api-types dependency-version: 0.38.47 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump msw from 2.13.3 to 2.13.4 (DIYgod#21764) Bumps [msw](https://github.com/mswjs/msw) from 2.13.3 to 2.13.4. - [Release notes](https://github.com/mswjs/msw/releases) - [Changelog](https://github.com/mswjs/msw/blob/main/CHANGELOG.md) - [Commits](mswjs/msw@v2.13.3...v2.13.4) --- updated-dependencies: - dependency-name: msw dependency-version: 2.13.4 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump oxlint-tsgolint in the oxc group (DIYgod#21760) Bumps the oxc group with 1 update: [oxlint-tsgolint](https://github.com/oxc-project/tsgolint). Updates `oxlint-tsgolint` from 0.21.0 to 0.21.1 - [Release notes](https://github.com/oxc-project/tsgolint/releases) - [Commits](oxc-project/tsgolint@v0.21.0...v0.21.1) --- updated-dependencies: - dependency-name: oxlint-tsgolint dependency-version: 0.21.1 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: oxc ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Ananya <84459091+ananyatimalsina@users.noreply.github.com> Co-authored-by: Tony <TonyRL@users.noreply.github.com> Co-authored-by: Chris Sauermann <chris.sauermann@proton.me> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: occam-7 <liver-saga-utmost@duck.com>

github-actions bot added the route label Apr 15, 2026

github-advanced-security AI found potential problems Apr 15, 2026

View reviewed changes

github-actions bot added the auto: ready to review label Apr 15, 2026

github-actions bot added a commit that referenced this pull request Apr 15, 2026

Update VOUCHED list

be85b3e

#21742 (comment)

TonyRL closed this Apr 15, 2026

TonyRL added the false-attribution label Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track major AI platform updates through RSSHub#21742

Track major AI platform updates through RSSHub#21742
BetterAndBetterII wants to merge 1 commit intoDIYgod:masterfrom
BetterAndBetterII:feat/ai-update-routes

BetterAndBetterII commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

TonyRL commented Apr 15, 2026

Uh oh!

BetterAndBetterII commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants