Skip to content

[fix][5875912] Fix autoquant-autodeploy example#878

Merged
Fridah-nv merged 2 commits intomainfrom
fridah/fix-ad-example
Feb 11, 2026
Merged

[fix][5875912] Fix autoquant-autodeploy example#878
Fridah-nv merged 2 commits intomainfrom
fridah/fix-ad-example

Conversation

@Fridah-nv
Copy link
Contributor

@Fridah-nv Fridah-nv commented Feb 10, 2026

What does this PR do?

Type of change: Bug fix

Overview: ?
Please check Bug ticket

Usage

# Add a code snippet demonstrating how to use this

Testing

Tested with

./scripts/run_auto_quant_and_deploy.sh     --hf_ckpt ./models/Qwen/Qwen3-8B     --save_quantized_ckpt ./qwen3_8B_autoquant     --quant fp8     --effective_bits 10.0

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes/No
  • Did you write any new necessary tests?: Yes/No
  • Did you add or update any necessary documentation?: Yes/No
  • Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

  • Refactor
    • Simplified LLM initialization by removing intermediate configuration layer
    • Updated attention backend from triton to flashinfer

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
@Fridah-nv Fridah-nv self-assigned this Feb 10, 2026
@Fridah-nv Fridah-nv requested a review from a team as a code owner February 10, 2026 21:41
@Fridah-nv Fridah-nv requested a review from meenchen February 10, 2026 21:41
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The pull request simplifies LLM initialization in the API server by removing intermediate AutoDeployConfig object creation and replacing it with direct LLM instantiation. The attn_backend parameter is explicitly changed from triton to flashinfer during this refactoring.

Changes

Cohort / File(s) Summary
LLM Initialization Refactoring
examples/llm_autodeploy/api_server.py
Removed AutoDeployConfig import and usage. Replaced AutoDeployConfig creation and to_llm_kwargs()-based LLM construction with direct LLM(...) instantiation. Changed attn_backend parameter from triton to flashinfer.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references a bug fix for an autoquant-autodeploy example and includes a ticket reference, which aligns with the actual changes removing AutoDeployConfig import and simplifying LLM initialization in the example code.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fridah/fix-ad-example

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@examples/llm_autodeploy/api_server.py`:
- Around line 48-49: The BuildConfig type and the local variable build_config
are now unused dead code; remove the BuildConfig import and delete the two lines
that create and modify build_config (the BuildConfig(...) instantiation and the
assignment to build_config.plugin_config.tokens_per_block) so there are no
unused symbols left (search for BuildConfig and build_config in api_server.py to
locate the exact spots).
🧹 Nitpick comments (1)
examples/llm_autodeploy/api_server.py (1)

147-152: --backend argument appears unused.

args.backend is never referenced after parsing. Only args.compile_backend (line 53) is passed to LLM. If this was previously consumed by the removed AutoDeployConfig path, consider removing it.

#!/bin/bash
# Verify that args.backend is not used anywhere in this file
rg -n 'args\.backend\b' examples/llm_autodeploy/api_server.py

@codecov
Copy link

codecov bot commented Feb 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.44%. Comparing base (5e43b2a) to head (a0b47f0).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #878   +/-   ##
=======================================
  Coverage   73.44%   73.44%           
=======================================
  Files         197      197           
  Lines       20657    20657           
=======================================
  Hits        15172    15172           
  Misses       5485     5485           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
@Fridah-nv Fridah-nv enabled auto-merge (squash) February 11, 2026 19:50
@Fridah-nv Fridah-nv merged commit eb3e6ed into main Feb 11, 2026
37 checks passed
@Fridah-nv Fridah-nv deleted the fridah/fix-ad-example branch February 11, 2026 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants