Skip to content

Support Qwen3.5#32

Open
ErlisLushtaku wants to merge 7 commits intomainfrom
erlislushtaku/fix/support-qwen-3.5
Open

Support Qwen3.5#32
ErlisLushtaku wants to merge 7 commits intomainfrom
erlislushtaku/fix/support-qwen-3.5

Conversation

@ErlisLushtaku
Copy link
Copy Markdown
Collaborator

@ErlisLushtaku ErlisLushtaku commented Apr 6, 2026

  • Updated dependencies to support Qwen3.5
  • Added structured outputs to make the judge output the scores rather than outputing other things until crossing token limits which was happening a lot.

@ErlisLushtaku ErlisLushtaku changed the title Support qwen 3.5 Support Qwen3.5 Apr 6, 2026
pyproject.toml Outdated
[project.optional-dependencies]
vllm = ["vllm==0.10.2", "transformers>=4.55.2,<5.0.0"]
# vLLM on PyPI pins transformers<5; optional extra matches that so `uv lock` can resolve.
vllm = ["vllm>=0.17.0,<1.0.0", "transformers>=4.56.0,<5.0.0"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vllm>=0.17.0,<1.0.0 is a very wide range. A few concerns:

  • Was this tested with a prebuilt wheel or built from source? Building vLLM from source on cluster nodes often fails due to CUDA kernel compilation issues.
  • Is the StructuredOutputsParams import path (vllm.sampling_params) stable across this entire range? It may have been introduced in 0.17 and could move. For example StructuredOutputParams was a bit different when vllm==0.11.0. Thus I think it makes more sense to create more stable versioning

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tightened the range. 0.18.1 was working. I think the StructuredOutputParams is stable accross the new range.

_PAIR_SCORE_MAX = 10


def build_pair_score_output_choices() -> list[str]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cartesian product approach works for a single A-vs-B pair (11×11 = 121 choices), but won't scale to multi-criteria evaluation — with N dimensions it becomes 11^(2N) choices, which is unusable.

May be we can consider switching to a JSON schema constraint instead of choice, e.g. {"score_A": int, "score_B": int} per criterion. VLLM's StructuredOutputsParams already supports json_schema alongside choice, so this would be a drop-in change.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, updated

)
)
if truncated_completion_count:
print(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging for a follow-up PR: the codebase mixes print() for warnings, progress, and debug info, making it hard to filter by severity or redirect output. We should migrate to Python's logging module (or at minimum a thin wrapper like logger = logging.getLogger(__name__)). What do you think @geoalgo

@ErlisLushtaku ErlisLushtaku force-pushed the erlislushtaku/fix/support-qwen-3.5 branch from ab3db1b to ef1c92c Compare April 7, 2026 14:19
- Switch from choice-based structured outputs to JSON schema constraint
- Tighten vllm version range from >=0.17.0,<1.0.0 to >=0.17.0,<0.19.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants