Skip to content

Windows wheels [C]: fix XNNPACK test hang by forcing single-threaded execution#18374

Draft
manuelcandales wants to merge 4 commits intomainfrom
manuel/windows-wheels-fix-C
Draft

Windows wheels [C]: fix XNNPACK test hang by forcing single-threaded execution#18374
manuelcandales wants to merge 4 commits intomainfrom
manuel/windows-wheels-fix-C

Conversation

@manuelcandales
Copy link
Contributor

pthreadpool's condvar-based synchronization on Windows can deadlock with multiple threads due to a lost-wakeup bug in signal_num_recruited_threads. Reset the threadpool to 1 thread before loading the model to avoid the issue.

The previous fix (setting sslBackend in pre_build_script.sh) only
applied to nested tokenizer submodules. The top-level submodule
checkout still used schannel via the reusable workflow's
`submodules: true`, causing SEC_E_ILLEGAL_MESSAGE errors when
cloning from git.gitlab.arm.com.

Move all submodule initialization into the pre-build script where
we can control the SSL backend, and disable submodule checkout in
the workflow.
Move submodule initialization above the aarch64 sed workaround so
the file it edits is guaranteed to exist even if the caller disables
submodule checkout. Also remove the redundant UNAME_S assignment
later in the script.
The default 60-minute timeout from pytorch/test-infra is too tight for
the Windows wheel build + smoke test, causing jobs to be cancelled.
…ution

pthreadpool's condvar-based synchronization on Windows can deadlock
with multiple threads due to a lost-wakeup bug in signal_num_recruited_threads.
Reset the threadpool to 1 thread before loading the model to avoid the issue.
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 20, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18374

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit 5bb9070 with merge base 94e9ca6 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 20, 2026
@manuelcandales manuelcandales changed the title Windows wheels: fix XNNPACK test hang by forcing single-threaded execution Windows wheels [C]: fix XNNPACK test hang by forcing single-threaded execution Mar 20, 2026

# pthreadpool's condvar-based synchronization on Windows can deadlock
# with multiple threads. Force single-threaded execution.
_unsafe_reset_threadpool(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you set this to (2) does it still hang? Just curious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test: #18377

@manuelcandales manuelcandales force-pushed the manuel/build-windows-wheels-fix-2 branch from 8033f32 to 77989a2 Compare March 20, 2026 20:15
Base automatically changed from manuel/build-windows-wheels-fix-2 to main March 20, 2026 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants