Skip to content

Conversation

@nbouvrette
Copy link

@nbouvrette nbouvrette commented Feb 11, 2026

Summary

During interpreter finalization (Py_FinalizeEx), active greenlets being deallocated trigger g_switch() to throw GreenletExit. This performs a stack switch and executes Python code in a partially-torn-down interpreter, causing:

  • SIGSEGV (signal 11) on greenlet 3.x
  • SIGABRT (signal 6 / std::runtime_error: Accessing state after destruction) on greenlet 2.x

On Python >= 3.11, CPython's restructured finalization internals (frame representation, data stack management, recursion tracking) make g_switch() during finalization safe — so no crash occurs. On Python < 3.11, this was not the case, leading to crashes during shutdown when active greenlets exist.

Root Cause

We traced this to two unsafe operations during Py_FinalizeEx:

  1. _green_dealloc_kill_started_non_main_greenlet in PyGreenlet.cpp — calls g_switch() to throw GreenletExit into the greenlet, which performs a stack switch and runs Python code via _PyEval_EvalFrameDefault in a partially-torn-down interpreter.

  2. ~ThreadState in TThreadState.hpp — calls PyImport_ImportModule(\"gc\") for leak detection, which is unsafe when the import machinery is partially torn down.

Fix

This PR adds two _Py_IsFinalizing() guards, compiled only on Python < 3.11 (#if !GREENLET_PY311):

  1. In _green_dealloc_kill_started_non_main_greenlet: When finalizing, call murder_in_place() directly instead of g_switch(). This marks the greenlet as dead without throwing GreenletExit — avoiding the crash at the cost of not running cleanup code inside the greenlet.

  2. In ~ThreadState: When finalizing, skip the GC-based leak detection and perform only minimal safe cleanup (clearing strong references to prevent leaks).

On Python >= 3.11, no changes are made — the existing behavior (throwing GreenletExit via g_switch(), running cleanup code) continues to work correctly during finalization.

Behavioral Difference by Python Version

Python Version Behavior During Shutdown
< 3.11 (with fix) murder_in_place() — greenlet killed without cleanup (safe)
>= 3.11 (unchanged) g_switch() + GreenletExit — cleanup code runs normally (already safe)

Tests

Adds test_interpreter_shutdown.py with 9 subprocess-based tests:

  • Safety tests (all Python versions): single, multiple, nested, threaded, deeply-nested active greenlets at shutdown, active exception context, stress test with 50 greenlets
  • Version-aware behavioral tests: verify that GreenletExit cleanup runs on Python >= 3.11, but is correctly skipped (via murder_in_place) on < 3.11

Reproduction

This was discovered in a production environment running Python 3.9.7 + uWSGI + greenlet where worker recycling caused active greenlets to be cleaned up during interpreter shutdown, generating core dumps that filled disk space (P0 incident). A full reproduction environment and detailed analysis is available at: https://github.com/nbouvrette/scheduling-tools/tree/main/greenlet-tests

Motivation

This fix stabilizes greenlet for the many production deployments still running Python 3.9 and 3.10, where upgrading Python is not immediately feasible. It also closes a longstanding gap in the test suite by adding the first interpreter-shutdown tests.

Fixes #411
See also #351, #376

nbouvrette added a commit to nbouvrette/greenlet that referenced this pull request Feb 11, 2026
Re-add Python 3.9 to requires-python and trove classifiers in
pyproject.toml. No C/C++ code was removed when 3.9 support was dropped
in 3.3.0 — the drop was purely a packaging metadata change.

Combined with the safe finalization fix in the parent commit (PR python-greenlet#495),
greenlet now works reliably on Python 3.9 during interpreter shutdown,
which was the primary stability concern for older Python versions.

This gives teams still running Python 3.9 in production a stable
greenlet release path while they plan their Python upgrade.

Also adds CHANGES.rst entries for both this change and the finalization
fix.

Co-authored-by: Cursor <cursoragent@cursor.com>
@nbouvrette nbouvrette mentioned this pull request Feb 11, 2026
4 tasks
@nbouvrette nbouvrette force-pushed the fix/safe-finalization-py310 branch from b7878d0 to a39c535 Compare February 11, 2026 04:54
nbouvrette added a commit to nbouvrette/greenlet that referenced this pull request Feb 11, 2026
Re-add Python 3.9 to requires-python, trove classifiers, and CI test
matrix. No C/C++ code was removed when 3.9 support was dropped in
3.3.0 — the drop was purely a packaging metadata change.

Combined with the safe finalization fix in the parent commit (PR python-greenlet#495),
greenlet now works reliably on Python 3.9 during interpreter shutdown,
which was the primary stability concern for older Python versions.

Changes:
- pyproject.toml: requires-python >= 3.9, add 3.9 trove classifier
- .github/workflows/tests.yml: add "3.9" to test matrix, exclude
  windows-11-arm (not available for 3.9)
- CHANGES.rst: add entries for both this change and the finalization fix

Co-authored-by: Cursor <cursoragent@cursor.com>
@nbouvrette nbouvrette force-pushed the fix/safe-finalization-py310 branch 3 times, most recently from 93cd07e to 4eb8fc1 Compare February 11, 2026 05:42
During interpreter finalization (Py_FinalizeEx), active greenlets being
deallocated would trigger g_switch() to throw GreenletExit. This performs
a stack switch and executes Python code in a partially-torn-down
interpreter, causing:
  - SIGSEGV (signal 11) on greenlet 3.x
  - SIGABRT (signal 6 / "Accessing state after destruction") on greenlet 2.x

On Python >= 3.11, CPython's restructured finalization internals (frame
representation, data stack management, recursion tracking) make g_switch()
during finalization safe. On Python < 3.11, this was not the case.

This commit adds two guards, compiled only on Python < 3.11
(!GREENLET_PY311):

1. In _green_dealloc_kill_started_non_main_greenlet (PyGreenlet.cpp):
   When the interpreter is finalizing, call murder_in_place() directly
   instead of attempting g_switch(). This marks the greenlet as dead
   without throwing GreenletExit, avoiding the crash at the cost of not
   running cleanup code inside the greenlet.

2. In ~ThreadState (TThreadState.hpp):
   When the interpreter is finalizing, skip the GC-based leak detection
   that calls PyImport_ImportModule("gc"), which is unsafe when the
   import machinery is partially torn down. Only perform minimal safe
   cleanup (clearing strong references).

On Python >= 3.11, no changes are made — the existing behavior (throwing
GreenletExit via g_switch, running cleanup code) continues to work
correctly during finalization.

Also adds test_interpreter_shutdown.py with 9 subprocess-based tests
covering:
  - Single/multiple/nested/threaded/deeply-nested active greenlets at
    shutdown (no-crash safety on all Python versions)
  - Version-aware behavioral tests verifying that GreenletExit cleanup
    code runs on Python >= 3.11 but is correctly skipped on < 3.11
  - Edge cases: active exception context, stress test with 50 greenlets

Fixes python-greenlet#411
See also python-greenlet#351, python-greenlet#376

Co-authored-by: Cursor <cursoragent@cursor.com>
@nbouvrette nbouvrette force-pushed the fix/safe-finalization-py310 branch from 4eb8fc1 to 292e126 Compare February 11, 2026 05:50
nbouvrette added a commit to nbouvrette/greenlet that referenced this pull request Feb 11, 2026
Re-add Python 3.9 to requires-python, trove classifiers, and CI test
matrix. No C/C++ code was removed when 3.9 support was dropped in
3.3.0 — the drop was purely a packaging metadata change.

Combined with the safe finalization fix in the parent commit (PR python-greenlet#495),
greenlet now works reliably on Python 3.9 during interpreter shutdown,
which was the primary stability concern for older Python versions.

Changes:
- pyproject.toml: requires-python >= 3.9, add 3.9 trove classifier
- .github/workflows/tests.yml: add "3.9" to test matrix, exclude
  windows-11-arm (not available for 3.9)
- CHANGES.rst: add entries for both this change and the finalization fix

Co-authored-by: Cursor <cursoragent@cursor.com>
@nbouvrette
Copy link
Author

@jamadden please let me know if you have any questions but I think if this would fix the worst stability bug I am aware of, it could possibly make a lot of people happy 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Segfault in ~ThreadStateCreator on shutdown on Python 3.11

1 participant