feat: background workers = non-HTTP workers with shared state#2287
feat: background workers = non-HTTP workers with shared state#2287nicolas-grekas wants to merge 3 commits intophp:mainfrom
Conversation
e1655ab to
867e9b3
Compare
|
Interesting approach to parallelism, what would be a concrete use case for only letting information flow one way from the sidekick to the http workers? Usually the flow would be inverted, where a http worker offloads work to a pool of 'sidekick' workers and can optionally wait for a task to complete. |
da54ab8 to
a06ba36
Compare
|
Thank you for the contribution. Interesting idea, but I'm thinking we should merge the approach with #1883. The kind of worker is the same, how they are started is but a detail. @nicolas-grekas the Caddyfile setting should likely be per |
ad71bfe to
05e9702
Compare
|
@AlliBalliBaba The use case isn't task offloading (HTTP->worker), but out-of-band reconfigurability (environment->worker->HTTP). Sidekicks observe external systems (Redis Sentinel failover, secret rotation, feature flag changes, etc.) and publish updated configuration that HTTP workers pick up on their next request; with per-request consistency guaranteed via Task offloading (what you describe) is a valid and complementary pattern, but it solves a different problem. The non-HTTP worker foundation here could support both. @henderkes Agreed that the underlying non-HTTP worker type overlaps with #1883. The foundation (skip HTTP startup/shutdown, immediate readiness, cooperative shutdown) is the same. The difference is the API layer and the DX goals:
Happy to follow up with your proposals now that this is hopefully clarified. |
05e9702 to
8a56d4c
Compare
|
Great PR! Couldn't we create a single API that covers both use case? We try to keep the number of public symbols and config option as small as possible! |
Yes, that's why I'd like to unify the two API's and background implementations into one. Unfortunately the first task worker attempt didn't make it into |
|
The PHP-side API has been significantly reworked since the initial iteration: I replaced The old design used
Key improvements:
Other changes:
|
cb65f46 to
4dda455
Compare
|
Thanks @dunglas and @henderkes for the feedback. I share the goal of keeping the API surface minimal. Thinking about it more, the current API is actually quite small and already general:
The name "sidekick" works as a generic concept: a helper running alongside. The current set_vars/get_vars protocol covers the config-publishing use case. For task offloading (HTTP->worker) later, the same sidekick infrastructure could support:
Same worker type, same So the path would be:
The foundation (non-HTTP threads, cooperative shutdown, crash recovery, per-php_server scoping) is shared. Only the communication primitives differ. WDYT? |
b3734f5 to
ed79f46
Compare
|
|
|
Hmm, it seems they are on some versions, for example here: https://github.com/php/frankenphp/actions/runs/23192689128/job/67392820942?pr=2287#step:10:3614 For the cache, I'm not aware of a Github feature that allow to clear everything unfortunately 🙁 |
813a1ea to
04ef4fe
Compare
|
Here is a new iteration with a new API based on signaling threads + an exception (split in last commit):
Now defaults to 6 (same as HTTP workers) instead of -1 (never panic). Users can still set -1 explicitly if they want infinite retries.
Set for all workers:
Background workers now use the same
The per-thread vars cache wasn't reset for regular (non-worker) PHP scripts served by |
ef92ba2 to
bdbaf1c
Compare
|
Following up on the ShutdownException exploration: I explored replacing the signaling stream with a
The conclusion: you can't reliably deliver signals to interrupt C-level blocking calls in a Go+CGo process, cross-platform. So I kept the signaling stream as the primary graceful shutdown mechanism, and added a grace period with best-effort force-kill:
Other changes from previous update still in place: Docs updated. All 16 tests pass on PHP 8.5 with Note: the force-kill mechanism could also benefit HTTP workers stuck in blocking calls during shutdown. Keeping it scoped to background workers for now; let me know. |
bdbaf1c to
a5d96eb
Compare
refactor: address review feedback on background workers - Use `name` instead of `match` for background worker identification - Combine start + wait + lock + copy + unlock into single CGo call (go_frankenphp_worker_get_vars replaces three separate exports) - Remove lockedVarsStacks, InitLockedVarsStacks, and varsVersion - Set FRANKENPHP_WORKER_NAME for all workers (HTTP and background) - Split worker_bg_name into worker_name + is_background_worker flag - Rename httpEnabled to isBackgroundWorker on Go side - Remove name validation regex (same rules as HTTP workers) - Keep $_SERVER['argv'] for background workers (bin/console compat) Add generational cache back Review by henderkes
a5d96eb to
c814c8d
Compare
| function background_worker_should_stop(float $timeout = 0): bool | ||
| { | ||
| static $signalingStream; | ||
| $signalingStream ??= frankenphp_worker_get_signaling_stream(); | ||
| $s = (int) $timeout; | ||
|
|
||
| return match (@stream_select(...[[$signalingStream], [], [], $s, (int) (($timeout - $s) * 1e6)])) { | ||
| 0 => false, // timeout | ||
| false => true, // error (pipe closed) = stop | ||
| default => "stop\n" === fgets($signalingStream), | ||
| }; | ||
| } |
There was a problem hiding this comment.
Could we also wrap this in a separate utility function? frankenphp_sleep() or so. Allows us to possibly also change the implementation in the future.
There was a problem hiding this comment.
Perhaps frankenphp_wait_for_shutdown or something similar. Sleep is misleading.
|
Hmm looking at the current implementation, workers can access background workers if they are declared in the same If possible it would also be nice if we'd minimize the public api and didn't expose It would be cleaner for future use cases to do something like Tell me if you are getting tired of these reviews, I can also create a PR to your PR if you want 😄 |
…cture - Add drain() to threadHandler interface for per-handler shutdown logic - Remove applyReservedThreads, handle in calculateMaxThreads instead - Auto-start background workers via initWorkers (same as HTTP workers) - Remove separate auto-start mechanism (startAutoBackgroundWorkers) - Create registries internally from worker options Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Keep reviews coming in :) Any opinions on this? |
be7d9c0 to
d43ede9
Compare
Extract backgroundWorkerThread from workerThread to decouple lifecycle management. Background workers now have their own threadHandler with dedicated drain(), beforeScriptExecution(), afterScriptExecution(), and setupScript() methods. Removes all isBackgroundWorker checks from threadworker.go. The dispatch happens at thread creation time (convertToWorkerThread vs convertToBackgroundWorkerThread). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@AlliBalliBaba I specifically think there shouldn't even be a global option at all, much like for workers (where we need to keep it for backward compatibility). It will only lead to unexpected surprises for people and the benefit is rather small. Instead I think it would make sense to be able to pass an entry point from the php code that starts the background worker, anywhere in the Putting these things in the global block is not great because:
While the only benefit is the last point, which can also be a weakness. |
|
For thoughts:
|
Summary
Background workers are long-running PHP workers that run outside the HTTP cycle. They observe their environment (Redis, DB, filesystem, etc.) and publish configuration that HTTP workers read per-request - enabling real-time reconfiguration without restarts or polling.
PHP API
frankenphp_worker_set_vars(array $vars)- publishes config from a background worker (persistent memory, cross-thread)frankenphp_worker_get_vars(string|array $name, float $timeout = 30.0)- reads config from HTTP workers (blocks until first publish, generational cache)frankenphp_worker_get_signaling_stream()- returns a pipe-based stream forstream_select()integration (cooperative shutdown)Caddyfile configuration
backgroundmarks a worker as non-HTTPnamespecifies an exact worker name; workers withoutnameare catch-all for lazy-started namesmax_threadson catch-all sets a safety cap for lazy-started instances (defaults to 16)numandmax_threadscapped at 1 (pooling is a future feature)max_consecutive_failuresdefaults to 6 (same as HTTP workers)max_execution_timeautomatically disabled for background workersShutdown
Background workers are stopped cooperatively via the signaling stream: FrankenPHP writes
"stop\n"which is picked up bystream_select(). Workers have a 5-second grace period.After the grace period, a best-effort force-kill is attempted:
max_execution_timetimer cross-thread viatimer_settime(EG(max_execution_timer_timer))CancelSynchronousIo+QueueUserAPCinterrupts blocking I/O and alertable waitsArchitecture
BackgroundWorkerRegistryperphp_serverfor isolation and at-most-once semanticspemalloc) withRWMutexfor safe cross-thread sharingget_varscalls return the same array instance (===is O(1))IS_ARRAY_IMMUTABLE)ZSTR_IS_INTERNED) - skip copy/free for shared memory stringsstream_select()- compatible with amphp/ReactPHP event loops$_SERVER['FRANKENPHP_WORKER_NAME']set for all workers (HTTP and background)$_SERVER['FRANKENPHP_WORKER_BACKGROUND']set for all workers (true/false)Example
Test coverage
16 tests covering: basic vars, at-most-once start, validation, type support (enums, binary-safe strings), multiple background workers, multiple entrypoints, crash restart, signaling stream, worker restart lifecycle, non-background-worker error handling, identity detection, generational cache.
All tests pass on PHP 8.2, 8.3, 8.4, and 8.5 with
-race.Documentation
Full docs at
docs/background-workers.md.