-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Description
Affected Stackable version
dev (24.11 prerelease)
Current and expected behavior
@xeniape ran into an issue (sble employees: see slack) where pods would be left with expired certificates after a while, rather than getting evicted by commons-op as expected. Restarting commons-op evicted the pods, as expected.
Our current working hypothesis here is that commons-op's re-reconciliation timer didn't advance while the computer was suspended, causing the eviction to be delayed by the same amount of time.
Possible solution
Either:
- Change the timer to use wall time instead of monotonic/CPU time
- Cap the re-reconciliation timer, causing spurious reconciles but at least limiting the issue
- Make the timer automatically expire when resuming from suspend
Either way, we should probably also communicate upstream with kube-rs and either fix it there or highlight the issue somehow.
Additional context
No response
Environment
No response
Would you like to work on fixing this bug?
None
Reactions are currently unavailable