fix: handle provider status default cases across providers.#107
fix: handle provider status default cases across providers.#107garvitssoni wants to merge 3 commits intomainfrom
Conversation
| lifecycleStatus = v1.LifecycleStatusFailed | ||
| default: | ||
| lifecycleStatus = v1.LifecycleStatusPending | ||
| lifecycleStatus = "" |
There was a problem hiding this comment.
| lifecycleStatus = "" | |
| lifecycleStatus = v1.LifecycleStatusEmpty |
|
How do we handle the empty status in dev-plane and the UI? In these cases, does it make more sense to assume the instance is terminated rather than returning an empty status? |
c0ff74d to
464b0c3
Compare
Hi @stephahart, On the UI side, an empty provider status is handled as a separate case and displayed as “Deleting.” This differs from “Terminated,” which is shown as “Stopped.” Combining these would present users with an inaccurate view of the instance state. On the dev-plane side, we track when an instance last entered Running, Stopped, or Terminated using the ProviderStatusTransitions struct. Cloud providers may occasionally return “not found” errors, which can resemble termination—even though the instance may not really be terminated yet. To ensure data accuracy, we only set LastTerminatedAt when termination is explicitly confirmed. If the status is empty or unknown, we do not record any transition time. In practice, an unknown status indicates “not yet confirmed,” prompting frequent polling (every 15 seconds). A confirmed Terminated state is treated as final—we record the timestamp and reduce polling to every 60 seconds. If we default unknown statuses to Terminated, we risk recording an incorrect termination time, slowing down polling prematurely, and potentially missing instances that recover later. Thanks! |
|
This is probably better than what we have, but it will only slightly shift the problem to flickering into and out of "unknown". One thing to consider now is: when reacting to the new state, we preserve the current state if the new state is "unknown". Providers implement func (c *LaunchpadClient) MergeInstanceForUpdate(_ v1.Instance, newInstance v1.Instance) v1.Instance {
return newInstance
}this could change to something like: func (c *LaunchpadClient) MergeInstanceForUpdate(currentInstance v1.Instance, newInstance v1.Instance) v1.Instance {
if newInstance.Status.LifecycleStatus == v1.LifecycleStatusEmpty {
newInstance.Status.LifecycleStatus = currentInstance.Status.LifecycleStatus
}
return newInstance
}which at least preserves the old status (so instead of running -> pending -> terminating, we'd actually see running -> terminating). |
| LifecycleStatusTerminating LifecycleStatus = "terminating" | ||
| LifecycleStatusTerminated LifecycleStatus = "terminated" | ||
| LifecycleStatusFailed LifecycleStatus = "failed" | ||
| LifecycleStatusEmpty LifecycleStatus = "" |
There was a problem hiding this comment.
"Unknown" might be better
Problem
Across several providers, unknown/transient status values were being coerced into concrete lifecycle states (commonly
pending). During deletion this led to incorrect “Starting/Deploying” signals and contributed to UI status regressions.Root cause
Default-case handling in provider status mappers treated unknown values as known states instead of surfacing “unknown.”
Fix
pending/failed.