Show specific diverged workflows in sandbox merge dialog#4396
Show specific diverged workflows in sandbox merge dialog#4396midigofrank merged 12 commits intomainfrom
Conversation
Add diverged_workflows/2 function to return list of workflow names that have diverged between sandbox and target projects, rather than just a boolean flag. Update merge modal to display these specific workflow names in a bullet list when divergence is detected. This helps users make informed decisions about whether to proceed with a merge by showing exactly which workflows will be affected. Fixes #4001
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4396 +/- ##
==========================================
- Coverage 89.38% 89.38% -0.01%
==========================================
Files 425 425
Lines 20135 20133 -2
==========================================
- Hits 17997 17995 -2
Misses 2138 2138 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@josephjclark let me know your thoughts on the wording |
- Remove dbg() call from diverged workflows conditional check - Extract duplicate diverged workflows logic into reusable helper function - Simplifies code maintenance by centralizing the workflow divergence check
|
Something is up here Frank - I see the warning even if the branches haven't diverged Divergence means that the parent project changed since the sandbox was forked. So the head version in the parent is not in the sandbox's history. |
| @doc """ | ||
| Returns the list of workflow names that have diverged between source and target projects. | ||
|
|
||
| Compares version hashes for workflows with matching names. A workflow is considered |
There was a problem hiding this comment.
Yep:
A workflow is considered
diverged if it exists in both projects but has different version hashes.
this isn't quite right: a workflow is diverged if the parent head is not in the workflow's history.
There was a problem hiding this comment.
Hmm, great catch!! Let me add tests
There was a problem hiding this comment.
@josephjclark I have fixed this but I think there is another issue we need to fix. Currently, we're squashing the hashes if the source(cli, app) is the same. This means the workflows will always diverge after making a change because we will squash the "cloned" hash
There was a problem hiding this comment.
Heh, it's bad. We don't clone the versions at all when creating sandbox
There was a problem hiding this comment.
Heh, it's bad. We don't clone the versions at all when creating sandbox
I lied, we do clone the versions during forking: https://github.com/OpenFn/lightning/blob/main/lib/lightning/projects/sandboxes.ex#L558
…rison Update divergence detection to properly identify when parent project has changed since fork. A workflow now diverges only if the parent's current HEAD version is absent from the sandbox's version history, not just when the two HEADs differ. This allows sandboxes that have pulled parent updates and moved forward to correctly show no divergence, while still detecting true conflicts where parent and sandbox evolved independently. Changes: - Modify get_workflow_version_hashes_by_name to return all version hashes grouped by workflow name instead of just HEAD hash - Update diverged_workflows/2 to check if parent HEAD exists in sandbox history list - Refactor has_diverged?/2 to delegate to diverged_workflows/2 - Add comprehensive tests validating version history comparison logic
|
@midigofrank I don't want to merge this until we've done the demo release - so I'm gonna pause review today. I'll review all your sandbox PRs together (maybe EOD tomorrow) |
|
@josephjclark mind giving this another review? |
|
Sorry Frank this still isn't looking right :(
I've just gone onto main and I can see that it's giving exactly the same warning. Ok, so this PR is correct in that it now lists the names of any affected workflows. But we've got another critical issue in that something is wrong in the divergence logic still. I'm going to raise this as a new issue. |
|
Hey @josephjclark , I think part of the reason why you keep getting diverged workflows is because we always squash the versions. The fix is in #4414 . Let me get it ready |
* Ensure workflows have version history before operations (#3958) Added WorkflowVersions.ensure_version_recorded/1 function to guarantee all workflows have at least one version recorded before operations that depend on version history. Key changes: - New ensure_version_recorded/1 function in WorkflowVersions module - Prevents squashing of first workflow version - Called before sandbox creation in Sandboxes module - Called before workflow merge in MergeProjects module - Called before document import in Provisioner module
This commit removes the version_history array field from the workflows table in favor of always querying the workflow_versions table directly. Changes: - Remove version_history field from Workflow schema - Update record_version to return WorkflowVersion instead of Workflow - Update ensure_version_recorded to return WorkflowVersion - Remove reconcile_history! function (no longer needed) - Update history_for and latest_hash to always query database - Fix provisioner to ignore version_history param during provisioning - Fix validate_extraneous_params to handle tuple params correctly - Add migration to drop version_history column Test updates: - Update all tests to expect WorkflowVersion return type - Remove tests for version_history field access - Update tests to use WorkflowVersions.history_for() instead - Fix provisioning controller tests to ensure version before comparison - Update CLI deploy tests to include version_history in expected state Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Hey @josephjclark @elias-ba , I have merged in the 2 sandbox branches here. Would you mind giving this a fresh review? |
josephjclark
left a comment
There was a problem hiding this comment.
Looks great @midigofrank - just tested this and I'm getting no divergence warnings.
elias-ba
left a comment
There was a problem hiding this comment.
Hey @midigofrank this looks really great. I spotted few things I would like to bring to your attention. But most of them are just for raising awareness, you tell me if these are not super important to address right now and I will approve this PR.
| defp ensure_workflows_have_versions(project) do | ||
| workflows = | ||
| Workflows.list_project_workflows(project.id, | ||
| include: [:jobs, :edges, :triggers] | ||
| ) | ||
|
|
||
| Enum.each(workflows, &WorkflowVersions.ensure_version_recorded/1) | ||
| end |
There was a problem hiding this comment.
- If
ensure_version_recorded/1returns{:error, reason}for any workflow, the error is silently dropped. This will always succeed because Enum.each returns:okregardless. Can we consider usingEnum.eachwith explicit error checking, or switch to something like:
defp ensure_workflows_have_versions(project) do
workflows = Workflows.list_project_workflows(project.id, include: [:jobs, :edges, :triggers])
Enum.reduce_while(workflows, :ok, fn workflow, :ok ->
case WorkflowVersions.ensure_version_recorded(workflow) do
{:ok, _} -> {:cont, :ok}
{:error, reason} -> {:halt, {:error, reason}}
end
end)
end- This calls
ensure_version_recordedfor each workflow, which callslatest_version(1 query) and potentiallyrecord_version(1 transaction with multiple queries) per workflow. For projects with many workflows, this could add noticeable latency. Not a blocker but worth noting.
There was a problem hiding this comment.
Great catch, but I really didn't want to check it's failure. This was more of a side effect than a main thread thing. Remember, at this point someone is fetching the projectState.json
|
|
||
| defp create_sandbox_workflows(parent, sandbox) do | ||
| Enum.reduce(parent.workflows, %{}, fn parent_workflow, mapping -> | ||
| {:ok, _} = WorkflowVersions.ensure_version_recorded(parent_workflow) |
There was a problem hiding this comment.
This will raise a MatchError if ensure_version_recorded returns {:error, _}. While unlikely in practice, a DB issue or constraint violation would crash the caller. Can we consider using with or propagating the error ? I think same pattern is in lib/lightning/projects/merge_projects.ex:96 too
There was a problem hiding this comment.
Great catch, but error propagation at this point was kinda difficult. The existing pattern even with workflow creation was all matching {:ok, var} = insert(workflow | edge | ...) so I just used the same pattern
| assert length(updated_history) == 1 | ||
| assert [updated_version] = updated_history | ||
| assert String.starts_with?(updated_version, "cli:") | ||
| # we nolonger squash the first version |
There was a problem hiding this comment.
"no longer" instead of "nolonger" ?
| def ensure_version_recorded(%Workflow{} = workflow) do | ||
| latest_version = latest_version(workflow.id) | ||
|
|
||
| if latest_version do | ||
| {:ok, latest_version} | ||
| else | ||
| hash = generate_hash(workflow) | ||
| record_version(workflow, hash) | ||
| end | ||
| end |
There was a problem hiding this comment.
This has a small TOCTOU window. Two concurrent calls for the same workflow could both see latest_version as nil and both proceed to record_version.
Inside record_version, the Ecto.Multi transaction queries latest_version again, but without a row lock (FOR UPDATE), so under PostgreSQL's default READ COMMITTED isolation both transactions can see nil simultaneously. Since the unique index on (workflow_id, hash) was removed recently, there's no DB-level constraint preventing both inserts from succeeding, which could result in a duplicate row.
In practice the risk is very low because:
ensure_version_recordedis only called during sandbox creation, merge, and provisioning, infrequent, user-initiated operationsgenerate_hashis deterministic, so both calls produce the same hash, a duplicate row is cosmetically odd but functionally harmless for divergence detection
If we ever wanted to fully close this window, adding lock: "FOR UPDATE" to the latest_version query inside the record_version Multi would do it. Not suggesting it's needed for this PR though, just flagging for awareness.
There was a problem hiding this comment.
Aaah, great idea!! If it ever pops up, we'll include it
Description
diverged_workflows/2function to return list of workflow names that have diverged between sandbox and target projectsFixes #4001
Fixes #3958
Closes #4364
Validation steps
AI Usage
Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):
You can read more details in our
Responsible AI Policy
Pre-submission checklist
/reviewwith Claude Code)
(e.g.,
:owner,:admin,:editor,:viewer)