Skip to content

Show specific diverged workflows in sandbox merge dialog#4396

Merged
midigofrank merged 12 commits intomainfrom
4001-show-diverged-workflows
Feb 18, 2026
Merged

Show specific diverged workflows in sandbox merge dialog#4396
midigofrank merged 12 commits intomainfrom
4001-show-diverged-workflows

Conversation

@midigofrank
Copy link
Collaborator

@midigofrank midigofrank commented Feb 9, 2026

Description

  • Add diverged_workflows/2 function to return list of workflow names that have diverged between sandbox and target projects
  • Update merge modal to display specific workflow names in a bullet list when divergence is detected
Screenshot 2026-02-09 at 11 36 49

Fixes #4001
Fixes #3958
Closes #4364

Validation steps

  1. Create a sandbox project
  2. Pick any workflow in the sandbox project and add a job
  3. Go back to the sandboxes page for the project and click to merge the sandbox.
  4. The merge modal should list your workflow as diverged, like in the attached screenshot

AI Usage

Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):

  • I have used Claude Code
  • I have used another model
  • I have not used AI

You can read more details in our
Responsible AI Policy

Pre-submission checklist

  • I have performed an AI review of my code (we recommend using /review
    with Claude Code)
  • I have implemented and tested all related authorization policies.
    (e.g., :owner, :admin, :editor, :viewer)
  • I have updated the changelog.
  • I have ticked a box in "AI usage" in this PR

Add diverged_workflows/2 function to return list of workflow names that
have diverged between sandbox and target projects, rather than just a
boolean flag. Update merge modal to display these specific workflow
names in a bullet list when divergence is detected.

This helps users make informed decisions about whether to proceed with
a merge by showing exactly which workflows will be affected.

Fixes #4001
@github-project-automation github-project-automation bot moved this to New Issues in v2 Feb 9, 2026
@midigofrank midigofrank self-assigned this Feb 9, 2026
@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

❌ Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 89.38%. Comparing base (6f7c1ab) to head (ec3bacf).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
lib/lightning/projects/merge_projects.ex 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4396      +/-   ##
==========================================
- Coverage   89.38%   89.38%   -0.01%     
==========================================
  Files         425      425              
  Lines       20135    20133       -2     
==========================================
- Hits        17997    17995       -2     
  Misses       2138     2138              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@midigofrank
Copy link
Collaborator Author

@josephjclark let me know your thoughts on the wording

- Remove dbg() call from diverged workflows conditional check
- Extract duplicate diverged workflows logic into reusable helper function
- Simplifies code maintenance by centralizing the workflow divergence check
@josephjclark
Copy link
Collaborator

Something is up here Frank - I see the warning even if the branches haven't diverged

Divergence means that the parent project changed since the sandbox was forked. So the head version in the parent is not in the sandbox's history.

@doc """
Returns the list of workflow names that have diverged between source and target projects.

Compares version hashes for workflows with matching names. A workflow is considered
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep:

A workflow is considered
diverged if it exists in both projects but has different version hashes.

this isn't quite right: a workflow is diverged if the parent head is not in the workflow's history.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, great catch!! Let me add tests

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josephjclark I have fixed this but I think there is another issue we need to fix. Currently, we're squashing the hashes if the source(cli, app) is the same. This means the workflows will always diverge after making a change because we will squash the "cloned" hash

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, it's bad. We don't clone the versions at all when creating sandbox

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually #3958 .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, it's bad. We don't clone the versions at all when creating sandbox

I lied, we do clone the versions during forking: https://github.com/OpenFn/lightning/blob/main/lib/lightning/projects/sandboxes.ex#L558

…rison

Update divergence detection to properly identify when parent project has
changed since fork. A workflow now diverges only if the parent's current
HEAD version is absent from the sandbox's version history, not just when
the two HEADs differ.

This allows sandboxes that have pulled parent updates and moved forward
to correctly show no divergence, while still detecting true conflicts
where parent and sandbox evolved independently.

Changes:
- Modify get_workflow_version_hashes_by_name to return all version hashes
  grouped by workflow name instead of just HEAD hash
- Update diverged_workflows/2 to check if parent HEAD exists in sandbox
  history list
- Refactor has_diverged?/2 to delegate to diverged_workflows/2
- Add comprehensive tests validating version history comparison logic
@josephjclark
Copy link
Collaborator

@midigofrank I don't want to merge this until we've done the demo release - so I'm gonna pause review today. I'll review all your sandbox PRs together (maybe EOD tomorrow)

@midigofrank
Copy link
Collaborator Author

@josephjclark mind giving this another review?

@josephjclark
Copy link
Collaborator

Sorry Frank this still isn't looking right :(

  • I crate a new sandbox
  • I edit a step
  • I try to merge the sandbox - it warns me that it's diverged (it hasn't - I've edited the sandbox without changing the OG)

I've just gone onto main and I can see that it's giving exactly the same warning.

Ok, so this PR is correct in that it now lists the names of any affected workflows.

But we've got another critical issue in that something is wrong in the divergence logic still. I'm going to raise this as a new issue.

@midigofrank
Copy link
Collaborator Author

Hey @josephjclark , I think part of the reason why you keep getting diverged workflows is because we always squash the versions. The fix is in #4414 . Let me get it ready

midigofrank and others added 3 commits February 17, 2026 18:45
* Ensure workflows have version history before operations (#3958)

Added WorkflowVersions.ensure_version_recorded/1 function to guarantee
all workflows have at least one version recorded before operations that
depend on version history.

Key changes:
- New ensure_version_recorded/1 function in WorkflowVersions module
- Prevents squashing of first workflow version
- Called before sandbox creation in Sandboxes module
- Called before workflow merge in MergeProjects module
- Called before document import in Provisioner module
This commit removes the version_history array field from the workflows
table in favor of always querying the workflow_versions table directly.

Changes:
- Remove version_history field from Workflow schema
- Update record_version to return WorkflowVersion instead of Workflow
- Update ensure_version_recorded to return WorkflowVersion
- Remove reconcile_history! function (no longer needed)
- Update history_for and latest_hash to always query database
- Fix provisioner to ignore version_history param during provisioning
- Fix validate_extraneous_params to handle tuple params correctly
- Add migration to drop version_history column

Test updates:
- Update all tests to expect WorkflowVersion return type
- Remove tests for version_history field access
- Update tests to use WorkflowVersions.history_for() instead
- Fix provisioning controller tests to ensure version before comparison
- Update CLI deploy tests to include version_history in expected state

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@midigofrank
Copy link
Collaborator Author

Hey @josephjclark @elias-ba , I have merged in the 2 sandbox branches here. Would you mind giving this a fresh review?

Copy link
Collaborator

@josephjclark josephjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @midigofrank - just tested this and I'm getting no divergence warnings.

@midigofrank midigofrank merged commit 9c1e3ae into main Feb 18, 2026
7 checks passed
@midigofrank midigofrank deleted the 4001-show-diverged-workflows branch February 18, 2026 11:10
@github-project-automation github-project-automation bot moved this from New Issues to Done in v2 Feb 18, 2026
Copy link
Contributor

@elias-ba elias-ba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @midigofrank this looks really great. I spotted few things I would like to bring to your attention. But most of them are just for raising awareness, you tell me if these are not super important to address right now and I will approve this PR.

Comment on lines +150 to +157
defp ensure_workflows_have_versions(project) do
workflows =
Workflows.list_project_workflows(project.id,
include: [:jobs, :edges, :triggers]
)

Enum.each(workflows, &WorkflowVersions.ensure_version_recorded/1)
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If ensure_version_recorded/1 returns {:error, reason} for any workflow, the error is silently dropped. This will always succeed because Enum.each returns :ok regardless. Can we consider using Enum.each with explicit error checking, or switch to something like:
defp ensure_workflows_have_versions(project) do
  workflows = Workflows.list_project_workflows(project.id, include: [:jobs, :edges, :triggers])

  Enum.reduce_while(workflows, :ok, fn workflow, :ok ->
    case WorkflowVersions.ensure_version_recorded(workflow) do
      {:ok, _} -> {:cont, :ok}
      {:error, reason} -> {:halt, {:error, reason}}
    end
  end)
end
  1. This calls ensure_version_recorded for each workflow, which calls latest_version (1 query) and potentially record_version (1 transaction with multiple queries) per workflow. For projects with many workflows, this could add noticeable latency. Not a blocker but worth noting.

Copy link
Collaborator Author

@midigofrank midigofrank Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, but I really didn't want to check it's failure. This was more of a side effect than a main thread thing. Remember, at this point someone is fetching the projectState.json


defp create_sandbox_workflows(parent, sandbox) do
Enum.reduce(parent.workflows, %{}, fn parent_workflow, mapping ->
{:ok, _} = WorkflowVersions.ensure_version_recorded(parent_workflow)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will raise a MatchError if ensure_version_recorded returns {:error, _}. While unlikely in practice, a DB issue or constraint violation would crash the caller. Can we consider using with or propagating the error ? I think same pattern is in lib/lightning/projects/merge_projects.ex:96 too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, but error propagation at this point was kinda difficult. The existing pattern even with workflow creation was all matching {:ok, var} = insert(workflow | edge | ...) so I just used the same pattern

assert length(updated_history) == 1
assert [updated_version] = updated_history
assert String.starts_with?(updated_version, "cli:")
# we nolonger squash the first version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"no longer" instead of "nolonger" ?

Comment on lines +117 to +126
def ensure_version_recorded(%Workflow{} = workflow) do
latest_version = latest_version(workflow.id)

if latest_version do
{:ok, latest_version}
else
hash = generate_hash(workflow)
record_version(workflow, hash)
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a small TOCTOU window. Two concurrent calls for the same workflow could both see latest_version as nil and both proceed to record_version.

Inside record_version, the Ecto.Multi transaction queries latest_version again, but without a row lock (FOR UPDATE), so under PostgreSQL's default READ COMMITTED isolation both transactions can see nil simultaneously. Since the unique index on (workflow_id, hash) was removed recently, there's no DB-level constraint preventing both inserts from succeeding, which could result in a duplicate row.

In practice the risk is very low because:

  • ensure_version_recorded is only called during sandbox creation, merge, and provisioning, infrequent, user-initiated operations
  • generate_hash is deterministic, so both calls produce the same hash, a duplicate row is cosmetically odd but functionally harmless for divergence detection

If we ever wanted to fully close this window, adding lock: "FOR UPDATE" to the latest_version query inside the record_version Multi would do it. Not suggesting it's needed for this PR though, just flagging for awareness.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah, great idea!! If it ever pops up, we'll include it

@github-project-automation github-project-automation bot moved this from Done to In review in v2 Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

sandboxes: when projects have diverged, tell me which workflows are in conflict sandboxes: no version history is available on old workflows

3 participants