WIP(Don't review): Add a PickDivergent step for evolog based graph resolution#13174
Draft
Caleb-T-Owens wants to merge 1 commit intomasterfrom
Draft
WIP(Don't review): Add a PickDivergent step for evolog based graph resolution#13174Caleb-T-Owens wants to merge 1 commit intomasterfrom
Caleb-T-Owens wants to merge 1 commit intomasterfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
5cb22c3 to
577c1a0
Compare
Caleb-T-Owens
commented
Apr 3, 2026
| fn merge_options_force_theirs(&self) -> anyhow::Result<Options> { | ||
| Ok(self | ||
| .tree_merge_options()? | ||
| .with_file_favor(Some(gix::merge::tree::FileFavor::Theirs))) |
Contributor
Author
There was a problem hiding this comment.
@Byron for TreeFavor the options are Ancestor or Ours. Should there also be a Theirs variant?
Collaborator
There was a problem hiding this comment.
Doing so would be a good motivation to rewrite the gix-merge algorithm into a form that is more suitable to handle the complexity that comes with it.
From what I remember, it already has more features than the Git merge, which also has quite a lot of complexity on its own given the tiny amount of tests for it (in the Git test suite).
So yeah, everything is possible, but it's going to be hard unless one finds an approach that makes it significantly easier.
577c1a0 to
acc9aba
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Not done yet yet:
Combining divergent changes
Problem
When local history and remote history both evolve the same logical change, a normal cherry-pick is not enough.
The simplest case is a single amended commit on each side:
XintoRXintoLIn that case, we can rebase both sides onto the desired parent and perform a three-way merge using
Xas the base.The harder case is when either side split the original change into multiple commits. For example:
XintoR1 -> R2 -> R3XintoL1 -> L2When pulling, we want to preserve the remote shape. That means the result should still be three remote-position commits, while local amendments are distributed across those three commits in remote order.
This document defines the terminology and algorithm for doing that.
Goals
The remote side determines the number and ordering of commits in the resolved result.
If the remote side split a logical change into multiple commits, those commits stay in that order.
Local amendments should either be assigned to an earlier matching remote commit or carried forward to a later one.
The solution should compose with normal step-graph rebasing instead of creating a separate parallel engine.
If the final merge for a remote position conflicts, the result should use the existing conflict representation.
Terms
Divergence ancestor
The commit from which the local and remote histories both descended before they diverged.
This is the preferred merge base when one exists. In earlier notes this was described as a “common amended ancestor”. In this document, we use divergence ancestor consistently.
Remote family
An ordered set of one or more remote commits that represent the remote-side evolution of one logical change.
Example:
R1 -> R2 -> R3A remote family preserves remote structure. The output must contain one resolved commit for each remote family member.
Local sequence
The ordered list of local commits that represent the local-side evolution of the same logical change.
Example:
L1 -> L2The local sequence is treated as input material to distribute across the remote family. Its internal commit structure does not need to be preserved.
Family order
The processing order of commits in a remote family.
This is not a separate family-local ordering. It is the normal traversal order already implied by the step graph: parentmost to childmost in the output history.
Rebase base
The commit that the current remote family member should be placed on top of in the output graph.
For the first member of a family, this is the parent produced by the preceding step in the rebase graph. For later members, it is the resolved parent created earlier in the same family (or whatever parent the step graph specifies at that position).
Hunk pool
A tree representing the unresolved local content that still needs to be assigned into the remote family.
At the beginning of processing, the hunk pool contains the entire local sequence flattened onto the first rebase base. As each remote family member absorbs overlapping hunks, the remaining hunks are carried forward in the pool.
Overlap
A hunk in the hunk pool overlaps a remote commit if it touches the same changed region as that remote commit’s delta after that remote commit has been materialized onto its output parent.
Overlap is determined using a zero-context diff of the rebased/materialized remote commit against its output parent.
In other words, for remote family member
Ri, the classifier uses:Where
Ri'is the remote commit after it has been rebased or otherwise materialized at its output position.Rules:
The algorithm deliberately avoids trying to do clever sub-file distribution for rename cases.
Why this belongs in the rebase engine
This operation needs full rebase context.
A remote family may be spread through a larger graph, and the remaining local hunks must be carried forward from one remote position to the next. That means we need the same parent tracking, commit materialization, and conflict handling that the existing rebase engine already provides.
Creating a separate divergence-specific history engine would duplicate a large amount of the rebase machinery and make composition harder.
Instead, divergent resolution should be modeled as a special step type that runs inside the rebase engine and then lowers to ordinary pick steps once resolved.
Proposed step:
PickDivergentEach remote family member is represented by one
PickDivergentstep.Invariants
The following invariants are trusted input from the caller.
For all
PickDivergentsteps with the samefamily_id:local_commitsis identicalancestoris identicalremote_commitAfter processing, a
PickDivergentstep is lowered into an ordinaryPickstep in the output step graph. No additional divergence-specific metadata needs to survive in the resulting graph.High-level idea
The remote family defines the output shape.
The local sequence is flattened into a single hunk pool. Then, for each remote family member in order:
Earlier remote commits get first claim on overlapping hunks. Assignment is greedy and single-claim: once a hunk is assigned to the current remote family member, it is removed from the pool and cannot be assigned again later in the traversal. Any hunks that were never claimed by earlier commits are guaranteed to land in the final remote family member.
Detailed algorithm
Assume:
R1, R2, ..., RnL1, L2, ..., LmAB0R1..Rnare processed in family order.1. Build the initial hunk pool
Rebase the local sequence onto
B0and flatten it into a single tree, preserving the final local content rather than local commit structure.Conceptually:
B0L1, L2, ..., Lmin local orderIn terms of the current merge implementation, this means preferring the local / “theirs” side while constructing the pool.
Call the resulting tree
P1.If a divergence ancestor
Aexists, also rebase it ontoB0to produceA1.2. For each remote family member, move the pool to the correct parent
For step
i, letBibe the actual parent commit for that remote position in the output graph.Before resolving
Ri:Bi, rebase it ontoBiBi, rebase the ancestor ontoBiThis keeps the pool and optional ancestor aligned with the current output position.
3. Materialize the remote commit for this position
Resolve the remote commit at this step exactly as the rebase engine would normally place it:
RiontoBiRiis a merge commit, first construct the virtual parent/merge base exactly as existing cherry-pick logic doesCall the resulting commit tree for this position
Ri*.If the resolved commit for this position is ultimately written out, it keeps the remote commit’s message, author, and extra headers, and follows the same signing semantics as a normal
Pick.4. Partition the hunk pool
If
i < n, compute the zero-context diff for the rebased/materialized remote delta:Where
Ri'is the remote commit after it has been materialized onto its output parent.Use that diff to partition the current pool
Piinto:Pi_match: hunks that overlapRi'Pi_rest: hunks that do not overlapRi'Assignment is greedy in family order and single-claim: if a hunk overlaps the current remote commit, it is assigned here, removed from the pool, and cannot later be assigned to another remote family member even if it would also overlap there.
If
i == n(the last remote family member), skip filtering and define:Pi_match = PiPi_rest = ∅This guarantees that any remaining local changes are incorporated into the final remote position.
5. Merge the matched pool into the current remote position
Produce the resolved tree for this remote position with a three-way merge:
BiRi*Pi_matchIn shorthand:
If the merge is clean, write a normal commit for this remote position.
If
Pi_matchis empty for a non-final step, we still emit the remote commit for that position normally, because preserving the remote structure is stronger than only emitting commits that absorb local hunks.If the merge conflicts, write a conflicted commit using the existing conflict representation.
6. Carry the rest of the pool forward
If
i < n, set:P(i+1) = Pi_restThen continue to the next remote family member.
Because the remainder is rebased onto the next remote position before reuse, later commits see the still-unassigned local changes in the correct context.
This makes the algorithm intentionally sequential and context-sensitive: overlap for
R(i+1)is evaluated only afterRihas been materialized and after the remaining pool has been carried forward.7. Lower the result into ordinary rebase steps
Once each
PickDivergentstep has been resolved, replace it in the output graph with a normalPickstep containing the newly created commit id.From that point on, the rest of the rebase engine does not need to know that divergent resolution happened.
Worked example
Suppose:
XR1 -> R2 -> R3L1 -> L2We want the final history to remain three commits on the remote side.
L1 -> L2onto the parent ofR1to createP1P1against the delta ofR1P1_matchP1_restP1_matchintoR1P1_restonto the parent ofR2R2R2R3, merge the entire remaining pool into the final remote positionResult:
R1 -> R2 -> R3R3Merge-commit handling
If a remote or ancestor commit involved in divergent resolution has multiple parents, the algorithm should use the same generalized cherry-pick behavior already present in
but-rebase:In other words,
PickDivergentshould reuse the crate’s existing N-to-M cherry-pick semantics rather than inventing separate merge behavior.Conflict behavior
There are two distinct failure modes:
1. Prerequisite merge-base construction fails
If the algorithm cannot construct the required effective base or effective destination state for a merge commit, the operation should fail immediately.
This matches existing
cherry_pickbehavior that reportsFailedToMergeBases.2. Final divergent merge conflicts
If the final merge for a remote position conflicts, the operation should still succeed by materializing a conflicted commit using the existing conflict representation.
That is the same user-visible behavior as other rebases in
but-rebase.