improve optimization complete logic by mgarrard · Pull Request #4828 · facebook/Ax

mgarrard · 2026-01-27T17:03:15Z

Summary:
This criteria updates the completion state logic to assume if a node can transition, and that transition is to itself, then the optimization is complete.

This works because should_transition_to_next_node only considers transtion blocking criteria (ie not max parallelism) when thinking about should transition or not. And if a node points to itself, we can assume that signifies the end of the optimiztion (steps are initialized this way earlier in this stack). this allows allows for the gs to be re-called into, and the tc criterion to change thus putting it back into a non-complete state.

An alternative I considered is to check if all transition edges are completed, and at least one points to self. This would look something like the below snippet. It would be much more expensive to evaluate, and is guarding against a malformed strategy. Edges are already known to be created in order of importance, and self transition edges should be considered ending edges when their importance is considered

property
def optimization_complete(self) -> bool:
    if len(self._curr.transition_criteria) == 0:
        return False

    # Check ALL transition edges, not just the first matching one
    for next_node, all_tc in self._curr.transition_edges.items():
        transition_blocking = [tc for tc in all_tc if tc.block_transition_if_unmet]
        if not transition_blocking:
            continue
        
        all_met = all(
            tc.is_met(experiment=self.experiment, curr_node=self._curr)
            for tc in transition_blocking
        )
        
        if all_met:
            # An edge's criteria are met - check where it points
            if next_node != self._curr.name:
                return False  # Can transition to different node, not complete
    
    # All met edges (if any) point to self
    # Check if we actually have any met criteria pointing to self
    can_transition, next_node = self._curr.should_transition_to_next_node(
        raise_data_required_error=False
    )
    return can_transition and next_node == self._curr.name

The thrid alternative is to instate "compeletion node", which i think could be viable in the future if we have more complex generation strategies than we currently support, and the self generation logic is too cumbersome.

For now though, I think this is a pretty nice simplification that also should have some compute wins. Going from O (number of nodes * number of TC per node), to O(number of tc on current node)

Differential Revision: D91549954

meta-codesync · 2026-01-27T17:03:21Z

@mgarrard has exported this pull request. If you are a Meta employee, you can view the originating Diff in D91549954.

codecov-commenter · 2026-01-27T17:35:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.75%. Comparing base (ce4dc42) to head (9e28603).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4828   +/-   ##
=======================================
  Coverage   96.75%   96.75%           
=======================================
  Files         591      591           
  Lines       61874    61882    +8     
=======================================
+ Hits        59869    59877    +8     
  Misses       2005     2005

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Summary: This criteria updates the completion state logic to assume if a node can transition, and that transition is to itself, then the optimization is complete. This works because should_transition_to_next_node only considers transtion blocking criteria (ie not max parallelism) when thinking about should transition or not. And if a node points to itself, we can assume that signifies the end of the optimiztion (steps are initialized this way earlier in this stack). this allows allows for the gs to be re-called into, and the tc criterion to change thus putting it back into a non-complete state. An alternative I considered is to check if all transition edges are completed, and at least one points to self. This would look something like the below snippet. It would be much more expensive to evaluate, and is guarding against a malformed strategy. Edges are already known to be created in order of importance, and self transition edges should be considered ending edges when their importance is considered ``` property def optimization_complete(self) -> bool: if len(self._curr.transition_criteria) == 0: return False # Check ALL transition edges, not just the first matching one for next_node, all_tc in self._curr.transition_edges.items(): transition_blocking = [tc for tc in all_tc if tc.block_transition_if_unmet] if not transition_blocking: continue all_met = all( tc.is_met(experiment=self.experiment, curr_node=self._curr) for tc in transition_blocking ) if all_met: # An edge's criteria are met - check where it points if next_node != self._curr.name: return False # Can transition to different node, not complete # All met edges (if any) point to self # Check if we actually have any met criteria pointing to self can_transition, next_node = self._curr.should_transition_to_next_node( raise_data_required_error=False ) return can_transition and next_node == self._curr.name ``` The thrid alternative is to instate "compeletion node", which i think could be viable in the future if we have more complex generation strategies than we currently support, and the self generation logic is too cumbersome. For now though, I think this is a pretty nice simplification that also should have some compute wins. Going from O (number of nodes * number of TC per node), to O(number of tc on current node) Differential Revision: D91549954

Summary: This criteria updates the completion state logic to assume if a node can transition, and that transition is to itself, then the optimization is complete. This works because should_transition_to_next_node only considers transtion blocking criteria (ie not max parallelism) when thinking about should transition or not. And if a node points to itself, we can assume that signifies the end of the optimiztion (steps are initialized this way earlier in this stack). this allows allows for the gs to be re-called into, and the tc criterion to change thus putting it back into a non-complete state. An alternative I considered is to check if all transition edges are completed, and at least one points to self. This would look something like the below snippet. It would be much more expensive to evaluate, and is guarding against a malformed strategy. Edges are already known to be created in order of importance, and self transition edges should be considered ending edges when their importance is considered ``` property def optimization_complete(self) -> bool: if len(self._curr.transition_criteria) == 0: return False # Check ALL transition edges, not just the first matching one for next_node, all_tc in self._curr.transition_edges.items(): transition_blocking = [tc for tc in all_tc if tc.block_transition_if_unmet] if not transition_blocking: continue all_met = all( tc.is_met(experiment=self.experiment, curr_node=self._curr) for tc in transition_blocking ) if all_met: # An edge's criteria are met - check where it points if next_node != self._curr.name: return False # Can transition to different node, not complete # All met edges (if any) point to self # Check if we actually have any met criteria pointing to self can_transition, next_node = self._curr.should_transition_to_next_node( raise_data_required_error=False ) return can_transition and next_node == self._curr.name ``` The thrid alternative is to instate "compeletion node", which i think could be viable in the future if we have more complex generation strategies than we currently support, and the self generation logic is too cumbersome. For now though, I think this is a pretty nice simplification that also should have some compute wins. Going from O (number of nodes * number of TC per node), to O(number of tc on current node) Reviewed By: lena-kashtelyan Differential Revision: D91549954

Summary: Pull Request resolved: facebook#4828 This criteria updates the completion state logic to assume if a node can transition, and that transition is to itself, then the optimization is complete. This works because should_transition_to_next_node only considers transtion blocking criteria (ie not max parallelism) when thinking about should transition or not. And if a node points to itself, we can assume that signifies the end of the optimiztion (steps are initialized this way earlier in this stack). this allows allows for the gs to be re-called into, and the tc criterion to change thus putting it back into a non-complete state. An alternative I considered is to check if all transition edges are completed, and at least one points to self. This would look something like the below snippet. It would be much more expensive to evaluate, and is guarding against a malformed strategy. Edges are already known to be created in order of importance, and self transition edges should be considered ending edges when their importance is considered ``` property def optimization_complete(self) -> bool: if len(self._curr.transition_criteria) == 0: return False # Check ALL transition edges, not just the first matching one for next_node, all_tc in self._curr.transition_edges.items(): transition_blocking = [tc for tc in all_tc if tc.block_transition_if_unmet] if not transition_blocking: continue all_met = all( tc.is_met(experiment=self.experiment, curr_node=self._curr) for tc in transition_blocking ) if all_met: # An edge's criteria are met - check where it points if next_node != self._curr.name: return False # Can transition to different node, not complete # All met edges (if any) point to self # Check if we actually have any met criteria pointing to self can_transition, next_node = self._curr.should_transition_to_next_node( raise_data_required_error=False ) return can_transition and next_node == self._curr.name ``` The thrid alternative is to instate "compeletion node", which i think could be viable in the future if we have more complex generation strategies than we currently support, and the self generation logic is too cumbersome. For now though, I think this is a pretty nice simplification that also should have some compute wins. Going from O (number of nodes * number of TC per node), to O(number of tc on current node) Reviewed By: lena-kashtelyan Differential Revision: D91549954

Summary: This criteria updates the completion state logic to assume if a node can transition, and that transition is to itself, then the optimization is complete. This works because should_transition_to_next_node only considers transtion blocking criteria (ie not max parallelism) when thinking about should transition or not. And if a node points to itself, we can assume that signifies the end of the optimiztion (steps are initialized this way earlier in this stack). this allows allows for the gs to be re-called into, and the tc criterion to change thus putting it back into a non-complete state. An alternative I considered is to check if all transition edges are completed, and at least one points to self. This would look something like the below snippet. It would be much more expensive to evaluate, and is guarding against a malformed strategy. Edges are already known to be created in order of importance, and self transition edges should be considered ending edges when their importance is considered ``` property def optimization_complete(self) -> bool: if len(self._curr.transition_criteria) == 0: return False # Check ALL transition edges, not just the first matching one for next_node, all_tc in self._curr.transition_edges.items(): transition_blocking = [tc for tc in all_tc if tc.block_transition_if_unmet] if not transition_blocking: continue all_met = all( tc.is_met(experiment=self.experiment, curr_node=self._curr) for tc in transition_blocking ) if all_met: # An edge's criteria are met - check where it points if next_node != self._curr.name: return False # Can transition to different node, not complete # All met edges (if any) point to self # Check if we actually have any met criteria pointing to self can_transition, next_node = self._curr.should_transition_to_next_node( raise_data_required_error=False ) return can_transition and next_node == self._curr.name ``` The thrid alternative is to instate "compeletion node", which i think could be viable in the future if we have more complex generation strategies than we currently support, and the self generation logic is too cumbersome. For now though, I think this is a pretty nice simplification that also should have some compute wins. Going from O (number of nodes * number of TC per node), to O(number of tc on current node) Reviewed By: lena-kashtelyan Differential Revision: D91549954

meta-codesync · 2026-02-10T01:26:10Z

This pull request has been merged in ed975ea.

meta-cla Bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 27, 2026

meta-codesync Bot added fb-exported meta-exported labels Jan 27, 2026

mgarrard force-pushed the export-D91549954 branch from fd87f1a to c77d15a Compare January 27, 2026 19:47

mgarrard force-pushed the export-D91549954 branch from c77d15a to 964c8ab Compare January 27, 2026 19:47

mgarrard force-pushed the export-D91549954 branch from 964c8ab to 5c24950 Compare February 9, 2026 20:24

mgarrard force-pushed the export-D91549954 branch from 5c24950 to e1596d1 Compare February 9, 2026 20:27

mgarrard force-pushed the export-D91549954 branch from e1596d1 to 9e28603 Compare February 9, 2026 20:27

meta-codesync Bot closed this in ed975ea Feb 10, 2026

facebook-github-bot added the Merged label Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve optimization complete logic#4828

improve optimization complete logic#4828
mgarrard wants to merge 1 commit intofacebook:mainfrom
mgarrard:export-D91549954

mgarrard commented Jan 27, 2026

Uh oh!

meta-codesync Bot commented Jan 27, 2026

Uh oh!

codecov-commenter commented Jan 27, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mgarrard commented Jan 27, 2026

Uh oh!

meta-codesync Bot commented Jan 27, 2026

Uh oh!

codecov-commenter commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

meta-codesync Bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jan 27, 2026 •

edited

Loading