Skip to content

Always provide a relevant TI context in Dag callback#61274

Merged
eladkal merged 80 commits intoapache:mainfrom
IndiTheo:bugfix/improve-dag-callback-context
Mar 8, 2026
Merged

Always provide a relevant TI context in Dag callback#61274
eladkal merged 80 commits intoapache:mainfrom
IndiTheo:bugfix/improve-dag-callback-context

Conversation

@IndiTheo
Copy link
Copy Markdown
Contributor

@IndiTheo IndiTheo commented Jan 31, 2026

This PR strives to make sense out of TI contexts passed to Dag callbacks. Rather than always passing the last task lexicogrpahically, the new logic passes the last "relevant" task to Dag's resulting state.

Behavior outline:

  1. On regular failure, pass the latest finished task out of the failed ones.
  2. On timeout, pass the task that started last out of the unfinished tasks.
  3. On "tasks deadlocked", pass a task that should have run next but wasn't able to.
  4. On success, pass the latest succeeded task.

There's a possibility it's better to pass no "ti" key at all to Dag-level contexts, or provide a None value. This might break the existing code even more, so currently the PR only improves the relevancy of the task passed. Yet we should consider the option of not passing any task instance in the future.


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)
    deepwiki.com was used to find the relevant logic in the repo, and an embedded chat was used to make a skeleton for tests.

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@IndiTheo IndiTheo requested review from XD-DENG and ashb as code owners January 31, 2026 00:33
@boring-cyborg boring-cyborg Bot added the area:Scheduler including HA (high availability) scheduler label Jan 31, 2026
@IndiTheo IndiTheo changed the title Always provide a relevant task instance in Dag callback Always provide a relevant TI context in Dag callback Jan 31, 2026
Comment thread airflow-core/src/airflow/jobs/scheduler_job_runner.py Outdated
@IndiTheo IndiTheo marked this pull request as draft February 4, 2026 19:43
@IndiTheo IndiTheo marked this pull request as ready for review February 4, 2026 20:26
@IndiTheo
Copy link
Copy Markdown
Contributor Author

IndiTheo commented Feb 4, 2026

I want to get some feedback on the behavior, whether it's the desired one or if anything can be made better.

I will update the docs to explain the behavior of passed TIs to dag callbacks.

Copy link
Copy Markdown
Contributor

@Nataneljpwd Nataneljpwd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I have left a few comments on the changes, overall, great find!

Comment thread airflow-core/src/airflow/jobs/scheduler_job_runner.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
@IndiTheo IndiTheo requested a review from Nataneljpwd February 7, 2026 12:07
@IndiTheo
Copy link
Copy Markdown
Contributor Author

IndiTheo commented Mar 6, 2026

Slightly confused about the milestone

@eladkal
Copy link
Copy Markdown
Contributor

eladkal commented Mar 6, 2026

Slightly confused about the milestone

I'll set it for 3.1.9 but this is subject to confirmation from the release manager of 3.1.9

@kaxil
Copy link
Copy Markdown
Member

kaxil commented Mar 6, 2026

Since technically, this is a chance in behavior, let's put it in 3.2, can you add a newsfragment too please @Asquator : https://github.com/apache/airflow/tree/main/airflow-core/newsfragments

Comment thread airflow-core/docs/administration-and-deployment/logging-monitoring/callbacks.rst Outdated
IndiTheo and others added 2 commits March 6, 2026 16:37
Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
@eladkal
Copy link
Copy Markdown
Contributor

eladkal commented Mar 8, 2026

static checks are failing

@IndiTheo
Copy link
Copy Markdown
Contributor Author

IndiTheo commented Mar 8, 2026

The test is failing because my newsfragment has multiple lines, and it can't...

Do I change the type to significant, or avoid the elaborate explanations by including just one general sentence?

Generally I'm not sure why we have to limit this to just one line... I don't think it's a significant change but a minor improvement, and I still wanted to include some details.

@IndiTheo
Copy link
Copy Markdown
Contributor Author

IndiTheo commented Mar 8, 2026

Newsfragment validation should pass now.

@eladkal eladkal merged commit 43c17c6 into apache:main Mar 8, 2026
73 checks passed
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented Mar 8, 2026

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

wjddn279 pushed a commit to wjddn279/airflow that referenced this pull request Mar 9, 2026
* Skeleton solution

* Handle more cases

* Refactored None handling

* Fixed task name

* Materialized tasks for double iteration

* Use default argument in min

* Use default argument in max

* Use default argument in max

* Changed end_date to start_date

* Fix logic for on_success_callback

* Handle the case where ti has no start_date

* Logic for DAG testing

* Cosmetics

* Simplified logic

* Pass last succeeded TI on success

* Added template method

* Cosmetics

* More cosmetics

* Lint

* Added default keys for mypy check to succeed

* Updated the docs

* Fixed timezone aware datetime comparison

* Clarified Dag run timeouts in the docs

* Fix condition to look for failures across all relevant TIs

* Mypy & ruff

* Timezone awareness for deadlocked tasks case

* Select from all tis for success callback

* Improved deadlock callback logic

* Mypy & ruff

* Made the test more explicit

* Format

* Updated backward docs version to 3.1.9 according to milestone

* Revert "Updated backward docs version to 3.1.9 according to milestone"

This reverts commit 4defea7.

* Type hint

Thank you

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>

* Reapply "Updated backward docs version to 3.1.9 according to milestone"

This reverts commit a31476a.

* Removed redundant iter() call

* Removed redundant logic in deadlock case

* Updated backward docs version to 3.2.0 according to milestone

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>

* Added a newsfragment entry

* Removed trailing whitespaces

* Shortened nesfragment to just one line

---------

Co-authored-by: TheoS <asquator@proton.me>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
dominikhei pushed a commit to dominikhei/airflow that referenced this pull request Mar 11, 2026
* Skeleton solution

* Handle more cases

* Refactored None handling

* Fixed task name

* Materialized tasks for double iteration

* Use default argument in min

* Use default argument in max

* Use default argument in max

* Changed end_date to start_date

* Fix logic for on_success_callback

* Handle the case where ti has no start_date

* Logic for DAG testing

* Cosmetics

* Simplified logic

* Pass last succeeded TI on success

* Added template method

* Cosmetics

* More cosmetics

* Lint

* Added default keys for mypy check to succeed

* Updated the docs

* Fixed timezone aware datetime comparison

* Clarified Dag run timeouts in the docs

* Fix condition to look for failures across all relevant TIs

* Mypy & ruff

* Timezone awareness for deadlocked tasks case

* Select from all tis for success callback

* Improved deadlock callback logic

* Mypy & ruff

* Made the test more explicit

* Format

* Updated backward docs version to 3.1.9 according to milestone

* Revert "Updated backward docs version to 3.1.9 according to milestone"

This reverts commit 4defea7.

* Type hint

Thank you

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>

* Reapply "Updated backward docs version to 3.1.9 according to milestone"

This reverts commit a31476a.

* Removed redundant iter() call

* Removed redundant logic in deadlock case

* Updated backward docs version to 3.2.0 according to milestone

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>

* Added a newsfragment entry

* Removed trailing whitespaces

* Shortened nesfragment to just one line

---------

Co-authored-by: TheoS <asquator@proton.me>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
Pyasma pushed a commit to Pyasma/airflow that referenced this pull request Mar 13, 2026
* Skeleton solution

* Handle more cases

* Refactored None handling

* Fixed task name

* Materialized tasks for double iteration

* Use default argument in min

* Use default argument in max

* Use default argument in max

* Changed end_date to start_date

* Fix logic for on_success_callback

* Handle the case where ti has no start_date

* Logic for DAG testing

* Cosmetics

* Simplified logic

* Pass last succeeded TI on success

* Added template method

* Cosmetics

* More cosmetics

* Lint

* Added default keys for mypy check to succeed

* Updated the docs

* Fixed timezone aware datetime comparison

* Clarified Dag run timeouts in the docs

* Fix condition to look for failures across all relevant TIs

* Mypy & ruff

* Timezone awareness for deadlocked tasks case

* Select from all tis for success callback

* Improved deadlock callback logic

* Mypy & ruff

* Made the test more explicit

* Format

* Updated backward docs version to 3.1.9 according to milestone

* Revert "Updated backward docs version to 3.1.9 according to milestone"

This reverts commit 4defea7.

* Type hint

Thank you

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>

* Reapply "Updated backward docs version to 3.1.9 according to milestone"

This reverts commit a31476a.

* Removed redundant iter() call

* Removed redundant logic in deadlock case

* Updated backward docs version to 3.2.0 according to milestone

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>

* Added a newsfragment entry

* Removed trailing whitespaces

* Shortened nesfragment to just one line

---------

Co-authored-by: TheoS <asquator@proton.me>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DAG on_failure_callback uses wrong context

9 participants