Skip to content

Flatten grid structure endpoint memory consumption#61273

Merged
kaxil merged 2 commits intoapache:mainfrom
astronomer:reduce_memory_spike_grid_structure_data
Feb 1, 2026
Merged

Flatten grid structure endpoint memory consumption#61273
kaxil merged 2 commits intoapache:mainfrom
astronomer:reduce_memory_spike_grid_structure_data

Conversation

@jedcunningham
Copy link
Copy Markdown
Member

The grid structure endpoint was loading all serdags for the shown dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more quickly.

For normal simple dags this makes nearly no difference:

Before:
Average: 0.0163 secs
Requests/sec: 184.5453
95%% in 0.0225 secs

After:
Average: 0.0177 secs
Requests/sec: 169.2285
95%% in 0.0214 secs

But for "edge case" dags - those with many serdag versions shown on a single grid, and large serdag versions, it makes a significant difference and can be the difference between an healthy api server and one that OOMs.

Before:
before_bad

Average: 12.5568 secs
Requests/sec: 0.0796
90%% in 12.9824 secs

After:
after_bad

Average: 12.2856 secs
Requests/sec: 0.0814
90%% in 12.4549 secs

(This profiling was done with an intentionally bad dag to really highlight the difference)


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: Cursor (Claude 4.5 Opus)
Generated-by: Gemini CLI (Gemini 3 Pro)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes memory consumption in the grid structure endpoint by implementing streaming and immediate garbage collection of serialized DAG (serdag) objects. Instead of loading all serdags into memory before processing, the endpoint now processes them in batches of 5 and expunges them from the SQLAlchemy session after use.

Changes:

  • Added streaming query execution using yield_per(5) to process serdags in batches
  • Implemented immediate expunging of serdag objects after processing to enable garbage collection
  • Refactored merging logic to process and merge DAGs incrementally instead of collecting all in memory first

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread airflow-core/src/airflow/api_fastapi/core_api/routes/ui/grid.py
Comment thread airflow-core/src/airflow/api_fastapi/core_api/routes/ui/grid.py Outdated
Comment thread airflow-core/src/airflow/api_fastapi/core_api/routes/ui/grid.py Outdated
Copy link
Copy Markdown
Member

@dheerajturaga dheerajturaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! very large serdags with many versions is a common occurrence in our use case. Any and all optimizations are welcome in this space 😄

Comment thread airflow-core/src/airflow/api_fastapi/core_api/routes/ui/grid.py Outdated
Comment thread airflow-core/src/airflow/api_fastapi/core_api/routes/ui/grid.py Outdated
@kaxil kaxil added this to the Airflow 3.1.8 milestone Feb 1, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
@jedcunningham jedcunningham force-pushed the reduce_memory_spike_grid_structure_data branch from 17ca80a to c10b7f9 Compare February 1, 2026 07:16
@kaxil kaxil merged commit 40f6ec1 into apache:main Feb 1, 2026
129 checks passed
@kaxil kaxil deleted the reduce_memory_spike_grid_structure_data branch February 1, 2026 14:18
@kaxil
Copy link
Copy Markdown
Member

kaxil commented Feb 1, 2026

#protm - Small but impactful change

shashbha14 pushed a commit to shashbha14/airflow that referenced this pull request Feb 2, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
jason810496 pushed a commit to abhijeets25012-tech/airflow that referenced this pull request Feb 3, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
vatsrahul1001 pushed a commit that referenced this pull request Feb 3, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
vatsrahul1001 added a commit that referenced this pull request Feb 3, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.

(cherry picked from commit 40f6ec1)
vatsrahul1001 added a commit that referenced this pull request Feb 3, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.

(cherry picked from commit 40f6ec1)
jhgoebbert pushed a commit to jhgoebbert/airflow_Owen-CH-Leung that referenced this pull request Feb 8, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
Ratasa143 pushed a commit to Ratasa143/airflow that referenced this pull request Feb 15, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
choo121600 pushed a commit to choo121600/airflow that referenced this pull request Feb 22, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
Subham-KRLX pushed a commit to Subham-KRLX/airflow that referenced this pull request Mar 4, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
vatsrahul1001 added a commit that referenced this pull request Mar 4, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.

(cherry picked from commit 40f6ec1)
Ankurdeewan pushed a commit to Ankurdeewan/airflow that referenced this pull request Mar 15, 2026
The grid structure endpoint was loading all serdags for the shown
dagruns into memory at once, before merging them together.

Now, we load 5 at a time and also expunge so they can be gc'd more
quickly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants