Python: Orchestration output ADR by TaoChenOSU · Pull Request #4799 · microsoft/agent-framework

TaoChenOSU · 2026-03-19T21:17:51Z

Motivation and Context

Description

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

moonbox3 · 2026-03-20T08:30:32Z

docs/decisions/0020-orchestration-output-types.md

+
+Orchestrations (Concurrent, Sequential, Handoff, GroupChat, Magentic) are not standalone features — they are prebuilt workflow patterns built on top of the workflow system APIs. They serve as both ready-to-use solutions and as reference implementations that demonstrate how to correctly compose agents using the workflow primitives (`Executor`, `WorkflowBuilder`, `yield_output`, `as_agent()`, etc.).
+
+This dual role makes orchestrations critically important: the patterns they establish become the patterns that developers follow when building their own workflows. In practice, developers use the framework to build workflows that coordinate Foundry agents, and ultimately deploy those workflows as hosted agents on Azure AI Foundry. This path — from workflow definition to agent deployment — relies on a seamless integration between workflows and agents. If orchestrations model this integration poorly (e.g., producing outputs that don't compose cleanly with `as_agent()`), developers building custom workflows will inherit the same problems.


Suggested change

This dual role makes orchestrations critically important: the patterns they establish become the patterns that developers follow when building their own workflows. In practice, developers use the framework to build workflows that coordinate Foundry agents, and ultimately deploy those workflows as hosted agents on Azure AI Foundry. This path — from workflow definition to agent deployment — relies on a seamless integration between workflows and agents. If orchestrations model this integration poorly (e.g., producing outputs that don't compose cleanly with `as_agent()`), developers building custom workflows will inherit the same problems.

This dual role makes orchestrations critically important: the patterns they establish become the patterns that developers follow when building their own workflows. In practice, developers use the framework to build workflows that coordinate Foundry agents, and ultimately deploy those workflows as hosted agents on Microsoft Foundry. This path — from workflow definition to agent deployment — relies on a seamless integration between workflows and agents. If orchestrations model this integration poorly (e.g., producing outputs that don't compose cleanly with `as_agent()`), developers building custom workflows will inherit the same problems.

docs/decisions/0020-orchestration-output-types.md

moonbox3 · 2026-03-20T08:31:40Z

docs/decisions/0020-orchestration-output-types.md

+
+```python
+workflow = SequentialBuilder(participants=[agent1, agent2, agent3]).build()
+events = await workflow.run(message="Write a report")


Isn't this missing a stream=True?

moonbox3 · 2026-03-20T08:33:46Z

docs/decisions/0020-orchestration-output-types.md

+- **Non-streaming**: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer.
+- **Streaming**: Converts each output event into `AgentResponseUpdate` objects and yields them as they arrive. All updates — whether from intermediate agents or the final conversation dump — are yielded indiscriminately as streaming chunks.
+
+In both modes, `WorkflowAgent` processes all output events without distinguishing intermediate from final. When `intermediate_outputs=True`, this means intermediate agent responses and the final conversation dump are merged together. Even when `intermediate_outputs=False`, the final output is still the full conversation rather than the meaningful answer.


Is this true for single agents? I only get back one AgentResponse with one content that is the final answer?

No, in single agent, you get all contents from all model calls/tool results in a single AgentResponse (through get_final_response when streaming)

For workflows, you will still receive intermediate works by setting intermediate_outputs=True. Users can configure it depending on their scenarios.

moonbox3 · 2026-03-20T08:45:15Z

docs/decisions/0020-orchestration-output-types.md

+  - Adds a new concept to the workflow framework (`run_output` vs `output`).
+  - `WorkflowAgent`, sub-workflow consumers, and event processing logic all need to handle two output event types.
+
+### Option 3: Add `is_run_completed` flag to existing output event


When an AgentExecutor streams, it calls yield_output(update) for every AgentResponseUpdate chunk. The last streaming chunk is not itself the "final answer," it's just the last piece. Who assembles the full AgentResponse and sets is_run_completed=True? The ADR doesn't address this.

Currently AgentExecutor yields individual updates in streaming mode, so the orchestration layer above would need to aggregate and re-yield. How would this best work?

Right, the agent executors don't know. Only the author of the workflow knows what makes up the run output. Our orchestration layer doesn't use the AgentExecutor directly or as the last executor in the flow (which I think it's not so good of a sign):

Sequential: the end executor which knows it must be the end of the workflow run.

Concurrent: the aggregator knows the end of the workflow run.

Handoff: subclasses the AgentExecutor so it knows when it's the end of the run.

GroupChat & Magentic: the manager knows the end.

There isn't an orchestration layer. The orchestrations are just like any workflow. When an AgentExecutor yields an update, it creates an output event.

moonbox3 · 2026-03-20T08:52:06Z

docs/decisions/0020-orchestration-output-types.md

+
+### Option 3: Add `is_run_completed` flag to existing output event
+
+Add an optional `is_run_completed: bool` parameter to the existing `yield_output()` method and `WorkflowEvent`:


I'd say that nothing in the framework would enforce that exactly one event has is_run_completed=True, or that it's the last output event. Custom workflow authors could forget it, set it multiple times, or set it on an intermediate event. Should the framework validate this invariant (in WorkflowRunResult or the event stream)? The ADR says "executors must remember to set it" in the cons section. Wouldn't that be a footgun for custom workflows?

I agree, this seems like a recipe for mistakes

Yeah, we can create a warning if there are more than one output event with is_run_completed=True.

moonbox3 · 2026-03-20T08:54:03Z

docs/decisions/0020-orchestration-output-types.md

+
+Each orchestration pattern changes what data it yields as the final output and sets `is_run_completed=True`:
+
+| Orchestration | Current Final Output | New Final Output | Rationale |


Isn't there a breaking change here? WorkflowRunResult.get_outputs() currently returns list[Any] (all output event data). Changing final outputs from list[Message] to AgentResponse is a breaking change for any caller doing for msg in result.get_outputs()[0] or similar iteration. The ADR doesn't call this out as breaking, and the PR title doesn't have [BREAKING]. Just want to make sure we're okay with this.

Fundamentally it does not make sense to me that a Workflow defaults to returning a AgentResponse

Will mark this as breaking.

And to @eavanvalkenburg's comment, we don't force the workflow to output anything. The orchestrations are just a way to build workflows, and we think it makes sense to have the orchestrations to output AgentResponse better than a list of messages because orchestrations are designed to work with agents, and we are seeing increasingly people building orchestrations and turning them into agents and deploying them to Foundry.

Workflows don't return anything. It's event based. A workflow can generate output events containing anything.

moonbox3 · 2026-03-20T08:56:55Z

docs/decisions/0020-orchestration-output-types.md

+
+Orchestrations can be wrapped as agents using `workflow.as_agent()`. The `WorkflowAgent` processes workflow output events differently depending on the mode:
+
+- **Non-streaming**: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer.


This says when intermediate_outputs=False, "only the is_run_completed=True event is surfaced." But today, WorkflowAgent collects all type='output' events and converts them. The proposed behavior requires WorkflowAgent to either:

Buffer all events and only use the ₩is_run_completed=True` one (discarding earlier outputs), or

Filter during collection

Which approach? Buffering means memory overhead for long-running orchestrations. Filtering means you lose the ability to retroactively include intermediate outputs.

I am not sure if I understand the question. This line is a description of the current behavior, not the proposal.

In the proposed solution, the WorkflowAgent will not change and will not discriminate outputs. Users must define their workflow in a way they want before turning it to an agent.

moonbox3 · 2026-03-20T08:58:45Z

docs/decisions/0020-orchestration-output-types.md

+
+| Orchestration | Current Final Output | New Final Output | Rationale |
+|---|---|---|---|
+| **Concurrent** | `list[Message]` (user prompt + one reply per agent) | `AgentResponse` containing all sub-agent response messages | The combined responses from all parallel agents represent the orchestration's answer. Messages are copied from each sub-agent's `AgentResponse`. |


What happens when is_run_completed event has AgentResponse with messages from multiple agents? Could this happen?

Here it says the final output is an AgentResponse "containing all sub-agent response messages." An AgentResponse has a single agent_id field. Whose agent_id is it? The orchestration's? And AgentResponse.messages would mix messages from different agents, correct? Is that semantically sound, or should there be a different container?

I think this makes sense, perhaps I need to reword this a bit.

Concurrent allows custom aggregation strategies so not every concurrent will simply aggregate the messages into a list. Of course, the default one does simple aggregation. We can use the workflow name as the agent response in that case, and for the individual messages, they will be put into the list unchanged.

A different container also makes sense. There is no correct answer here. I think AgentResponse is a natural integration, the prerequisite is that users need to know that they are running a concurrent orchestration.

moonbox3 · 2026-03-20T08:59:39Z

docs/decisions/0020-orchestration-output-types.md

+| **GroupChat** | `list[Message]` (all rounds + completion message) | `AgentResponse` containing the summary or completion message | The orchestrator's summary/end message is the meaningful result. Individual round messages are intermediate outputs visible when `intermediate_outputs=True`. |
+| **Magentic** | `list[Message]` (chat history + final answer) | `AgentResponse` containing the synthesized final answer | The manager's synthesized final answer is the meaningful result. Individual agent work is intermediate. |
+
+### Integration Points


If all orchestrations change their output from list[Message] to AgentResponse, existing consumers break. Are we including a deprecation period? A version flag? Or is this a clean break right before GA?

We are not GAing orchestrations.

The intent is to GA orchestrations, please check with Shawn. Even if that doesn't mean the same day as core GA, it will fast-follow.

docs/decisions/0020-orchestration-output-types.md

eavanvalkenburg · 2026-03-20T15:15:58Z

docs/decisions/0020-orchestration-output-types.md

+
+Orchestrations can be wrapped as agents using `workflow.as_agent()`. The `WorkflowAgent` processes workflow output events differently depending on the mode:
+
+- **Non-streaming**: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer.


this is what happens in single agents, as well, there might be multiple turns and tool calls, and you wait until all of that is done, and then you return the full set of Messages in one AgentResponse.

eavanvalkenburg · 2026-03-20T15:16:49Z

docs/decisions/0020-orchestration-output-types.md

+Orchestrations can be wrapped as agents using `workflow.as_agent()`. The `WorkflowAgent` processes workflow output events differently depending on the mode:
+
+- **Non-streaming**: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer.
+- **Streaming**: Converts each output event into `AgentResponseUpdate` objects and yields them as they arrive. All updates — whether from intermediate agents or the final conversation dump — are yielded indiscriminately as streaming chunks.


This is also the same, regardless of where the Update originates, the user get's it back (there are really only 2 places in single agents, but still)

eavanvalkenburg · 2026-03-20T15:18:15Z

docs/decisions/0020-orchestration-output-types.md

+- **Non-streaming**: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer.
+- **Streaming**: Converts each output event into `AgentResponseUpdate` objects and yields them as they arrive. All updates — whether from intermediate agents or the final conversation dump — are yielded indiscriminately as streaming chunks.
+
+In both modes, `WorkflowAgent` processes all output events without distinguishing intermediate from final. When `intermediate_outputs=True`, this means intermediate agent responses and the final conversation dump are merged together. Even when `intermediate_outputs=False`, the final output is still the full conversation rather than the meaningful answer.


No, in single agent, you get all contents from all model calls/tool results in a single AgentResponse (through get_final_response when streaming)

docs/decisions/0020-orchestration-output-types.md

eavanvalkenburg · 2026-03-20T15:22:36Z

docs/decisions/0020-orchestration-output-types.md

+
+### Option 3: Add `is_run_completed` flag to existing output event
+
+Add an optional `is_run_completed: bool` parameter to the existing `yield_output()` method and `WorkflowEvent`:


I agree, this seems like a recipe for mistakes

eavanvalkenburg · 2026-03-20T15:23:09Z

docs/decisions/0020-orchestration-output-types.md

+  - Adds a new concept to the workflow framework (`run_output` vs `output`).
+  - `WorkflowAgent`, sub-workflow consumers, and event processing logic all need to handle two output event types.
+
+### Option 3: Add `is_run_completed` flag to existing output event


nit: but is run the right verbiage for this, shouldn't it be step?

Could you elaborate?

eavanvalkenburg · 2026-03-20T15:27:47Z

docs/decisions/0020-orchestration-output-types.md

+
+## Definition of a Run
+
+A **run** represents a single invocation of the workflow — from receiving an initial request to the workflow returning to idle status. The `is_run_completed` flag on an output event signals that this output represents the final result of the current run.


How would this work, what if multiple executors set this at the same time?

This is just like outputs. Any executor in the workflow can create outputs. We can't prevent this because we don't know the internals of executors.

Similar to the comment above, we can create a warning when we see two output events with is_run_completed=True.

eavanvalkenburg · 2026-03-20T15:29:35Z

docs/decisions/0020-orchestration-output-types.md

+
+Each orchestration pattern changes what data it yields as the final output and sets `is_run_completed=True`:
+
+| Orchestration | Current Final Output | New Final Output | Rationale |


Fundamentally it does not make sense to me that a Workflow defaults to returning a AgentResponse

eavanvalkenburg · 2026-03-20T15:31:31Z

docs/decisions/0020-orchestration-output-types.md

+
+When `WorkflowAgent` converts workflow events to an `AgentResponse`:
+
+- Events with `is_run_completed=True` provide the `AgentResponse` that becomes the agent's response directly, with the name of the workflow as the author of the response.


this would make this inconsistent with single agents, because there you alwasy get all intermediate work in a AgentResponse

You can still get intermediate work too with workflows as agents.

TaoChenOSU added 2 commits March 19, 2026 14:14

ADR: Orchestration outputs

a4d8f6c

Add scope disclaimer

bd8eedc

TaoChenOSU self-assigned this Mar 19, 2026

TaoChenOSU added this to Agent Framework Mar 19, 2026

TaoChenOSU added python agent orchestration Issues related to agent orchestration workflows Related to Workflows in agent-framework labels Mar 19, 2026

TaoChenOSU moved this to In Progress in Agent Framework Mar 19, 2026

TaoChenOSU moved this from In Progress to In Review in Agent Framework Mar 19, 2026

github-actions bot changed the title ~~Orchestration output ADR~~ Python: Orchestration output ADR Mar 19, 2026

markwallace-microsoft added the documentation Improvements or additions to documentation label Mar 19, 2026

moonbox3 reviewed Mar 20, 2026

View reviewed changes

eavanvalkenburg reviewed Mar 20, 2026

View reviewed changes


		Orchestrations (Concurrent, Sequential, Handoff, GroupChat, Magentic) are not standalone features — they are prebuilt workflow patterns built on top of the workflow system APIs. They serve as both ready-to-use solutions and as reference implementations that demonstrate how to correctly compose agents using the workflow primitives (`Executor`, `WorkflowBuilder`, `yield_output`, `as_agent()`, etc.).

		This dual role makes orchestrations critically important: the patterns they establish become the patterns that developers follow when building their own workflows. In practice, developers use the framework to build workflows that coordinate Foundry agents, and ultimately deploy those workflows as hosted agents on Azure AI Foundry. This path — from workflow definition to agent deployment — relies on a seamless integration between workflows and agents. If orchestrations model this integration poorly (e.g., producing outputs that don't compose cleanly with `as_agent()`), developers building custom workflows will inherit the same problems.


		### Option 3: Add `is_run_completed` flag to existing output event

		Add an optional `is_run_completed: bool` parameter to the existing `yield_output()` method and `WorkflowEvent`:


		Each orchestration pattern changes what data it yields as the final output and sets `is_run_completed=True`:

		\| Orchestration \| Current Final Output \| New Final Output \| Rationale \|


		Orchestrations can be wrapped as agents using `workflow.as_agent()`. The `WorkflowAgent` processes workflow output events differently depending on the mode:

		- Non-streaming: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer.


		## Definition of a Run

		A run represents a single invocation of the workflow — from receiving an initial request to the workflow returning to idle status. The `is_run_completed` flag on an output event signals that this output represents the final result of the current run.


		When `WorkflowAgent` converts workflow events to an `AgentResponse`:

		- Events with `is_run_completed=True` provide the `AgentResponse` that becomes the agent's response directly, with the name of the workflow as the author of the response.

Conversation

TaoChenOSU commented Mar 19, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants