Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
57b3d70
Initial plan
Copilot Feb 5, 2026
5bbe648
Add comprehensive design document for airlock storage consolidation
Copilot Feb 5, 2026
1d2172e
Update design to use metadata-based stage management instead of data …
Copilot Feb 5, 2026
fa39c85
Add metadata-based blob operations and update constants for consolida…
Copilot Feb 5, 2026
138820b
Add implementation status document and update CHANGELOG
Copilot Feb 5, 2026
8941b1b
Complete Terraform infrastructure consolidation for core and workspac…
Copilot Feb 5, 2026
47dcdc8
Add storage helper functions and update implementation status
Copilot Feb 5, 2026
427515d
New approach: Use blob index tags for EventGrid filtering while keepi…
Copilot Feb 5, 2026
b09f990
Implement unified EventGrid subscriptions with metadata-based routing
Copilot Feb 5, 2026
231f434
Integrate metadata-based routing in airlock processor with feature fl…
Copilot Feb 5, 2026
76a3d62
Add comprehensive EventGrid unified subscription documentation
Copilot Feb 5, 2026
cd70948
Add ABAC conditions to restrict API access based on container metadat…
Copilot Feb 5, 2026
f08b384
Fix stage name consistency: use hyphenated 'in-progress' throughout
Copilot Feb 5, 2026
5211f36
Final consistency fixes for stage naming throughout codebase
Copilot Feb 5, 2026
76a09bd
Add security analysis revealing need for separate public/private stor…
Copilot Feb 5, 2026
2c6235b
Updated plan: Use ABAC with private endpoint filtering to maintain ne…
Copilot Feb 5, 2026
0df7e5c
Implement revised architecture: keep public accounts separate, use AB…
Copilot Feb 5, 2026
e375cf2
Plan: Consolidate to 1 core account using ABAC to allow public/privat…
Copilot Feb 5, 2026
72c9478
Final architecture: 1 core account via App Gateway + 1 per workspace …
Copilot Feb 5, 2026
2b66bb3
Implement 1 core account with App Gateway routing and PE-based ABAC
Copilot Feb 5, 2026
1d5b8ef
Add final architecture documentation and update CHANGELOG
Copilot Feb 5, 2026
7638186
Fix ABAC condition logic and align API access with ABAC permissions
Copilot Feb 5, 2026
d490b5a
Add clarifying comments to ABAC conditions explaining the Boolean logic
Copilot Feb 5, 2026
e20e33a
Add comprehensive analysis of copy operations and workspace_id ABAC o…
Copilot Feb 5, 2026
4f2fe0b
Implement Option B: Global workspace storage with workspace_id ABAC f…
Copilot Feb 5, 2026
aa6c32a
Update airlock documentation for Option B architecture and confirm UI…
Copilot Feb 5, 2026
85ab8af
Add e2e tests for Option B workspace isolation and metadata-based tra…
Copilot Feb 5, 2026
bee6cdc
Remove all "Option B" references from codebase and documentation
Copilot Feb 5, 2026
ff96ee5
Remove unimplemented App Gateway PE and correct documentation
Copilot Feb 5, 2026
e025056
Remove all Option A/B references and clarify 2 storage accounts
Copilot Feb 5, 2026
b98ede1
Remove unused legacy storage account name variables
Copilot Feb 5, 2026
4a9b185
remove comments, update docs
marrobi Feb 5, 2026
8421bdb
Update app gawateway configuration
marrobi Feb 5, 2026
34f2636
linting
marrobi Feb 5, 2026
3d99220
Implement airlock security improvements: is_publicly_accessible_stage…
Copilot Feb 6, 2026
ad73137
Rebase changes onto copilot/redesign-airlock-storage-accounts: tighte…
Copilot Feb 6, 2026
105f38b
Fix 3 bugs found during pre-merge review: BlobCreatedTrigger missing …
Copilot Feb 6, 2026
7335c65
Merge pull request #18 from marrobi/copilot/tighten-public-accessible…
marrobi Feb 6, 2026
55f3590
Tests [ass, needs flows and access manually validating.
marrobi Feb 10, 2026
b0c50e8
update core version
marrobi Feb 10, 2026
fcead34
Merge branch 'main' into copilot/redesign-airlock-storage-accounts
marrobi Feb 10, 2026
bd14845
fix: make consolidated core storage publicly accessible for SAS uploads
marrobi Feb 10, 2026
8b405ef
Merge branch 'main' into copilot/redesign-airlock-storage-accounts
marrobi Feb 10, 2026
115e778
Fix linting.
marrobi Feb 11, 2026
051ef76
Merge branch 'copilot/redesign-airlock-storage-accounts' of https://g…
marrobi Feb 11, 2026
98764f9
Deployed, with docs, needds fully testing.
marrobi Apr 2, 2026
816e4e8
fix linting
marrobi Apr 2, 2026
25c194e
tested import and export v2 flow
marrobi Apr 2, 2026
90fc2d7
fix cancelled
marrobi Apr 2, 2026
de87d71
Merge remote-tracking branch 'upstream/main' into copilot/redesign-ai…
marrobi Apr 2, 2026
a9125cb
AdressPR comments.
marrobi Apr 2, 2026
d885d40
linting
marrobi Apr 2, 2026
4992cfd
Fix PR comments
marrobi Apr 2, 2026
3b9dbd6
update e2e tests
marrobi Apr 2, 2026
d3fa795
update role assignment
marrobi Apr 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -214,3 +214,4 @@ validation.txt

/index.html
.DS_Store
*_old.tf
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
## (Unreleased)

ENHANCEMENTS:
* Add per-workspace `airlock_version` property (1=legacy, 2=consolidated) for backwards-compatible airlock storage migration. Add core-level `enable_legacy_airlock` toggle. Remove `USE_METADATA_STAGE_MANAGEMENT` environment variable. ([#4853](https://github.com/microsoft/AzureTRE/pull/4853), [#4358](https://github.com/microsoft/AzureTRE/issues/4358))
* Specify default_outbound_access_enabled = false setting for all subnets ([#4757](https://github.com/microsoft/AzureTRE/pull/4757))
* Pin all GitHub Actions workflow steps to full commit SHAs to prevent supply chain attacks plus update to latest releases ([#4886](https://github.com/microsoft/AzureTRE/pull/4886))

Expand Down
61 changes: 61 additions & 0 deletions airlock_processor/BlobCreatedTrigger/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,17 @@
from shared_code.blob_operations import get_blob_info_from_topic_and_subject, get_blob_client_from_blob_info


# Mapping from v2 container metadata stage to (completed_step, new_status)
V2_STAGE_COMPLETION_MAP = {
constants.STAGE_IMPORT_APPROVED: (constants.STAGE_APPROVAL_INPROGRESS, constants.STAGE_APPROVED),
constants.STAGE_IMPORT_REJECTED: (constants.STAGE_REJECTION_INPROGRESS, constants.STAGE_REJECTED),
constants.STAGE_IMPORT_BLOCKED: (constants.STAGE_BLOCKING_INPROGRESS, constants.STAGE_BLOCKED_BY_SCAN),
constants.STAGE_EXPORT_APPROVED: (constants.STAGE_APPROVAL_INPROGRESS, constants.STAGE_APPROVED),
constants.STAGE_EXPORT_REJECTED: (constants.STAGE_REJECTION_INPROGRESS, constants.STAGE_REJECTED),
constants.STAGE_EXPORT_BLOCKED: (constants.STAGE_BLOCKING_INPROGRESS, constants.STAGE_BLOCKED_BY_SCAN),
}


def main(msg: func.ServiceBusMessage,
stepResultEvent: func.Out[func.EventGridOutputEvent],
dataDeletionEvent: func.Out[func.EventGridOutputEvent]):
Expand All @@ -23,6 +34,12 @@ def main(msg: func.ServiceBusMessage,
topic = json_body["topic"]
request_id = re.search(r'/blobServices/default/containers/(.*?)/blobs', json_body["subject"]).group(1)

# Check if this event is from a v2 consolidated storage account
if constants.STORAGE_ACCOUNT_NAME_AIRLOCK_CORE in topic or constants.STORAGE_ACCOUNT_NAME_AIRLOCK_WORKSPACE_GLOBAL in topic:
_handle_v2_blob_created(json_body, topic, request_id, stepResultEvent, dataDeletionEvent)
return

# Legacy v1 handling below
# message originated from in-progress blob creation
if constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS in topic or constants.STORAGE_ACCOUNT_NAME_EXPORT_INPROGRESS in topic:
try:
Expand Down Expand Up @@ -55,6 +72,9 @@ def main(msg: func.ServiceBusMessage,
elif constants.STORAGE_ACCOUNT_NAME_IMPORT_BLOCKED in topic or constants.STORAGE_ACCOUNT_NAME_EXPORT_BLOCKED in topic:
completed_step = constants.STAGE_BLOCKING_INPROGRESS
new_status = constants.STAGE_BLOCKED_BY_SCAN
else:
logging.warning(f"Unknown storage account in topic: {topic}")
return

# reply with a step completed event
stepResultEvent.set(
Expand Down Expand Up @@ -88,3 +108,44 @@ def send_delete_event(dataDeletionEvent: func.Out[func.EventGridOutputEvent], js
data_version=constants.DATA_DELETION_EVENT_DATA_VERSION
)
)


def _handle_v2_blob_created(json_body, topic, request_id, stepResultEvent, dataDeletionEvent):
"""Handle BlobCreated events from v2 consolidated storage accounts.

In v2, cross-account copies (e.g., import approval: core → workspace-global)
fire BlobCreated events. Container metadata determines the stage and appropriate
step result, matching the v1 pattern where BlobCreatedTrigger signals copy completion.
"""
storage_account_name, _, _ = get_blob_info_from_topic_and_subject(
topic=json_body["topic"], subject=json_body["subject"])

from shared_code.blob_operations_metadata import get_container_metadata
try:
metadata = get_container_metadata(storage_account_name, request_id)
except Exception:
logging.warning(f"Could not read container metadata for request {request_id} on {storage_account_name}, skipping")
return

stage = metadata.get('stage', '')
logging.info(f"V2 BlobCreated for request {request_id}: stage={stage}, account={storage_account_name}")

if stage in V2_STAGE_COMPLETION_MAP:
completed_step, new_status = V2_STAGE_COMPLETION_MAP[stage]
logging.info(f"V2 copy completed for request {request_id}: {completed_step} -> {new_status}")

stepResultEvent.set(
func.EventGridOutputEvent(
id=str(uuid.uuid4()),
data={"completed_step": completed_step, "new_status": new_status, "request_id": request_id},
subject=request_id,
event_type="Airlock.StepResult",
event_time=datetime.datetime.now(datetime.UTC),
data_version=constants.STEP_RESULT_EVENT_DATA_VERSION))

# Send delete event for the source container (same as v1)
send_delete_event(dataDeletionEvent, json_body, request_id)
else:
# Non-terminal stages (e.g., import-external from user upload, export-internal)
# are not copy completions — ignore them
logging.info(f"V2 BlobCreated for non-terminal stage '{stage}' on request {request_id}, no action needed")
115 changes: 101 additions & 14 deletions airlock_processor/StatusChangedQueueTrigger/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

from exceptions import NoFilesInRequestException, TooManyFilesInRequestException

from shared_code import blob_operations, constants
from shared_code import blob_operations, constants, airlock_storage_helper, parsers
from pydantic import BaseModel, parse_obj_as


Expand All @@ -19,6 +19,8 @@ class RequestProperties(BaseModel):
previous_status: Optional[str]
type: str
workspace_id: str
review_workspace_id: Optional[str] = None
airlock_version: int = 1


class ContainersCopyMetadata:
Expand All @@ -31,6 +33,8 @@ def __init__(self, source_account_name: str, dest_account_name: str):


def main(msg: func.ServiceBusMessage, stepResultEvent: func.Out[func.EventGridOutputEvent], dataDeletionEvent: func.Out[func.EventGridOutputEvent]):
request_properties = None
request_files = None
try:
request_properties = extract_properties(msg)
request_files = get_request_files(request_properties) if request_properties.new_status == constants.STAGE_SUBMITTED else None
Expand All @@ -53,13 +57,25 @@ def handle_status_changed(request_properties: RequestProperties, stepResultEvent

logging.info('Processing request with id %s. new status is "%s", type is "%s"', req_id, new_status, request_type)

# Check if using metadata-based stage management (v2) or legacy per-stage accounts (v1)
use_metadata = request_properties.airlock_version >= 2

if new_status == constants.STAGE_DRAFT:
account_name = get_storage_account(status=constants.STAGE_DRAFT, request_type=request_type, short_workspace_id=ws_id)
blob_operations.create_container(account_name, req_id)
if use_metadata:
from shared_code.blob_operations_metadata import create_container_with_metadata
account_name = airlock_storage_helper.get_storage_account_name_for_request(request_type, new_status, ws_id, airlock_version=request_properties.airlock_version)
stage = airlock_storage_helper.get_stage_from_status(request_type, new_status)
create_container_with_metadata(account_name, req_id, stage, workspace_id=ws_id, request_type=request_type)
else:
account_name = get_storage_account(status=constants.STAGE_DRAFT, request_type=request_type, short_workspace_id=ws_id)
blob_operations.create_container(account_name, req_id)
return

if new_status == constants.STAGE_CANCELLED:
storage_account_name = get_storage_account(previous_status, request_type, ws_id)
if use_metadata:
storage_account_name = airlock_storage_helper.get_storage_account_name_for_request(request_type, previous_status, ws_id, airlock_version=request_properties.airlock_version)
else:
storage_account_name = get_storage_account(previous_status, request_type, ws_id)
container_to_delete_url = blob_operations.get_blob_url(account_name=storage_account_name, container_name=req_id)
set_output_event_to_trigger_container_deletion(dataDeletionEvent, request_properties, container_url=container_to_delete_url)
return
Expand All @@ -68,11 +84,74 @@ def handle_status_changed(request_properties: RequestProperties, stepResultEvent
set_output_event_to_report_request_files(stepResultEvent, request_properties, request_files)

if (is_require_data_copy(new_status)):
logging.info('Request with id %s. requires data copy between storage accounts', req_id)
containers_metadata = get_source_dest_for_copy(new_status=new_status, previous_status=previous_status, request_type=request_type, short_workspace_id=ws_id)
blob_operations.create_container(containers_metadata.dest_account_name, req_id)
blob_operations.copy_data(containers_metadata.source_account_name,
containers_metadata.dest_account_name, req_id)
if use_metadata:
# Metadata mode: Update container stage instead of copying
from shared_code.blob_operations_metadata import update_container_stage, create_container_with_metadata

# For import submit, use review_workspace_id so data goes to review workspace storage
effective_ws_id = ws_id
if new_status == constants.STAGE_SUBMITTED and request_type.lower() == constants.IMPORT_TYPE and request_properties.review_workspace_id:
effective_ws_id = request_properties.review_workspace_id

# Get the storage account (might change from core to workspace or vice versa)
source_account = airlock_storage_helper.get_storage_account_name_for_request(request_type, previous_status, ws_id, airlock_version=request_properties.airlock_version)
dest_account = airlock_storage_helper.get_storage_account_name_for_request(request_type, new_status, effective_ws_id, airlock_version=request_properties.airlock_version)
new_stage = airlock_storage_helper.get_stage_from_status(request_type, new_status)

if source_account == dest_account:
# Same storage account - just update metadata
logging.info(f'Request {req_id}: Updating container stage to {new_stage} (no copy needed)')
update_container_stage(source_account, req_id, new_stage, changed_by='system')

# In v2, same-account transitions don't fire BlobCreated events.
Comment thread
marrobi marked this conversation as resolved.
# For SUBMITTED, v1 relies on BlobCreatedTrigger to handle the malware scanning gate
# (skip to in_review when scanning is disabled). We handle this inline for v2.
if new_status == constants.STAGE_SUBMITTED:
try:
enable_malware_scanning = parsers.parse_bool(os.environ["ENABLE_MALWARE_SCANNING"])
except KeyError:
logging.error("environment variable 'ENABLE_MALWARE_SCANNING' does not exist. Cannot continue.")
raise
if not enable_malware_scanning:
logging.info(f'Request {req_id}: Malware scanning disabled, skipping to in_review')
stepResultEvent.set(
func.EventGridOutputEvent(
id=str(uuid.uuid4()),
data={"completed_step": constants.STAGE_SUBMITTED, "new_status": constants.STAGE_IN_REVIEW, "request_id": req_id},
subject=req_id,
event_type="Airlock.StepResult",
event_time=datetime.datetime.now(datetime.UTC),
data_version=constants.STEP_RESULT_EVENT_DATA_VERSION))
else:
logging.info(f'Request {req_id}: Malware scanning enabled, waiting for scan result')
elif new_status in [constants.STAGE_REJECTION_INPROGRESS, constants.STAGE_BLOCKING_INPROGRESS]:
# Terminal transitions: emit StepResult immediately since no BlobCreated event will fire
final_status = constants.STAGE_REJECTED if new_status == constants.STAGE_REJECTION_INPROGRESS else constants.STAGE_BLOCKED_BY_SCAN
logging.info(f'Request {req_id}: Emitting StepResult for terminal transition {new_status} -> {final_status}')
stepResultEvent.set(
func.EventGridOutputEvent(
id=str(uuid.uuid4()),
data={"completed_step": new_status, "new_status": final_status, "request_id": req_id},
subject=req_id,
event_type="Airlock.StepResult",
event_time=datetime.datetime.now(datetime.UTC),
data_version=constants.STEP_RESULT_EVENT_DATA_VERSION))
else:
# Different storage account (e.g., core → workspace on import approval,
# workspace → core on export approval) - need to copy.
# BlobCreatedTrigger will fire when the copy completes and emit the StepResult,
# matching the v1 async pattern for large data transfers.
logging.info(f'Request {req_id}: Copying from {source_account} to {dest_account}')
create_container_with_metadata(dest_account, req_id, new_stage, workspace_id=effective_ws_id, request_type=request_type)
blob_operations.copy_data(source_account, dest_account, req_id)
else:
# Legacy mode: Copy data between storage accounts
logging.info('Request with id %s. requires data copy between storage accounts', req_id)
review_ws_id = request_properties.review_workspace_id
containers_metadata = get_source_dest_for_copy(new_status=new_status, previous_status=previous_status, request_type=request_type, short_workspace_id=ws_id, review_workspace_id=review_ws_id)
blob_operations.create_container(containers_metadata.dest_account_name, req_id)
blob_operations.copy_data(containers_metadata.source_account_name,
containers_metadata.dest_account_name, req_id)
return

# Other statuses which do not require data copy are dismissed as we don't need to do anything...
Expand Down Expand Up @@ -102,7 +181,7 @@ def is_require_data_copy(new_status: str):
return False


def get_source_dest_for_copy(new_status: str, previous_status: str, request_type: str, short_workspace_id: str) -> ContainersCopyMetadata:
def get_source_dest_for_copy(new_status: str, previous_status: str, request_type: str, short_workspace_id: str, review_workspace_id: str = None) -> ContainersCopyMetadata:
# sanity
if is_require_data_copy(new_status) is False:
raise Exception("Given new status is not supported")
Expand All @@ -115,7 +194,7 @@ def get_source_dest_for_copy(new_status: str, previous_status: str, request_type
raise Exception(msg)

source_account_name = get_storage_account(previous_status, request_type, short_workspace_id)
dest_account_name = get_storage_account_destination_for_copy(new_status, request_type, short_workspace_id)
dest_account_name = get_storage_account_destination_for_copy(new_status, request_type, short_workspace_id, review_workspace_id=review_workspace_id)
return ContainersCopyMetadata(source_account_name, dest_account_name)


Expand Down Expand Up @@ -151,12 +230,14 @@ def get_storage_account(status: str, request_type: str, short_workspace_id: str)
raise Exception(error_message)


def get_storage_account_destination_for_copy(new_status: str, request_type: str, short_workspace_id: str) -> str:
def get_storage_account_destination_for_copy(new_status: str, request_type: str, short_workspace_id: str, review_workspace_id: str = None) -> str:
tre_id = _get_tre_id()

if request_type == constants.IMPORT_TYPE:
if new_status == constants.STAGE_SUBMITTED:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + tre_id
# Import submit: copy to review workspace storage, or tre_id for legacy compatibility
dest_id = review_workspace_id if review_workspace_id else tre_id
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + dest_id
Comment thread
marrobi marked this conversation as resolved.
Comment on lines +238 to +240
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In legacy (v1) mode this changes the destination of the submit copy from stalimip{tre_id} to stalimip{review_workspace_id} when a review workspace is configured. Core Terraform still provisions only stalimip{tre_id} for v1, so stalimip{review_workspace_id} is unlikely to exist and import submissions would fail. The in-progress storage account name should remain stalimip{tre_id}; the review workspace accesses it via private endpoints/DNS, not by changing the account suffix.

Suggested change
# Import submit: copy to review workspace storage, or tre_id for legacy compatibility
dest_id = review_workspace_id if review_workspace_id else tre_id
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + dest_id
# Import submit: in legacy/v1 mode the in-progress storage account remains TRE-scoped.
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + tre_id

Copilot uses AI. Check for mistakes.
elif new_status == constants.STAGE_APPROVAL_INPROGRESS:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_APPROVED + short_workspace_id
elif new_status == constants.STAGE_REJECTION_INPROGRESS:
Expand Down Expand Up @@ -218,7 +299,13 @@ def set_output_event_to_trigger_container_deletion(dataDeletionEvent, request_pr


def get_request_files(request_properties: RequestProperties):
storage_account_name = get_storage_account(request_properties.previous_status, request_properties.type, request_properties.workspace_id)
use_metadata = request_properties.airlock_version >= 2
if use_metadata:
storage_account_name = airlock_storage_helper.get_storage_account_name_for_request(
request_properties.type, request_properties.previous_status, request_properties.workspace_id,
airlock_version=request_properties.airlock_version)
else:
storage_account_name = get_storage_account(request_properties.previous_status, request_properties.type, request_properties.workspace_id)
return blob_operations.get_request_files(account_name=storage_account_name, request_id=request_properties.request_id)


Expand Down
2 changes: 1 addition & 1 deletion airlock_processor/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.8.9"
__version__ = "0.8.13"
Comment thread
marrobi marked this conversation as resolved.
Loading
Loading