Skip to content

Add azure support for storing MLRun artifacts #281

Merged
GiladShapira94 merged 17 commits intomlrun:developmentfrom
GiladShapira94:azure-blob-support
Mar 29, 2026
Merged

Add azure support for storing MLRun artifacts #281
GiladShapira94 merged 17 commits intomlrun:developmentfrom
GiladShapira94:azure-blob-support

Conversation

@GiladShapira94
Copy link
Copy Markdown
Collaborator

This PR fixes: https://iguazio.atlassian.net/browse/CEML-672

Summary

This PR adds Azure Blob Storage as an artifact storage in MLRun CE, alongside the existing S3/SeaweedFS backend.
Based on the storage mode (S3/azure) and the use values, the storage secrets get update and MLRun artifact storage configuration gets updates for example: the artifact_path, and credentials to store the files in SeaweedFS or Azure.

Main changes

  • Replaced the top-level s3.* values with a storage block containing storage.mode (s3 or azure-blob), storage.s3., and storage.azure.
    • The storage-credentials Secret (replacing the old s3-credentials Secret) is now populated based on the selected mode — S3 env vars for s3 mode, Azure env vars for azure-blob mode
    • MLRun and Jupyter ConfigMaps now set paths (MLRUN_HTTPDB__REAL_PATH, artifact paths, feature store prefixes, model monitoring prefixes) based on the selected storage mode
    • Added install-time validation: helm install fails immediately if storage.s3.bucket is not set when mode: s3, or if storage.azure.containerName is not set when mode: azure-blob

Other changes

  • Renamed mlpipeline-minio-artifact Secret to mlpipeline-seaweedfs-artifact across all pipeline templates
  • Added pipelines.metadata.enabled flag to gate all KFP metadata-grpc/envoy resources
  • Added seaweedfs.s3Service NodePort service and exposed it in NOTES.txt
  • Added SeaweedFS init container in ml-pipeline.yaml to wait for S3 gateway readiness
  • Added spark.hadoop.fs.s3a.change.detection.mode none to Spark defaults to fix SeaweedFS ETag incompatibility (CEML-669)
  • Bumped chart version to 0.11.0-rc.27

Important Note - In both modes KFP pipelines writes artifacts to SeaweedFS

Comment on lines +3 to +4
{{- $azure_container := .Values.storage.azure.containerName | default "" }}
{{- $is_azure := eq .Values.storage.mode "azure-blob" }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id move it to template file (_helpers.tpl and set it once. then, create a "schema" variable of "s3://" or "az://" according to the storage mode. that way, you wont need to do many if/else but rather put the schema template as a prefix and variables would be the same

@@ -1,5 +1,7 @@
{{ if .Values.mlrun.enabled}}
{{- $bucket_name := .Values.global.infrastructure.aws.bucketName | default "mlrun" }}
{{- $azure_container := .Values.storage.azure.containerName | default "" }}
Copy link
Copy Markdown
Member

@yaelgen yaelgen Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default here is empty string, but in jupyter-env-configmap.yaml the same variable defaults to "mlrun". I think it should be consistent and probably default "mlrun" in both places. This would also be resolved by the _helpers.tpl refactor Liran suggested.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that for Azure Blob that we don't want to use default values as this service is managed by the user, in difference to the SeaweedFS that manage by us

@@ -154,21 +154,21 @@ S3 Service Port - returns the port for pipeline config
S3 Access Key - uses top-level s3.accessKey for all components (MLRun, Jupyter, Pipelines)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the values now live under storage.s3.*., adjust comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is deleted but no replacement is created. Both values.yaml references (name: storage-credentials) and the secret_name=storage-credentials auto-mount param point to this secret, how will this work? am I missing something?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a change in MLRun 1.11 that mount secrets as an ENV to the running pods, it the same as it was before but more generic approach

MLRUN_CE__MODE: {{ .Values.jupyterNotebook.ce.mode | default "full" }}
MLRUN_CE__VERSION: {{ .Chart.Version }}
MLRUN_FUNCTION__SPEC__SERVICE_ACCOUNT__DEFAULT: {{ .Values.mlrun.api.functionSpecServiceAccountDefault | default "" | quote }}
MLRUN_FEATURE_STORE__DATA_PREFIXES__DEFAULT: s3:///{{ $bucket_name }}/projects/{project}/FeatureStore/{name}/{kind}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MLRUN_FEATURE_STORE__DATA_PREFIXES__DEFAULT was removed here but not replaced. In mlrun-env-configmap.yaml it's set for both modes (S3 and Azure) inside the if/else block. Since Jupyter uses a separate configmap (jupyter-common-env), it won't inherit that value - I guess it will fall back to MLRun's built-in default which points to v3io://. Should this be added back here for both modes, same as in the MLRun configmap or is it works differently?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is ok, as this values return from MLRun server so it's better to have one source of truth

@GiladShapira94 GiladShapira94 merged commit becd78f into mlrun:development Mar 29, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants