HDDS-14517. [Recon] Include all storage report fields in CSV report for Capacity Distribution by priyeshkaratha · Pull Request #9681 · apache/ozone

priyeshkaratha · 2026-01-28T09:16:03Z

What changes were proposed in this pull request?

This pull request simplifies how datanode storage metrics are downloaded. Instead of having multiple endpoints, the download logic is now handled by a single improved endpoint. The old endpoint that only supported pending deletion data has been removed. The new endpoint generates a detailed CSV with all relevant storage metrics, giving a more complete view of datanode health and usage.

Refactored Datanode Metrics Download: The dedicated /download endpoint for pending deletion metrics in PendingDeletionEndpoint has been removed.
Consolidated Storage Report: A new /download endpoint has been introduced in StorageDistributionEndpoint to provide a comprehensive CSV report.
Expanded Report Fields: The new CSV report now includes all datanode storage fields such as Capacity, Used Space, Remaining Space, Committed Space, Reserved Space, Minimum Free Space, and Pending Block Size.
Frontend URL Update: The frontend application (capacity.tsx) has been updated to use the new consolidated download URL.
Updated Test Coverage: Corresponding unit tests have been removed from TestPendingDeletionEndpoint and new, comprehensive tests have been added in TestStorageDistributionEndpoint for the new download functionality.

What is the link to the Apache JIRA

HDDS-14517

How was this patch tested?

Tested using modified testcases.

priyeshkaratha · 2026-01-28T09:18:06Z

Hi @devabhishekpal @devmadhuu @ArafatKhan2198 can you please review the changes?

devabhishekpal

This change looks good to me, but it also needs frontend change.
The download button currently calls the pendingDeletion/download API endpoint.
However this changes it to storageDistribution/download so the download button won't work

priyeshkaratha · 2026-02-06T06:43:53Z

This change looks good to me, but it also needs frontend change. The download button currently calls the pendingDeletion/download API endpoint. However this changes it to storageDistribution/download so the download button won't work

Thanks @devabhishekpal for the review. calling storageDistribution/download is already part of the PR. Did I miss anything specific?

devabhishekpal

Thanks for the patch, lgtm @priyeshkaratha

devmadhuu

Thanks for the patch @priyeshkaratha . Some comments, pls check.

...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java

devmadhuu · 2026-02-12T12:28:26Z

...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java

  }

+  @GET
+  @Path("/download")


Since we are adding new APIs, ReadMe also to be updated along with Swagger API doc
cc: @devabhishekpal

devmadhuu · 2026-02-12T12:38:09Z

...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java

+    }
+
+    Map<String, DatanodeStorageReport> reportByUuid =
+        collectDatanodeReports().stream()


collectDatanodeReports() calls nodeManager.getAllNodes() which iterates all datanodes

For each datanode, it calls nodeManager.getNodeStat() and nodeManager.getTotalFilesystemUsage()

This happens synchronously during the HTTP request

For 1000+ datanodes, this could be a bottleneck, can we use already-collected metrics from DataNodeMetricsService, because chances are that user may click to download csv as soon as he see the data in table and if this is not feasible, then can we add some kind of timeout protection ?

...e/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestStorageDistributionEndpoint.java

devmadhuu · 2026-02-12T12:45:37Z

...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java

+        for (DatanodePendingDeletionMetrics metric : pendingDeletionMetrics) {
+          DatanodeStorageReport report = reportByUuid.get(metric.getDatanodeUuid());
+          if (report == null) {
+            continue; // skip if report is missing


When a datanode has pending deletion metrics but no storage report, the row is silently dropped from the CSV. This could mask operational issues.
Scenarios Where This Fails:

Datanode just registered but hasn't sent full report yet

Datanode in STALE state

Race condition between metrics collection and report generation
Impact: Operators lose visibility into datanodes with pending deletions, which is exactly what they're trying to monitor.

priyeshkaratha changed the title ~~HDDS-14517. Include all storage report fields in downloaded CSV report for Capacity Distribution~~ HDDS-14517. [Recon] Include all storage report fields in downloaded CSV report for Capacity Distribution Jan 28, 2026

adoroszlai changed the title ~~HDDS-14517. [Recon] Include all storage report fields in downloaded CSV report for Capacity Distribution~~ HDDS-14517. [Recon] Include all storage report fields in CSV report for Capacity Distribution Jan 28, 2026

adoroszlai added the recon label Jan 28, 2026

devabhishekpal requested review from ArafatKhan2198, devabhishekpal and devmadhuu and removed request for ArafatKhan2198 January 30, 2026 08:26

HDDS-14517. Adding all the storage fields to report.

74f3f7f

priyeshkaratha force-pushed the HDDS-14517 branch from 5ce79dd to 74f3f7f Compare February 6, 2026 05:34

devabhishekpal reviewed Feb 6, 2026

View reviewed changes

devabhishekpal approved these changes Feb 6, 2026

View reviewed changes

priyeshkaratha marked this pull request as ready for review February 8, 2026 05:55

devmadhuu reviewed Feb 12, 2026

View reviewed changes

addressing review comments

80ebc79

priyeshkaratha force-pushed the HDDS-14517 branch from 940b403 to 80ebc79 Compare February 14, 2026 05:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-14517. [Recon] Include all storage report fields in CSV report for Capacity Distribution#9681

HDDS-14517. [Recon] Include all storage report fields in CSV report for Capacity Distribution#9681
priyeshkaratha wants to merge 2 commits intoapache:masterfrom
priyeshkaratha:HDDS-14517

priyeshkaratha commented Jan 28, 2026

Uh oh!

priyeshkaratha commented Jan 28, 2026

Uh oh!

devabhishekpal left a comment

Uh oh!

priyeshkaratha commented Feb 6, 2026

Uh oh!

devabhishekpal left a comment

Uh oh!

devmadhuu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devmadhuu Feb 12, 2026

Uh oh!

devmadhuu Feb 12, 2026

Uh oh!

Uh oh!

Uh oh!

devmadhuu Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

priyeshkaratha commented Jan 28, 2026

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

priyeshkaratha commented Jan 28, 2026

Uh oh!

devabhishekpal left a comment

Choose a reason for hiding this comment

Uh oh!

priyeshkaratha commented Feb 6, 2026

Uh oh!

devabhishekpal left a comment

Choose a reason for hiding this comment

Uh oh!

devmadhuu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devmadhuu Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

devmadhuu Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

devmadhuu Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants