HDDS-14517. [Recon] Include all storage report fields in CSV report for Capacity Distribution#9681
HDDS-14517. [Recon] Include all storage report fields in CSV report for Capacity Distribution#9681priyeshkaratha wants to merge 2 commits intoapache:masterfrom
Conversation
|
Hi @devabhishekpal @devmadhuu @ArafatKhan2198 can you please review the changes? |
5ce79dd to
74f3f7f
Compare
devabhishekpal
left a comment
There was a problem hiding this comment.
This change looks good to me, but it also needs frontend change.
The download button currently calls the pendingDeletion/download API endpoint.
However this changes it to storageDistribution/download so the download button won't work
Thanks @devabhishekpal for the review. calling storageDistribution/download is already part of the PR. Did I miss anything specific? |
devabhishekpal
left a comment
There was a problem hiding this comment.
Thanks for the patch, lgtm @priyeshkaratha
devmadhuu
left a comment
There was a problem hiding this comment.
Thanks for the patch @priyeshkaratha . Some comments, pls check.
...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java
Outdated
Show resolved
Hide resolved
...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java
Outdated
Show resolved
Hide resolved
...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java
Outdated
Show resolved
Hide resolved
...ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/StorageDistributionEndpoint.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| @GET | ||
| @Path("/download") |
There was a problem hiding this comment.
Since we are adding new APIs, ReadMe also to be updated along with Swagger API doc
cc: @devabhishekpal
| } | ||
|
|
||
| Map<String, DatanodeStorageReport> reportByUuid = | ||
| collectDatanodeReports().stream() |
There was a problem hiding this comment.
- collectDatanodeReports() calls nodeManager.getAllNodes() which iterates all datanodes
- For each datanode, it calls nodeManager.getNodeStat() and nodeManager.getTotalFilesystemUsage()
- This happens synchronously during the HTTP request
For 1000+ datanodes, this could be a bottleneck, can we use already-collected metrics from DataNodeMetricsService, because chances are that user may click to download csv as soon as he see the data in table and if this is not feasible, then can we add some kind of timeout protection ?
...e/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestStorageDistributionEndpoint.java
Outdated
Show resolved
Hide resolved
...e/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestStorageDistributionEndpoint.java
Outdated
Show resolved
Hide resolved
| for (DatanodePendingDeletionMetrics metric : pendingDeletionMetrics) { | ||
| DatanodeStorageReport report = reportByUuid.get(metric.getDatanodeUuid()); | ||
| if (report == null) { | ||
| continue; // skip if report is missing |
There was a problem hiding this comment.
When a datanode has pending deletion metrics but no storage report, the row is silently dropped from the CSV. This could mask operational issues.
Scenarios Where This Fails:
- Datanode just registered but hasn't sent full report yet
- Datanode in STALE state
- Race condition between metrics collection and report generation
Impact: Operators lose visibility into datanodes with pending deletions, which is exactly what they're trying to monitor.
940b403 to
80ebc79
Compare
What changes were proposed in this pull request?
This pull request simplifies how datanode storage metrics are downloaded. Instead of having multiple endpoints, the download logic is now handled by a single improved endpoint. The old endpoint that only supported pending deletion data has been removed. The new endpoint generates a detailed CSV with all relevant storage metrics, giving a more complete view of datanode health and usage.
What is the link to the Apache JIRA
HDDS-14517
How was this patch tested?
Tested using modified testcases.