Fix NPE in NASBackupProvider when no running KVM host is available#12805
Fix NPE in NASBackupProvider when no running KVM host is available#12805jmsperu wants to merge 2 commits intoapache:4.22from
Conversation
ResourceManager.findOneRandomRunningHostByHypervisor() can return null when no KVM host in the zone has status=Up (e.g. during management server startup, brief agent disconnections, or host state transitions). NASBackupProvider.syncBackupStorageStats() and deleteBackup() call host.getId() without a null check, causing a NullPointerException that crashes the entire BackupSyncTask background job every sync interval. This adds null checks in both methods: - syncBackupStorageStats: log a warning and return early - deleteBackup: throw CloudRuntimeException with a descriptive message
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12805 +/- ##
============================================
- Coverage 17.61% 17.60% -0.01%
+ Complexity 15662 15661 -1
============================================
Files 5917 5917
Lines 531415 531437 +22
Branches 64973 64976 +3
============================================
- Hits 93588 93586 -2
- Misses 427271 427292 +21
- Partials 10556 10559 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 17121 |
|
@jmsperu , probably due to the rebase: seems an import is missing. |
|
Good catch @DaanHoogland, the CollectionUtils import was dropped during the rebase. Fixed now — should build cleanly. |
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17198 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
There was a problem hiding this comment.
Pull request overview
Fixes a crash in the NAS backup provider’s background sync/deletion flows when no running KVM host is available in a zone (i.e., ResourceManager.findOneRandomRunningHostByHypervisor(...) returns null), preventing BackupSyncTask from dying due to NullPointerException.
Changes:
- Add a
null-host check indeleteBackup()and fail with a descriptiveCloudRuntimeException. - Add early returns in
syncBackupStorageStats()when there are no repositories and when no running KVM host can be found (with a warning log).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import java.text.SimpleDateFormat; | ||
| import java.util.ArrayList; | ||
| import java.util.Collections; | ||
| import org.apache.commons.collections.CollectionUtils; | ||
| import java.util.Comparator; | ||
| import java.util.Date; |
| if (host == null) { | ||
| throw new CloudRuntimeException(String.format("Unable to find a running KVM host in zone %d to delete backup %s", backup.getZoneId(), backup.getUuid())); | ||
| } |
| final Host host = resourceManager.findOneRandomRunningHostByHypervisor(Hypervisor.HypervisorType.KVM, zoneId); | ||
| if (host == null) { | ||
| logger.warn("Unable to find a running KVM host in zone {} to sync backup storage stats", zoneId); | ||
| return; | ||
| } |
Rebased onto 4.22 as requested (previously #12680).
ResourceManager.findOneRandomRunningHostByHypervisor()can return null when no KVM host in the zone has status=Up (e.g. during management server startup, brief agent disconnections, or host state transitions).NASBackupProvider.syncBackupStorageStats()anddeleteBackup()callhost.getId()without a null check, causing a NullPointerException that crashes the entireBackupSyncTaskbackground job every sync interval.This adds null checks in both methods:
CloudRuntimeExceptionwith a descriptive message