[NAS Backup] Suppress Errors in Disk Usage Calculation that Caused Backup to Fail.#13424
[NAS Backup] Suppress Errors in Disk Usage Calculation that Caused Backup to Fail.#13424daviftorres wants to merge 10 commits into
Conversation
|
This is the equivalent command for applying the fix: We haven't confirmed the exact root cause of the So, I am running tests with |
Proposed Changes Rationalebackup_size=$(du -sb "$dest" 2>/dev/null | cut -f1) || true
timeout 60 umount "$mount_point" 2>/dev/null || true
rmdir "$mount_point" 2>/dev/null || true
echo -n "$backup_size"
|
|
Dear @abh1sar , do you think you can help me with this bug? Regards, |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #13424 +/- ##
============================================
- Coverage 18.94% 18.94% -0.01%
+ Complexity 18376 18375 -1
============================================
Files 6192 6192
Lines 556550 556558 +8
Branches 67954 67955 +1
============================================
- Hits 105454 105453 -1
- Misses 439517 439526 +9
Partials 11579 11579
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adjusts the KVM NAS backup script’s “statistics/cleanup” section so that failures while computing backup disk usage (and related cleanup commands) don’t cause an otherwise successful backup job to be marked as failed.
Changes:
- Capture
duoutput intobackup_sizeand suppressdustderr to avoid failing the script during size calculation. - Add
timeoutaroundumountand suppress errors fromumount/rmdir. - Emit the computed backup size at the end of
backup_running_vm().
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
abh1sar
left a comment
There was a problem hiding this comment.
hi @daviftorres
would it be possible to test script changes in your env where it is reproducible?
Some log outputs would be nice.
|
@blueorangutan package |
|
@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18383 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-16435)
|
Handle potential errors when calculating disk usage.
Add timeout for unmounting backup mount point and cleanup.
Co-authored-by: Abhisar Sinha <63767682+abh1sar@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
eb53f2b to
8f93025
Compare
Hey @DaanHoogland , sure! If I understand it right, it is caused by the Please point me in the right direction if you see that I am going the wrong way. Regards! |
| rmdir $mount_point | ||
| backup_size=$(du -sb "$dest" 2>>"$logFile" | cut -f1) || { log -ne "WARNING: du failed for $dest, reporting size as 0"; backup_size=0; } | ||
|
|
||
| timeout "$UNMOUNT_TIMEOUT" umount "$mount_point" 2>>"$logFile" || { log "WARNING: umount of $mount_point failed or timed out"; true; } |
| timeout "$UNMOUNT_TIMEOUT" umount "$mount_point" 2>>"$logFile" || { log "WARNING: umount of $mount_point failed or timed out"; true; } | ||
| rmdir "$mount_point" 2>>"$logFile" || { log "WARNING: rmdir of $mount_point failed"; true; } |

Description
This PR tried to prevent the failure of the job at the statistics section of a backup that has actually succeeded.
Apparently, it also fixes some silent failures I previously reported in #11727
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?