Wait on network to be available before trying to run backups against remote repos by rzeigler · Pull Request #2176 · borgbase/vorta

rzeigler · 2024-12-25T18:52:44Z

Description

Reactively disable scheduled jobs against remote repositories when there is no network connection.

Related Issue

Motivation and Context

Running a backup against a remote repo will fail without a network connection.
This results in a notification on default settings, stress (because why are my backups failing) and could maybe result in backups not happening.
On my Linux systems configured to sleep, the first backup after resuming always seems to fail because the network has not finished activating yet.
This should allow that backup to just wait for a little while.

As a side benefit, this should also defer backups that happen in the normal course when there is no possible way they could work (i.e. when in airplane mode).

I thought about these changes for a little and its what I came up with. However, if the maintainers have a different approach they would like to try I'm more than willing to rework my PR or open a new one with that approach. I do also have a mac that I can in theory develop the stubbed mac implementation here, but I'm less familiar with how to programatically access the state of the mac network. If this approach is good, I can flesh that out also.

How Has This Been Tested?

Activate airplane mode from kde plasma network manager
Manipulate the event log in settings.db so that the scheduled backup will fire immediately
Start vorta
Observe the new logs indicating that the scheduled task is being postponed because no network
Re-enable the network
Observe the scheduling come back

I had a little bit of trouble figuring out how to manipulate vorta's database state so that I could actually start vorta, then put my system to sleep. Vorta always wanted to backup immediately. I assume this should be a reasonable facsimile of the behavior.

Regarding actual tests, I am not an experienced python developer and running nox locally seems to fail to collect tests in test_darwin.py which I wouldn't expect to execute at all on my system. That said, I don't think I changed anything covered by unit tests.

Screenshots (if appropriate):

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

I have read the CONTRIBUTING guide.
My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes. I am unsure of how I would test this
All new and existing tests passed. See note re: nox, I'm not actually sure

I provide my contribution under the terms of the license of this repository and I affirm the Developer Certificate of Origin.

Don't try and start jobs with remote repos until the network is up. This should prevent job failures when, for instance, waking from sleep. Mac implementation is a stub currently. Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

NetworkStatusMonitor is now a QObject with signal network_status_changed. The scheduler listens to this signal and handles rescheduling similar to the existing 'wake from suspend' logic works. Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

It seems that sometimes the signal does't arrive. Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

rzeigler · 2026-01-22T19:51:17Z

Would there be any hope of getting either feedback or a merge?

m3nu · 2026-01-22T21:04:03Z

Sorry, I missed this. There are too many stale PRs currently. Looks simple enough at first.

m3nu

Review Summary

Purpose: Defer scheduled backups for remote repos when network is unavailable (e.g., after sleep/resume).

Positive Aspects

Solves a real problem - Backup failures after sleep/resume due to network not being ready is a legitimate issue
Good architecture - Extends existing NetworkStatusMonitor pattern, uses Qt signals appropriately
Linux implementation is well done - Uses NetworkManager DBus API properly
Fails gracefully - When status can't be determined, assumes network is up (won't block backups unnecessarily)
Respects local repos - Only defers remote repository backups

Issues to Address

1. Bug: Inconsistent network state checking (Medium severity)

The code mixes two different NetworkManager enums:

In networkStateChanged slot (network_manager.py:135):

# This checks NMState (0-80 range)
self.network_status_changed.emit(state >= 60)

In is_network_active() (network_manager.py:30):

# This checks NMConnectivityState (0-4 range)
return self._nm.get_connectivity_state() is not NMConnectivityState.NONE

These are different DBus properties:

StateChanged signal emits NMState (values 0-80)
Connectivity property returns NMConnectivityState (values 0-4)

This could cause the initial _net_up state (set via is_network_active()) to disagree with signal-based updates. Both should use the same approach.

Suggested fix - Option A (Use State consistently):

def is_network_active(self):
    try:
        state = read_dbus_property(self._nm._nm, 'State')
        return state >= 60  # NM_STATE_CONNECTED_LOCAL or better
    except DBusException:
        logger.exception("Failed to check network state. Assuming connected")
        return True

Or Option B (Use Connectivity consistently):

@pyqtSlot("unsigned int")
def networkStateChanged(self, state):
    logger.debug(f'network state changed: {state}')
    # Re-check connectivity when state changes
    try:
        connectivity = self.get_connectivity_state()
        self.network_status_changed.emit(connectivity is not NMConnectivityState.NONE)
    except DBusException:
        self.network_status_changed.emit(True)

2. Typo (Trivial)

scheduler.py:93: "updating shcedule" → "updating schedule"

3. Unused import (Trivial)

scheduler.py:6: List is added to typing imports but never used in the file.

Notes

macOS stub - darwin.py is_network_active() always returns True. This is fine for now since it preserves existing behavior, and you've offered to implement it later.
Testing - Would be good to verify on Linux with NetworkManager by toggling airplane mode and confirming backups defer/resume correctly.

Overall this is a useful feature! Just needs the NMState vs NMConnectivityState inconsistency fixed before merge.

rzeigler · 2026-01-27T01:59:22Z

Suggestions are noted, I'll work on those updates.

…Connectivity and NMState I chose NMState because there is no Connectivity signal and immediately polling for Connectivity on state change seems to return the old value.

rzeigler · 2026-01-27T14:27:42Z

Some changes made. I standardized on reading 'State' instead of 'Connectivity' due to what appears to be a delay in 'Connectivity' being updated on the nm side after the State change emits.

m3nu · 2026-01-27T15:15:49Z

Quick Review — Minor Issues Found

All three original review comments from Jan 22 have been addressed. Found a few new minor items:

1. Wrong enum name for NMState value 10
Per NM docs, value 10 is NM_STATE_ASLEEP, not NM_STATE_DISABLED. No functional impact (only the int value matters), but the name is misleading.

2. Double "to" typo in scheduler.py comment

# Connect to network manager to to monitor that network status

Should be: # Connect to network manager to monitor the network status

3. Misleading else-branch log message in set_timer_for_profile
The else branch of if profile.schedule_make_up_missed and self._net_up always logs "the network is not available", but it also fires when schedule_make_up_missed is simply disabled. Suggest:

if profile.schedule_make_up_missed and self._net_up:
    # ... run the backup
elif profile.schedule_make_up_missed and not self._net_up:
    logger.debug('Skipping catchup %s (%s), the network is not available', profile.name, profile.id)

4. Trailing whitespace

network_manager.py: trailing tab on NM_STATE_CONNECTED_GLOBAL = 70
network_manager.py: trailing whitespace on blank line before is_network_connected

(These would likely be caught by ruff/pre-commit.)

rzeigler · 2026-01-27T16:15:13Z

I have addressed other notes with the exception of

1. Wrong enum name for NMState value 10 Per NM docs, value 10 is NM_STATE_ASLEEP, not NM_STATE_DISABLED. No functional impact (only the int value matters), but the name is misleading.

This is incorrect. Per the docs ASLEEP and DISABLED both have the value 10 and ASLEEP is deprecated.

m3nu · 2026-01-27T17:09:15Z

Thanks for your patience with this PR, @rzeigler, and for working through all the review feedback!

The three original review issues are all resolved, and the follow-up items have been addressed as well (and you were right about NM_STATE_DISABLED vs NM_STATE_ASLEEP).

One remaining issue I noticed:

Local repos blocked from catchup when network is down

In set_timer_for_profile(), the catchup logic gates on self._net_up without checking if the repo is remote:

# scheduler.py ~line 314
if profile.schedule_make_up_missed and self._net_up:
    # run the backup...
elif profile.schedule_make_up_missed and not self._net_up:
    logger.debug('Skipping catchup %s (%s), the network is not available', ...)

Meanwhile, reload_all_timers() correctly distinguishes local vs remote repos:

elif not profile.repo.is_remote_repo() or self._net_up:
    self.set_timer_for_profile(profile.id)

So reload_all_timers allows local repos through, but set_timer_for_profile then blocks their catchup anyway because self._net_up is False. Local repos don't need network access and should always be able to catch up.

Suggested fix:

needs_network = profile.repo is not None and profile.repo.is_remote_repo()
if profile.schedule_make_up_missed and (self._net_up or not needs_network):
    # run backup...
elif profile.schedule_make_up_missed and not self._net_up and needs_network:
    logger.debug('Skipping catchup %s (%s), the network is not available', profile.name, profile.id)

Also two minor typos remaining:

scheduler.py: # Connect to network manager to to monitor net status — double "to"
network_manager.py: # Connected and it appears that the the connected state — double "the"

ctheune · 2026-02-01T12:07:07Z

Quick note: #2133 is currently closed due to inactivity and #1815 is a duplicate that will be fixed by this.

Thanks @m3nu for picking this up.

rzeigler · 2026-02-08T18:20:20Z

Anything else on this?

m3nu · 2026-02-08T18:35:02Z

Anything else on this?

No, I think all good. I'll make a note to merge and test tomorrow.

m3nu

🔍 Code Review

PR: #2176 - Wait on network to be available before trying to run backups against remote repos
Branch: wait-for-network → master
Changes: +113 / -5 lines across 4 files

📋 Summary

Adds reactive network monitoring to defer scheduled backups against remote repos when there is no network connection. On Linux, listens to NetworkManager's StateChanged DBus signal; on macOS, the implementation is stubbed to always return True.

✅ What's Good

Clean approach to a real pain point (failed backups after resume from sleep)
Proper use of DBus signals for reactive network monitoring on Linux
Correct NMState enum mapping with documentation links
Graceful fallback: NullNetworkStatusMonitor.is_network_active() returns True
Thread-safe: properly integrates with existing lock pattern in scheduler
Only affects remote repos, leaving local backups unhindered
Good defensive re-fetch of network state on resume from suspend

🔍 Review Details

Severity	Count	Details
🔴 Error	1	Base class design
🟡 Warning	4	Various
🔵 Info	2	Style/minor

1. 🔴 Base class is_network_active() raises NotImplementedError — If a new platform subclass is added and forgets to implement this method, scheduled backups will crash instead of gracefully degrading. Consider returning True by default (like NullNetworkStatusMonitor does) so the behavior falls back to "assume connected".

2. 🟡 macOS is_network_active() is stubbed — DarwinNetworkStatus.is_network_active() always returns True with a # Not yet implemented comment. macOS users won't benefit from this feature. Worth tracking as a follow-up issue.

3. 🟡 reload_all_timers() race condition — When network goes down, timers for remote repos are removed via remove_job(). When network comes back, networkStatusChanged signal fires and reschedules. However, if the signal is missed (e.g., during a brief connectivity blip), profiles stay unscheduled until the 15-minute reload timer fires. This is probably acceptable but worth noting.

4. 🟡 set_timer_for_profile() partial coverage — The needs_network check only guards the schedule_make_up_missed path. If a profile has schedule_make_up_missed=False and network is down, it will still schedule a future timer that may fire while network is still down — the backup then fails at BorgCreateJob.prepare() time (existing behavior, but this PR could address it for consistency).

5. 🟡 @pyqtSlot("unsigned int") fragility — The slot signature should match the DBus signal type. NM's StateChanged emits u (uint32). Verify this works reliably across different Qt/DBus versions.

6. 🔵 Naming convention — networkStatusChanged method in scheduler.py uses camelCase, inconsistent with the rest of the Python codebase which uses snake_case. Consider renaming to network_status_changed_handler or similar.

7. 🔵 NM_STATE_CONNECTING excluded — _is_network_connected() correctly excludes NM_STATE_CONNECTING (40), meaning a brief "connecting" phase after resume won't trigger premature backup attempts. Good decision.

📊 Verdict: COMMENT 💬

Solid approach to a real problem. The Linux/NetworkManager integration looks correct. Main concerns are the base class design (should default to True instead of NotImplementedError), the partial coverage in set_timer_for_profile(), and the macOS stub that should be tracked for follow-up. None of these are blockers, but addressing them would strengthen the PR.

🤖 Reviewed by Claude Code

m3nu · 2026-02-09T18:22:22Z

I think we are good. The base class design follows existing convention and isn't a concern.

Thanks again for the patience!

rzeigler force-pushed the wait-for-network branch 2 times, most recently from 2fe7de9 to eaaadf0 Compare December 25, 2024 23:23

Wait on network for remote repos

42e8bf9

Don't try and start jobs with remote repos until the network is up. This should prevent job failures when, for instance, waking from sleep. Mac implementation is a stub currently. Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

rzeigler force-pushed the wait-for-network branch from eaaadf0 to 42e8bf9 Compare December 26, 2024 02:13

Implement missed interface function in NullNetworkStatusMonitor

efc04e1

Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

rzeigler force-pushed the wait-for-network branch from cce8273 to efc04e1 Compare December 26, 2024 13:18

rzeigler added 3 commits December 29, 2024 15:47

Request network status when waking from sleep

5f6a77f

It seems that sometimes the signal does't arrive. Signed-off-by: Ryan Zeigler <zeiglerr@gmail.com>

Merge branch 'master' into wait-for-network

b83e765

goebbe mentioned this pull request Jan 29, 2025

Run missed backups runs before network is ready #2133

Closed

2 tasks

Merge branch 'master' into wait-for-network

10447f8

m3nu requested changes Jan 22, 2026

View reviewed changes

rzeigler added 3 commits January 27, 2026 07:20

Merge remote-tracking branch 'upstream/master' into wait-for-network

e3a7eb1

Address typos and unused imports

9f23676

Only use the NM State for connectivity decisions instead of mixing NM…

02a466e

…Connectivity and NMState I chose NMState because there is no Connectivity signal and immediately polling for Connectivity on state change seems to return the old value.

rzeigler requested a review from m3nu January 27, 2026 14:41

rzeigler added 3 commits January 27, 2026 10:26

Autoformat fixes

517e82b

Avoid logging network is unavailable when nothing to make up.

4cfb475

Clearify comment verbiage

553fe94

Do not block catchup backups for local repos on network state

5b82ca0

ctheune mentioned this pull request Feb 1, 2026

Vorta trying to fast after resume #1815

Open

2 tasks

Merge branch 'master' into wait-for-network

3711ec1

m3nu reviewed Feb 9, 2026

View reviewed changes

m3nu merged commit 6deb619 into borgbase:master Feb 9, 2026
5 checks passed

mr-raj12 mentioned this pull request Feb 11, 2026

Add mypy type checking infrastructure and core module annotations (#2362) #2402

Open

9 tasks

rzeigler mentioned this pull request Feb 12, 2026

Wait on network to be available (mac impl) #2408

Open

9 tasks

Conversation

rzeigler commented Dec 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

Uh oh!

rzeigler commented Jan 22, 2026

Uh oh!

m3nu commented Jan 22, 2026

Uh oh!

m3nu left a comment

Choose a reason for hiding this comment

Review Summary

Positive Aspects

Issues to Address

1. Bug: Inconsistent network state checking (Medium severity)

2. Typo (Trivial)

3. Unused import (Trivial)

Notes

Uh oh!

rzeigler commented Jan 27, 2026

Uh oh!

rzeigler commented Jan 27, 2026

Uh oh!

m3nu commented Jan 27, 2026

Quick Review — Minor Issues Found

Uh oh!

rzeigler commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

m3nu commented Jan 27, 2026

Local repos blocked from catchup when network is down

Uh oh!

ctheune commented Feb 1, 2026

Uh oh!

rzeigler commented Feb 8, 2026

Uh oh!

m3nu commented Feb 8, 2026

Uh oh!

m3nu left a comment

Choose a reason for hiding this comment

🔍 Code Review

📋 Summary

✅ What's Good

🔍 Review Details

📊 Verdict: COMMENT 💬

Uh oh!

m3nu commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rzeigler commented Dec 25, 2024 •

edited

Loading

rzeigler commented Jan 27, 2026 •

edited

Loading