Waiting for a Linux system to be online

Designed by Freepik

What is an “online” system?

Networking is a complex topic, and there is lots of confusion around the definition of an “online” system. Sometimes the boot process gets delayed up to two minutes, because the system still waits for one or more network interfaces to be ready. Systemd provides the network-online.target that other service units can rely on, if they are deemed to require network connectivity. But what does “online” actually mean in this context, is a link-local IP address enough, do we need a routable gateway and how about DNS name resolution?

The requirements for an “online” network interface depend very much on the services using an interface. For some services it might be good enough to reach their local network segment (e.g. to announce Zeroconf services), while others need to reach domain names (e.g. to mount a NFS share) or reach the global internet to run a web server. On the other hand, the implementation of network-online.target varies, depending on which networking daemon is in use, e.g. systemd-networkd-wait-online.service or NetworkManager-wait-online.service. For Ubuntu, we created a specification that describes what we as a distro expect an “online” system to be. Having a definition in place, we are able to tackle the network-online-ordering issues that got reported over the years and can work out solutions to avoid delayed boot times on Ubuntu systems.

In essence, we want systems to reach the following networking state to be considered online:

  1. Do not wait for “optional” interfaces to receive network configuration
  2. Have IPv6 and/or IPv4 “link-local” addresses on every network interface
  3. Have at least one interface with a globally routable connection
  4. Have functional domain name resolution on any routable interface

A common implementation

NetworkManager and systemd-networkd are two very common networking daemons used on modern Linux systems. But they originate from different contexts and therefore show different behaviours in certain scenarios, such as wait-online. Luckily, on Ubuntu we already have Netplan as a unification layer on top of those networking daemons, that allows for common network configuration, and can also be used to tweak the wait-online logic.

With the recent release of Netplan v1.1 we introduced initial functionality to tweak the behaviour of the systemd-networkd-wait-online.service, as used on Ubuntu Server systems. When Netplan is used to drive the systemd-networkd backend, it will emit an override configuration file in /run/systemd/system/systemd-networkd-wait-online.service.d/10-netplan.conf, listing the specific non-optional interfaces that should receive link-local IP configuration. In parallel to that, it defines a list of network interfaces that Netplan detected to be potential global connections, and waits for any of those interfaces to reach a globally routable state.

Such override config file might look like this:

[Unit]
ConditionPathIsSymbolicLink=/run/systemd/generator/network-online.target.wants/systemd-networkd-wait-online.service

[Service]
ExecStart=
ExecStart=/lib/systemd/systemd-networkd-wait-online -i eth99.43:carrier -i lo:carrier -i eth99.42:carrier -i eth99.44:degraded -i bond0:degraded
ExecStart=/lib/systemd/systemd-networkd-wait-online --any -o routable -i eth99.43 -i eth99.45 -i bond0

In addition to the new features implemented in Netplan, we reached out to upstream systemd, proposing an enhancement to the systemd-networkd-wait-online service, integrating it with systemd-resolved to check for the availability of DNS name resolution. Once this is implemented upstream, we’re able to fully control the systemd-networkd backend on Ubuntu Server systems, to behave consistently and according to the definition of an “online” system that was lined out above.

Future work

The story doesn’t end there, because Ubuntu Desktop systems are using NetworkManager as their networking backend. This daemon provides its very own nm-online utility, utilized by the NetworkManager-wait-online systemd service. It implements a much higher-level approach, looking at the networking daemon in general instead of the individual network interfaces. By default, it considers a system to be online once every “autoconnect” profile got activated (or failed to activate), meaning that either a IPv4 or IPv6 address got assigned.

There are considerable enhancements to be implemented to this tool, for it to be controllable in a fine-granular way similar to systemd-networkd-wait-online, so that it can be instructed to wait for specific networking states on selected interfaces.

A note of caution

Making a service depend on network-online.target is considered an antipattern in most cases. This is because networking on Linux systems is very dynamic and the systemd target can only ever reflect the networking state at a single point in time. It cannot guarantee this state to be remained over the uptime of your system and has the potentially to delay the boot process considerably. Cables can be unplugged, wireless connectivity can drop, or remote routers can go down at any time, affecting the connectivity state of your local system. Therefore, “instead of wondering what to do about network.target, please just fix your program to be friendly to dynamically changing network configuration.” [source].