Waiting for a Linux system to be online

Designed by Freepik

What is an “online” system?

Networking is a complex topic, and there is lots of confusion around the definition of an “online” system. Sometimes the boot process gets delayed up to two minutes, because the system still waits for one or more network interfaces to be ready. Systemd provides the network-online.target that other service units can rely on, if they are deemed to require network connectivity. But what does “online” actually mean in this context, is a link-local IP address enough, do we need a routable gateway and how about DNS name resolution?

The requirements for an “online” network interface depend very much on the services using an interface. For some services it might be good enough to reach their local network segment (e.g. to announce Zeroconf services), while others need to reach domain names (e.g. to mount a NFS share) or reach the global internet to run a web server. On the other hand, the implementation of network-online.target varies, depending on which networking daemon is in use, e.g. systemd-networkd-wait-online.service or NetworkManager-wait-online.service. For Ubuntu, we created a specification that describes what we as a distro expect an “online” system to be. Having a definition in place, we are able to tackle the network-online-ordering issues that got reported over the years and can work out solutions to avoid delayed boot times on Ubuntu systems.

In essence, we want systems to reach the following networking state to be considered online:

  1. Do not wait for “optional” interfaces to receive network configuration
  2. Have IPv6 and/or IPv4 “link-local” addresses on every network interface
  3. Have at least one interface with a globally routable connection
  4. Have functional domain name resolution on any routable interface

A common implementation

NetworkManager and systemd-networkd are two very common networking daemons used on modern Linux systems. But they originate from different contexts and therefore show different behaviours in certain scenarios, such as wait-online. Luckily, on Ubuntu we already have Netplan as a unification layer on top of those networking daemons, that allows for common network configuration, and can also be used to tweak the wait-online logic.

With the recent release of Netplan v1.1 we introduced initial functionality to tweak the behaviour of the systemd-networkd-wait-online.service, as used on Ubuntu Server systems. When Netplan is used to drive the systemd-networkd backend, it will emit an override configuration file in /run/systemd/system/systemd-networkd-wait-online.service.d/10-netplan.conf, listing the specific non-optional interfaces that should receive link-local IP configuration. In parallel to that, it defines a list of network interfaces that Netplan detected to be potential global connections, and waits for any of those interfaces to reach a globally routable state.

Such override config file might look like this:

[Unit]
ConditionPathIsSymbolicLink=/run/systemd/generator/network-online.target.wants/systemd-networkd-wait-online.service

[Service]
ExecStart=
ExecStart=/lib/systemd/systemd-networkd-wait-online -i eth99.43:carrier -i lo:carrier -i eth99.42:carrier -i eth99.44:degraded -i bond0:degraded
ExecStart=/lib/systemd/systemd-networkd-wait-online --any -o routable -i eth99.43 -i eth99.45 -i bond0

In addition to the new features implemented in Netplan, we reached out to upstream systemd, proposing an enhancement to the systemd-networkd-wait-online service, integrating it with systemd-resolved to check for the availability of DNS name resolution. Once this is implemented upstream, we’re able to fully control the systemd-networkd backend on Ubuntu Server systems, to behave consistently and according to the definition of an “online” system that was lined out above.

Future work

The story doesn’t end there, because Ubuntu Desktop systems are using NetworkManager as their networking backend. This daemon provides its very own nm-online utility, utilized by the NetworkManager-wait-online systemd service. It implements a much higher-level approach, looking at the networking daemon in general instead of the individual network interfaces. By default, it considers a system to be online once every “autoconnect” profile got activated (or failed to activate), meaning that either a IPv4 or IPv6 address got assigned.

There are considerable enhancements to be implemented to this tool, for it to be controllable in a fine-granular way similar to systemd-networkd-wait-online, so that it can be instructed to wait for specific networking states on selected interfaces.

A note of caution

Making a service depend on network-online.target is considered an antipattern in most cases. This is because networking on Linux systems is very dynamic and the systemd target can only ever reflect the networking state at a single point in time. It cannot guarantee this state to be remained over the uptime of your system and has the potentially to delay the boot process considerably. Cables can be unplugged, wireless connectivity can drop, or remote routers can go down at any time, affecting the connectivity state of your local system. Therefore, “instead of wondering what to do about network.target, please just fix your program to be friendly to dynamically changing network configuration.” [source].

Netplan v1.0 paves the way to stable, declarative network management

New “netplan status –diff” subcommand, finding differences between configuration and system state

As the maintainer and lead developer for Netplan, I’m proud to announce the general availability of Netplan v1.0 after more than 7 years of development efforts. Over the years, we’ve so far had about 80 individual contributors from around the globe. This includes many contributions from our Netplan core-team at Canonical, but also from other big corporations such as Microsoft or Deutsche Telekom. Those contributions, along with the many we receive from our community of individual contributors, solidify Netplan as a healthy and trusted open source project. In an effort to make Netplan even more dependable, we started shipping upstream patch releases, such as 0.106.1 and 0.107.1, which make it easier to integrate fixes into our users’ custom workflows.

With the release of version 1.0 we primarily focused on stability. However, being a major version upgrade, it allowed us to drop some long-standing legacy code from the libnetplan1 library. Removing this technical debt increases the maintainability of Netplan’s codebase going forward. The upcoming Ubuntu 24.04 LTS and Debian 13 releases will ship Netplan v1.0 to millions of users worldwide.

Highlights of version 1.0

In addition to stability and maintainability improvements, it’s worth looking at some of the new features that were included in the latest release:

  • Simultaneous WPA2 & WPA3 support.
  • Introduction of a stable libnetplan1 API.
  • Mellanox VF-LAG support for high performance SR-IOV networking.
  • New hairpin and port-mac-learning settings, useful for VXLAN tunnels with FRRouting.
  • New netplan status –diff subcommand, finding differences between configuration and system state.

Besides those highlights of the v1.0 release, I’d also like to shed some light on new functionality that was integrated within the past two years for those upgrading from the previous Ubuntu 22.04 LTS which used Netplan v0.104:

  • We added support for the management of new network interface types, such as veth, dummy, VXLAN, VRF or InfiniBand (IPoIB). 
  • Wireless functionality was improved by integrating Netplan with NetworkManager on desktop systems, adding support for WPA3 and adding the notion of a regulatory-domain, to choose proper frequencies for specific regions. 
  • To improve maintainability, we moved to Meson as Netplan’s buildsystem, added upstream CI coverage for multiple Linux distributions and integrations (such as Debian testing, NetworkManager, snapd or cloud-init), checks for ABI compatibility, and automatic memory leak detection. 
  • We increased consistency between the supported backend renderers (systemd-networkd and NetworkManager), by matching physical network interfaces on permanent MAC address, when the match.macaddress setting is being used, and added new hardware offloading functionality for high performance networking, such as Single-Root IO Virtualisation virtual function link-aggregation (SR-IOV VF-LAG).

The much improved Netplan documentation, that is now hosted on “Read the Docs”, and new command line subcommands, such as netplan status, make Netplan a well vested tool for declarative network management and troubleshooting.

Integrations

Those changes pave the way to integrate Netplan in 3rd party projects, such as system installers or cloud deployment methods. By shipping the new python3-netplan Python bindings to libnetplan, it is now easier than ever to access Netplan functionality and network validation from other projects. We are proud that the Debian Cloud Team chose Netplan to be the default network management tool in their official cloud-images for Debian Bookworm and beyond. Ubuntu’s NetworkManager package now uses Netplan as it’s default backend on Ubuntu 23.10 Desktop systems and beyond. Further integrations happened with cloud-init and the Calamares installer.

Please check out the Netplan version 1.0 release on GitHub! If you want to learn more, follow our activities on Netplan.io, GitHub, Launchpad, IRC or our Netplan Developer Diaries blog on discourse.

Multi-Cloud-Hosting: Vorteile und Implementierungsstrategien

In der heutigen schnelllebigen digitalen Landschaft ist Multi-Cloud-Hosting nicht nur ein Buzzword, sondern eine wesentliche Komponente für Unternehmen, die nach Robustheit, Flexibilität und Skalierbarkeit in ihrer Online-Präsenz streben. Während traditionelles Webhosting auf einem einzelnen Server oder innerhalb eines einzigen Cloud-Anbieters beruht, ermöglicht Multi-Cloud-Hosting die Verteilung von Ressourcen über mehrere Cloud-Plattformen hinweg. Diese innovative Hosting-Lösung bietet eine einzigartige Kombination aus Vorteilen, darunter verbesserte Ausfallsicherheit, optimierte Leistung und erhöhte Flexibilität, die sie besonders attraktiv für Webhosting-Experten und technikaffine Unternehmen macht. Die Implementierung einer Multi-Cloud-Strategie kann jedoch komplexe Herausforderungen mit sich bringen, von der Datenmigration bis hin zur Sicherheit und Kostenkontrolle. Daher ist ein tiefgreifendes Verständnis sowohl der Vorteile als auch der Implementierungsstrategien von entscheidender Bedeutung, um die Potenziale von Multi-Cloud-Hosting vollständig ausschöpfen zu können.

1. Verständnis der Multi-Cloud-Umgebung: Grundlagen und Key-Player

Multi-Cloud-Hosting umfasst die Nutzung von Cloud-Diensten verschiedener Anbieter, um eine diversifizierte Hosting-Umgebung zu schaffen. Diese Strategie ermöglicht es Unternehmen, die Stärken einzelner Cloud-Provider zu nutzen, während sie gleichzeitig Abhängigkeiten reduzieren und die Ausfallsicherheit verbessern. Zu den Key-Playern in der Multi-Cloud-Umgebung gehören große Cloud-Plattformen wie Amazon Web Services, Microsoft Azure, Google Cloud Platform und IBM Cloud, die jeweils einzigartige Dienste und Preismodelle bieten. Durch das Verständnis der spezifischen Funktionen und Angebote jedes Anbieters können Unternehmen eine Multi-Cloud-Strategie entwickeln, die ihre spezifischen Bedürfnisse erfüllt. Wesentlich ist dabei die Bewertung von Faktoren wie Performance, Sicherheit, Compliance und Kosten, um eine ausgewogene und effektive Cloud-Lösung zu konzipieren.

2. Die Vorteile von Multi-Cloud-Hosting: Flexibilität, Skalierbarkeit und Risikominimierung

Der entscheidende Vorteil des Multi-Cloud-Hostings liegt in seiner unübertroffenen Flexibilität und Skalierbarkeit. Unternehmen können Ressourcen dynamisch zuweisen oder entziehen, um auf Nachfrageschwankungen zu reagieren, ohne an die Grenzen eines einzigen Anbieters gebunden zu sein. Diese Flexibilität ermöglicht eine optimale Performance und Effizienz, insbesondere bei der Handhabung von Spitzenlasten oder global verteilten Nutzern. Darüber hinaus trägt die Diversifizierung der Cloud-Dienste zur Risikominimierung bei, da die Abhängigkeit von einem einzigen Anbieter verringert wird, was die Resilienz gegenüber Ausfällen und anderen Störungen verbessert.

3. Strategien zur Implementierung von Multi-Cloud-Hosting: Best Practices für Experten

Die Implementierung von Multi-Cloud-Hosting erfordert sorgfältige Planung und Strategie. Zu den Best Practices gehören die Bewertung der eigenen Bedürfnisse und Ziele, die Auswahl kompatibler Cloud-Dienste und die Entwicklung eines kohärenten Datenmanagements und einer Governance-Struktur. Wichtig ist auch die Berücksichtigung von Aspekten wie Netzwerkdesign, Sicherheitsrichtlinien und Kostenmanagement. Durch die Entwicklung eines umfassenden Implementierungsplans, der Schulung von Personal und der Einrichtung effektiver Überwachungs- und Managementtools können Unternehmen die Vorteile des Multi-Cloud-Hostings maximieren und gleichzeitig potenzielle Fallstricke minimieren.

4. Herausforderungen und Lösungen im Multi-Cloud-Hosting: Sicherheit, Datenmanagement und Kostenkontrolle

Trotz seiner vielen Vorteile bringt Multi-Cloud-Hosting auch spezifische Herausforderungen mit sich, insbesondere in den Bereichen Sicherheit, Datenmanagement und Kostenkontrolle. Die Sicherheit in einer Multi-Cloud-Umgebung erfordert eine konsistente Anwendung von Sicherheitsrichtlinien und -verfahren über alle Cloud-Dienste hinweg. Datenmanagement in einer Multi-Cloud-Umgebung erfordert effektive Lösungen für Datenintegration, -qualität und -lebenszyklusmanagement, um Silos zu vermeiden und die Datenintegrität zu gewährleisten. Die Kostenkontrolle in Multi-Cloud-Umgebungen erfordert transparente Abrechnungsmodelle und kontinuierliches Monitoring, um unerwartete Kosten zu vermeiden. Durch die Adressierung dieser Herausforderungen mit strategischen Lösungen können Unternehmen die Effizienz und Sicherheit ihrer Multi-Cloud-Plattformen maximieren.