System Update
Introduction
This page compares different system update mechanisms. The purpose is to help the project with picking a suitable mechanism that the project then will support going forward. Users may find this page relevant for picking a mechanism that suits their specific needs.
A system update mechanism must ensure that a device running an older release of the operating systems runs with a more recent release when the update mechanism is done. This includes updating everything that defines the system (rootfs, kernel, bootloader, etc.), restarting running processes and potentially a reboot. An ideal mechanism:
- never ends up in an inconsistent state (atomic update),
- always keeps the device usable (fallback to previous state when there are problems, or at least supporting a recovery mode),
- requires little additional resources (disk space, RAM),
- minimizes downtime while updating,
- works in combination with security technology (integrity protection),
- is secure (does not install or execute software created by an attacker).
These are conflicting requirements. Different mechanisms will have different strengths and weaknesses. Therefore the first chapter provides a more detailed definition of the different aspects and has a table comparing the mechanisms. The following sections then describe each mechanism in more detail.
A similar comparison was done for Automotive Grade Linux (AGL) here: https://lists.linuxfoundation.org/pipermail/automotive-discussions/2016-May/002061.html
Some talks at Embedded Linux Conference presented an overview of the current mechanisms:
- How do you update your embedded Linux devices? by Daniel Sangorrin / Keijiro Yano http://events.linuxfoundation.org/sites/events/files/slides/linuxcon-japan-2016-softwre-updates-sangorrin.pdf
- Comparison of Linux Software Update Technologies by Matt Porter http://events.linuxfoundation.org/sites/events/files/slides/Comparison%20of%20Linux%20Software%20Update%20Technologies_0.pdf
- Software update for IoT: the current state of play by Chris Simmonds http://de.slideshare.net/chrissimmonds/software-update-for-iot-the-current-state-of-play
- OSS Remote Firmware Updates for IoT-like Projects by Silvano Cirujano Cuesta http://events.linuxfoundation.org/sites/events/files/slides/OSS_Remote_Firmware_Updates_for_IoT-like_Projects.pdf
Comparison
- Type
- Block-based update mechanisms directly modify blocks in the partition(s) that they update, without going through the filesystem. This implies that the partition has to be the same for all devices and that devices must use exactly the same partition size. File-based update mechanisms modify files and directories. Therefore devices with different partition sizes can use the same update data and it may be possible to update without a reboot.
- Disk layout
- Boot loader, number and kind of partitions, ...
- Rootfs
- The partition which contains the OS. May be strictly read-only (block-based update mechanisms) or read/write (file-based). Some update mechanisms support installing and updating a subset of the full OS.
- Updates from
- describes from where the update mechanism gets the update.
- Updates what
- describes which parts of the overall system the mechanism updates.
- Code stability
- Based on how long the code has been in use, personal experience, security track record in existing deployments, etc.
- OE/Yocto integration
- Whether the mechanism is already available and who supports it.
- Resource requirements on server
- affect both build time and long-term storage capacity. Likely to depend on the complexity of the changes.
- Resource requirements on client
- Amount of temporary disk space, CPU/network load, ..., again for different scenarios.
- Failure resilience
- Summarizes how the mechanism copes with potential problems.
- Complexity
- Some mechanisms are harder to use correctly than others (usability). Also includes how difficult is to set up the mechanism because of dependencies.
- Downtime
- How long normal operation of the device needs to be interrupted for an update.
- Security
- Compatibility with other technology, protection of the update mechanism itself.
Mechanism | Type | Disk layout | Rootfs | Updates from | Updates what | Code stability | OE/Yocto integration | Resource requirements | Failure resilience | Complexity | Downtime | Security | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
on server | on client | ||||||||||||
swupd | file-based | All files in a single partition. Arbitrary disk layout, filesystem and boot mechanism. | Read/write. Files provided by the OS are read-only, everything else is read/write (/etc, /var). OS can be split up into a core OS (always installed) and optional bundles which may or may not be installed. | HTTP(S) server | Files in the rootfs, boot loader and kernel via plugins | Used in Clear Linux OS. Code relatively stable, but would benefit from a rewrite (evolved from a prototype). | meta-swupd (work in progress, not part of the Yocto project). No support for automatically updating at the moment. | Build time and storage for each update linear with total number of files (file system analysis, zero packs) plus linear with number of modified files (compression). Optionally can prepare deltas from certain previous builds, which is linear with the number of modified files since each of those builds. | In the best case (delta prepared by server), a single archive with just some file diffs gets downloaded, unpacked and applied. In other cases, each new or modified file gets downloaded and unpacked. Staging new content needs free space in the rootfs partition, i.e. partition must be at least twice as large as the base OS. | No recovery mechanism built into swupd itself. Short period of time where interrupted update may leave behind inconsistent rootfs. No updates possible when there is not enough free space left. | Upgrade path must be considered as part of release process (deltas, incompatible changes) | Downloading and staging in parallel to normal operation. Services are kept running until after the update, at which point the device admin needs to restart services or reboot (needs to be automated). | Compatible with Linux IMA, Smack, SELinux. Incompatible with dm-verity. Relies on HTTPS and (optionally) signing to protect integrity of downloaded files. |
sbabic's swupdate | block-based | A complete partition. Arbitrary disk layout, kernel, filesystem and boot mechanism. | Read only. Partiton must not be mounted. Device must boot from maintenance image (could be in another partition in the same disk). | Local provisioning of image file or use mongoose HTTP server. | Complete partition. Allows to change u-boot variables and allow to use plugins to make changes to other bootloaders. | Code relatively stable, but there is only a master branch and no versioning. | meta-swupdate (Not part of the Yocto project). | Just need to build the update image and package it with a configuration file (sw-description). | It needs to have at least two partition; production and maintenance. To perform the update it requires to download the image file to the maintenance partition, so it requires the maintenance partition needs to be big enough to hold the image file. | There are no recovery mechanism built in. If the bootloader has the capability to check if a boot failed, it could boot in maintenance mode again. | This updater is very easy to use and setup, even if not using the OE layer. | There is need to reboot in maintenance mode and once the image is installed, needs to reboot again with the new production image. | It is possible to sign the images used for the update in order to check its integrity. |
Mender | block-based | Dual rootfs partitions with two extra partitions, a boot and a data partition. | Active and passive partition, read/write while in use. Kernel is stored on rootfs. | Mender server and/or HTTPS server | Complete rootfs, including kernel. | master branch under development. There is a stable branch, but it's lacking many basic features. | meta-mender. Not part of the Yocto Project. | Needs to store one compressed rootfs image for each update, plus small meta data section. | Needs active/passive rootfs partition, and bandwidth to download compressed rootfs image. Needs no additional space on device beyond the partitions, space for the Mender binary, and a tiny local database. | Automatic rollback if the device either fails to boot, or the Mender daemon cannot connect to the Mender server afterwards. | Relatively easy to build with Yocto. More complex if not using Yocto. | Update is downloaded and applied during normal operation. Afterwards one reboot is required, no other downtime. | Secure connection to the server (TLS). Signing not currently supported, but planned. |
TODO: add OSTree (https://bugzilla.yoctoproject.org/show_bug.cgi?id=10704) and mender.io (https://bugzilla.yoctoproject.org/show_bug.cgi?id=10703) TODO: move text from table into the sections of the individual mechanisms to keep the table short. Perhaps just keep a rating ("low/middle/high complexity") above in some of the columns? Rating would be relative to each other.
swupd
TODO: when using swupd purely as update mechanism (i.e. no bundles), space requirement on the server could be reduced to linear with the number of modified files by not creating the zero packs.
SWUpdate
TODO: additional explanations?