System Update: Difference between revisions

From Yocto Project
Jump to navigationJump to search
Line 17: Line 17:
Some talks at Embedded Linux Conference presented an overview of the current mechanisms:
Some talks at Embedded Linux Conference presented an overview of the current mechanisms:
* How do you update your embedded Linux devices? by Daniel Sangorrin / Keijiro Yano http://events.linuxfoundation.org/sites/events/files/slides/linuxcon-japan-2016-softwre-updates-sangorrin.pdf
* How do you update your embedded Linux devices? by Daniel Sangorrin / Keijiro Yano http://events.linuxfoundation.org/sites/events/files/slides/linuxcon-japan-2016-softwre-updates-sangorrin.pdf
* Comparison of Linux Software Update Technologies by Matt Porter http://events.linuxfoundation.org/sites/events/files/slides/Comparison%20of%20Linux%20Software%20Update%20Technologies_0.pdf
* Comparison of Linux Software Update Technologies by Matt Porter http://events.linuxfoundation.org/sites/events/files/slides/Comparison%20of%20Linux%20Software%20Update%20Technologies_0.pdf. Video at https://youtu.be/pdHV9H9nZks?list=PLbzoR-pLrL6pRFP6SOywVJWdEHlmQE51q
* Software update for IoT: the current state of play by Chris Simmonds http://de.slideshare.net/chrissimmonds/software-update-for-iot-the-current-state-of-play
* Software update for IoT: the current state of play by Chris Simmonds http://de.slideshare.net/chrissimmonds/software-update-for-iot-the-current-state-of-play
* OSS Remote Firmware Updates for IoT-like Projects by Silvano Cirujano Cuesta http://events.linuxfoundation.org/sites/events/files/slides/OSS_Remote_Firmware_Updates_for_IoT-like_Projects.pdf
* OSS Remote Firmware Updates for IoT-like Projects by Silvano Cirujano Cuesta http://events.linuxfoundation.org/sites/events/files/slides/OSS_Remote_Firmware_Updates_for_IoT-like_Projects.pdf

Revision as of 12:10, 7 December 2016

Introduction

This page compares different system update mechanisms. The purpose is to help the project with picking a suitable mechanism that the project then will support going forward. Users may find this page relevant for picking a mechanism that suits their specific needs.

A system update mechanism must ensure that a device running an older release of the operating systems runs with a more recent release when the update mechanism is done. This includes updating everything that defines the system (rootfs, kernel, bootloader, etc.), restarting running processes and potentially a reboot. An ideal mechanism:

  • never ends up in an inconsistent state (atomic update),
  • always keeps the device usable (fallback to previous state when there are problems, or at least supporting a recovery mode),
  • requires little additional resources (disk space, RAM),
  • minimizes downtime while updating,
  • works in combination with security technology (integrity protection),
  • is secure (does not install or execute software created by an attacker).

These are conflicting requirements. Different mechanisms will have different strengths and weaknesses. Therefore the first chapter provides a more detailed definition of the different aspects and has a table comparing the mechanisms. The following sections then describe each mechanism in more detail.

A similar comparison was done for Automotive Grade Linux (AGL) here: https://lists.linuxfoundation.org/pipermail/automotive-discussions/2016-May/002061.html

Some talks at Embedded Linux Conference presented an overview of the current mechanisms:

Comparison

Type
Block-based update mechanisms directly modify blocks in the partition(s) that they update, without going through the filesystem. This implies that the partition has to be the same for all devices and that devices must use exactly the same partition size. File-based update mechanisms modify files and directories. Therefore devices with different partition sizes can use the same update data and it may be possible to update without a reboot.
Disk layout
Boot loader, number and kind of partitions, ...
Rootfs
The partition which contains the OS. May be strictly read-only (block-based update mechanisms) or read/write (file-based). Some update mechanisms support installing and updating a subset of the full OS.
Updates from
describes from where the update mechanism gets the update.
Updates what
describes which parts of the overall system the mechanism updates.
Code stability
Based on how long the code has been in use, personal experience, security track record in existing deployments, etc.
OE/Yocto integration
Whether the mechanism is already available and who supports it.
Resource requirements on server
affect both build time and long-term storage capacity. Likely to depend on the complexity of the changes.
Resource requirements on client
Amount of temporary disk space, CPU/network load, ..., again for different scenarios.
Failure resilience
Summarizes how the mechanism copes with potential problems.
Complexity
Some mechanisms are harder to use correctly than others (usability). Also includes how difficult is to set up the mechanism because of dependencies.
Downtime
How long normal operation of the device needs to be interrupted for an update.
Security
Compatibility with other technology, protection of the update mechanism itself.
Mechanism Type Disk layout Rootfs Updates from Updates what Code stability OE/Yocto integration Resource requirements Failure resilience Complexity Downtime Security
on server on client
swupd file-based All files in a single partition. Arbitrary disk layout, filesystem and boot mechanism. Read/write. Files provided by the OS are read-only, everything else is read/write (/etc, /var). OS can be split up into a core OS (always installed) and optional bundles which may or may not be installed. HTTP(S) server Files in the rootfs, boot loader and kernel via plugins Used in Clear Linux OS. Code relatively stable, but would benefit from a rewrite (evolved from a prototype). meta-swupd (work in progress, not part of the Yocto project). No support for automatically updating at the moment. Build time and storage for each update linear with total number of files (file system analysis, zero packs) plus linear with number of modified files (compression). Optionally can prepare deltas from certain previous builds, which is linear with the number of modified files since each of those builds. In the best case (delta prepared by server), a single archive with just some file diffs gets downloaded, unpacked and applied. In other cases, each new or modified file gets downloaded and unpacked. Staging new content needs free space in the rootfs partition, i.e. partition must be at least twice as large as the base OS. No recovery mechanism built into swupd itself. Short period of time where interrupted update may leave behind inconsistent rootfs. No updates possible when there is not enough free space left. Upgrade path must be considered as part of release process (deltas, incompatible changes) Downloading and staging in parallel to normal operation. Services are kept running until after the update, at which point the device admin needs to restart services or reboot (needs to be automated). Compatible with Linux IMA, Smack, SELinux. Incompatible with dm-verity. Relies on HTTPS and (optionally) signing to protect integrity of downloaded files.
sbabic's swupdate block-based Arbitrary disk layout, kernel, filesystem and boot mechanism. Active and passive partition in dual-copy. In rescue mode, device must boot from maintenance image (could be in another partition in the same disk). Local provisioning of image file (USB, SD), generic URL, integrated (mongoose) HTTP server, external backend (Hawkbit Server). See note. Code relatively stable, 3 months release cycle meta-swupdate (Not part of the Yocto project). Just need to build the update image and package it with a configuration file (sw-description). dual-copy:It needs to have at least two partitions (active/passive). rescue: see note There are no recovery mechanism built in. If the bootloader has the capability to check if a boot failed, it could boot in maintenance mode again. This updater is very easy to use and setup, even if not using the OE layer. rescue:There is need to reboot in maintenance mode and once the image is installed, needs to reboot again with the new production image. dual-copy: Update is downloaded and applied during normal operation. Afterwards one reboot is required, no other downtime. Signed images, https, encrypted images
Mender block-based Dual rootfs partitions with two extra partitions, a boot and a data partition. Active and passive partition, read/write while in use. Kernel is stored on rootfs. Mender server and/or HTTPS server Complete rootfs, including kernel. master branch under development. There is a stable branch, but it's lacking many basic features. meta-mender. Not part of the Yocto Project. Needs to store one compressed rootfs image for each update, plus small meta data section. Needs active/passive rootfs partition, and bandwidth to download compressed rootfs image. Needs no additional space on device beyond the partitions, space for the Mender binary, and a tiny local database. Automatic rollback if the device either fails to boot, or the Mender daemon cannot connect to the Mender server afterwards. Relatively easy to build with Yocto. More complex if not using Yocto. Update is downloaded and applied during normal operation. Afterwards one reboot is required, no other downtime. Secure connection to the server (TLS). Signing not currently supported, but planned.

TODO: add OSTree (https://bugzilla.yoctoproject.org/show_bug.cgi?id=10704) and mender.io (https://bugzilla.yoctoproject.org/show_bug.cgi?id=10703) TODO: move text from table into the sections of the individual mechanisms to keep the table short. Perhaps just keep a rating ("low/middle/high complexity") above in some of the columns? Rating would be relative to each other.

swupd

TODO: when using swupd purely as update mechanism (i.e. no bundles), space requirement on the server could be reduced to linear with the number of modified files by not creating the zero packs.

SWUpdate

Disk layout
There is no constraint how software is stored. SWUPdate supports raw flash (NOR, NAND), UBI volumes, disk partitions or can update files (provided in a tarball) into an existing filesystem. Each artifact can be stored on a different storage device.
Rootfs
No constrain where software is stored. During an update, a single partition, multiple partitions or generically multiple different storages can be updated.
SWUpdate is often used in one of the following setup:
  • rescue : The system reboots in maintenance mode and SWUpdate is started from a Ramdisk. Just one copy of the Software is stored into the system.
  • dual-copy : two copies of the software (rootfs, kernel) are stored into the system and SWUpdate installs the stand-by copy.
Updates from
  • local provisioning : USB, SD, etc.
  • generic URL : HTTP(S), FTP. It uses the libcurl library and supports what libcurl provides.
  • Webserver : SWUpdate integrates a Webserver (moongoose)
  • External Backend connector (suricatta mode) to bind with an external backend server. Currently, the Hawkbit server is supported https://github.com/eclipse/hawkbit. Open to further backends.
Updates what
  • bootloader (risky !)
  • kernel
  • interface to bootloader (U-Boot) Allows to change u-boot variables and allow to use plugins to make changes to other bootloaders.
  • disk partitions
  • provide interface to update FPGAs, external microcontrollers, etc.
  • custom handlers: an interface allows to add own installers written in C or in LUA.
Resources on client
  • rescue : meta-swupdate provides recipe to generate a compressed ramdisk with small footprint. Including support for signed image, the whole rootfs is ~4MB. The minimal requirement for a complete rescue (bootloader, kernel for SWUpdate and ramdisk) is 8MB. This allows to put the rescue in a small storage like a SPI-NOR, while the software is stored on another and bigger device (NAND, eMMC).
  • dual-copy : Needs active/passive rootfs partition, and bandwidth to download compressed rootfs image. No additional space is required if the image is directly streamed to the stand-by copy.
Downtime
  • rescue : There is need to reboot in maintenance mode and once the image is installed, needs to reboot again with the new production image.
  • dual-copy : Update is downloaded and applied during normal operation. Afterwards one reboot is required, no other downtime.
Security
  • Connection with HTTPS to the external the server.
  • Signed images : it is possible to sign the images used for the update in order to check its integrity.
  • Split in several processes: connection to the internet can run with a different userid / groupid as the installer. The installer runs often with high privileges because it has to write the hardware.
  • Support for encrypted artifacts