A couple of days ago there was an incident involving Raspberry Pi devices running Raspbian. During a period of time devices could brick if you perform an ‘apt-get update && apt-get upgrade’. In this case they would brick by losing most of their functionality and if restarted they would not boot anymore.
The problem, in this case, was that a misconfigured package was uploaded (raspi-copies-and-fills) to the official repositories and if updated it would lead to a broken system. The change in the upstream repositories was quickly reverted when people started reporting problems but it existed long enough to significantly affect users.
The impact for people with physical access to their devices was not as severe as there is a quick fix which involves removing the SD card from your device, plug it into your PC and removing the ‘etc/ld.so.preload’ file.
For people who manage distributed Raspberry Pi devices with only remote access, there is no quick fix and the common catchall SSH does not work either as you were not able to launch new applications which is required as the sshd daemon is started on-demand. One needs to travel to each device to manually recovery it either by using the workaround presented or re-writing SD cards.
You can read more about the incident at the following links (blog posts and Raspberry Pi forums):
- Raspberry Pi: update breaks Raspbian Stretch
- raspi-copies-and-fills package gone AWOL?
- Ras3 crashes after update
- Updating "raspi-copies-and-fills" destroys my system
This incident clearly highlights the drawbacks of using a standard Linux package manager for deploying software Over-the-Air to distributed devices. It is not possible to automatically roll back in case there are problems and there is also no way to validate or sanity check the packages before installing them on your active system.
Ensuring a robust update process that never leaves devices bricked has always been one of the main goals of Mender, which utilizes a dual A/B system layout.
During the update process, the Mender client writes the updated image to the rootfs partition that is not running and configures the bootloader to boot from the updated rootfs partition. The device is then rebooted. If booting the updated partition fails, the partition that was running is booted instead, ensuring that the device does not get bricked. If the boot succeeds, Mender sets the updated partition to boot permanently when Mender starts as part of the boot process.
Persistent data can be stored in the data partition, which is left unchanged during the update process.
We do also support Raspbian and you can follow this tutorial to get started.