dbus: rauc: replace tacd-based update polling with native update polling#90
dbus: rauc: replace tacd-based update polling with native update polling#90hnez wants to merge 35 commits intolinux-automation:mainfrom
Conversation
|
I've created a RAUC feature request to support service config reloading: rauc/rauc#1709 |
|
I have decided to base this PR on #105 so I do not have to worry about conflicting changes too much. |
989ed7d to
e3e26fc
Compare
|
I think we could review and merge this and carry the RAUC changes as patches in |
| while let Some(ev) = status.next().await { | ||
| info!("Current status: {} ({})", ev.active_state, ev.sub_state); | ||
|
|
||
| if ev.active_state == "active" { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Shouldn't we handle the case where the state doesn't end up as "active" here?
There was a problem hiding this comment.
For example with a timeout?
There was a problem hiding this comment.
I have moved from the tacd::broker::Topic Rube Goldberg machine based service restart to just using the dbus endpoint directly to trigger the restart and get immediate feedback.
This should remove the need for timeouts on our side. Instead the timeouts from the RAUC service file (or the defaults) should take effect.
There was a problem hiding this comment.
It turns out some amount of Rube Goldberg machinery is actually required to reload/restart a systemd service and be notified about the result of this action.
I have just pushed an updated version in the form of 6a64740.
| } | ||
|
|
||
| #[cfg(not(feature = "demo_mode"))] | ||
| pub(super) fn update_from_poll_status(&mut self, poll_status: zvariant::Dict) -> Result<bool> { |
There was a problem hiding this comment.
This would benefit from some documentation. How do we find the channel to update from the RAUC status information?
| let inhibit_files = primary_channel | ||
| .inhibit_files | ||
| .as_deref() | ||
| .unwrap_or("/var/run/tacd/inhibit/dut-pwr;/var/run/tacd/inhibit/setup-mode"); |
There was a problem hiding this comment.
| .unwrap_or("/var/run/tacd/inhibit/dut-pwr;/var/run/tacd/inhibit/setup-mode"); | |
| .unwrap_or("/run/tacd/inhibit/dut-pwr;/run/tacd/inhibit/setup-mode"); |
There was a problem hiding this comment.
Does this cover any existing paths in /var/run/tacd?
There was a problem hiding this comment.
I am not sure whether I understand the question. We generate/remove these inhibit files in the tacd and check their existence in RAUC.
d9272db to
adb2b67
Compare
RAUC is in the process of adding native polling support, which we want to integrate into the tacd. To do the switch in a reviewable way first remove the tacd-based polling and then add the native support in separate commits. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
RAUC native update polling only supports a single update channel,
while our native update polling did support multiple
(all channels which RAUC would have accepted updates from,
based on the enabled signing certificates, were polled for updates and the
user was asked if they wanted to install updates from them).
Prepare for the change by adding a concept of a single primary update
channel. The primary channel is the first enabled one. Based on the
channel definition file name.
E.g. on production TACs these channel files are available:
root@lxatac-00011:~# ls /usr/share/tacd/update_channels/
01_stable.yaml 05_testing.yaml
They are sorted by name when they are read from disk, so if both
`stable.cert.pem` and `testing.cert.pem` are found
in `/etc/rauc/certificates-enabled/`, then the stable channel will be
the primary channel, but bundles from the testing channel may still be
installed via the command line interface (e.g. to facilitate a channel
switch).
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This restricts the sources that the `/v1/tac/update/install` will accept update requests from to only the primary channel. The web interface has not exposed the feature to install arbitrary URLs for some time now and users that want to do so are better served by using the command line interface instead. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This configures RAUC to poll for updates on our behalf. We do not use the information yet or enable automatic installation but those are next steps. We also need to trigger RAUC to re-read the file for this to be useful. All of these features are added in follow-up commits. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
We will need that in the future to implement cleaner service restarts. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This triggers RAUC systemd service reload or restart (currently RAUC does not support reloads, so it will be a restart) and waits for the result. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
RAUC native polling provides us with information about the recent poll attempts. This includes information about the bundle version and wether it is an update over what is currently running on the device. In other words: it gives us everything we need to show update notifications again. Forward this information to the same places we used with the tacd-based update polling. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
The RAUC native polling interface provides more information than
just the basic `compatible` and `version` fields.
Among these extra informations are the following:
- `manifest_hash`
By using the `manifest_hash` in the `InstallBundle` call we can
(cryptographically) ensure that the exact bundle (content) that the
user agreed to install is actually being installed and that no switch
has happened in between.
- `effective_url`
This is the bundle URL after all HTTP redirects have been followed.
This is e.g. relevant when a "clever" update server is used that
redirects poll requests to specific bundles to e.g. implement staged
rollouts or prevents updates from incompatible bundle versions.
By using this one can ensure that the bundle URL used matches
the `manifest_hash` provided and that the redirects did not change
(e.g. because the next step in a staged update was reached)
between the last poll and the user accepting an update.
The update dialog on the LCD is updated to use this mechanism now,
while the web interface will be updated later.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
Automatic installation and boot of updates can be useful when managing a fleet of devices. This is however a feature that requires strict user consent, hence why it is off by default. Add backend-support for enabling this feature. Frontent support in the web interface will be added later. We always enable auto-reboot together with auto-install, since the migration scripts only run once at the end of the installation. A system that is updated, but not rebooted, would thus accumulate changes that are not migrated to the other slot. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
Users managing a fleet of devices with custom-built update bundles and update channel may want to automatically enable update polling and automatic installation of updates without having to do so explicitly via the web ui. (At least we at Pengutronix do). Enable this usecase by adding optional `force_polling` and `force_auto_install` config options to the update channel definition files. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
The native RAUC polling feature allows configuring when a new update bundle is even considered as an update candidate (`candidate_criteria`), when it is considered for automatic installation (`install_criteria`) and under which conditions to auto-boot into another slot after installation (`reboot_criteria`). The defaults we have chosen in previous commits generally make sense, but allow users with custom update channels to customize them if they deem it necessary. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
The struct itself and also the functions that take parts of it as parameters (in particular `channel_list_update_task`) are getting unwieldy. Adding another parameter to `channel_list_update_task` would cross cargo clippy's threshold on "too many arguments". Work around that by splitting the Rauc struct and just passing the config part of it to `channel_list_update_task` as a whole. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
When in setup mode we do not want the system to auto-install updates and suddenly reboot without user input, as that would lead to a bad first experience with the TAC. Instead use an inhibit file to delay the first RAUC update poll to when the setup mode is exited. This commit does not yet make use of the inhibit file, it only generates and removes it. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
If a DUT is currently powered by the TAC it is unlikely that we should reboot for an update. This commit does not yet make use of the generated inhibit file. It only creates and removes it based on the DUT power state. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
It would be quite surprising (in the negative sense) if the TAC would reboot without a warning while you are setting it up after unboxing. Instead wait for the setup to complete before looking for updates. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
Ideally we would not want to auto-reboot when the TAC is in use, but deciding when that is, is not easy. One piece of information we do have is if the DUT is currently powered. In that case we very likely do not want to reboot right now. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
The return value from InstallerProxy::get_slot_status() needs some post-processing. Break that out into a separate function, because we will need to call `get_slot_status` from another place soon. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This function looks up the two rootfs slots and aranges them based on which one is booted and which one is not. This is currently only used in one place, but will be used in another soon. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
I like this better now, because it makes the control flow more obvious (bail! returns right there, while Err(anyhow!(...))) follows the normal control flow. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
At least for now the RAUC service will only poll for updates if the booted slot was marked good during the current runtime of the service. We restart the service to dynamically update the configuration. This means we have to mark the booted slot as good again if we want polling. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
We include the `boot-id` and `uptime` options in the RAUC `send-headers` config in the hopes of detecting boot-loops during update roll-out. This was not anticipated when writing the setup page, so we add it now. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
The channel list in the web interface now contains a "Upgrade" column
with one of the following:
- "Not enabled" for channels which are not enabled, which means bundles
from it can not be installed for it.
- "Not primary" (this one is new) for channels which are enabled,
but are not the primary one and are thus not polled by the native RAUC
polling feature.
- "Polling disabled" if the polling feature is not enabled.
- A spinner if we do not know the status yet.
- "Up to date" if the TAC is in sync with this update channel.
- "Upgrade" (a button) if an update is available.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
…dren This allows us to only show some configuration options when a condition is met. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
Only make the toggle visible when polling is enabled, since auto-update without polling does nothing. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This ensures that the exact bundle (content) that the user agreed to install is actually installed. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
With the change to native RAUC update polling only the primary update channel is checked for updates, not all update channels as before. Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This is an advanced config options that most users should not actually need, so hide it behind an expandable section (and only show it if polling is active). Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
This uses the native RAUC update polling support introduced in rauc/rauc#1672 to replace the tacd-internal update polling.
Among the various benefits of this approach are the following:
Other PRs related to this one:
TODO before un-drafting: