Skip to content

Support for duplication of active image to the second image storage #258

@athoelke

Description

@athoelke

Use cases

Some firmware update implementations want to duplicate the active firmware image into the memory used for staging a new firmware image. The following use cases have been identified:

  1. Maintaining a backup copy of the current active firmware, to maintain device operation in a situation where the active image is somehow corrupted. The bootloader would be designed to switch to the second/backup firmware image if the active image failed verification. Unlike a TRIAL state rollback, the second image is identical firmware.

    To work well, this type of implementation, would also need to periodically verify the backup image, and discard and re-duplicate the active image if the backup was found to be corrupted.

  2. If the system uses delta updates, and is designed to operate in-place on a copy of the current firmware, then the update system needs to make a copy of the active firmware into the staging memory prior to applying the delta update to it.

In some environments, copying the active firmware into the staging area could be considered to be a disruptive or long-running operation, in a similar way to erasing the staging area in preparation for a new update.

Analysis

Although use case 2 could be supported by copying the image as part of the start or even write operations - that does not address use case 1, which would like the duplicated image to be the 'steady state' of the firmware component - i.e. when READY.

Initial idea

It appears that the use cases could be supported within the existing FWU state model:

  • If we ignore the name of the operation, and carry out the duplication of the firmware as part of the clean operation.

  • The READY state is used to indicate that the second image is a valid copy of the active image.

  • If the duplication fails, then the clean operation would end up in FAILED state. This is already permitted in the v1.0 state model, see 4.2.4 the Behavior on error:

    If an operation fails because of other conditions, it is implementation defined whether the component state is unchanged, or is transitioned to FAILED state.

  • It would be possible that a reboot might take a component in READY state to FAILED, in case that either the active or second image has become corrupt. Repeating the clean operation would reinstate the backup copy and return to READY state.

Drawbacks

One concern with this approach is that psa_fwu_clean() might be misleading as a function name for an operation intended for any actions that are required to bring the component into the 'steady state' (READY) for the system.

In addition, the current specification does not suggest that a reboot from READY state could result in a non-intuitive transition to FAILED state.

We also need to ensure that this interacts appropriately with alternative variations of the FWU state model:

  • If staging is volatile, then the v1.0 requires that UPDATED and FAILED states are not visible following a reboot: see PSA_FWU_FLAG_VOLATILE_STAGING. This implies that a reboot that updates or reverts the firmware carries out the clean operation, and duplicates the image, presenting a READY state to the Update client.

    If the bootloader detects corruption in either image, it must rectify that by duplicating the firmware again prior to booting the application firmware.

  • If the system is sensitive to disruptive operations, then the Update client must be able to control the timing of the duplication process - and call psa_fwu_clean() explicitly.

Therefore, a FWU v1.0-compliant implementation that both has volatile staging, and duplicates the active image before entering READY state, cannot be used by an application that is sensitive to disruptive operations and needs explicit control of when the duplication happens.

This is not just a hypothetical scenario, it is one of the envisaged systems that would make use of firmware image duplication.

Options

These use cases nearly fit into the current state model, except for the need for volatile pre-installation images (WRITING and CANDIDATE states) but non-volatile post-installation images (UPDATED and FAILED states).

In situations where that collision of volatile staging and controlled duplication does not arise, the firmware could use the above approach as described, with the existing FWU v1.0 API and state model.

To enable full support for these use cases, will require some alteration or extension of the API.

Option A

One option is to take the approach above, but adjust for the challenging scenarios:

  • Duplicate the firmware image as part of psa_fwu_clean(). Update the specification to describe this optional behavior for implementations that require this.
  • Alter the requirements around the volatility of UPDATED and FAILED states, so that this is decoupled from volatile staging for pre-installation images. This means that the meaning of the Volatile-staging flag has to be changed.
  • It might be good to add a separate flag to indicate if the implementation has volatile or non-volatile FAILED and UPDATED states
  • It might be good to add a flag to indicate the component does duplication as part of the clean operation.

Option B

Given that this is not fully compatible with the v1.0 API, as it changes the meaning of the volatile-staging flag, it might be cleaner and clearer to add to the state model for systems that require this additional behavior.

  • Add a new optional DUPLICATED state.
  • Add a new optional copy operation, psa_fwu_copy().
  • Add a new component flag, PSA_FWU_FLAG_COPY_IMAGE.

If the new flag is set for a FWU component, then the state model for the component changes as follows:

  • In READY state, the Update client must do the copy operation before starting a new update. The start operation is invalid when in READY state for this component.

    • A successful copy transitions the copmponent to DUPLICATED state.
    • A failed copy transitions to FAILED state.
  • In DUPLICATED state, the Update client uses the start operation to begin a firmware update.

  • During reboot from DUPLICATED state, the component can transition to FAILED state, if one of the copies is corrupted.

If firmware update is aborted, the Update client needs to cancel, clean, and copy, before starting again.

If staging is volatile, then the FAILED state is not visible after reboot - but the Update client can detect the difference between READY and DUPLICATED, and explicitly call psa_fwu_copy() at an appropriate time.

Metadata

Metadata

Assignees

Labels

API designRelated the design of the APIFirmware Update APIIssue or PR related to the Firmware Update APIenhancementNew feature or request

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions