Skip to content

Add VMware-to-KVM BFV-only resize#627

Draft
anokfireball wants to merge 7 commits into
stable/2023.2-m3from
cross-hv-vmware-driver
Draft

Add VMware-to-KVM BFV-only resize#627
anokfireball wants to merge 7 commits into
stable/2023.2-m3from
cross-hv-vmware-driver

Conversation

@anokfireball

@anokfireball anokfireball commented Jun 12, 2026

Copy link
Copy Markdown
Member

Implements the VMware-to-KVM Cross-Hypervisor Instance Migration for boot-from-volume instances (only!) based on the ADR decision by adding (1) the conductor sanitization of image properties that would pin the instance to VMware HVs, (2) the required source-side VMware driver control flow, and (3) the confirm/revert control flow that commits/puts those properties back.

The cross-HV branches are guarded by _is_cross_hv_resize which currently still is a stub returning False. As a consequence, the overall feature is dormant: no flavor pair is allowlisted, no detection runs, no behavior changes. The feature activates once the detection layer allows a combo of source HV type and destination flavor (i.e. VMware to KVM).

The three commits map to the ADR's Resize Orchestration choice:

  1. compute: RequestSpec sanitization for cross-HV resize. Sanitizes img_hv_type and the VMware bus/vif/scsi/video model fields from request_spec.image.properties before _schedule(), so the scheduler can pick a KVM host. Originals go into a new MigrationContext.old_image_properties field (versioned object v1.3) for revert. _move_claim carries the journal across the destination claim.

  2. vmware: enable cross-HV BFV resize in VMware driver. Adds branches to migrate_disk_and_power_off, confirm_migration, and finish_revert_migration. Source-side power-off and volume/NIC detach use ReconfigVM only: no file deletion, no Cinder calls. The "shell" VM stays registered in vCenter under migration.uuid so confirm and revert can find it. Revert restores instanceUuid before re-attaching volumes, since the volume attach paths look up the VM by instance.uuid.

  3. compute: revert path for cross-HV resize. Restores request_spec.image.properties and system_metadata['image_*'] from the MigrationContext journal during revert, before _update_volume_attachments and the driver call so the scheduler can pick a KVM host again.

Also as per ADR, there are no no Cinder retype during resize. The FCD driver keeps serving NFS connection_info to KVM connectors, and volumes stay typed vstorageobject. Customers run a standard os-retype when they are ready to change the volume type.

@anokfireball anokfireball changed the title Cross hv vmware driver Add VMware-to-KVM BFV resize Jun 12, 2026
@anokfireball anokfireball changed the title Add VMware-to-KVM BFV resize Add VMware-to-KVM BFV-only resize Jun 12, 2026
Sanitize VMware-specific image properties on the RequestSpec before
scheduling, matching how request_spec.flavor is already handled in the
cold migrate flow.

The sanitizer mutates request_spec.image.properties in place and
returns a dict of the original values. The conductor saves this dict
to MigrationContext.old_image_properties (via instance.save()) BEFORE
the prep_resize cast, so the destination _move_claim can preserve it
when rebuilding MigrationContext.

The field mapping lives in CROSS_HV_SANITIZE_PROPS and is shared with
the VMware driver system_metadata sanitizer so they agree on which
properties to scrub.

resource_tracker._move_claim() preserves any existing
old_image_properties when it rebuilds MigrationContext on the
destination, so the rollback dict survives the claim step.

Change-Id: I56271a118ec3785eb5ba81a6f648e58d18169e73
Add cross-HV BFV branches to the three VMware driver resize methods:

migrate_disk_and_power_off:
  - Detect cross-HV before _decode_host_addr (KVM dest uses plain IP)
  - Power off (tolerate already-off for Phase 2 idempotency)
  - Sanitize system_metadata image_* keys using shared
    CROSS_HV_SANITIZE_PROPS; save originals into
    MigrationContext.old_image_properties
  - Detach volumes and NICs via ReconfigVM (file is not deleted)
  - Swap instanceUuid to migration.uuid, rename shell
  - Return '[]' (libvirt finish_migration iterates this safely)
  - On retry, skip work if a shell already exists
  - On rollback after UUID swap, find the VM via migration.uuid

confirm_migration:
  - Find shell by migration.uuid, destroy it
  - Tolerate missing shell (log and continue)
  - Clean all cross_hv_* markers from system_metadata

finish_revert_migration:
  - Find shell by migration.uuid
  - Restore instanceUuid before volume re-attach (attach paths look
    up the VM by instance.uuid, so ordering matters)
  - Re-attach NICs and volumes, rename back
  - Power on

Detection is a stub (returns False) until the detection/guardrails
work lands separately.

Change-Id: I3db51a7dc0cb1dfc1592959d9823250a93fd3358
Two restoration steps for cross-HV resize revert:

1. compute/api.py:revert_resize(): restore
   request_spec.image.properties from
   MigrationContext.old_image_properties and persist via
   reqspec.save(). Mirrors the existing reqspec.flavor revert. Only
   fires when old_image_properties is set (cross-HV resizes).

2. compute/manager.py:_finish_revert_resize(): restore
   system_metadata image_* keys from the MigrationContext dict
   BEFORE _update_volume_attachments and driver.finish_revert_migration,
   so the VMware driver sees the original values and the FCD driver
   returns VMware-protocol connection_info.

   cross_hv_* markers are cleaned AFTER the driver call returns. The
   VMware driver reads the marker to route to its cross-HV revert
   branch.

Change-Id: Ib3689d4631c10b458802b450a3f40844766c90b9
@anokfireball anokfireball force-pushed the cross-hv-vmware-driver branch from 7de8abf to fa741a2 Compare June 16, 2026 09:11
@BzzyBriel BzzyBriel force-pushed the cross-hv-vmware-driver branch 3 times, most recently from bb2df8b to bd4f689 Compare June 16, 2026 15:51
compute_utils.heal_reqspec_is_bfv(
self.context, self.request_spec, self.instance)
self._old_image_properties = {}
src_hv = self._source_cn.hypervisor_type if self._source_cn else None

@BzzyBriel BzzyBriel Jun 16, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._source_cn can not only be None in tests but also from this (obsolete) codepath

We allow it only if it's:
- a BFV instance, that
- is powered on.
And only from VMware to CH, not backwards.

The image properties are now sanitized and saved, only on vmware->ch
hv transition, inside the new _prep_cross_hv_resize() step.

Change-Id: I50dedca220e70f51811c0f9b9327ca3ae7ea9e12
Co-authored-by: Jakob Karge <jakob.karge@sap.com>
Co-authored-by: Fabian Koller <github@kthxbye.cyou>
@BzzyBriel BzzyBriel force-pushed the cross-hv-vmware-driver branch from bd4f689 to 074afbd Compare June 16, 2026 15:58
host_list=self.host_list)

def _prep_cross_hv_resize(self, src_hv: str, dest_hv: str):
if extra_specs_ops.match("VMware vCenter Server", src_hv) \

@anokfireball anokfireball Jun 17, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only whitelisting the combination for the conductor. We also need some check like this for the API layer... maybe compute_utils.is_supported_cross_hypervisor_resize or similar so the logic can be shared between the two locations similar to compute_utils.is_cross_hypervisor_resize.

Comment thread nova/exception.py
Comment thread nova/conductor/manager.py Outdated
Change-Id: Ia06d1e2f06226403b07f6e96806af056d49f610c
Change-Id: I4894fad4dcfab556818766d5a983bbe2645a9ea9
Change-Id: I74d1d48714e9fc441da97425ca45f82bbeb645bb
Comment thread nova/compute/api.py
source_hv_type, dest_hv_type):
request_spec = objects.RequestSpec.get_by_instance_uuid(
context, instance.uuid)
compute_utils.raise_on_unsupported_cross_hypervisor_resize(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again a test-breaking change. Some tests apparently expect this method to return before a request_spec is first requested from the database. We should fix those tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants