Skip to content

Add instance storage discovery patterns in config#592

Open
Ivaylogi98 wants to merge 1 commit intocloudfoundry:ubuntu-jammyfrom
Ivaylogi98:ubuntu-jammy-add-instance-storage-patterns
Open

Add instance storage discovery patterns in config#592
Ivaylogi98 wants to merge 1 commit intocloudfoundry:ubuntu-jammyfrom
Ivaylogi98:ubuntu-jammy-add-instance-storage-patterns

Conversation

@Ivaylogi98
Copy link
Copy Markdown

@Ivaylogi98 Ivaylogi98 commented May 4, 2026

Note: Should merge forward these changes to ubuntu-noble.

Summary

Adds two config fields to the AWS agent settings baked into the stemcell, required by
cloudfoundry/bosh-agent#407 (and subsequently by cloudfoundry/bosh-agent#396) to correctly discover NVMe instance storage on AWS Nitro
instances.

Background

On Nitro-based instances (e.g. m6id, i3en), NVMe PCIe enumeration order is
non-deterministic — /dev/nvme0n1 may be the root EBS volume, an attached EBS volume,
or instance storage depending on boot order. The bosh-agent now uses symlinks in
/dev/disk/by-id/ to identify and exclude EBS volumes, leaving only instance storage.

To keep the resolver IaaS-agnostic, the AWS-specific patterns are not hardcoded in the
agent binary — they are injected via agent config at stemcell build time.

Changes

stemcell_builder/stages/bosh_aws_agent_settings/apply.sh:

  • InstanceStorageDevicePattern: glob pattern matching all NVMe namespace devices
  • InstanceStorageManagedVolumePattern: glob pattern matching EBS volume symlinks in
    /dev/disk/by-id/, used to identify and exclude EBS volumes from instance storage
    discovery

Related

This repository uses a "Merge Forward" strategy

Changes should be made in the earliest applicable branch, and
merged forward through subsequent branches.

  1. Create a PR into the oldest branch (ubuntu-<short_name>)
  2. After this PR has been merged create a merge-to-<next_short_name> branch
  3. Merge ubuntu-<short_name> into merge-to-<next_short_name>
  4. Create a PR to merge merge-to-<next_short_name> into ubuntu-<next_short_name>
  5. Repeat as needed for subsequent branches

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Walkthrough

The apply.sh script in the bosh_aws_agent_settings stage has been updated to include additional configuration parameters in the generated agent.json file. Two new settings have been added to the Platform.Linux configuration: InstanceStorageDevicePattern (set to /dev/nvme*n1) and InstanceStorageManagedVolumePattern (set to /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_*). The existing UseMonitIptablesFirewall setting remains in place alongside these additions.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding instance storage discovery patterns to the AWS agent configuration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description comprehensively covers the purpose, background, technical changes, and includes the required merge-forward strategy guidance from the template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@stemcell_builder/stages/bosh_aws_agent_settings/apply.sh`:
- Around line 15-16: The two config keys InstanceStorageDevicePattern and
InstanceStorageManagedVolumePattern are dead fields (not consumed by
bosh-agent); either remove them from the agent config output in apply.sh or
implement support in bosh-agent to consume them. To fix: if removing, delete the
two keys from the config generation in apply.sh and update any docs/tests; if
implementing, add parsing and use of
InstanceStorageDevicePattern/InstanceStorageManagedVolumePattern inside the
DevicePathResolver logic (respecting existing DevicePathResolutionType,
DIDTransformPattern and VolumeID matching semantics), add unit tests to
bosh-agent that verify matching/resolution against /dev/disk/by-id/ symlinks,
and update integration tests to exercise managed vs instance-storage resolution
paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 53804f69-c57a-4da5-8162-a0753a10bc96

📥 Commits

Reviewing files that changed from the base of the PR and between 265a297 and f8f6b3a.

📒 Files selected for processing (1)
  • stemcell_builder/stages/bosh_aws_agent_settings/apply.sh

Comment on lines +15 to +16
"InstanceStorageDevicePattern": "/dev/nvme*n1",
"InstanceStorageManagedVolumePattern": "/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_*"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify udev persistent-storage rules handling NVMe by-id entries exist in the repo/chroot assets,
# and that no stage strips or masks them.

# Check for any udev rule files in the repo that touch disk/by-id or nvme by-id creation
rg -rn "by-id" stemcell_builder/stages/ --include="*.rules" --include="*.sh" -A2 -B2

# Check for any stage that might remove or modify udev persistent-storage rules
rg -rn "persistent-storage\|60-persistent\|61-persistent" stemcell_builder/stages/ -A2 -B2

# Check the udev_aws_rules stage assets for any disk/by-id references in nvme-id or rules files
fd -e rules -e sh . stemcell_builder/stages/udev_aws_rules/assets/ --exec cat {}

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 769


🌐 Web query:

standard udev persistent storage nvme by-id Amazon Elastic Block Store Ubuntu Jammy

💡 Result:

On Ubuntu Jammy (22.04 LTS) EC2 instances with Nitro-based hardware, Amazon EBS volumes appear as NVMe devices (/dev/nvmeXn1) with non-persistent enumeration order across reboots. Standard udev persistent storage uses /dev/disk/by-id symlinks based on model, serial (EBS volume ID), and namespace ID (e.g., nvme-Amazon_Elastic_Block_Store_vol-0123456789abcdef_1), provided by /usr/lib/udev/rules.d/60-persistent-storage.rules. For EBS-specific persistence matching block device mappings (e.g., /dev/sdf symlink to NVMe device), install amazon-ec2-utils package (version 1.4-1 in Jammy repos), which provides ebsnvme-id tool and installs /usr/lib/udev/rules.d/70-ec2-nvme-devices.rules. These rules run ebsnvme-id -u /dev/%k on "Amazon Elastic Block Store" model devices to create symlinks like /dev/sdf -> /dev/nvme1n1. To install: sudo apt update sudo apt install amazon-ec2-utils Example usage: - ebsnvme-id /dev/nvme0n1 --volume : vol-0123456789abcdef - ebsnvme-id /dev/nvme0n1 -u : sdf (for udev/symlink name) Use /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_VOLID_1 for scripts/fstab to avoid enumeration issues. Standard by-id is always available; EBS block mapping symlinks require amazon-ec2-utils.

Citations:


🏁 Script executed:

# Search the repository for how bosh-agent uses InstanceStorageDevicePattern and InstanceStorageManagedVolumePattern
rg "InstanceStorageDevicePattern|InstanceStorageManagedVolumePattern" . -A 3 -B 3

# Also check if there's a bosh-agent reference or version info
rg -l "bosh.*agent" . --type sh | head -5

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 1193


🏁 Script executed:

# Search for how bosh-agent uses these patterns - look in related agent repositories or documentation
rg -n "InstanceStorageDevicePattern\|InstanceStorageManagedVolumePattern\|instance.*storage" . --type sh -A 2 -B 2 | head -50

# Check if there are any tests or examples that validate these patterns
fd -e yaml -e json . stemcell_builder/stages/bosh_aws_agent_settings/ --exec cat {}

# Check the actual apply.sh to see full context of how these are used
cat stemcell_builder/stages/bosh_aws_agent_settings/apply.sh

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 1065


🏁 Script executed:

# Look for bosh-agent usage of these patterns - search in git history or comments
rg -n "InstanceStorage" . -A 5 -B 5

# Check if there's any documentation or README about these settings
find . -name "README*" -o -name "*.md" | xargs grep -l "InstanceStorage" 2>/dev/null

# Check git log for when these patterns were added to understand context
git log --oneline -n 20 stemcell_builder/stages/bosh_aws_agent_settings/apply.sh 2>/dev/null || echo "Git log not available"

# Search for any tests or validation of these patterns
fd -e rb -e sh . stemcell_builder/ -path "*/test*" -o -path "*/spec*" | xargs grep -l "InstanceStorage" 2>/dev/null || echo "No tests found for InstanceStorage"

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 1515


🌐 Web query:

bosh-agent InstanceStorageDevicePattern InstanceStorageManagedVolumePattern implementation how used

💡 Result:

No public sources explicitly document "InstanceStorageDevicePattern" or "InstanceStorageManagedVolumePattern" in the context of bosh-agent. These terms do not appear in official BOSH documentation, GitHub repositories for cloudfoundry/bosh-agent, or related CPIs (e.g., AWS, vSphere, Google). Bosh-agent handles disk settings via /var/vcap/bosh/settings.json, parsing ephemeral, persistent, system, and raw_ephemeral disks from CPI-provided data. It uses DiskSettings struct to extract path, volume_id, id, lun, etc., for mounting and partitioning. For instance storage (e.g., AWS NVMe/local disks), CPIs like bosh-aws-cpi configure block device mappings and raw_instance_storage: true exposes all disks as raw-ephemeral-* labels for jobs. The agent resolves device paths using strategies like virtio, scsi, iscsi via DevicePathResolver, with options like StripVolumeRegex for ID matching (/dev/disk/by-id). These patterns may be internal CPI-specific (e.g., GoogleCPI or custom) for distinguishing managed volumes vs. instance-local storage, but no implementation details found. Likely used by CPIs to populate agent disk settings for proper resolution/mounting of local ephemeral storage vs. managed persistent volumes.

Citations:


🏁 Script executed:

# Search directly in bosh-agent repository (clone might be available or search via git)
# First check if bosh-agent is mentioned or linked anywhere in this repo
find . -name "*.md" -o -name "*.txt" | xargs grep -i "bosh-agent\|bosh.*agent" 2>/dev/null | head -20

# Check the git commit that added these patterns for more context
git log --all --pretty=format:"%h %s" | grep -i "instance.*storage" 2>/dev/null || echo "No git log available"

# Search for where these config values might be documented or used
rg "InstanceStorage" . --type yaml --type json 2>/dev/null || true

# Check if there's any reference to bosh-agent version or release info
rg "bosh.version\|agent.version" . --type sh --type yaml

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 783


🌐 Web query:

bosh-aws-cpi InstanceStorageDevicePattern device pattern matching ephemeral storage

💡 Result:

In the BOSH AWS CPI, there is no config key or code path named “InstanceStorageDevicePattern” for matching instance-storage devices. What the CPI actually does for ephemeral/instance storage device “pattern matching” is: 1) When you enable raw instance storage (raw_instance_storage: true) - BOSH tells the agent to partition and label each instance-storage disk with part labels matching the pattern raw-ephemeral-. - Your job is then to locate these disks inside the VM using /dev/disk/by-partlabel/raw-ephemeral- - Example (from docs): ls -la /dev/disk/by-partlabel/raw-ephemeral-* 2) How this relates to “ephemeral storage” - All BOSH-managed VMs must store ephemeral data; BOSH prefers first instance-storage disk when available, otherwise it creates a separate EBS volume to use as an ephemeral disk. 3) How disk selection is configured - The AWS CPI supports instance-storage-backed ephemeral disks in two ways: a) raw_instance_storage: true (access to all instance storage disks; labels them raw-ephemeral-) b) cloud_properties.ephemeral_disk.use_instance_storage (use instance storage for the ephemeral disk instead of EBS), with safeguards: - raw_instance_storage? cannot be combined with ephemeral_disk.use_instance_storage - use_instance_storage requires the chosen instance type to actually have instance storage So if your goal is “matching ephemeral storage devices” in the guest OS, the authoritative pattern you’re looking for is the agent partlabel pattern raw-ephemeral-, not an “InstanceStorageDevicePattern”.

Citations:


🏁 Script executed:

# Look at the git commit that added these patterns for context
git show f8f6b3ab --stat 2>/dev/null || echo "Cannot show commit"

# Check if there are any other references to these pattern names elsewhere in the codebase
rg -r "DevicePattern\|ManagedVolume" . 2>/dev/null

# Check prelude_agent.bash to understand the context better
cat stemcell_builder/lib/prelude_agent.bash 2>/dev/null | head -50

# Look for any comments or documentation about these settings
rg -B 10 "InstanceStorageDevicePattern" stemcell_builder/stages/bosh_aws_agent_settings/apply.sh

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 50400


🌐 Web query:

bosh-linux-stemcell-builder InstanceStorageDevicePattern AWS ephemeral device matching

💡 Result:

There isn’t an “InstanceStorageDevicePattern” setting in bosh-linux-stemcell-builder (stemcells don’t define the AWS ephemeral-device mapping pattern). Instead, AWS ephemeral disks are handled at deploy/runtime by the BOSH AWS CPI + bosh-agent. What’s available / relevant in AWS instance-storage (ephemeral) handling: - BOSH documents that with multiple instance-storage disks attached, bosh-agent partitions and labels them with the label pattern raw-ephemeral-, and you can discover them via: /dev/disk/by-partlabel/raw-ephemeral- - Example output shows symlinks like /dev/disk/by-partlabel/raw-ephemeral-0 -> ../../xvdba1 and raw-ephemeral-1 -> ../../xvdbb1. So, for “AWS ephemeral device matching”, the authoritative match pattern exposed to jobs/release tasks is the by-partlabel prefix raw-ephemeral-, not an InstanceStorageDevicePattern from the stemcell builder. Where the mapping/devices are defined (for context): - In the BOSH AWS CPI code, the “first raw ephemeral device” is selected per virtualization type; e.g. for hvm on i3/i3en it uses /dev/nvme0n1, otherwise /dev/xvdba. Additional raw ephemeral mappings are then derived by stepping to next raw ephemeral disks. If your goal is to locate ephemeral instance-storage devices inside a BOSH VM: use /dev/disk/by-partlabel/raw-ephemeral- (agent-managed), rather than trying to match /dev/xvd* or /dev/nvme* by a custom pattern.

Citations:


🏁 Script executed:

# Check if there are any other references to these config fields in CloudFoundry bosh-agent or bosh-aws-cpi
rg "InstanceStorageDevicePattern\|InstanceStorageManagedVolumePattern" . 2>/dev/null | wc -l

# Check git commit message details
git show f8f6b3ab --pretty=fuller 2>/dev/null | head -50

Repository: cloudfoundry/bosh-linux-stemcell-builder

Length of output: 1553


🌐 Web query:

bosh-agent go code DiskSettings device pattern matching AWS EBS NVMe

💡 Result:

In the BOSH agent Go code, DiskSettings (from boshsettings package) holds disk identification fields like ID, VolumeID, Path, DeviceID for persistent/ephemeral disks provided by the CPI. The linux_platform.go uses a DevicePathResolver (boshdpresolv package) to resolve the real kernel device path via GetRealDevicePath(diskSetting), which symlinks it for association/mounting (e.g., AssociateDisk, MountPersistentDisk). Resolution strategies are configurable via LinuxOptions.DevicePathResolutionType ("virtio", "scsi", "iscsi", or default), using specialized resolvers like virtioDevicePathResolver or idDevicePathResolver. These scan directories like /dev/disk/by-id/ for matches against disk ID/VolumeID, falling back to path/mapped resolution if needed (e.g., timeouts noted for AWS vol- IDs in issues). For AWS EBS NVMe (Nitro instances like c5/m5), the AWS CPI maps to stable paths like /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_ (from block_device_manager.rb) or /dev/nvmeXn1. Symlinks in /dev/disk/by-id/ enable pattern matching: resolvers glob/lookup symlinks containing the transformed VolumeID (e.g., vol-xxx -> nvme-Amazon...volxxx). AWS docs confirm NVMe devices /dev/nvmen; ebsnvme-id maps vol IDs to names. DiskIDTransformPattern/Replace in LinuxOptions allows regex transform of IDs for symlink matching (e.g., "^(vol-.*)$" -> "${1}"). partitionPath derives /dev/p1 from device path. Sources confirm fallback to path resolution works reliably on AWS despite by-id mismatches; NVMe order varies, so ID/symlink matching ensures stability.

Citations:


These config fields are not used by bosh-agent and have no implementation.

Lines 15-16 add InstanceStorageDevicePattern and InstanceStorageManagedVolumePattern to the agent config, but:

  1. These field names do not appear anywhere in the bosh-agent or bosh-aws-cpi codebases.
  2. The bosh-agent uses different mechanisms: DevicePathResolutionType (already present in the config) with DevicePathResolver strategies that match disks via /dev/disk/by-id/ symlinks using disk VolumeID and optional DIDTransformPattern regex transforms.
  3. No tests validate these new patterns, and no code consumes them.

Either:

  • Remove these unused fields, OR
  • Add the corresponding implementation in bosh-agent to actually use these patterns (with tests).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@stemcell_builder/stages/bosh_aws_agent_settings/apply.sh` around lines 15 -
16, The two config keys InstanceStorageDevicePattern and
InstanceStorageManagedVolumePattern are dead fields (not consumed by
bosh-agent); either remove them from the agent config output in apply.sh or
implement support in bosh-agent to consume them. To fix: if removing, delete the
two keys from the config generation in apply.sh and update any docs/tests; if
implementing, add parsing and use of
InstanceStorageDevicePattern/InstanceStorageManagedVolumePattern inside the
DevicePathResolver logic (respecting existing DevicePathResolutionType,
DIDTransformPattern and VolumeID matching semantics), add unit tests to
bosh-agent that verify matching/resolution against /dev/disk/by-id/ symlinks,
and update integration tests to exercise managed vs instance-storage resolution
paths.

@github-project-automation github-project-automation Bot moved this from Inbox to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Waiting for Changes | Open for Contribution

Development

Successfully merging this pull request may close these issues.

1 participant