Skip to content

Fix NVMe remote disk detection with model-based identification #4325

Open
umfranci wants to merge 8 commits intomainfrom
umfranci/nvme-data-disks-11032026
Open

Fix NVMe remote disk detection with model-based identification #4325
umfranci wants to merge 8 commits intomainfrom
umfranci/nvme-data-disks-11032026

Conversation

@umfranci
Copy link
Copy Markdown
Collaborator

The current code uses a controller-based heuristic to identify remote disks, which is unreliable on certain VM configurations.

Changes

  • In nvmecli.py

    • get_devices(): Add support for new nvme-cli JSON schema (Subsystems → Controllers → Namespaces) alongside the legacy flat schema.
    • get_device_models(): New method returning a device-path-to-model mapping. Uses try/except for graceful fallback on malformed JSON.
  • In nvme.py

    • _remove_nvme_remote_disks(): Use model-based detection (checking for "NVMe Accelerator" in model name) as primary method. Includes prefix matching for controller paths (/dev/nvme0) against namespace keys (/dev/nvme0n1). Falls back to legacy controller-based heuristic if nvme-cli data is unavailable.
  • In features.py

    • _is_remote_data_disk(): Use model-based detection as primary method with legacy fallback.

Notes

  • No behavior change on VMs with SCSI disk controllers.
  • No behavior change on VMs where nvme-cli is unavailable (fallback path is preserved).

Copilot AI review requested due to automatic review settings March 18, 2026 04:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves NVMe remote disk detection on Azure by switching from a controller-based heuristic to primarily using nvme-cli’s reported device model (e.g., identifying remote disks via “NVMe Accelerator”), while retaining a legacy fallback when nvme-cli data isn’t available.

Changes:

  • Extend Nvmecli.get_devices() parsing to support both legacy and newer nvme-cli JSON schemas, including deriving missing namespace IDs from DevicePath.
  • Add Nvmecli.get_device_models() to map NVMe namespace device paths to model names.
  • Update NVMe remote-disk filtering in Nvme and Azure Disk feature logic to use model-based detection first with legacy fallback.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
lisa/tools/nvmecli.py Adds namespace-id derivation for older nvme-cli JSON and introduces model mapping across legacy/new schemas.
lisa/features/nvme.py Uses nvme-cli model mapping to filter out remote (“Accelerator”) NVMe devices with fallback to prior heuristic.
lisa/sut_orchestrator/azure/features.py Updates Azure remote data disk detection to prefer model-based identification with legacy fallback.

You can also share your feedback on Copilot code review. Take the survey.

@umfranci umfranci marked this pull request as ready for review March 18, 2026 10:13
Copilot AI review requested due to automatic review settings March 18, 2026 10:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Azure NVMe remote disk detection by switching from a controller-based heuristic to model-based identification (using nvme-cli “ModelNumber”), and extends nvme-cli parsing to support both legacy and newer JSON schemas.

Changes:

  • Extend Nvmecli.get_devices() to tolerate missing NameSpace in legacy nvme-cli JSON by deriving it from DevicePath, and keep supporting the newer Subsystems→Controllers→Namespaces schema.
  • Add Nvmecli.get_device_models() to map /dev/nvme* namespace paths to model names from nvme-cli JSON (legacy + new schema).
  • Update NVMe remote disk filtering to prefer model-based detection (with legacy fallback) in both the NVMe feature and Azure disk feature logic.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
lisa/tools/nvmecli.py Adds namespace-id derivation for legacy JSON missing NameSpace, and introduces model extraction (get_device_models) across both nvme-cli schemas.
lisa/features/nvme.py Updates remote NVMe disk filtering to primarily use model-based detection with safe controller-prefix matching fallback behavior unchanged.
lisa/sut_orchestrator/azure/features.py Updates Azure remote data disk classification to prefer model-based detection, falling back to the legacy OS-controller heuristic.
**Key Test Cases:**
verify_nvme_basic|verify_nvme_disk_controller_type|verify_hot_add_disk_serial

**Impacted LISA Features:**
Nvme, Disk

**Tested Azure Marketplace Images:**
- canonical 0001-com-ubuntu-server-jammy 22_04-lts latest
- redhat rhel 8_10 latest

You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 20, 2026 05:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Azure NVMe remote-disk detection by switching from a controller-based heuristic to model-based identification, and updates nvme-cli parsing to support both legacy and newer JSON schemas.

Changes:

  • Extend Nvmecli.get_devices() to parse both legacy “flat” and newer “Subsystems → Controllers → Namespaces” JSON schemas (and derive missing namespace IDs from DevicePath when needed).
  • Add Nvmecli.get_device_models() and use it as the primary signal for identifying Azure remote NVMe disks (model contains “NVMe Accelerator”), with legacy heuristic fallback.
  • Update Azure disk feature logic to use model-based remote-disk detection first, preserving legacy fallback behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
lisa/tools/nvmecli.py Adds new-schema parsing and introduces get_device_models() for device-path → model mapping.
lisa/features/nvme.py Uses model-based detection to filter out remote NVMe devices from local NVMe enumeration, with legacy fallback.
lisa/sut_orchestrator/azure/features.py Updates remote data disk detection to prefer model-based classification, with legacy fallback.

Key Test Cases:
verify_nvme_disk_controller_type|verify_hot_add_disk_serial

Impacted LISA Features:
Nvme, Disk

Tested Azure Marketplace Images:

  • canonical 0001-com-ubuntu-server-jammy 22_04-lts latest
  • redhat rhel 9_5 latest

@umfranci umfranci requested a review from SRIKKANTH March 20, 2026 09:08
return True
return False

return [disk for disk in disk_list if not _is_remote(disk)]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea is not just return a disk list but also to modify the original disk_list by removing nvme remote disks. Any particular reason why a new list was returned without modifying disk_list? If no, please make that necessary change and then check the call hierarchy and test it with all the tests making use of this modified method.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code to modify the original list - making it consistent with the existing code

# Derive the namespace ID from the DevicePath if missing.
if isinstance(device_path, str) and device_path and namespace_id is None:
# e.g., "/dev/nvme0n1" → "1", "/dev/nvme0n17" → "17"
ns_match = re.search(r"n(\d+)$", device_path)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is fine, please use like this instead, to make it less ambiguous:
ns_match = re.search(r"nvme\d+n(\d+)$", device_path)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code to use the suggested pattern matching

Copilot AI review requested due to automatic review settings March 27, 2026 08:40
@umfranci umfranci force-pushed the umfranci/nvme-data-disks-11032026 branch from 05cdf43 to 8eabace Compare March 27, 2026 08:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment on lines +364 to +367
try:
nvme_devices = json.loads(nvme_list.stdout)["Devices"]
except (json.JSONDecodeError, KeyError, TypeError):
return {}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_device_models() assumes json.loads(...)["Devices"] is iterable of dict-like objects. If Devices exists but is not a list of dicts (e.g., malformed output where Devices is a dict/str/null), the subsequent for nvme_device in nvme_devices will raise (e.g., AttributeError on nvme_device.get) instead of gracefully returning {}. Consider validating the parsed structure (top-level is dict and Devices is a list) before iterating, and return {} when it isn't.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants