Fix NVMe remote disk detection with model-based identification #4325
Fix NVMe remote disk detection with model-based identification #4325
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves NVMe remote disk detection on Azure by switching from a controller-based heuristic to primarily using nvme-cli’s reported device model (e.g., identifying remote disks via “NVMe Accelerator”), while retaining a legacy fallback when nvme-cli data isn’t available.
Changes:
- Extend
Nvmecli.get_devices()parsing to support both legacy and newer nvme-cli JSON schemas, including deriving missing namespace IDs fromDevicePath. - Add
Nvmecli.get_device_models()to map NVMe namespace device paths to model names. - Update NVMe remote-disk filtering in
Nvmeand AzureDiskfeature logic to use model-based detection first with legacy fallback.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| lisa/tools/nvmecli.py | Adds namespace-id derivation for older nvme-cli JSON and introduces model mapping across legacy/new schemas. |
| lisa/features/nvme.py | Uses nvme-cli model mapping to filter out remote (“Accelerator”) NVMe devices with fallback to prior heuristic. |
| lisa/sut_orchestrator/azure/features.py | Updates Azure remote data disk detection to prefer model-based identification with legacy fallback. |
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR improves Azure NVMe remote disk detection by switching from a controller-based heuristic to model-based identification (using nvme-cli “ModelNumber”), and extends nvme-cli parsing to support both legacy and newer JSON schemas.
Changes:
- Extend
Nvmecli.get_devices()to tolerate missingNameSpacein legacy nvme-cli JSON by deriving it fromDevicePath, and keep supporting the newer Subsystems→Controllers→Namespaces schema. - Add
Nvmecli.get_device_models()to map/dev/nvme*namespace paths to model names from nvme-cli JSON (legacy + new schema). - Update NVMe remote disk filtering to prefer model-based detection (with legacy fallback) in both the NVMe feature and Azure disk feature logic.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lisa/tools/nvmecli.py | Adds namespace-id derivation for legacy JSON missing NameSpace, and introduces model extraction (get_device_models) across both nvme-cli schemas. |
| lisa/features/nvme.py | Updates remote NVMe disk filtering to primarily use model-based detection with safe controller-prefix matching fallback behavior unchanged. |
| lisa/sut_orchestrator/azure/features.py | Updates Azure remote data disk classification to prefer model-based detection, falling back to the legacy OS-controller heuristic. |
**Key Test Cases:**
verify_nvme_basic|verify_nvme_disk_controller_type|verify_hot_add_disk_serial
**Impacted LISA Features:**
Nvme, Disk
**Tested Azure Marketplace Images:**
- canonical 0001-com-ubuntu-server-jammy 22_04-lts latest
- redhat rhel 8_10 latest
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR improves Azure NVMe remote-disk detection by switching from a controller-based heuristic to model-based identification, and updates nvme-cli parsing to support both legacy and newer JSON schemas.
Changes:
- Extend
Nvmecli.get_devices()to parse both legacy “flat” and newer “Subsystems → Controllers → Namespaces” JSON schemas (and derive missing namespace IDs fromDevicePathwhen needed). - Add
Nvmecli.get_device_models()and use it as the primary signal for identifying Azure remote NVMe disks (model contains “NVMe Accelerator”), with legacy heuristic fallback. - Update Azure disk feature logic to use model-based remote-disk detection first, preserving legacy fallback behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lisa/tools/nvmecli.py | Adds new-schema parsing and introduces get_device_models() for device-path → model mapping. |
| lisa/features/nvme.py | Uses model-based detection to filter out remote NVMe devices from local NVMe enumeration, with legacy fallback. |
| lisa/sut_orchestrator/azure/features.py | Updates remote data disk detection to prefer model-based classification, with legacy fallback. |
Key Test Cases:
verify_nvme_disk_controller_type|verify_hot_add_disk_serial
Impacted LISA Features:
Nvme, Disk
Tested Azure Marketplace Images:
- canonical 0001-com-ubuntu-server-jammy 22_04-lts latest
- redhat rhel 9_5 latest
lisa/features/nvme.py
Outdated
| return True | ||
| return False | ||
|
|
||
| return [disk for disk in disk_list if not _is_remote(disk)] |
There was a problem hiding this comment.
Idea is not just return a disk list but also to modify the original disk_list by removing nvme remote disks. Any particular reason why a new list was returned without modifying disk_list? If no, please make that necessary change and then check the call hierarchy and test it with all the tests making use of this modified method.
There was a problem hiding this comment.
Updated the code to modify the original list - making it consistent with the existing code
lisa/tools/nvmecli.py
Outdated
| # Derive the namespace ID from the DevicePath if missing. | ||
| if isinstance(device_path, str) and device_path and namespace_id is None: | ||
| # e.g., "/dev/nvme0n1" → "1", "/dev/nvme0n17" → "17" | ||
| ns_match = re.search(r"n(\d+)$", device_path) |
There was a problem hiding this comment.
While this is fine, please use like this instead, to make it less ambiguous:
ns_match = re.search(r"nvme\d+n(\d+)$", device_path)
There was a problem hiding this comment.
Updated the code to use the suggested pattern matching
05cdf43 to
8eabace
Compare
| try: | ||
| nvme_devices = json.loads(nvme_list.stdout)["Devices"] | ||
| except (json.JSONDecodeError, KeyError, TypeError): | ||
| return {} |
There was a problem hiding this comment.
get_device_models() assumes json.loads(...)["Devices"] is iterable of dict-like objects. If Devices exists but is not a list of dicts (e.g., malformed output where Devices is a dict/str/null), the subsequent for nvme_device in nvme_devices will raise (e.g., AttributeError on nvme_device.get) instead of gracefully returning {}. Consider validating the parsed structure (top-level is dict and Devices is a list) before iterating, and return {} when it isn't.
The current code uses a controller-based heuristic to identify remote disks, which is unreliable on certain VM configurations.
Changes
In nvmecli.py
In nvme.py
In features.py
Notes