petri: support lazy disk artifact fetching via HTTP by jstarks · Pull Request #3189 · microsoft/openvmm

jstarks · 2026-04-03T17:18:32Z

No description provided.

Add infrastructure for on-demand disk artifact fetching so VMM tests don't need to download entire multi-GB VHDs upfront. Core changes: - Add ArtifactSource (Local | Remote), ResolvedArtifactSource, and RemoteAccess enum to petri_artifacts_core - ArtifactResolver gains require_source(handle, RemoteAccess) alongside existing require(). require() delegates to require_source with LocalOnly - Resolver-wide remote policy from PETRI_REMOTE_ARTIFACTS env var - TestArtifactRequirements tracks per-artifact remote access orthogonally with optional flag Petri disk plumbing: - Add DiskPath enum (Local | Remote) with From<PathBuf> - Disk::Differencing takes DiskPath instead of PathBuf - BootImageConfig stores type-erased ResolvedArtifactSource - Firmware::boot_drive() converts ArtifactSource -> DiskPath Backend wiring: - OpenVMM: DiskPath::Remote produces RamDiskLayer + AutoCacheSqlite + BlobDisk layer stack with per-OS cache directory - Hyper-V: DiskPath::Remote rejected with clear error - Test macros emit RemoteAccess::LocalOnly for Hyper-V, Allow for OpenVMM Resolver: - resolve_source() tries local first, falls back to Azure blob URL constructed from IsHostedOnHvliteAzureBlobStore::FILENAME - Single blob_artifact_info table serves both local path lookup and URL construction, removing vmm_test_images dependency from resolver Also fix disklayer_sqlite auto_cache disk_id formatting: {b:2x} -> {b:02x} to avoid spaces in cache directory names.

Copilot

Pull request overview

This PR extends the petri test framework to support “lazy” disk artifact resolution where artifacts can resolve to either a local file path or a remote HTTP URL, enabling on-demand fetching (with caching) rather than requiring pre-downloaded images in all environments.

Changes:

Introduces ArtifactSource + RemoteAccess and a new require_source() API to allow artifacts to resolve to local paths or remote URLs.
Adds remote-disk plumbing for the OpenVMM backend by translating remote artifacts into HTTP-backed blob disks with a sqlite read-cache layer.
Updates VMM test macros and selected test/prep codepaths to opt into remote-capable disk requirements; removes vmm_test_images dependency from the OpenVMM known-path resolver.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
vmm_tests/vmm_test_macros/src/lib.rs	Switches boot media requirements to `require_source(...)` and wires a per-backend remote access policy into generated tests.
vmm_tests/prep_steps/src/main.rs	Updates prep step to request the Windows VHD via `require_source(..., Allow)`.
vmm_tests/petri_artifact_resolver_openvmm_known_paths/src/lib.rs	Adds remote URL resolution for blob-hosted artifacts and refactors test image resolution away from `vmm_test_images::KnownTestArtifacts`.
vmm_tests/petri_artifact_resolver_openvmm_known_paths/Cargo.toml	Removes `vmm_test_images` dependency.
vm/devices/storage/disklayer_sqlite/src/auto_cache.rs	Fixes cache-key hex formatting to always be zero-padded.
petri/src/vm/openvmm/mod.rs	Adds translation of remote differencing disks into layered (RAM + sqlite cache + HTTP blob) disks.
petri/src/vm/mod.rs	Introduces `DiskPath` (Local/Remote), updates boot-drive handling to use it, and adds `PETRI_CACHE_DIR` cache directory selection.
petri/src/vm/hyperv/mod.rs	Rejects remote disk artifacts for the Hyper-V backend with an explicit error.
petri/src/lib.rs	Re-exports the new artifact source/policy types (`ArtifactSource`, `RemoteAccess`, `ResolvedArtifactSource`).
petri/petri_artifacts_core/src/lib.rs	Implements core API changes: `ArtifactSource`, `RemoteAccess`, `ResolvedArtifactSource`, `require_source()`, and remote-aware requirement collection/resolution.
petri/burette/src/tests/network.rs	Updates test boot image requirements to allow remote sources.
Cargo.lock	Drops `vmm_test_images` from the resolver crate’s dependency closure.

Copilot · 2026-04-03T17:26:15Z

petri/petri_artifacts_core/src/lib.rs

+    /// this particular artifact. The resolver-wide policy set via
+    /// [`with_remote_access`](Self::with_remote_access) can further restrict
+    /// this to `LocalOnly`.


The rustdoc link references ArtifactResolver::with_remote_access, but no such method exists in this file. This will produce broken intra-doc links and is confusing for API consumers; either add the method (to override remote_policy) or update the docs to describe the actual control mechanism (PETRI_REMOTE_ARTIFACTS env var + per-call RemoteAccess).

Suggested change

/// this particular artifact. The resolver-wide policy set via

/// [`with_remote_access`](Self::with_remote_access) can further restrict

/// this to `LocalOnly`.

/// this particular artifact. The resolver's configured remote policy may

/// further restrict this request and force the effective access mode to

/// `LocalOnly`.

Copilot · 2026-04-03T17:26:16Z

petri/petri_artifacts_core/src/lib.rs

+    /// Remote access for each artifact is determined by the per-call
+    /// [`RemoteAccess`] flags recorded during collection, as potentially
+    /// overridden by the resolver-wide policy set via
+    /// [`ArtifactResolver::with_remote_access`].


TestArtifactRequirements::resolve docs mention a resolver-wide override via ArtifactResolver::with_remote_access, but that API doesn’t exist (only PETRI_REMOTE_ARTIFACTS is implemented). Please either implement the override API or adjust this doc comment so it matches the real behavior.

Suggested change

/// Remote access for each artifact is determined by the per-call

/// [`RemoteAccess`] flags recorded during collection, as potentially

/// overridden by the resolver-wide policy set via

/// [`ArtifactResolver::with_remote_access`].

/// Remote access for each artifact is determined by the

/// [`RemoteAccess`] flags recorded during collection, subject to any

/// process-wide override configured via the `PETRI_REMOTE_ARTIFACTS`

/// environment variable.

Copilot · 2026-04-03T17:26:16Z

petri/src/vm/openvmm/mod.rs

+    let format = if url.ends_with(".vhd") || url.ends_with(".vmgs") {
+        disk_backend_resources::BlobDiskFormat::FixedVhd1


Remote disk handling treats anything not ending in .vhd/.vmgs as BlobDiskFormat::Flat, but some blob-hosted test artifacts are .vhdx (e.g. WINDOWS_11_ENTERPRISE_AARCH64 in petri_artifacts_vmm_test). Flat will not parse a VHDX, so allowing remote access for those artifacts will fail at runtime. Consider explicitly rejecting .vhdx (force local download) or extending the blob disk stack to support VHDX.

Suggested change

let format = if url.ends_with(".vhd") || url.ends_with(".vmgs") {

disk_backend_resources::BlobDiskFormat::FixedVhd1

let url_path = url.split(['?', '#']).next().unwrap_or(url);

let format = if url_path.ends_with(".vhd") || url_path.ends_with(".vmgs") {

disk_backend_resources::BlobDiskFormat::FixedVhd1

} else if url_path.ends_with(".vhdx") {

anyhow::bail!(

"remote blob disks do not support VHDX artifacts: {url}. Download the artifact locally instead"

);

Copilot · 2026-04-03T17:26:16Z

vmm_tests/petri_artifact_resolver_openvmm_known_paths/src/lib.rs

+    fn resolve_source(&self, id: ErasedArtifactHandle) -> anyhow::Result<ArtifactSource> {
+        // Try local resolution first.
+        if let Ok(path) = self.resolve(id) {
+            return Ok(ArtifactSource::Local(path));
+        }
+
+        // Fall back to remote URL for artifacts hosted on Azure Blob Storage.
+        if let Some(info) = blob_artifact_info(id) {
+            let url = format!(
+                "https://{STORAGE_ACCOUNT}.blob.core.windows.net/{CONTAINER}/{}",
+                info.filename
+            );
+            return Ok(ArtifactSource::Remote { url });
+        }
+
+        // No local path and no remote URL available — return the original error.
+        self.resolve(id).map(ArtifactSource::Local)
+    }


resolve_source calls self.resolve(id) twice and the final comment says it returns the original error, but the original error is discarded. Capture the first error and return it if there’s no remote URL fallback to avoid duplicate work and ensure the error is preserved.

Copilot · 2026-04-03T17:26:17Z

vmm_tests/petri_artifact_resolver_openvmm_known_paths/src/lib.rs

+/// Info about a blob-hosted artifact.
+struct BlobArtifactInfo {
+    /// Filename in Azure Blob Storage (from `IsHostedOnHvliteAzureBlobStore::FILENAME`).
+    filename: &'static str,
+    /// CLI name for `cargo xtask guest-test download-image --artifacts <name>`.
+    download_name: &'static str,
+}
+
+/// Returns info about a blob-hosted artifact, if it is hosted on Azure Blob Storage.
+fn blob_artifact_info(id: ErasedArtifactHandle) -> Option<BlobArtifactInfo> {
+    use petri_artifacts_core::AsArtifactHandle;
+    use petri_artifacts_vmm_test::artifacts::*;
+    use petri_artifacts_vmm_test::tags::IsHostedOnHvliteAzureBlobStore;
+
+    #[rustfmt::skip]
+    let info = match id {
+        _ if id == test_vhd::GEN1_WINDOWS_DATA_CENTER_CORE2022_X64.erase() => BlobArtifactInfo { filename: test_vhd::GEN1_WINDOWS_DATA_CENTER_CORE2022_X64::FILENAME, download_name: "Gen1WindowsDataCenterCore2022X64Vhd" },
+        _ if id == test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2022_X64.erase() => BlobArtifactInfo { filename: test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2022_X64::FILENAME, download_name: "Gen2WindowsDataCenterCore2022X64Vhd" },
+        _ if id == test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2025_X64.erase() => BlobArtifactInfo { filename: test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2025_X64::FILENAME, download_name: "Gen2WindowsDataCenterCore2025X64Vhd" },
+        _ if id == test_vhd::FREE_BSD_13_2_X64.erase() => BlobArtifactInfo { filename: test_vhd::FREE_BSD_13_2_X64::FILENAME, download_name: "FreeBsd13_2X64Vhd" },
+        _ if id == test_vhd::ALPINE_3_23_X64.erase() => BlobArtifactInfo { filename: test_vhd::ALPINE_3_23_X64::FILENAME, download_name: "Alpine323X64Vhd" },
+        _ if id == test_vhd::ALPINE_3_23_AARCH64.erase() => BlobArtifactInfo { filename: test_vhd::ALPINE_3_23_AARCH64::FILENAME, download_name: "Alpine323Aarch64Vhd" },
+        _ if id == test_vhd::UBUNTU_2404_SERVER_X64.erase() => BlobArtifactInfo { filename: test_vhd::UBUNTU_2404_SERVER_X64::FILENAME, download_name: "Ubuntu2404ServerX64Vhd" },
+        _ if id == test_vhd::UBUNTU_2504_SERVER_X64.erase() => BlobArtifactInfo { filename: test_vhd::UBUNTU_2504_SERVER_X64::FILENAME, download_name: "Ubuntu2504ServerX64Vhd" },
+        _ if id == test_vhd::UBUNTU_2404_SERVER_AARCH64.erase() => BlobArtifactInfo { filename: test_vhd::UBUNTU_2404_SERVER_AARCH64::FILENAME, download_name: "Ubuntu2404ServerAarch64Vhd" },
+        _ if id == test_vhd::WINDOWS_11_ENTERPRISE_AARCH64.erase() => BlobArtifactInfo { filename: test_vhd::WINDOWS_11_ENTERPRISE_AARCH64::FILENAME, download_name: "Windows11EnterpriseAarch64Vhdx" },
+        _ if id == test_iso::FREE_BSD_13_2_X64.erase() => BlobArtifactInfo { filename: test_iso::FREE_BSD_13_2_X64::FILENAME, download_name: "FreeBsd13_2X64Iso" },
+        _ if id == test_vmgs::VMGS_WITH_BOOT_ENTRY.erase() => BlobArtifactInfo { filename: test_vmgs::VMGS_WITH_BOOT_ENTRY::FILENAME, download_name: "VmgsWithBootEntry" },
+        _ if id == test_vmgs::VMGS_WITH_16K_TPM.erase() => BlobArtifactInfo { filename: test_vmgs::VMGS_WITH_16K_TPM::FILENAME, download_name: "VmgsWith16kTpm" },
+        _ => return None,


blob_artifact_info hard-codes the download_name strings that are passed to cargo xtask guest-test download-image --artifacts .... This duplicates the source-of-truth currently embodied by vmm_test_images::KnownTestArtifacts (which derives the name from the enum variant) and can easily drift, making the suggested recovery command incorrect. Consider reusing a centralized mapping/constant for these names (or reintroducing vmm_test_images just for the name mapping).

Copilot · 2026-04-03T17:26:17Z

petri/src/vm/mod.rs

+/// Returns the cache directory for lazy-fetched disk artifacts.
+pub(crate) fn petri_disk_cache_dir() -> String {
+    if let Ok(dir) = std::env::var("PETRI_CACHE_DIR") {
+        return dir;
+    }
+
+    #[cfg(target_os = "macos")]
+    {
+        if let Ok(home) = std::env::var("HOME") {
+            return format!("{home}/Library/Caches/petri");
+        }
+    }
+
+    #[cfg(windows)]
+    {
+        if let Ok(local) = std::env::var("LOCALAPPDATA") {
+            return format!("{local}\\petri\\cache");
+        }
+    }
+
+    // Linux / fallback: XDG
+    if let Ok(xdg) = std::env::var("XDG_CACHE_HOME") {
+        return format!("{xdg}/petri");
+    }
+    if let Ok(home) = std::env::var("HOME") {
+        return format!("{home}/.cache/petri");
+    }
+
+    ".cache/petri".to_string()
+}


This introduces a new cache location mechanism via PETRI_CACHE_DIR (and the PR also introduces PETRI_REMOTE_ARTIFACTS in petri_artifacts_core). The developer docs for VMM test images currently only describe pre-downloading via cargo xtask guest-test download-image (see Guide dev_guide/tests/vmm/azure_images.md). Please update the Guide to document the new lazy-fetch behavior and the relevant env vars so local/CI behavior is clear.

Copilot AI review requested due to automatic review settings April 3, 2026 17:18

Copilot started reviewing on behalf of jstarks April 3, 2026 17:19 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

petri: support lazy disk artifact fetching via HTTP#3189

petri: support lazy disk artifact fetching via HTTP#3189
jstarks wants to merge 1 commit intomicrosoft:mainfrom
jstarks:petri-blobby

jstarks commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		let format = if url.ends_with(".vhd") \|\| url.ends_with(".vmgs") {
		disk_backend_resources::BlobDiskFormat::FixedVhd1

-    let format = if url.ends_with(".vhd") || url.ends_with(".vmgs") {
-        disk_backend_resources::BlobDiskFormat::FixedVhd1
+    let url_path = url.split(['?', '#']).next().unwrap_or(url);
+    let format = if url_path.ends_with(".vhd") || url_path.ends_with(".vmgs") {
+        disk_backend_resources::BlobDiskFormat::FixedVhd1
+    } else if url_path.ends_with(".vhdx") {
+        anyhow::bail!(
+            "remote blob disks do not support VHDX artifacts: {url}. Download the artifact locally instead"
+        );

Conversation

jstarks commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants