Skip to content

petri: support lazy disk artifact fetching via HTTP#3189

Draft
jstarks wants to merge 1 commit intomicrosoft:mainfrom
jstarks:petri-blobby
Draft

petri: support lazy disk artifact fetching via HTTP#3189
jstarks wants to merge 1 commit intomicrosoft:mainfrom
jstarks:petri-blobby

Conversation

@jstarks
Copy link
Copy Markdown
Member

@jstarks jstarks commented Apr 3, 2026

No description provided.

Add infrastructure for on-demand disk artifact fetching so VMM tests
don't need to download entire multi-GB VHDs upfront.

Core changes:
- Add ArtifactSource (Local | Remote), ResolvedArtifactSource, and
  RemoteAccess enum to petri_artifacts_core
- ArtifactResolver gains require_source(handle, RemoteAccess) alongside
  existing require(). require() delegates to require_source with LocalOnly
- Resolver-wide remote policy from PETRI_REMOTE_ARTIFACTS env var
- TestArtifactRequirements tracks per-artifact remote access orthogonally
  with optional flag

Petri disk plumbing:
- Add DiskPath enum (Local | Remote) with From<PathBuf>
- Disk::Differencing takes DiskPath instead of PathBuf
- BootImageConfig stores type-erased ResolvedArtifactSource
- Firmware::boot_drive() converts ArtifactSource -> DiskPath

Backend wiring:
- OpenVMM: DiskPath::Remote produces RamDiskLayer + AutoCacheSqlite +
  BlobDisk layer stack with per-OS cache directory
- Hyper-V: DiskPath::Remote rejected with clear error
- Test macros emit RemoteAccess::LocalOnly for Hyper-V, Allow for OpenVMM

Resolver:
- resolve_source() tries local first, falls back to Azure blob URL
  constructed from IsHostedOnHvliteAzureBlobStore::FILENAME
- Single blob_artifact_info table serves both local path lookup and URL
  construction, removing vmm_test_images dependency from resolver

Also fix disklayer_sqlite auto_cache disk_id formatting: {b:2x} -> {b:02x}
to avoid spaces in cache directory names.
Copilot AI review requested due to automatic review settings April 3, 2026 17:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the petri test framework to support “lazy” disk artifact resolution where artifacts can resolve to either a local file path or a remote HTTP URL, enabling on-demand fetching (with caching) rather than requiring pre-downloaded images in all environments.

Changes:

  • Introduces ArtifactSource + RemoteAccess and a new require_source() API to allow artifacts to resolve to local paths or remote URLs.
  • Adds remote-disk plumbing for the OpenVMM backend by translating remote artifacts into HTTP-backed blob disks with a sqlite read-cache layer.
  • Updates VMM test macros and selected test/prep codepaths to opt into remote-capable disk requirements; removes vmm_test_images dependency from the OpenVMM known-path resolver.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
vmm_tests/vmm_test_macros/src/lib.rs Switches boot media requirements to require_source(...) and wires a per-backend remote access policy into generated tests.
vmm_tests/prep_steps/src/main.rs Updates prep step to request the Windows VHD via require_source(..., Allow).
vmm_tests/petri_artifact_resolver_openvmm_known_paths/src/lib.rs Adds remote URL resolution for blob-hosted artifacts and refactors test image resolution away from vmm_test_images::KnownTestArtifacts.
vmm_tests/petri_artifact_resolver_openvmm_known_paths/Cargo.toml Removes vmm_test_images dependency.
vm/devices/storage/disklayer_sqlite/src/auto_cache.rs Fixes cache-key hex formatting to always be zero-padded.
petri/src/vm/openvmm/mod.rs Adds translation of remote differencing disks into layered (RAM + sqlite cache + HTTP blob) disks.
petri/src/vm/mod.rs Introduces DiskPath (Local/Remote), updates boot-drive handling to use it, and adds PETRI_CACHE_DIR cache directory selection.
petri/src/vm/hyperv/mod.rs Rejects remote disk artifacts for the Hyper-V backend with an explicit error.
petri/src/lib.rs Re-exports the new artifact source/policy types (ArtifactSource, RemoteAccess, ResolvedArtifactSource).
petri/petri_artifacts_core/src/lib.rs Implements core API changes: ArtifactSource, RemoteAccess, ResolvedArtifactSource, require_source(), and remote-aware requirement collection/resolution.
petri/burette/src/tests/network.rs Updates test boot image requirements to allow remote sources.
Cargo.lock Drops vmm_test_images from the resolver crate’s dependency closure.

Comment on lines +265 to +267
/// this particular artifact. The resolver-wide policy set via
/// [`with_remote_access`](Self::with_remote_access) can further restrict
/// this to `LocalOnly`.
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rustdoc link references ArtifactResolver::with_remote_access, but no such method exists in this file. This will produce broken intra-doc links and is confusing for API consumers; either add the method (to override remote_policy) or update the docs to describe the actual control mechanism (PETRI_REMOTE_ARTIFACTS env var + per-call RemoteAccess).

Suggested change
/// this particular artifact. The resolver-wide policy set via
/// [`with_remote_access`](Self::with_remote_access) can further restrict
/// this to `LocalOnly`.
/// this particular artifact. The resolver's configured remote policy may
/// further restrict this request and force the effective access mode to
/// `LocalOnly`.

Copilot uses AI. Check for mistakes.
Comment on lines +493 to +496
/// Remote access for each artifact is determined by the per-call
/// [`RemoteAccess`] flags recorded during collection, as potentially
/// overridden by the resolver-wide policy set via
/// [`ArtifactResolver::with_remote_access`].
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestArtifactRequirements::resolve docs mention a resolver-wide override via ArtifactResolver::with_remote_access, but that API doesn’t exist (only PETRI_REMOTE_ARTIFACTS is implemented). Please either implement the override API or adjust this doc comment so it matches the real behavior.

Suggested change
/// Remote access for each artifact is determined by the per-call
/// [`RemoteAccess`] flags recorded during collection, as potentially
/// overridden by the resolver-wide policy set via
/// [`ArtifactResolver::with_remote_access`].
/// Remote access for each artifact is determined by the
/// [`RemoteAccess`] flags recorded during collection, subject to any
/// process-wide override configured via the `PETRI_REMOTE_ARTIFACTS`
/// environment variable.

Copilot uses AI. Check for mistakes.
Comment on lines +213 to +214
let format = if url.ends_with(".vhd") || url.ends_with(".vmgs") {
disk_backend_resources::BlobDiskFormat::FixedVhd1
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remote disk handling treats anything not ending in .vhd/.vmgs as BlobDiskFormat::Flat, but some blob-hosted test artifacts are .vhdx (e.g. WINDOWS_11_ENTERPRISE_AARCH64 in petri_artifacts_vmm_test). Flat will not parse a VHDX, so allowing remote access for those artifacts will fail at runtime. Consider explicitly rejecting .vhdx (force local download) or extending the blob disk stack to support VHDX.

Suggested change
let format = if url.ends_with(".vhd") || url.ends_with(".vmgs") {
disk_backend_resources::BlobDiskFormat::FixedVhd1
let url_path = url.split(['?', '#']).next().unwrap_or(url);
let format = if url_path.ends_with(".vhd") || url_path.ends_with(".vmgs") {
disk_backend_resources::BlobDiskFormat::FixedVhd1
} else if url_path.ends_with(".vhdx") {
anyhow::bail!(
"remote blob disks do not support VHDX artifacts: {url}. Download the artifact locally instead"
);

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +150
fn resolve_source(&self, id: ErasedArtifactHandle) -> anyhow::Result<ArtifactSource> {
// Try local resolution first.
if let Ok(path) = self.resolve(id) {
return Ok(ArtifactSource::Local(path));
}

// Fall back to remote URL for artifacts hosted on Azure Blob Storage.
if let Some(info) = blob_artifact_info(id) {
let url = format!(
"https://{STORAGE_ACCOUNT}.blob.core.windows.net/{CONTAINER}/{}",
info.filename
);
return Ok(ArtifactSource::Remote { url });
}

// No local path and no remote URL available — return the original error.
self.resolve(id).map(ArtifactSource::Local)
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_source calls self.resolve(id) twice and the final comment says it returns the original error, but the original error is discarded. Capture the first error and return it if there’s no remote URL fallback to avoid duplicate work and ensure the error is preserved.

Copilot uses AI. Check for mistakes.
Comment on lines +156 to +185
/// Info about a blob-hosted artifact.
struct BlobArtifactInfo {
/// Filename in Azure Blob Storage (from `IsHostedOnHvliteAzureBlobStore::FILENAME`).
filename: &'static str,
/// CLI name for `cargo xtask guest-test download-image --artifacts <name>`.
download_name: &'static str,
}

/// Returns info about a blob-hosted artifact, if it is hosted on Azure Blob Storage.
fn blob_artifact_info(id: ErasedArtifactHandle) -> Option<BlobArtifactInfo> {
use petri_artifacts_core::AsArtifactHandle;
use petri_artifacts_vmm_test::artifacts::*;
use petri_artifacts_vmm_test::tags::IsHostedOnHvliteAzureBlobStore;

#[rustfmt::skip]
let info = match id {
_ if id == test_vhd::GEN1_WINDOWS_DATA_CENTER_CORE2022_X64.erase() => BlobArtifactInfo { filename: test_vhd::GEN1_WINDOWS_DATA_CENTER_CORE2022_X64::FILENAME, download_name: "Gen1WindowsDataCenterCore2022X64Vhd" },
_ if id == test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2022_X64.erase() => BlobArtifactInfo { filename: test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2022_X64::FILENAME, download_name: "Gen2WindowsDataCenterCore2022X64Vhd" },
_ if id == test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2025_X64.erase() => BlobArtifactInfo { filename: test_vhd::GEN2_WINDOWS_DATA_CENTER_CORE2025_X64::FILENAME, download_name: "Gen2WindowsDataCenterCore2025X64Vhd" },
_ if id == test_vhd::FREE_BSD_13_2_X64.erase() => BlobArtifactInfo { filename: test_vhd::FREE_BSD_13_2_X64::FILENAME, download_name: "FreeBsd13_2X64Vhd" },
_ if id == test_vhd::ALPINE_3_23_X64.erase() => BlobArtifactInfo { filename: test_vhd::ALPINE_3_23_X64::FILENAME, download_name: "Alpine323X64Vhd" },
_ if id == test_vhd::ALPINE_3_23_AARCH64.erase() => BlobArtifactInfo { filename: test_vhd::ALPINE_3_23_AARCH64::FILENAME, download_name: "Alpine323Aarch64Vhd" },
_ if id == test_vhd::UBUNTU_2404_SERVER_X64.erase() => BlobArtifactInfo { filename: test_vhd::UBUNTU_2404_SERVER_X64::FILENAME, download_name: "Ubuntu2404ServerX64Vhd" },
_ if id == test_vhd::UBUNTU_2504_SERVER_X64.erase() => BlobArtifactInfo { filename: test_vhd::UBUNTU_2504_SERVER_X64::FILENAME, download_name: "Ubuntu2504ServerX64Vhd" },
_ if id == test_vhd::UBUNTU_2404_SERVER_AARCH64.erase() => BlobArtifactInfo { filename: test_vhd::UBUNTU_2404_SERVER_AARCH64::FILENAME, download_name: "Ubuntu2404ServerAarch64Vhd" },
_ if id == test_vhd::WINDOWS_11_ENTERPRISE_AARCH64.erase() => BlobArtifactInfo { filename: test_vhd::WINDOWS_11_ENTERPRISE_AARCH64::FILENAME, download_name: "Windows11EnterpriseAarch64Vhdx" },
_ if id == test_iso::FREE_BSD_13_2_X64.erase() => BlobArtifactInfo { filename: test_iso::FREE_BSD_13_2_X64::FILENAME, download_name: "FreeBsd13_2X64Iso" },
_ if id == test_vmgs::VMGS_WITH_BOOT_ENTRY.erase() => BlobArtifactInfo { filename: test_vmgs::VMGS_WITH_BOOT_ENTRY::FILENAME, download_name: "VmgsWithBootEntry" },
_ if id == test_vmgs::VMGS_WITH_16K_TPM.erase() => BlobArtifactInfo { filename: test_vmgs::VMGS_WITH_16K_TPM::FILENAME, download_name: "VmgsWith16kTpm" },
_ => return None,
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blob_artifact_info hard-codes the download_name strings that are passed to cargo xtask guest-test download-image --artifacts .... This duplicates the source-of-truth currently embodied by vmm_test_images::KnownTestArtifacts (which derives the name from the enum variant) and can easily drift, making the suggested recovery command incorrect. Consider reusing a centralized mapping/constant for these names (or reintroducing vmm_test_images just for the name mapping).

Copilot uses AI. Check for mistakes.
Comment on lines +3144 to +3173
/// Returns the cache directory for lazy-fetched disk artifacts.
pub(crate) fn petri_disk_cache_dir() -> String {
if let Ok(dir) = std::env::var("PETRI_CACHE_DIR") {
return dir;
}

#[cfg(target_os = "macos")]
{
if let Ok(home) = std::env::var("HOME") {
return format!("{home}/Library/Caches/petri");
}
}

#[cfg(windows)]
{
if let Ok(local) = std::env::var("LOCALAPPDATA") {
return format!("{local}\\petri\\cache");
}
}

// Linux / fallback: XDG
if let Ok(xdg) = std::env::var("XDG_CACHE_HOME") {
return format!("{xdg}/petri");
}
if let Ok(home) = std::env::var("HOME") {
return format!("{home}/.cache/petri");
}

".cache/petri".to_string()
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a new cache location mechanism via PETRI_CACHE_DIR (and the PR also introduces PETRI_REMOTE_ARTIFACTS in petri_artifacts_core). The developer docs for VMM test images currently only describe pre-downloading via cargo xtask guest-test download-image (see Guide dev_guide/tests/vmm/azure_images.md). Please update the Guide to document the new lazy-fetch behavior and the relevant env vars so local/CI behavior is clear.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants