Skip to content

[Storage] upload roundtrip concerns with download API#4075

Draft
vincenttran-msft wants to merge 2 commits intoAzure:mainfrom
vincenttran-msft:vincenttran/check_upload_usability
Draft

[Storage] upload roundtrip concerns with download API#4075
vincenttran-msft wants to merge 2 commits intoAzure:mainfrom
vincenttran-msft:vincenttran/check_upload_usability

Conversation

@vincenttran-msft
Copy link
Copy Markdown
Member

@vincenttran-msft vincenttran-msft commented Mar 31, 2026

This is applicable to its current state, but will also still be applicable even after managed_download replaces download.

Against our current authoring, the only way to upload a downloaded blob is to buffer in-memory to be able to get it into a RequestContent:

 let download_response = source_blob_client.download(None).await?;
 let buffered: Bytes = download_response.into_body().collect().await?;
 // Only after full buffering can the data be passed to `upload`.
    dest_blob_client
        .upload(RequestContent::from(buffered.to_vec()), None)
        .await?;

Applying a similar workaround as @benbp :

    // content_length must be read from headers *before* deconstruct() consumes the response,
    // because the body stream itself has no len().
    let download_response = source_blob_client.download(None).await?;
    let content_length = download_response.content_length()?;
    let (_, _, body) = download_response.deconstruct();

 // use our custom SeekableStream adapter impl, but this makes retries impossible due to the always-error reset() impl
    let stream = DownloadBodySeekableStream {
        inner: Arc::new(Mutex::new(DownloadBodyInner {
            stream: Some(Box::pin(body)),
            remainder: Bytes::new(),
        })),
        len: content_length.unwrap_or(0),
    };

// But allows this to flow the bytes directly through instead of buffering in-memory
    dest_blob_client
        .upload(Body::SeekableStream(Box::new(stream)).into(), None)
        .await?;

@github-actions github-actions bot added the Storage Storage Service (Queues, Blobs, Files) label Mar 31, 2026
Comment on lines +1182 to +1196
// Act — download the blob
// `download` returns AsyncResponse<BlobClientDownloadResult>: the body is a PinnedStream.
// PinnedStream is a Pin<Box<dyn Stream<Item=Result<Bytes>> + Send>>: it is forward-only
// and cannot be reset or asked for its length without consuming it first.
let download_response = source_blob_client.download(None).await?;

// The content-length header is available, but the body stream itself has no `len()` method
// and no `reset()`. There is no conversion from AsyncResponseBody into RequestContent or
// Body::SeekableStream. The only path forward is to collect the entire stream into memory.
let buffered: Bytes = download_response.into_body().collect().await?;

// Only after full buffering can the data be passed to `upload`.
dest_blob_client
.upload(RequestContent::from(buffered.to_vec()), None)
.await?;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the crux of the issue.

Comment on lines +1340 to +1346
let stream = DownloadBodySeekableStream {
inner: Arc::new(Mutex::new(DownloadBodyInner {
stream: Some(Box::pin(body)),
remainder: Bytes::new(),
})),
len: content_length.unwrap_or(0),
};
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the crux of the workaround: a custom DownloadBodySeekableStream adapter so that we can get it into a RequestContent without collecting in-memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Storage Storage Service (Queues, Blobs, Files)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant