Skip to content

fix: add timeout and disable option for BigQuery Storage Read API#171

Open
EtoYoshimura03 wants to merge 3 commits intoadbc-drivers:mainfrom
EtoYoshimura03:fix/storage-read-api-timeout
Open

fix: add timeout and disable option for BigQuery Storage Read API#171
EtoYoshimura03 wants to merge 3 commits intoadbc-drivers:mainfrom
EtoYoshimura03:fix/storage-read-api-timeout

Conversation

@EtoYoshimura03
Copy link
Copy Markdown

fix: add timeout on BigQuery Storage Read API gRPC session to prevent indefinite hang

Problem

When the BigQuery Storage Read API is used (default), the driver can hang indefinitely during query execution. The exact hang point is iter.ArrowIterator(), which issues a gRPC CreateReadSession RPC to bigquerystorage.googleapis.com.

This occurs in environments where HTTP/2 (required by gRPC) is silently blocked — most commonly consumer VPNs and SSL-inspection proxies. The TCP handshake to the storage endpoint succeeds (making the connection appear live), but the gRPC call never completes and no error is returned.

Reproduction

Confirmed with dbt Fusion + hidemy.name VPN (AWS EC2, Frankfurt/Estonia exit nodes):

Without VPN — completes in 2.4s, fails fast with 403 (IP restriction)
TCP connects to 142.250.x.x (Google), gRPC never reached
dbtf debug
With VPN — hangs indefinitely, no error, no timeout
TCP connects, gRPC CreateReadSession never returns
dbtf debug ← stuck here forever
Debug logs confirm the query itself executes successfully (select 1 as id) before the hang — the problem is exclusively in opening the Storage Read session for streaming results.

Root cause

bigquery.RowIterator.ArrowIterator() issues a CreateReadSession gRPC call with no deadline. In environments where HTTP/2 is transparently blocked (VPN, proxy), the call blocks forever.

EnableStorageReadClient() itself is not the issue — it creates a lazy stub and returns immediately.

Fix

record_reader.go — wraps iter.ArrowIterator() in a goroutine with a 30-second context.WithTimeout. On timeout, returns an adbc.StatusTimeout error with an explicit message directing users to set OptionBoolDisableStorageReadClient=true.
connection.go — adds disableStorageReadClient and storageReadAPIEndpoint fields; skips EnableStorageReadClient() when disabled; supports custom gRPC endpoint for testing/emulators.
bigquery_database.go — exposes both options at the database level with GetOption/SetOption.
driver.go — declares the two new option constants.
driver_test.go — adds TestDisableStorageReadClientOption verifying default value, set/unset, construction via NewDatabase, and rejection of invalid values.
Error message shown to users on timeout

[bq] BigQuery Storage Read API timed out after 30s.
This may be caused by HTTP/2 being blocked by a VPN or SSL inspection proxy.
Set adbc.bigquery.sql.disable_storage_read_client=true to disable the Storage Read API.

Notes

The 30s timeout is intentionally generous; most healthy environments complete CreateReadSession in under 1s.
disable_storage_read_client=true currently falls back to non-Arrow row iteration (no REST→Arrow path yet); this is a known limitation noted in the option docs.
storage_read_api_endpoint is included for emulator/local testing compatibility.

Related upstream issue: dbt-labs/dbt-fusion#1552 — HTTPS_PROXY/NO_PROXY environment variables are ignored by the Go ADBC network stack, which means proxy-based workarounds are not available to affected users.

This issue surfaces specifically with dbt Fusion: dbt Core works correctly under the same VPN (queries complete successfully). Without VPN, standard dbt build runs fail with a database error — likely due to IP allowlist restrictions on the BigQuery project — so VPN is a required part of the setup for affected users, not an optional workaround. The hang described here is therefore a real blocker, not an edge case.

Also i didn't build full project (dbt fusion) with this fix. I will be grateful if anyone help me with that

@EtoYoshimura03 EtoYoshimura03 force-pushed the fix/storage-read-api-timeout branch 2 times, most recently from 8cf196f to 82f90f7 Compare April 21, 2026 13:21
Copy link
Copy Markdown
Contributor

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

We don't have a fallback if BigQuery Storage is not available, FWIW. (See #66) What are you expecting to happen currently?

Comment thread go/driver_test.go
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we generally use the assert/require packages to clean up the assertions

Comment thread go/driver.go Outdated
// OptionBoolDisableStorageReadClient disables the BigQuery Storage Read API (gRPC/HTTP2).
// When set to "true", the driver falls back to the standard REST API.
// Useful in environments where HTTP/2 is blocked or unavailable (e.g., SSL inspection proxies).
OptionBoolDisableStorageReadClient = "adbc.bigquery.sql.disable_storage_read_client"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #66, I think it would be preferable to have an option like bigquery.query.backend_api that accepts storage_read/jobs over a toggle

…ent to backend_api func - which accepts storage_read/jobs
@EtoYoshimura03
Copy link
Copy Markdown
Author

Hello, thank you for CR

Fallback — is a REST fallback something you're planning in the future, or would you accept one from outside contributors? I read through #66 and the related discussion (including dbt-labs/dbt-fusion#236) and couldn't quite tell why the fallback has been declined — is it a "not right now" vs. "out of scope entirely" decision?

The answer matters for this PR: if a fallback isn't something you want in the driver at all, I'd rather keep just the timeout and drop backend_api to avoid adding a dead option. If it might land eventually, keeping the option as scaffolding makes sense.

What are you expecting to happen currently?

what I'm expecting currently — honestly, this PR doesn't solve my actual problem (the driver still can't read results through my environment). But at least it turns the indefinite hang into a clear error with an actionable message, which is a meaningful improvement over the current behavior. That's the minimum I was aiming for.

@lidavidm
Copy link
Copy Markdown
Contributor

We would welcome a "regular" fallback. That issue is only closed because dbt found an acceptable response in the interim; I don't think we on this repo declined the feature (that's why #66 is still open!)

Comment thread go/record_reader.go
}
ch := make(chan arrowIterResult, 1)
go func() {
ai, err := iter.ArrowIterator()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need to fork to be able to pass a context down...we already had to fork to fix something else (Google has been unresponsive on the issue filed)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk. Should i TODO something here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeroshade is your Google friend up for fixing this too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this appears to be a plain getter: https://github.com/googleapis/google-cloud-go/blob/9773f607fd1d2d528cd82b2544fc10bce3c2ac74/bigquery/storage_iterator.go#L361-L377

Why do we need to wrap this in a timeout?

Comment thread go/record_reader.go Outdated
Code: adbc.StatusTimeout,
Msg: "[bq] BigQuery Storage Read API timed out after 30s. " +
"This may be caused by HTTP/2 being blocked by a VPN or SSL inspection proxy, " +
"or by running the driver inside a c-shared DLL where the gRPC stream stalls. " +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this referring to?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this referring to?

That referred to some independent investigation I did while diagnosing the hang — with the standard cloud.google.com/go/bigquery library (and your fork at v1.73.1-patch), ArrowIterator() works fine over the same VPN/network. But when this ADBC driver is loaded as a -buildmode=c-shared DLL into the dbt Fusion Rust process, the gRPC stream stalls indefinitely in exactly that call. I couldn't pin down a root cause in the CGO/Go-runtime-under-Rust interaction, so I was hedging in the error message.

I will dpor it, cause it have no means in error by itself and left only timeout error msg

@EtoYoshimura03
Copy link
Copy Markdown
Author

We would welcome a "regular" fallback. That issue is only closed because dbt found an acceptable response in the interim; I don't think we on this repo declined the feature (that's why #66 is still open!)

Great, thanks for clarifying — that's encouraging. I'll keep backend_api as scaffolding in this PR then, and the actual REST fallback can follow in a separate PR

@EtoYoshimura03 EtoYoshimura03 requested a review from lidavidm April 29, 2026 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants