fix: add timeout and disable option for BigQuery Storage Read API#171
fix: add timeout and disable option for BigQuery Storage Read API#171EtoYoshimura03 wants to merge 3 commits intoadbc-drivers:mainfrom
Conversation
8cf196f to
82f90f7
Compare
There was a problem hiding this comment.
nit: we generally use the assert/require packages to clean up the assertions
| // OptionBoolDisableStorageReadClient disables the BigQuery Storage Read API (gRPC/HTTP2). | ||
| // When set to "true", the driver falls back to the standard REST API. | ||
| // Useful in environments where HTTP/2 is blocked or unavailable (e.g., SSL inspection proxies). | ||
| OptionBoolDisableStorageReadClient = "adbc.bigquery.sql.disable_storage_read_client" |
There was a problem hiding this comment.
From #66, I think it would be preferable to have an option like bigquery.query.backend_api that accepts storage_read/jobs over a toggle
…ent to backend_api func - which accepts storage_read/jobs
|
Hello, thank you for CR Fallback — is a REST fallback something you're planning in the future, or would you accept one from outside contributors? I read through #66 and the related discussion (including dbt-labs/dbt-fusion#236) and couldn't quite tell why the fallback has been declined — is it a "not right now" vs. "out of scope entirely" decision? The answer matters for this PR: if a fallback isn't something you want in the driver at all, I'd rather keep just the timeout and drop backend_api to avoid adding a dead option. If it might land eventually, keeping the option as scaffolding makes sense.
what I'm expecting currently — honestly, this PR doesn't solve my actual problem (the driver still can't read results through my environment). But at least it turns the indefinite hang into a clear error with an actionable message, which is a meaningful improvement over the current behavior. That's the minimum I was aiming for. |
|
We would welcome a "regular" fallback. That issue is only closed because dbt found an acceptable response in the interim; I don't think we on this repo declined the feature (that's why #66 is still open!) |
| } | ||
| ch := make(chan arrowIterResult, 1) | ||
| go func() { | ||
| ai, err := iter.ArrowIterator() |
There was a problem hiding this comment.
I wonder if we need to fork to be able to pass a context down...we already had to fork to fix something else (Google has been unresponsive on the issue filed)
There was a problem hiding this comment.
idk. Should i TODO something here?
There was a problem hiding this comment.
@zeroshade is your Google friend up for fixing this too?
There was a problem hiding this comment.
Actually, this appears to be a plain getter: https://github.com/googleapis/google-cloud-go/blob/9773f607fd1d2d528cd82b2544fc10bce3c2ac74/bigquery/storage_iterator.go#L361-L377
Why do we need to wrap this in a timeout?
| Code: adbc.StatusTimeout, | ||
| Msg: "[bq] BigQuery Storage Read API timed out after 30s. " + | ||
| "This may be caused by HTTP/2 being blocked by a VPN or SSL inspection proxy, " + | ||
| "or by running the driver inside a c-shared DLL where the gRPC stream stalls. " + |
There was a problem hiding this comment.
What is this referring to?
There was a problem hiding this comment.
What is this referring to?
That referred to some independent investigation I did while diagnosing the hang — with the standard cloud.google.com/go/bigquery library (and your fork at v1.73.1-patch), ArrowIterator() works fine over the same VPN/network. But when this ADBC driver is loaded as a -buildmode=c-shared DLL into the dbt Fusion Rust process, the gRPC stream stalls indefinitely in exactly that call. I couldn't pin down a root cause in the CGO/Go-runtime-under-Rust interaction, so I was hedging in the error message.
I will dpor it, cause it have no means in error by itself and left only timeout error msg
Great, thanks for clarifying — that's encouraging. I'll keep backend_api as scaffolding in this PR then, and the actual REST fallback can follow in a separate PR |
… change to neutral
fix: add timeout on BigQuery Storage Read API gRPC session to prevent indefinite hang
Problem
When the BigQuery Storage Read API is used (default), the driver can hang indefinitely during query execution. The exact hang point is iter.ArrowIterator(), which issues a gRPC CreateReadSession RPC to bigquerystorage.googleapis.com.
This occurs in environments where HTTP/2 (required by gRPC) is silently blocked — most commonly consumer VPNs and SSL-inspection proxies. The TCP handshake to the storage endpoint succeeds (making the connection appear live), but the gRPC call never completes and no error is returned.
Reproduction
Confirmed with dbt Fusion + hidemy.name VPN (AWS EC2, Frankfurt/Estonia exit nodes):
Without VPN — completes in 2.4s, fails fast with 403 (IP restriction)
TCP connects to 142.250.x.x (Google), gRPC never reached
dbtf debug
With VPN — hangs indefinitely, no error, no timeout
TCP connects, gRPC CreateReadSession never returns
dbtf debug ← stuck here forever
Debug logs confirm the query itself executes successfully (select 1 as id) before the hang — the problem is exclusively in opening the Storage Read session for streaming results.
Root cause
bigquery.RowIterator.ArrowIterator() issues a CreateReadSession gRPC call with no deadline. In environments where HTTP/2 is transparently blocked (VPN, proxy), the call blocks forever.
EnableStorageReadClient() itself is not the issue — it creates a lazy stub and returns immediately.
Fix
record_reader.go — wraps iter.ArrowIterator() in a goroutine with a 30-second context.WithTimeout. On timeout, returns an adbc.StatusTimeout error with an explicit message directing users to set OptionBoolDisableStorageReadClient=true.
connection.go — adds disableStorageReadClient and storageReadAPIEndpoint fields; skips EnableStorageReadClient() when disabled; supports custom gRPC endpoint for testing/emulators.
bigquery_database.go — exposes both options at the database level with GetOption/SetOption.
driver.go — declares the two new option constants.
driver_test.go — adds TestDisableStorageReadClientOption verifying default value, set/unset, construction via NewDatabase, and rejection of invalid values.
Error message shown to users on timeout
[bq] BigQuery Storage Read API timed out after 30s.
This may be caused by HTTP/2 being blocked by a VPN or SSL inspection proxy.
Set adbc.bigquery.sql.disable_storage_read_client=true to disable the Storage Read API.
Notes
The 30s timeout is intentionally generous; most healthy environments complete CreateReadSession in under 1s.
disable_storage_read_client=true currently falls back to non-Arrow row iteration (no REST→Arrow path yet); this is a known limitation noted in the option docs.
storage_read_api_endpoint is included for emulator/local testing compatibility.
Related upstream issue: dbt-labs/dbt-fusion#1552 — HTTPS_PROXY/NO_PROXY environment variables are ignored by the Go ADBC network stack, which means proxy-based workarounds are not available to affected users.
This issue surfaces specifically with dbt Fusion: dbt Core works correctly under the same VPN (queries complete successfully). Without VPN, standard dbt build runs fail with a database error — likely due to IP allowlist restrictions on the BigQuery project — so VPN is a required part of the setup for affected users, not an optional workaround. The hang described here is therefore a real blocker, not an edge case.
Also i didn't build full project (dbt fusion) with this fix. I will be grateful if anyone help me with that