Stream CloudFetch result chunks to bound memory and prevent OOM on large results#1509
Open
msrathore-db wants to merge 3 commits into
Open
Stream CloudFetch result chunks to bound memory and prevent OOM on large results#1509msrathore-db wants to merge 3 commits into
msrathore-db wants to merge 3 commits into
Conversation
Large query results downloaded via CloudFetch could exhaust the JVM heap because up to cloudFetchThreadPoolSize chunks were downloaded and held in memory concurrently, regardless of their size. On small heaps this caused an OutOfMemoryError while readying the first chunk. Track each chunk's byte size from the result manifest and gate chunk scheduling on a configurable in-memory byte budget (default: a fraction of the JVM max heap) in addition to the existing thread-pool limit. At least one chunk is always allowed so an oversized chunk cannot stall consumption. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
8e41b07 to
7c78368
Compare
…heap Each chunk was decompressed by fully materializing the decompressed Arrow payload into an on-heap byte[] (several times the compressed size) before parsing. With multiple chunks downloading and decompressing in parallel, these transient on-heap copies are what exhausted the Java heap on small heaps. Decompression is now streamed directly into the Arrow reader, so the decompressed payload is never held on-heap alongside the compressed bytes (the parsed vectors are off-heap). Chunks are still downloaded and parsed in parallel ahead of consumption, so throughput is unaffected; combined with the in-memory byte budget for concurrent downloads, peak heap for large CloudFetch results drops substantially. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
7c78368 to
f8063a4
Compare
…size Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> # Conflicts: # NEXT_CHANGELOG.md
8fca624 to
7c29f02
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1508.
Downloading large query results via CloudFetch could exhaust the JVM heap and throw
java.lang.OutOfMemoryError: Java heap space(surfaced asFailed to ready chunk/Download failed for chunk index 0), especially on smaller heaps.Root cause
The
OutOfMemoryErroris on-heap. Two factors combined to make on-heap usage scale with the CloudFetch download concurrency (cloudFetchThreadPoolSize, default 16) rather than with available heap:byte[](several times the compressed size) before parsing.cloudFetchThreadPoolSizeof those transient decompressed copies existed at once.(The parsed Arrow vectors themselves are off-heap, so they were not the cause — the transient on-heap decompression buffers were.)
Fix
DecompressionUtil.decompressToInputStream), so the decompressed payload is never materialized on-heap alongside the compressed bytes. Chunks are still downloaded, decompressed, and parsed in parallel ahead of consumption — throughput is unaffected.cloudFetchMaxBytesInMemoryconnection property) in addition to the existing thread-pool limit. At least one chunk is always allowed so an oversized chunk cannot stall consumption; the budget is released as chunks are consumed.Applies to both the SQL Execution (SEA) and Thrift result paths, and to
RemoteChunkProviderandStreamingChunkProvider.Testing
Verified against a SQL warehouse on a 26-column, 169,769-row table (results split into 8 CloudFetch chunks):
-Xmx128m; the patched driver streams the full result at-Xmx40m(peak ~38 MB) — roughly a 3x reduction in required heap.-Xmx2g, the patched driver reads the same result in ~14-16 s vs ~22-28 s unpatched, with lower peak heap — streaming decompression removes the per-chunk full-copy allocation and its GC churn, and the byte budget does not throttle when heap is ample.jdbc-coreunit suite passes; the only failing tests locally are pre-existing integration/e2e tests unrelated to this change (confirmed failing identically onmain).This pull request and its description were written by Isaac.
NO_CHANGELOG=true