Skip to content

feat(server): stream monolithic blob PUT straight to object store#7

Merged
CMGS merged 3 commits into
mainfrom
feat/push-streaming-blob
Jun 28, 2026
Merged

feat(server): stream monolithic blob PUT straight to object store#7
CMGS merged 3 commits into
mainfrom
feat/push-streaming-blob

Conversation

@tonicmuroq

@tonicmuroq tonicmuroq commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

persistMonolithicUpload (the PUT /v2/<name>/blobs/<digest> path — the only push path our clients use) spooled the entire blob to a disk tempfile via the chunked-upload session machinery, hashed it, then read it back and uploaded to the object store. So a push was:

client ──> epoch: write whole blob to /var/cache/epoch/uploads tempfile (+ sha256)
                     THEN read it back ──> upload to GCS

Two full passes, ~2x disk I/O, and receive and upload run serially. For multi-GiB VM disk/memory blobs this was the bulk of push time (single PUTs were taking 2–4 min).

Change

A monolithic PUT already knows the digest up front (it's in the URL), so there's no reason to buffer. Stream the request body through a sha256 hasher straight into a concurrent multipart upload (minio ConcurrentStreamParts, bounded at PartSize*NumThreads = 64 MiB × 4), verify the digest once the stream drains, delete on mismatch.

  • No disk spool; receive and GCS upload overlap; multipart instead of a single stream.
  • Server-side sha256 verification preserved — hashed inline, mismatch → object deleted + DIGEST_INVALID.
  • No client change, no trust-model change. Bytes still transit epoch (this fixes the spool/serial waste; it does not move epoch out of the data path — that's presigned-PUT-direct, a separate change).
  • The chunked PATCH path still spools to disk (its digest is only known at finalize); no first-party client uses it.

Measured live (deployed to cocoonstack-us, image redirect-stream-20260618)

Server-side PUT durations (pure upload-through-epoch — most accurate):

upload result
real snapshot push win10-20260618-2 (~10.4 GB, 5 layers) — big layer A 201 in 61s
same push — big layer B 201 in 39s
same push — small layers / manifest 201 in 125–465 ms
synthetic 128 MB blob 201 in 1.6s (~78 MB/s)
128 MB with wrong digest 400 DIGEST_INVALID (verify + delete works)
  • No disk spool, confirmed live: the upload-spool dir stayed empty (4.0K) sampled 3× during a multi-GB PUT → bytes stream straight through, nothing buffered whole. This is the core fix.
  • Footprint during push: CPU < 1 core (peak ~775m), mem ~10 MiB (bounded by multipart part buffers, not blob size).
  • The old double-buffered path took ~3–4 min for comparable big blobs; now 39s / 61s.
  • End-to-end bake push (CI wall-clock) measured ~45–49 MB/s vs ~27 MB/s before ≈ 1.7×. (End-to-end is lower than the per-blob server rate above because it includes client-side cocoon snapshot export + per-blob sha256 buffering, not just the network upload.)

Remaining ceiling: bytes still transit epoch (TLS in + multipart out on ~3 CPU), so big layers cap ~80–110 MB/s. GCS-direct push (presigned PUT) is a possible follow-up.

Relation to #6

Independent of #6 (blob GET redirect). Together: pull bypasses the proxy entirely (#6), push stops double-buffering and parallelizes the GCS leg (this). Fully removing epoch from the push data path (presigned PUT direct to GCS, first-party-trust + GCS-CRC32C) is a possible follow-up.

@tonicmuroq tonicmuroq force-pushed the feat/push-streaming-blob branch 3 times, most recently from 3bd5da9 to e37e178 Compare June 22, 2026 16:47
@CMGS CMGS force-pushed the feat/push-streaming-blob branch from 4f23c9d to 740605c Compare June 28, 2026 17:10
persistMonolithicUpload spooled the entire blob to a disk tempfile (via
the chunked-upload session machinery), hashed it, then read it back and
uploaded to the object store — two full passes plus ~2x disk I/O, and
receive/upload run serially. For multi-GiB VM disk/memory blobs that's
the bulk of a push (single PUTs were taking 2-4 min).

Monolithic PUT already knows the digest up front (it's in the URL), so
there's no need to buffer: stream the request body through a sha256
hasher straight into a concurrent multipart upload (no disk), verify the
digest once the stream drains, and delete on mismatch so the
content-addressed key never keeps unverified bytes. Server-side digest
verification is preserved; no client change.

The chunked PATCH path (no first-party client uses it) still spools to
disk, since its digest is only known at finalize.

GCS S3-compat streaming multipart validated against staging: 12 MiB in
3 concurrent 5 MiB parts, sha256 round-trip verified.
@CMGS CMGS force-pushed the feat/push-streaming-blob branch from 740605c to dd18dc6 Compare June 28, 2026 18:04
CMGS added 2 commits June 29, 2026 02:21
Streaming writes direct to the digest key (kept for speed), then verifies.
- fail closed when BlobExists errors: streaming on could overwrite a valid
  blob with unverified bytes and the mismatch path would then delete it
- delete a mismatched blob on a detached, bounded context so cancellation
  can't suppress it (leaving bytes the dedup trusts); log for a sweep
- 500 store failures return INTERNAL_ERROR, not BLOB_UPLOAD_INVALID
Each monolithic streaming upload holds ~256 MiB (64 MiB x 4 parts), so
unbounded concurrency OOMs the server. Gate them through a buffered-channel
semaphore sized by EPOCH_MAX_STREAMING_UPLOADS (default 8); excess uploads
block on TCP backpressure instead of allocating. The dedup short-circuit
and the disk-spooled chunked path stay ungated (they don't carry the cost).
@CMGS CMGS force-pushed the feat/push-streaming-blob branch from dd18dc6 to 1a3e9e0 Compare June 28, 2026 18:22
@CMGS CMGS merged commit b36d15a into main Jun 28, 2026
2 checks passed
@CMGS CMGS deleted the feat/push-streaming-blob branch June 28, 2026 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants