Skip to content

Add server ingestion OOM protection#18784

Open
xiangfu0 wants to merge 1 commit into
apache:masterfrom
xiangfu0:server-ingestion-oom-protection
Open

Add server ingestion OOM protection#18784
xiangfu0 wants to merge 1 commit into
apache:masterfrom
xiangfu0:server-ingestion-oom-protection

Conversation

@xiangfu0

@xiangfu0 xiangfu0 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR adds server ingestion OOM protection for realtime ingestion. When protection is enabled for a consuming
segment, the server checks JVM heap usage before fetching more realtime messages. If heap usage is at or above the
configured throttle threshold, the consuming loop waits locally and resumes after heap usage drops to the recovery
threshold.

The protection uses Pinot's existing JVM heap usage tracking (ResourceUsageUtils, backed by MemoryMXBean), which is
the same process-level heap source used by query OOM protection/accounting. It adds a separate realtime-ingestion
backpressure gate on top of that shared heap signal; it does not reuse per-query attribution or query kill logic.

The implementation keeps heap sampling, hysteresis, GC requests, and metrics in a single server-wide throttle state.
Each consuming thread only checks whether its table is enrolled and then reads the shared server throttle status, so the
per-table work stays light.

Default Behavior

With no config changes, Pinot servers do not hold or stop realtime ingestion for this feature.

Effective defaults:

Setting Default Effect
Server mode DISABLE Server-level protection is off unless explicitly changed.
Throttle threshold 0.95 When protection is active, ingestion waits when JVM heap usage is at or above 95%.
Recovery threshold 0.90 When protection is active, ingestion resumes when JVM heap usage drops to 90% or lower.
Check interval 1000 ms A throttled consuming loop rechecks the shared server throttle status once per second.
GC request interval 30000 ms While throttled, Pinot asks the JVM to run GC at most once every 30 seconds.

Default runtime behavior:

  • Does not stop, hold, or throttle ingestion unless the server mode is changed from DISABLE or a table explicitly sets
    mode to ENABLE.
  • Uses server-local backpressure only; it does not pause the table through controller APIs or change stream offsets.
  • Applies only while a segment is in INITIAL_CONSUMING; catch-up paths such as CATCHING_UP and
    CONSUMING_TO_ONLINE are not gated.
  • Requests JVM GC while throttled, using the same GC-hint mechanism as query OOM protection, to avoid an
    ingestion-heavy server staying paused only because reclaimable objects are still counted as used heap.
  • Table-level mode=ENABLE opts a table in even when the server mode is DISABLE or would otherwise skip the table.
  • Table-level mode=DISABLE opts a table out even when the server mode would protect it.

Server Configuration

Server properties use the user-facing prefix pinot.server.instance.. They can be supplied in the server instance config
at startup, and they are also dynamic cluster configs: updating the same keys through the cluster config updates the
shared server throttle state at runtime without a server restart.

Cluster config values override the boot-time server instance config while present. Removing a cluster config key falls
back to the boot-time instance config value, or to the default if the instance config did not set it.

Enable Upsert And Dedup Protection

pinot.server.instance.ingestion.oom.protection.mode=UPSERT_DEDUP_ONLY
pinot.server.instance.ingestion.oom.protection.heapUsageThrottleThreshold=0.95
pinot.server.instance.ingestion.oom.protection.heapUsageRecoveryThreshold=0.90
pinot.server.instance.ingestion.oom.protection.checkIntervalMs=1000
pinot.server.instance.ingestion.oom.protection.gcIntervalMs=30000

Update Through Cluster Config

Use the same full pinot.server.instance.* names when updating cluster config:

curl -X POST "http://<controller-host>:<controller-port>/cluster/configs" \
  -H "Content-Type: application/json" \
  -d '{
    "pinot.server.instance.ingestion.oom.protection.mode": "UPSERT_DEDUP_ONLY",
    "pinot.server.instance.ingestion.oom.protection.heapUsageThrottleThreshold": "0.95",
    "pinot.server.instance.ingestion.oom.protection.heapUsageRecoveryThreshold": "0.90",
    "pinot.server.instance.ingestion.oom.protection.checkIntervalMs": "1000",
    "pinot.server.instance.ingestion.oom.protection.gcIntervalMs": "30000"
  }'

To disable the server-level policy dynamically:

curl -X POST "http://<controller-host>:<controller-port>/cluster/configs" \
  -H "Content-Type: application/json" \
  -d '{"pinot.server.instance.ingestion.oom.protection.mode": "DISABLE"}'

Server Config Reference

Property Default Description
pinot.server.instance.ingestion.oom.protection.mode DISABLE Dynamic server-level policy. Supported values: ENABLE, UPSERT_DEDUP_ONLY, DISABLE.
pinot.server.instance.ingestion.oom.protection.heapUsageThrottleThreshold 0.95 Dynamic server-wide heap usage ratio that starts ingestion backpressure.
pinot.server.instance.ingestion.oom.protection.heapUsageRecoveryThreshold 0.90 Dynamic server-wide heap usage ratio at or below which ingestion resumes. Must be lower than heapUsageThrottleThreshold.
pinot.server.instance.ingestion.oom.protection.checkIntervalMs 1000 Dynamic wait interval between shared throttle checks while ingestion is held.
pinot.server.instance.ingestion.oom.protection.gcIntervalMs 30000 Dynamic minimum interval between JVM GC requests while ingestion is held. Set to 0 or a negative value to disable the explicit GC request.

Server Modes

# Protect all realtime tables unless a table opts out.
pinot.server.instance.ingestion.oom.protection.mode=ENABLE

# Protect only upsert and dedup realtime tables unless a table opts in/out.
pinot.server.instance.ingestion.oom.protection.mode=UPSERT_DEDUP_ONLY

# Leave server-level protection disabled unless a table opts in.
pinot.server.instance.ingestion.oom.protection.mode=DISABLE

Tune Server-Wide Thresholds

Thresholds are server-level only. They are intentionally not table-level knobs.

pinot.server.instance.ingestion.oom.protection.heapUsageThrottleThreshold=0.95
pinot.server.instance.ingestion.oom.protection.heapUsageRecoveryThreshold=0.90
pinot.server.instance.ingestion.oom.protection.checkIntervalMs=2000
pinot.server.instance.ingestion.oom.protection.gcIntervalMs=30000

Table Configuration

Tables can only opt in or opt out under ingestionConfig.streamIngestionConfig.oomProtectionConfig. Thresholds are
configured on the server and shared by all enrolled realtime consuming threads.

If table mode is unset, the table follows the server mode. Table mode supports only ENABLE and DISABLE:

  • ENABLE: protect this realtime table even when the server mode is DISABLE or would otherwise skip it.
  • DISABLE: turn protection off for this realtime table even when the server mode would protect it.

Table Config Reference

Field Default Description
mode unset Optional table override. Supported values: ENABLE, DISABLE. Unset follows the server mode.

Force Protection On For One Realtime Table

Use this to opt in a specific realtime table, including when the server mode is DISABLE or UPSERT_DEDUP_ONLY would
skip a non-upsert/non-dedup table.

{
  "ingestionConfig": {
    "streamIngestionConfig": {
      "streamConfigMaps": [
        {
          "streamType": "kafka"
        }
      ],
      "oomProtectionConfig": {
        "mode": "ENABLE"
      }
    }
  }
}

Disable Protection For One Table

"oomProtectionConfig": {
  "mode": "DISABLE"
}

Runtime Behavior

When protection is active for a consuming segment, the server waits before the next stream fetch. This reduces additional
heap pressure from realtime ingestion while leaving segment state, stream offsets, and table pause state unchanged.

The heap sample is the same process-level JVM heap usage used by query OOM protection. The ingestion feature does not
reuse per-query memory attribution or query kill logic; it only uses the shared heap usage signal to decide whether
realtime ingestion should wait.

Heap sampling, hysteresis, GC requests, and the active/heap gauges are tracked centrally per server. A consuming thread
does not compute table-specific thresholds. It checks local enrollment (mode plus server policy) and then observes the
shared server throttle state.

While throttled, Pinot requests JVM GC at a server-wide, rate-limited interval. This matters for ingestion-heavy servers:
once ingestion stops allocating, the JVM might not naturally trigger another collection immediately, so the reported used
heap can stay above the recovery threshold even when garbage is reclaimable. The explicit GC request is a JVM hint, just
like the query OOM pause path; JVM flags such as -XX:+DisableExplicitGC can still ignore it.

While waiting, the consume loop re-checks stop and segment end criteria at each interval, so force commit, time limit,
and stop paths do not wait indefinitely for heap recovery.

Metrics

This PR adds server metrics for visibility:

  • REALTIME_INGESTION_OOM_PROTECTION_ACTIVE: global gauge set to 1 when the server-wide realtime ingestion throttle
    is active, otherwise 0.
  • REALTIME_INGESTION_OOM_PROTECTION_HEAP_USAGE_PERCENT: global gauge containing the latest JVM heap usage percentage
    observed by the shared server throttle state.

Example Updated

The upsert meetup RSVP realtime example now includes a table opt-in sample and a short operator guide:

  • pinot-tools/src/main/resources/examples/stream/upsertMeetupRsvp/upsertMeetupRsvp_realtime_table_config.json
  • pinot-tools/src/main/resources/examples/stream/upsertMeetupRsvp/README.md

Testing

  • ./mvnw -pl pinot-core -am -Dtest=ServerIngestionOomProtectionManagerTest -Dsurefire.failIfNoSpecifiedTests=false test
  • ./mvnw -pl pinot-core,pinot-server -am -Dtest=ServerIngestionOomProtectionManagerTest,RealtimeSegmentDataManagerTest -Dsurefire.failIfNoSpecifiedTests=false test
  • ./mvnw -pl pinot-segment-local,pinot-common,pinot-core -am -Dtest=TableConfigUtilsTest,TableConfigSerDeUtilsTest,RealtimeSegmentDataManagerTest -Dsurefire.failIfNoSpecifiedTests=false test
  • ./mvnw -pl pinot-plugins/pinot-metrics/pinot-yammer -am -Dtest=YammerServerPrometheusMetricsTest -Dsurefire.failIfNoSpecifiedTests=false test
  • ./mvnw spotless:apply -pl pinot-core,pinot-server,pinot-tools
  • ./mvnw license:format -pl pinot-core,pinot-server,pinot-tools
  • ./mvnw checkstyle:check -pl pinot-core,pinot-server,pinot-tools
  • ./mvnw license:check -pl pinot-core,pinot-server,pinot-tools
  • git diff --check

@xiangfu0 xiangfu0 force-pushed the server-ingestion-oom-protection branch from 6c03ba0 to 017ec1e Compare June 17, 2026 03:16
@xiangfu0 xiangfu0 requested a review from Copilot June 17, 2026 03:21

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds server-local OOM protection/backpressure for realtime ingestion by sampling JVM heap usage and temporarily pausing stream fetches when usage exceeds configurable thresholds. This integrates into the server realtime consume loop, adds server + table override configs, and exposes new table-level metrics for observability.

Changes:

  • Introduces ServerIngestionOomProtectionManager and wires it into realtime consumption (gated in INITIAL_CONSUMING only).
  • Adds server instance config keys + table-level override config (serverIngestionOomProtectionConfig) with validation + SerDe/tests.
  • Adds new server metrics (gauges + meter) and updates the upsert realtime example docs/config.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pinot-tools/src/main/resources/examples/stream/upsertMeetupRsvp/upsertMeetupRsvp_realtime_table_config.json Adds table-level override example for server ingestion OOM protection.
pinot-tools/src/main/resources/examples/stream/upsertMeetupRsvp/README.md Documents server + table configs and behavior for the example.
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java Adds server instance config keys/defaults for OOM protection.
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ingestion/StreamIngestionConfig.java Adds table-level override field for OOM protection in stream ingestion config.
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ingestion/ServerIngestionOomProtectionConfig.java New table-level config object for OOM protection mode/threshold overrides.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java Validates table-level OOM protection thresholds in ingestion config validation.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/utils/TableConfigUtilsTest.java Adds validation tests for OOM protection config constraints.
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/ServerIngestionOomProtectionManager.java New manager implementing heap-sampling + hysteresis + metrics/logging.
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java Creates and resets the OOM protection manager per realtime table.
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java Applies protection gating in consume loop for INITIAL_CONSUMING segments.
pinot-core/src/test/java/org/apache/pinot/core/data/manager/realtime/ServerIngestionOomProtectionManagerTest.java Adds unit tests for policy/override behavior and wait/metrics paths.
pinot-core/src/test/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManagerTest.java Adds integration-style tests asserting gating is applied/skipped by state.
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerGauge.java Adds new table gauges for protection active + heap usage percent.
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerMeter.java Adds new table meter for throttling occurrences.
pinot-common/src/test/java/org/apache/pinot/common/utils/config/TableConfigSerDeUtilsTest.java Adds SerDe assertions for the new table-level config object.

@codecov-commenter

codecov-commenter commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 76.41509% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.82%. Comparing base (431d541) to head (261dce0).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
.../realtime/ServerIngestionOomProtectionManager.java 77.77% 21 Missing and 17 partials ⚠️
...a/manager/realtime/RealtimeSegmentDataManager.java 50.00% 2 Missing and 5 partials ⚠️
.../pinot/server/starter/helix/BaseServerStarter.java 0.00% 3 Missing ⚠️
...ata/manager/realtime/RealtimeTableDataManager.java 85.71% 1 Missing ⚠️
...server/starter/helix/HelixInstanceDataManager.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18784      +/-   ##
============================================
+ Coverage     57.00%   64.82%   +7.81%     
- Complexity        1     1319    +1318     
============================================
  Files          2599     3390     +791     
  Lines        151667   210435   +58768     
  Branches      24564    32986    +8422     
============================================
+ Hits          86459   136413   +49954     
- Misses        57884    63026    +5142     
- Partials       7324    10996    +3672     
Flag Coverage Δ
custom-integration1 100.00% <ø> (?)
integration 100.00% <ø> (?)
integration1 100.00% <ø> (?)
integration2 0.00% <ø> (?)
java-21 64.82% <76.41%> (+7.81%) ⬆️
temurin 64.82% <76.41%> (+7.81%) ⬆️
unittests 64.82% <76.41%> (+7.81%) ⬆️
unittests1 57.01% <74.51%> (+0.01%) ⬆️
unittests2 37.25% <33.49%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 added ingestion Related to data ingestion pipeline oom-protection Related to out-of-memory protection mechanisms labels Jun 17, 2026
@xiangfu0 xiangfu0 force-pushed the server-ingestion-oom-protection branch 4 times, most recently from 2264fdc to 1924588 Compare June 17, 2026 19:47
Comment thread pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerGauge.java Outdated
Comment thread pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerMeter.java Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

@xiangfu0 xiangfu0 force-pushed the server-ingestion-oom-protection branch 3 times, most recently from 95da13e to f32d50b Compare June 17, 2026 22:52
@xiangfu0 xiangfu0 force-pushed the server-ingestion-oom-protection branch from f32d50b to bd76b67 Compare June 18, 2026 07:51

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one high-signal issue; see inline comment.

@xiangfu0 xiangfu0 force-pushed the server-ingestion-oom-protection branch 5 times, most recently from b358136 to f20a57f Compare June 19, 2026 03:02
@xiangfu0 xiangfu0 force-pushed the server-ingestion-oom-protection branch from f20a57f to 261dce0 Compare June 19, 2026 03:35
@Nullable Cache<Pair<String, String>, SegmentErrorInfo> errorCache,
BooleanSupplier isServerReadyToConsumeData,
BooleanSupplier isServerReadyToServeQueries,
ServerIngestionOomProtectionManager.ServerThrottleState serverIngestionOomProtectionThrottleState,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same compatibility issue here for pinot.server.instance.table.data.manager.provider.class: adding the new parameter to the interface method breaks existing custom providers at runtime. Pinot already supports external provider implementations via config, so this needs a compatibility shim rather than a signature rewrite.

@Nullable SegmentOperationsThrottlerSet segmentOperationsThrottlerSet,
ServerReloadJobStatusCache reloadJobStatusCache)
ServerReloadJobStatusCache reloadJobStatusCache,
ServerIngestionOomProtectionManager.ServerThrottleState serverIngestionOomProtectionThrottleState)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This widens an existing extension-point signature used by pinot.server.instance.data.manager.class. Any custom InstanceDataManager compiled against the current release will still load, but the first init(...) call on an upgraded server will hit AbstractMethodError because the old 5-arg method no longer satisfies the interface. Please preserve the old signature and thread the throttle state through a backward-compatible overload or setter instead of rewriting the contract in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion Related to data ingestion pipeline oom-protection Related to out-of-memory protection mechanisms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants