Runtime MAL/LAL rules can declare dynamic layers#13883
Merged
Merged
Conversation
Runtime hot-update rules pushed through the runtime-rule admin API may now declare a layerDefinitions: block. Layers register through a new Layer.registerDynamic channel (post-seal-allowed) and are refcount- tracked by RuntimeLayerRegistry: removing the last declaring rule unregisters the layer. Ordinals are operator-pinned in the 100_000+ tier (no auto-allocation; the ordinal is persisted in ServiceTraffic primary keys and must be operator-stable across restarts). Bundled and runtime are strictly separate ownership channels: * Runtime overrides of bundled rules cannot carry layerDefinitions — REST rejects with applyStatus=layer_override_forbidden. * Legacy override rows on disk-twins drop the static-loader substitution and apply post-seal via the dynamic channel so they remain operator-removable. * RuleSetMerger no longer injects resolver-only ACTIVE rows into the static loader; pure-runtime rules go through RuleSync.runOnce. Conflict validation surfaces structured 400s with new applyStatus codes: layer_ordinal_out_of_range, layer_name_conflict, layer_ordinal_collision, layer_name_invalid, layer_override_forbidden. The checker covers in-batch duplicates, self-update multi-claimant ordinal reuse, and bundled-vs-runtime conflicts. E2E (test/e2e-v2/cases/runtime-rule/mal-storage/runtime-rule-flow.sh) extended with Phase 5e-5h: three layer-rejection assertions plus a restart-survival round-trip that POST -> swctl layer ls -> docker restart -> swctl layer ls still shows -> /delete -> swctl layer ls no longer shows. See docs/en/concepts-and-designs/runtime-rule-hot-update.md#dynamic-layers for the full contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR extends SkyWalking OAP runtime-rule hot-update to support dynamic layer declarations from runtime MAL/LAL rules via layerDefinitions:, with explicit operator-pinned ordinals (>= 100_000). It introduces a refcounted runtime ownership channel (RuntimeLayerRegistry + Layer.registerDynamic/unregisterDynamic) to ensure layers remain operator-removable and survive restarts via runtime-rule replay, while keeping bundled/boot-time layers strictly non-removable and non-overridable by runtime overrides.
Changes:
- Add runtime dynamic-layer registration APIs in
Layerand implement refcounted ownership tracking viaRuntimeLayerRegistrywith structured conflict validation (LayerConflictException). - Prevent pure-runtime rules from being injected into the static loader (
RuleSetMergerbehavior change) so runtimelayerDefinitions:always go through the dynamic channel post-seal. - Update REST runtime-rule apply flow to reject forbidden bundled overrides, surface layer conflicts as structured HTTP 400, extend E2E flow, and update docs/CHANGES.
Reviewed changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e-v2/cases/runtime-rule/mal-storage/seed-rules/seed-rule-sibling-with-layer.yaml | Adds a happy-path runtime rule fixture declaring a dynamic layer for E2E Phase 5h. |
| test/e2e-v2/cases/runtime-rule/mal-storage/seed-rules/illegal-layer-out-of-range.yaml | Adds E2E fixture asserting ordinal floor enforcement (< 100_000 rejected). |
| test/e2e-v2/cases/runtime-rule/mal-storage/seed-rules/illegal-layer-name-invalid.yaml | Adds E2E fixture asserting layer name regex validation. |
| test/e2e-v2/cases/runtime-rule/mal-storage/seed-rules/illegal-layer-name-conflict.yaml | Adds E2E fixture asserting name conflict against built-in/boot-time layers. |
| test/e2e-v2/cases/runtime-rule/mal-storage/runtime-rule-flow.sh | Extends runtime-rule E2E flow with dynamic-layer rejection cases + restart survival round-trip. |
| test/e2e-v2/cases/runtime-rule/mal-storage/banyandb/e2e.yaml | Updates scenario description to include new layer phases. |
| oap-server/server-starter/src/main/resources/layer-extensions.yml | Documents new ordinal tiering convention and runtime tier prohibition in this file. |
| oap-server/server-core/src/test/java/org/apache/skywalking/oap/server/core/analysis/LayerDynamicRegisterTest.java | Adds unit coverage for Layer.registerDynamic/unregisterDynamic/isDynamic. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/rule/ext/RuntimeRuleOverrideResolver.java | Updates contract: ACTIVE substitutions only override existing disk entries (no resolver-only injection). |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/rule/ext/RuleSetMerger.java | Stops merging resolver-only ACTIVE entries into static loader (pure-runtime handled post-seal). |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java | Adds runtime dynamic-layer APIs, explicit ownership tracking, and sealed values() semantics. |
| oap-server/server-admin/runtime-rule/src/test/java/org/apache/skywalking/oap/server/receiver/runtimerule/layer/RuntimeLayerRegistryTest.java | Adds unit tests for refcounting, atomic updates, rollback, conflicts, and soft-claim behavior. |
| oap-server/server-admin/runtime-rule/src/test/java/org/apache/skywalking/oap/server/receiver/runtimerule/extension/DbOverrideRuntimeRuleResolverTest.java | Adds unit coverage for forbidding layerDefinitions on runtime overrides of bundled rules. |
| oap-server/server-admin/runtime-rule/src/test/java/org/apache/skywalking/oap/server/receiver/runtimerule/apply/MalFileApplierTest.java | Adds unit coverage for MAL runtime apply registering dynamic layers and conflict surfacing. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/state/EngineApplied.java | Adds optional appliedLayerClaims() token for orchestrator rollback on external failures. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/rest/RuntimeRuleService.java | Rejects forbidden bundled overrides pre-persist; surfaces layer conflicts as structured 400; unwraps wrapped conflicts. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/reconcile/DSLRuntimeApply.java | Ensures LayerConflictException bypasses generic compile-failed path for proper REST mapping. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/layer/RuntimeLayerRegistry.java | Introduces synchronized refcounted runtime-layer ownership + apply/rollback/removeRule APIs. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/layer/RuntimeLayerConflictChecker.java | Implements structured validation for in-batch and cross-channel/cross-rule layer conflicts. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/layer/LayerConflictException.java | Defines structured conflict exception with stable applyStatus codes. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/layer/LayerClaim.java | Adds immutable internal representation of (name, ordinal, normal) used by runtime layer registry. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/layer/AppliedClaims.java | Adds rollback token capturing registry mutations for orchestrator-driven rollback. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/extension/DbOverrideRuntimeRuleResolver.java | Drops static-loader substitution for forbidden layer overrides and adds YAML peek helpers. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/engine/mal/MalRuleEngine.java | Rolls back runtime layer claims during engine rollback and drops claims on unregister. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/engine/lal/LalRuleEngine.java | Rolls back runtime layer claims during engine rollback and drops claims on unregister. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/apply/MalFileApplier.java | Enables runtime layerDefinitions: by applying claims via RuntimeLayerRegistry and rolling back on failure. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/apply/LalFileApplier.java | Enables runtime layerDefinitions: in LAL and registers layers before compilation for name resolution. |
| oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/apply/DeltaClassifier.java | Escalates MAL edits with layerDefinitions changes to STRUCTURAL; enriches reasons with layer diffs. |
| oap-server/analyzer/meter-analyzer/src/test/java/org/apache/skywalking/oap/meter/analyzer/v2/prometheus/rule/RulesLayerDefinitionsTest.java | Minor formatting-only update (trailing newline). |
| oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/v2/prometheus/rule/Rules.java | Updates static-loader override semantics documentation to exclude resolver-only rules. |
| oap-server/analyzer/log-analyzer/src/test/java/org/apache/skywalking/oap/log/analyzer/v2/provider/LALConfigsLayerDefinitionsTest.java | Minor formatting-only update (trailing newline). |
| oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/provider/LALConfigs.java | Updates docs to exclude resolver-only LAL rules from boot-time load path. |
| docs/en/concepts-and-designs/runtime-rule-hot-update.md | Documents dynamic layers contract: tiers, lifecycle, limitations, and applyStatus conflict matrix. |
| docs/en/changes/changes.md | Adds changelog entry for runtime dynamic layers. |
| CLAUDE.md | Adds comment-writing guidance to repository contributor instructions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wankai123
previously approved these changes
May 24, 2026
Address Copilot review on PR #13883. Project's surefire enableAssertions defaults to true so bare `assert` does fire, but JUnit assertions are the convention used by every other check in these tests and survive any test-runner configuration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaced by Copilot review on PR #13883 — bare Java `assert` in LayerDynamicRegisterTest. Codify the convention so future agents reach for assertTrue/assertEquals first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
wankai123
approved these changes
May 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Runtime MAL/LAL rules can declare dynamic layers
Runtime hot-update rules pushed through the runtime-rule admin API may now declare a
layerDefinitions:block. Layers register through a newLayer.registerDynamicchannel (post-seal allowed) and are refcount-tracked byRuntimeLayerRegistry: removing the last declaring rule unregisters the layer. Ordinals are operator-pinned in the100_000+tier — no auto-allocation; the ordinal is persisted inServiceTrafficprimary keys and must be operator-stable across restarts.Bundled and runtime are strictly separate ownership channels:
layerDefinitions:— REST rejects pre-persist withapplyStatus=layer_override_forbidden.RuleSetMergerno longer injects resolver-only ACTIVE rows into the static loader; pure-runtime rules go throughRuleSync.runOncepost-seal exclusively.Conflict validation surfaces structured HTTP 400 with the existing
{applyStatus, catalog, name, message}envelope using newapplyStatuscodes:layer_ordinal_out_of_range,layer_name_conflict,layer_ordinal_collision,layer_name_invalid,layer_override_forbidden. The checker handles in-batch duplicates, self-update multi-claimant ordinal reuse, and bundled-vs-runtime cross-channel collisions.See docs/en/concepts-and-designs/runtime-rule-hot-update.md#dynamic-layers for the full contract — ordinal tiers, lifecycle, limitations, and conflict rules.
If this is non-trivial feature, paste the links/URLs to the design doc.
Design discussion in CLAUDE memory; user-facing docs in `docs/en/concepts-and-designs/runtime-rule-hot-update.md#dynamic-layers`.
Update the documentation to include this new feature.
`docs/en/concepts-and-designs/runtime-rule-hot-update.md` gets a new "Dynamic layers" section; `oap-server/server-starter/src/main/resources/layer-extensions.yml` header updated for the new tier convention; `CLAUDE.md` adds a Comments style note.
Tests(including UT, IT, E2E) are added to verify the new feature.
Unit: 121 runtime-rule + 25 server-core analysis + 188 LAL compiler tests, all green. New classes: `LayerDynamicRegisterTest` (12 tests), `RuntimeLayerRegistryTest` (19), `DbOverrideRuntimeRuleResolverTest` (5). E2E: `runtime-rule-flow.sh` extended with Phases 5e/5f/5g (three layer-rejection assertions through the HTTP envelope) plus Phase 5h (restart-survival: POST sibling rule, swctl layer ls shows, docker restart OAP, swctl still shows, /inactivate + /delete, swctl no longer shows). Local e2e against banyandb backend: `SUMMARY: 1 passed, 0 failed` × 2 back-to-back runs.
If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #.
Update the `CHANGES` log.