Skip to content

feat: context aware replay status#488

Open
wangyb-A wants to merge 3 commits into
mainfrom
context_replay
Open

feat: context aware replay status#488
wangyb-A wants to merge 3 commits into
mainfrom
context_replay

Conversation

@wangyb-A

@wangyb-A wangyb-A commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Issue #, if available: first piece of #389

Summary

Track replay status per-context instead of as a single execution-wide flag.

Changes

  • context.py: DurableContext owns its replay status, seeded from its
    creator and refined per-operation via a _replay_aware() context manager.
    Added public context.is_replaying().
  • logger.py: Logger consults a per-context is_replaying callable
    instead of holding ExecutionState.
  • state.py: removed the global replay machinery (track_replay,
    is_replaying, _visited_operations); added has_prior_operations() for
    execution-level first-invocation detection.
  • execution.py: seeds the root context's status; uses
    has_prior_operations() for the plugin is_first_invocation flag.
  • concurrency/executor.py: branches track their own status; removed the
    per-branch track_replay call.

Replay boundary behavior

_replay_aware() flips REPLAY→NEW at three points:

  • before an op with no checkpoint (brand-new code),
  • after a non-terminal op (the resume point — its own logs stay deduped),
  • after a completed op when the next op doesn't exist yet (so bare logs
    immediately after a wait are treated as new, not suppressed).

The third case fixes a regression where a log right after a wait was
silently dropped on replay.

Testing

Logging from examples:

Invocation 1 (requestId 8b1e9590) — first run, suspends at parent wait

message is_replaying operationName
Workflow started (before wait) false
Preparing item prepare
Prepared, about to wait false

Invocation 2 (requestId a46ca6bd) — parent wait fired

message is_replaying / child_is_replaying
Resumed after wait false
Auditing in child context (before child wait) false (child)
(before-wait lines from invocation 1 are NOT repeated — de-duplicated)

Invocation 3 (requestId 22d99bc2) — child wait fired

message child_is_replaying operationName
Resumed in child context (after child wait) false
Finalizing item finalize
Workflow completed
(all parent + child before-wait lines NOT repeated — de-duplicated)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@wangyb-A wangyb-A marked this pull request as ready for review June 24, 2026 18:07

@contextmanager
def _replay_aware(self):
def _replay_aware(self, *, executes_user_code: bool = False):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this true for context operations?

# - user-code op that is non-terminal (brand-new or retrying): the user
# function is about to run real work, so flip to NEW before it.
if was_replaying and (
not next_exists or (executes_user_code and not next_terminal)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we always flip before the op?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants