Skip to content

Roll back compile cache entry when the first trace throws#3635

Merged
zcbenz merged 2 commits into
ml-explore:mainfrom
adityasingh2400:fix-3624-compile-cache-empty-on-throw
Jun 9, 2026
Merged

Roll back compile cache entry when the first trace throws#3635
zcbenz merged 2 commits into
ml-explore:mainfrom
adityasingh2400:fix-3624-compile-cache-empty-on-throw

Conversation

@adityasingh2400

@adityasingh2400 adityasingh2400 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Proposed changes

Fixes #3624.

The per-call lambda built by compile() marks a cache entry non-empty before it traces it:

if (entry.empty) {
  entry.empty = false;                 // marked filled BEFORE the trace
  entry.constants = std::move(constants);
  std::tie(entry.inputs, entry.outputs, entry.extra) =
      compile_trace(fun, inputs, shapeless);   // may throw
  ...
}

If compile_trace (or the subsequent dfs, simplify, or fuse steps) throws on the first trace, the entry is left with empty == false but no inputs, outputs, or tape. A later call with matching inputs then find()s that entry, sees it is not empty, skips tracing, and compile_replaces against an empty tape, returning empty outputs as a spurious success instead of re-tracing or re-raising. A nullary compiled function is the sharpest case since its inputs always match.

This moves entry.empty = false to after the trace, dfs, simplify, and fuse steps complete, so a throwing first trace leaves the entry empty and a later call re-traces cleanly while a failure stays a failure. This is the approach @zcbenz endorsed on the issue. A retry reassigns constants, inputs, outputs, and the tape when it re-enters the fill path, so the partially assigned fields from the failed attempt are overwritten.

Testing

Added a C++ regression test in tests/compile_tests.cpp that compiles a nullary function which raises on its first invocation, then calls it twice. Without the fix the second call returns 0 outputs instead of raising. With the fix both calls raise, and once the function stops raising a retry traces cleanly and returns the correct result.

Verified on macOS (CPU-only build): the new test fails on main (second call did not throw, out.size() was 0) and passes with the fix. The full compile suite (23 cases, 111 assertions) is green. Both touched files pass clang-format.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

The per-call lambda in compile() marks a cache entry non-empty before it
traces it. If compile_trace (or the dfs, simplify, or fuse steps) throws on
the first trace, the entry is left empty == false with no inputs, outputs, or
tape. A later call with matching inputs then finds that entry, sees it is not
empty, skips tracing, and replaces against an empty tape, returning empty
outputs as a spurious success instead of re-tracing or re-raising. A nullary
compiled function is the sharpest case since its inputs always match.

Wrap the fill in a try/catch that resets entry.empty to true on any
exception, so a retry re-traces cleanly and a failure stays a failure.

Fixes ml-explore#3624
@zcbenz

zcbenz commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

I preferred the other fix: #3624 (comment)

@adityasingh2400

Copy link
Copy Markdown
Contributor Author

Done, reworked to that approach in 0fde210. The flag move is all that is needed since a failed first trace now leaves the entry empty and the retry path overwrites the partially assigned fields when it re-traces. The regression test is unchanged and still covers the throwing first trace, the throwing retry, and the clean retrace.

@zcbenz zcbenz merged commit 01368d8 into ml-explore:main Jun 9, 2026
30 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

compile: failed first trace leaves a stale cache entry that returns empty-success on retry

2 participants