Pane(fix[wait_for_text]): waits for new output, joins wrapped lines, prefers wait-for-channel for completion by tony · Pull Request #47 · tmux-python/libtmux-mcp

tony · 2026-05-12T22:38:44Z

Closes #45.

What changed for users

wait_for_text becomes a "wait for output written after this call" primitive instead of "match anything currently in the pane." For agents whose first poll used to return found=True on stale scrollback, this is a real fix. The CHANGES ## libtmux-mcp 0.1.x (unreleased) section describes the published-release-reader contract.

Concrete user-visible changes vs 0.1.0a6:

Anchors on (history_size, cursor_y) at entry; only matches lines written after the call begins.
content_start / content_end parameters removed (the anchor supersedes them).
Joins visually wrapped lines for matching via tmux's -J flag — long error strings that wrap across rows now match.
Rejects empty pattern, sub-10 ms interval, non-positive timeout at entry.
Raises ToolError on pane death, pane respawn, or clear-history mid-wait.
WaitForTextResult.timed_out and ContentChangeResult.timed_out removed (mechanically not found / not changed).
Emits a notifications/message warning when polling approaches history-limit — the band where tmux's grid_collect_history trim+scroll bounce keeps hsize clamped at the cap and the strict-shrink guard can't catch every trim.
Server instructions, send_keys docstring, and wait_for_text docstring all point agents at wait_for_channel (composed with tmux wait-for -S) as the deterministic primitive for command completion. The polling-scraper wait_for_text is now framed for output the agent does not author.
docs/topics/prompting.md system-prompt fragment and docs/topics/troubleshooting.md "Timing" bullet brought in line with the channel-first guidance.

Theoretical grounding

Reviewed against tmux source pinned at 134ba6c:

Capture math: cmd-capture-pane.c:158 (top = gd->hsize + n)
Bottom-row clip: cmd-capture-pane.c:159-160
Trim mechanics: grid.c:386 (grid_collect_history decrements hsize by hlimit/10)
Scroll-after-trim sequence: grid-view.c:106 (grid_view_scroll_region_up)
Resize-grow / hsize pull-back: screen.c:451-465 (screen_resize_y)
Pane respawn preserves hsize: screen.c:127 (screen_reinit)
No native trim-detection primitive exists: format.c history format strings + options-table.c hooks list both checked, no monotonic counter or history-trimmed hook

The rollover detection's partial coverage is acknowledged explicitly. A 2 ms high-frequency probe across ~3000 samples confirmed there is no sub-poll window where hsize dips below entry once entry was sub-cap — so the strict-shrink guard catches clear-history and the entry-at-cap edge but not bursty trim. The runtime ctx.warning in the trim-risk band is the agent-facing mitigation; wait_for_channel is the deterministic alternative.

Test plan

rm -rf docs/_build; uv run ruff check . --fix --show-fixes; uv run ruff format .; uv run mypy; uv run py.test --reruns 0 -vvv; just build-docs

Green at every commit. Full test suite passes.

New regression tests cover: stale scrollback no-match (#45), bottom-row clip, mid-wait respawn, pane death under remain-on-exit, input rejection (empty pattern / tiny interval / non-positive timeout), wrap-spanning patterns, resize-grow not raising, history shrinking via clear-history raising, idle wait already in trim-risk band warning, burst wait advancing into trim-risk band warning, cancellation propagation, server instruction substring assertions, send_keys docstring cross-link.

Filed follow-ups (deferred, not in this PR)

wait_for_text: two-call state/capture race within a single poll tick #50 — wait_for_text two-call state/capture race within a single tick (architecture)
wait_for_text: optional include_baseline_row=True for in-place row writes #51 — include_baseline_row=True opt-in for same-row writes (enhancement)
wait_for_text: adaptive backoff to reduce subprocess pressure on parallel waits #52 — adaptive backoff to reduce subprocess pressure (performance)
wait_for_content_change: surface pane death and respawn as ToolError for parity with wait_for_text #53 — wait_for_content_change lifecycle parity with wait_for_text (enhancement)
wait_for_text: expose risk_band_warned: bool in WaitForTextResult for clients that don't subscribe to log notifications #54 — expose risk_band_warned: bool on WaitForTextResult for clients not subscribed to log notifications (enhancement)
search_panes: visual-row capture misses wrap-spanning patterns #55 — search_panes visual-row capture misses wrap-spanning patterns (parallel to wait_for_text's -J fix; the slow path is a 1-line change, the fast path is structurally wrap-blind in tmux)
Investigate libtmux-side trim detection for scrollback consumers libtmux#671 — investigate libtmux-side trim detection helper (broader infrastructure)

Breaking changes

content_start / content_end parameters removed from wait_for_text. Pre-alpha API; CHANGES carries a before/after block.
wait_for_text raises ToolError on empty pattern, interval < 0.01, timeout <= 0, pane death, pane respawn, and history_size shrink below entry baseline. Previously most of these silently miscaptured or returned found=False after the full timeout.
WaitForTextResult.timed_out and ContentChangeResult.timed_out removed. They were mechanically not found / not changed; callers should switch to the primary result field. CHANGES carries a before/after block.

codecov-commenter · 2026-05-12T22:59:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.18%. Comparing base (30da675) to head (f79625c).

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #47      +/-   ##
==========================================
+ Coverage   84.87%   85.18%   +0.31%     
==========================================
  Files          40       40              
  Lines        2268     2316      +48     
  Branches      286      295       +9     
==========================================
+ Hits         1925     1973      +48     
  Misses        260      260              
  Partials       83       83

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tony · 2026-05-13T00:33:01Z

Code review

No issues found. Checked for bugs and AGENTS.md compliance.

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

tony · 2026-05-15T22:29:27Z

Code review

Found 1 issue:

CHANGES unreleased block orders the subheadings ### Breaking changes → ### Fixes → ### Dependencies, but AGENTS.md mandates ### Breaking changes → ### Dependencies → ### What's new → ### Fixes → ### Documentation → ### Development. Move the Dependencies block to land between Breaking changes and Fixes.

libtmux-mcp/CHANGES

Lines 7 to 60 in fbe4698

    
           <!-- END PLACEHOLDER - ADD NEW CHANGELOG ENTRIES BELOW THIS LINE --> 
        
           ### Breaking changes 
        
           **{tooliconl}`wait-for-text` drops `content_start` and `content_end`** 
        
           The baseline anchor introduced in this release supersedes the manual capture-range knobs: capture now follows the pane's grid position automatically, so the two parameters had no remaining purpose. Agents that named them should drop them from their call sites. (#45) 
        
           ```python 
        
           # Before 
        
           wait_for_text(pattern="OK", content_start=-100) 
        
           # After 
        
           wait_for_text(pattern="OK") 
        
           ``` 
        
           ### Fixes 
        
           **{tooliconl}`wait-for-text` now waits for *new* output, not stale scrollback** 
        
           Previously {tooliconl}`wait-for-text` returned `found=True` on the first poll if the pattern was already present anywhere in the pane when the call began. It now snapshots `#{history_size} + #{cursor_y}` at entry and only matches lines written below that baseline, so agents using the tool to synchronise on command output see the result they expected. For the synchronous "is the pattern in the pane right now?" use case, call {tooliconl}`search-panes` instead. (#45) 
        
           **{tooliconl}`wait-for-text` guards against tmux's bottom-row capture clip** 
        
           `capture-pane -S` clips a below-visible start back to the bottom visible row, so a cursor sitting at the last visible row (the normal state after a fresh shell prompt) would match whatever stale text already occupied that row. The poll loop now reads `#{pane_height}` at entry and short-circuits the capture when the computed start would land below the pane. (#45) 
        
           **{tooliconl}`wait-for-text` surfaces pane death and respawn** 
        
           Each poll tick now reads `#{pane_dead}` and `#{pane_pid}` alongside the grid state. A pane whose process exited under `remain-on-exit`, or one that was swapped out by `respawn-pane` mid-wait, raises a `ToolError` instead of silently miscapturing against a frozen or reset grid. (#45) 
        
           **{tooliconl}`wait-for-text` rejects footgun inputs** 
        
           Empty `pattern`, sub-10ms `interval`, and non-positive `timeout` each raise a `ToolError` at entry. The previous behaviour would have matched every line (empty pattern), spun the tmux server (zero interval), or completed a surprise single probe instead of waiting (`timeout=0`). (#45) 
        
           **{tooliconl}`wait-for-text` honours the timeout budget end-to-end** 
        
           `start_time` is now captured before the initial baseline read so a stalled tmux server cannot consume the budget before the deadline timer even starts. libtmux's `tmux_cmd` uses `Popen.communicate()` with no subprocess timeout, so the worst case was real, not theoretical. (#45) 
        
           ### Dependencies 
        
           **Minimum `libtmux>=0.56.0`** (was `>=0.55.1`). Unlocks the new tmux-command wrappers shipped in libtmux 0.56.0 — {meth}`~libtmux.Pane.respawn`, {meth}`~libtmux.Pane.copy_mode`, {meth}`~libtmux.Pane.pipe`, {meth}`~libtmux.Pane.swap`, {meth}`~libtmux.Pane.paste_buffer`, {meth}`~libtmux.Pane.clear_history`, {meth}`~libtmux.Pane.display_message`, {meth}`~libtmux.Server.delete_buffer`, and the {meth}`~libtmux.Session.next_window` / {meth}`~libtmux.Session.previous_window` / {meth}`~libtmux.Session.last_window` trio — so the MCP no longer falls back to raw `cmd()` calls for tmux commands the upstream API now covers. (#46) 
        
           ## libtmux-mcp 0.1.0a6 (2026-05-09) 
        
           libtmux-mcp 0.1.0a6 is the activation and registration cleanup release. It makes the server much easier for MCP clients to discover from ordinary "pane", "window", and "session" prompts, standardizes new setup docs around the `tmux` registration slug, and adds migration guidance for existing `libtmux` registrations. Existing installs keep working; the release changes defaults and documentation so new installs line up with the tool prefix users actually see. 
        
           ### What's new 
        
           **Bare tmux prompts now find the server** 
        
           Agents no longer need the word "tmux" in every prompt before this server becomes relevant. The server instructions now name the positive cases users naturally write, such as "split this pane", "current window", and "this session", while steering clients away from unrelated browser tabs, editor splits, tiling-window-manager panes, and notebook cells. The discovery anchors most likely to answer those prompts - {tooliconl}`list-panes`, {tooliconl}`list-windows`, and {tooliconl}`snapshot-pane` - also carry the preload hint used by clients that keep a small always-available schema set. See {ref}`prompting` for the user-facing activation contract. (#37) 
        
           **New installs are documented as `tmux`**

(AGENTS.md rule:

libtmux-mcp/AGENTS.md

Lines 419 to 421 in fbe4698

    
           **Fixed subheadings**, in this order when present: `### Breaking changes`, `### Dependencies`, `### What's new`, `### Fixes`, `### Documentation`, `### Development`. Dev tooling (helper scripts, internal automation) lives under `### Development`. For breaking changes, show the migration path with concrete inline code (e.g. a `# Before` / `# After` fenced code block). Dependency floor bumps use the form ``Minimum `pkg>=X.Y.Z` (was `>=X.Y.W`)``.

)

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

…onger matches (#45) why: wait_for_text returned found=True on the first poll whenever the pattern was already present in the pane when the call began, so agents using it to synchronise on command output saw the wrong result. The sibling wait_for_content_change already snapshots a baseline; wait_for_text now mirrors that pattern. what: - Snapshot (history_size, cursor_y) at entry; compute baseline_abs as the absolute grid index. Each poll re-reads history_size and captures from baseline_abs - hs_now + 1 onward, so the matched region tracks the pane's grid even as content scrolls into history. - Add _read_grid_position and _read_history_size helpers that expand the corresponding tmux format strings via display-message. - Drop content_start / content_end parameters; the baseline anchor supersedes them. Pre-alpha API contract makes this a clean drop. - Rewrite the docstring around the new "wait for new appearance" semantics, cross-reference search_panes for synchronous matches, and document edge cases (scrollback truncation, reverse-index sequences, clear/reset).

…#45) why: The baseline-anchor fix changes the user-visible contract for wait_for_text: a pattern already in the pane before the call must now return found=False, not the stale match the old code produced. The test suite needs to encode that contract so a future regression lands red. what: - Restructure WAIT_FOR_TEXT_FIXTURES with separate pre_command and during_command fields so fixtures can model "pattern was already on screen when wait_for_text started." - Add stale_scrollback_does_not_match: emit the pattern before wait_for_text starts and assert found=False on timeout. - Add matches_new_output_after_baseline: start the wait task, sleep, emit the marker via send_keys, assert found=True. Covers the positive path the baseline anchor was designed for.

why: The libtmux 0.56 bump merged on trunk ships Pane.display_message as a typed wrapper around display-message, and every other tool module in this package adopted it. The two helpers introduced for baseline anchoring (_read_grid_position, _read_history_size) reached back to raw pane.cmd("display-message", ...) instead. Bringing them in line keeps the codebase consistent with the same release-notes entry that markets the wrapper migration. what: - _read_grid_position and _read_history_size call pane.display_message(fmt, get_text=True) directly; the parse logic and empty-output fallback are unchanged. - Rename the local "result" binding to "stdout" to match the call surface used in meta.py.

why: capture-pane -S clips a below-visible start back to the bottom visible row (cmd-capture-pane.c, in every tmux release we support). When the cursor sits at the last visible row at entry, start_line = cy0 + 1 lands below the pane and tmux returns the bottom row's content. Stale text on that row — a fresh shell prompt, a printed marker — matches the pattern instantly, before any new output has appeared. The baseline-anchor fix that motivated this branch did not close this corner: the math is correct, but it assumed an empty capture result when start_line pointed below the visible region. what: - Add _read_pane_height alongside _read_history_size. Read pane_height once at entry; the bottom-row clip fires on entry-state, and a per-tick read costs another tmux round-trip without changing anything for the common case. - Before each capture, short-circuit lines to [] when start_line >= pane_height. Route through the existing deadline / sleep tail so the loop honours the timeout instead of tight- spinning. - tests/test_pane_tools.py: regression for #45. Respawn the pane with a sh -c that pre-fills, prints the marker without a trailing newline, and sleeps — freezing hsize and cursor_y for the wait window so the guard's first-tick contract is what's actually tested.

…dget why: start_time was captured after the initial _read_grid_position and _read_pane_height calls, so a stalled tmux server could block the tool for an arbitrary period before the user-supplied timeout deadline even started. libtmux's tmux_cmd uses Popen.communicate() without a subprocess timeout, so the worst case is real, not theoretical: a server-side stall on the first display-message leaks past the wall-clock budget the caller asked for. what: - Move start_time and deadline to the start of the resolved-pane block, before the baseline and pane-height reads. Same total work, no new branches; only the budget accounting changes.

why: Three parameter values turn the wait loop into a footgun rather than a wait primitive: - ``pattern=""`` — ``re.compile('')`` matches the zero-width position of every line, so the first poll returns ``found=True`` against whatever was already in the pane. - ``interval=0`` — ``asyncio.sleep(0)`` yields the event loop without idling, so the loop fires tmux subprocesses as fast as the scheduler hands them out (a self-inflicted server- side DoS). - ``timeout<=0`` — the loop body runs one probe before the deadline check, so a non-positive timeout silently turns "wait" into "synchronous probe" without saying so. Each is rejected at function entry so the failure surfaces as an explicit ``ToolError`` instead of corrupting the contract. what: - Inline parameter checks before the regex compile in wait_for_text; raise ToolError with the offending value in the message. - tests/test_pane_tools.py: one test per rejection, asserting the matching ToolError message.

why: The baseline-anchor mechanism assumes the pane's process and grid stay coherent for the duration of the wait. Two events break that assumption silently: - respawn-pane mid-wait. screen_reinit (screen.c) resets cy to 0 but preserves hsize, so the absolute anchor keeps pointing at the old process's grid — the new process's output ends up at rows the capture skips, and the wait times out as if nothing arrived. - pane death. With remain-on-exit set, tmux keeps the pane alive after its child exits and #{pane_dead} flips to 1. Without detection, the loop just keeps polling a stale frozen grid until the deadline. Both cases now surface as explicit ToolErrors so callers can react instead of timing out in the dark. what: - Replace _read_grid_position, _read_history_size, and _read_pane_height with a single _PaneState NamedTuple read via one display-message round-trip. Per-tick subprocess count drops from three reads (entry + per-tick history + per-tick we'd need height) to two (state + capture) without losing any fields. - Capture pane_pid at entry; each tick checks pane_dead first and then verifies pane_pid is unchanged. Either condition raises ToolError naming the affected pane and the transition. - tests/test_pane_tools.py: regression for respawn (asyncio.gather pattern: start wait_task, respawn after 0.1s, expect ToolError) and for pane death (set remain-on-exit, respawn into ``true``, poll #{pane_dead}, expect ToolError).

…me-row blind spot why: Two parts of the wait_for_text docstring oversold the baseline-anchor model: - The reverse-index note claimed ``\eM`` "rewrites history below the baseline." It does not. screen_write_reverseindex (screen-write.c) scrolls the visible region within [rupper, rlower] or decrements cy. ``hsize`` is never touched. A maintainer reading the tmux source to verify a related claim would lose trust in the docstring. - The "What 'new' means" block sells the tool as observing NEW text without saying that in-place updates to the entry cursor row (carriage-return rewrites, progress spinners, single-line status updates) are excluded by design. Users wiring this against TUIs that update in place see a silent timeout. what: - Rewrite the reverse-index note around the real hazard — in-place repaints below the baseline — and cite screen-write.c for the visible-region scroll behaviour rather than asserting history is rewritten. - Add a paragraph under "What 'new' means" that explicitly disclaims same-row rewrites and points readers at wait_for_content_change for the in-place case.

…md format why: The prior CHANGES entries on this branch carried commit-message shape (why: / what: bodies, mechanism-heavy bullets) into a changelog that AGENTS.md specifies as user-vocabulary prose under Bold subheadings. Three CHANGES commits were dropped from the branch via rebase; this single commit replaces them with prose entries in the project's actual changelog format. what: - Breaking changes: parameter drop with before/after migration block. - Fixes: one prose paragraph covering the baseline-anchor behaviour change (stale-scrollback fix, bottom-row clip handling, death/respawn surfacing) under a single user-vocabulary subheading. - Fixes: one prose paragraph covering input validation. - Section ordering: Breaking changes / Dependencies / Fixes, per AGENTS.md line 420.

…'t match the implementation why: Two comments in tests/test_pane_tools.py — the WaitForTextFixture docstring and the docstring on test_wait_for_text_matches_new_output_after_baseline — describe the positive-path coordination as using asyncio.gather. The implementation actually uses asyncio.create_task + a sequenced await on the emit coroutine, because gather would schedule both coroutines concurrently and lose the start-baseline-then-emit ordering the test relies on. A future maintainer following the comments would either rewrite the test incorrectly or doubt the existing shape. what: - WaitForTextFixture pre_command docstring: replace the asyncio.gather mention with asyncio.create_task plus a sequenced await. - test_wait_for_text_matches_new_output_after_baseline docstring: describe the actual create_task + sequenced-await mechanism and note explicitly that asyncio.gather is unsuitable here because it loses the ordering guarantee.

…t precedent why: The _stale_settled helper inside the wait_for_text parametrized test waited 2 seconds for (history_size, cursor_y) to stabilise across two consecutive polls. Every other settle-loop in this file (_pane_dead, _ready, _bottom_row_ready) uses a 5-second budget for similar tmux-state stability under shell activity. The 2-second outlier risked flaking on CI nodes whose zsh async-prompt hooks keep firing for longer than the budget, missing the two-consecutive-identical-reads window that signals settlement. what: - retry_until(_stale_settled, 2, raises=True) becomes retry_until(_stale_settled, 5, raises=True). Happy-path wall-clock is unaffected (the predicate returns True as soon as state stabilises, usually within milliseconds); only the failure-mode budget widens.

…nchor stays sound Adds a rollover guard inside the poll loop: when ``state.history_size < entry.history_size`` (grid_collect_history fired, clear-history ran, or the session's history-limit option was shrunk mid-wait), the absolute baseline ``hs0 + cy0`` is no longer recoverable. There is no server-side way to disambiguate "trimmed" from "still anchored", so surface the lost anchor as ToolError with a steer toward wait_for_channel for deterministic synchronization. Also fixes the stale ``pane_height`` reference in the bottom-row clip guard — it was captured once at entry and never refreshed, so a pane resize mid-wait left the guard keyed to a stale height. Compare against ``state.pane_height`` re-read each tick so the guard tracks the live visible region. Refs tmux grid.c grid_collect_history; verified against tmux master HEAD.

Passes ``join_wrapped=True`` (tmux ``-J``) to ``Pane.capture_pane`` so a pattern that crosses the pane's visual wrap is matched against the joined logical line. Without this, long error strings like "Build failed: module not found" got split across two rows by tmux and a naive ``re.search`` against each row in isolation never matched. The returned ``matched_lines`` entry for a wrap-crossing hit is the joined line and can therefore be longer than ``pane_width``; documented in the Notes section so agents that bound output understand the implication.

…nistic primitive Re-frames the wait family so agents pick the cheaper, race-free option first. Three coordinated doc edits: * ``send_keys`` docstring now forks the post-send choice into deterministic (``wait_for_channel`` + ``tmux wait-for -S``), pattern-match (``wait_for_text`` for output the agent doesn't control), and any-change (``wait_for_content_change``). * Server ``_INSTR_WAIT_NOT_POLL`` system instruction names ``wait_for_channel`` first; the 2 KB instruction budget is preserved by tightening the supporting prose. * ``wait_for_text`` docstring gains a ``When NOT to use this`` section that names the sequential-send_keys race and steers callers to the channel pattern with a pointer at the ``run_and_wait`` recipe. Tests: - ``test_base_instructions_prefer_wait_over_poll`` now requires ``wait_for_channel`` to appear in the system instructions and to be named before the fallbacks. - ``test_send_keys_docstring_cross_links_wait_for_channel`` is a cheap regression against future docstring drift.

The Phase 1 rollover predicate (``state.history_size < entry.history_size``) fires correctly on real row eviction — ``grid_collect_history`` trim, ``clear-history``, ``set-option history-limit <smaller>`` on tmux master — but ALSO fires on resize-grow when ``hscrolled > 0``. tmux's ``screen_resize_y`` (screen.c:451-465) decrements ``gd->hsize`` on a vertical grow by pulling rows from history back into the visible region. The rows are not freed; only the history/visible-region partition shifts and absolute indices stay valid. The Phase 1 predicate raised a spurious ``ToolError`` in that case. Trim and resize-grow are distinguished by ``pane_height``: trim leaves it unchanged, resize-grow increases it. Adding ``state.pane_height <= entry.pane_height`` as a conjunct makes the predicate the actual signature of row eviction, exempting resize-grow. Also walks back two over-broad claims that shipped with Phase 1: the "monotonically non-decreasing" framing in CHANGES (resize-grow proves it is not monotone) and the parenthetical about "shrinking history-limit mid-wait" — that retroactive-trim path requires tmux commit 195a9cf which is not in any tagged release as of tmux 3.6a. Adds test_wait_for_text_survives_resize_grow_with_scrolled_history.

… guard catches a subset A 2 ms high-frequency repro over ~3000 hsize samples showed the rollover guard's strict-inequality predicate (state.history_size < entry.history_size) never observes a dip during real history-limit overflow: tmux's grid_collect_history trims (~10% of hlimit) then immediately scrolls new lines back, so sampled hsize stays clamped near the cap. The predicate only fires when entry.history_size happened to land exactly at hlimit (environment luck). All earlier claims that "wait_for_text raises on history rollover" were therefore over-broad. Walk back the contract: * Docstring's "Scrollback rollover raises" Note now says "rollover detection is partial" and enumerates what is and isn't caught (clear-history yes; retroactive history-limit shrink on tmux >= 3.7 yes; grid_collect_history trim during continuous output NO). Points to the upcoming risk-band warning and to wait_for_channel for deterministic synchronization. * ToolError message changes from "history rolled over during wait" to "history shrank below entry baseline" -- describes what we actually detected (a shrink), not what we assumed (a rollover). * CHANGES breaking-change entry: same walk-back; references the new ### Fixes warning entry and the tracking issue filed in libtmux. * test_wait_for_text_raises_on_history_rollover renamed to test_wait_for_text_raises_when_history_is_cleared, since that is the path it actually exercises. Docstring trimmed to describe the test mechanism, with no dev-history narrative. * Same dev-history narrative removed from test_wait_for_text_survives_resize_grow_with_scrolled_history (same anti-pattern, same project guidance about not referencing the task or fix history in code comments). No behavior change beyond the message text and test rename. Phase 7 follow-up adds the risk-band warning predicate that catches what this guard does not.

… band The strict-shrink rollover guard misses grid_collect_history trim during continuous output: tmux trims ~10% of history-limit then immediately scrolls a line back, so sampled hsize stays clamped at the cap and the predicate state.history_size < entry.history_size never fires. A 2 ms high-frequency probe across ~3000 samples confirmed there is no sub-poll window where hsize dips below entry once entry was sub-cap. Emit a one-shot ctx.warning when sampled state enters the trim-risk band (top 10% of history-limit) AND the pane has advanced since the wait's baseline. MCP clients subscribed to notifications/message see "correctness is best-effort here. For deterministic synchronization use wait_for_channel." and can decide whether to keep waiting, retry, or switch primitives. Warning is one-shot per call so we don't spam. Adds history_limit: int to _PaneState; the field is read in the same display-message round-trip as the other per-tick fields, so the loop still costs two subprocesses per tick (state + capture). No CHANGES update in this commit -- the consolidated wait-for-text CHANGES entry will land in a later commit per the Narrative Bleed doctrine (no published-release user saw "no warning" to be "fixed"; this warning is just part of the new contract).

… output Adds the test the rollover claim should have had from the start. Earlier test_wait_for_text_raises_when_history_is_cleared exercises clear-history only -- not the grid_collect_history trim that fires during bursty output. This new test covers the actual real-rollover path. Shape: set history-limit=50 globally, split a fresh pane that inherits the small limit, attach a stub Context that records ctx.warning calls, start wait_for_text against a never-appearing pattern, burst 200 echo lines through the pane. tmux's grid_collect_history fires repeatedly, sampled history_size enters the risk band (>= 45), and the wait emits the one-shot trim-risk-band warning. The wait's found / timed_out result is intentionally not asserted -- the contract is best-effort once polling enters the risk band, so pinning a specific result would be over-specifying what tmux's grid model can't actually guarantee. The test pins the warning contract (what the tool guarantees) only.

…t for a published-release reader CHANGES walk-back and Phase 5 "spurious resize-grow" entries described intra-branch transitions a 0.1.0a6 user never saw. Consolidate the wait-for-text section into entries that describe only what the published release reader gets: * Breaking changes: one entry under "waits for new output, not stale scrollback" covers the #45 fix (real published bug), the new baseline-loss ToolErrors, the resize-grow exemption, the trim-risk-band warning, and the wait_for_channel cross-link. Reader needs zero branch context. * Fixes: keep the wrapped-line match and footgun-rejection entries (both real published bugs). Drop the "no longer raises spuriously on resize-grow" entry (phantom fix: 0.1.0a6 had no rollover guard to raise, spuriously or otherwise). The duplicate "waits for new output" Fixes entry is also dropped -- the consolidated Breaking changes entry covers it. Docstring Notes section in wait.py scrubbed of the tmux-commit-SHA citation and the forward-looking libtmux tracking-issue cross-reference. The tracking issue exists (filed at libtmux as part of this PR) but that's branch-internal narrative; users running the shipped wait_for_text don't need to know about a follow-up investigation issue to use the tool correctly.

tony · 2026-05-16T13:21:18Z

Code review

Found 2 issues:

CHANGES unreleased section ordering still violates AGENTS.md (the May 15 review's finding remains unresolved). AGENTS.md says "Fixed subheadings, in this order when present: ### Breaking changes, ### Dependencies, ### What's new, ### Fixes, ### Documentation, ### Development." Current order is Breaking changes → Documentation → Dependencies → Fixes. ### Documentation must come last (after ### Fixes), and ### Dependencies must come before ### Fixes.

libtmux-mcp/CHANGES

Lines 28 to 44 in 91ea744

    
           Trim that fires during continuous output cannot be reliably detected from `history_size` alone: tmux's trim-then-scroll bounce keeps sampled `hsize` clamped near `history-limit`. When polling approaches that band, the tool emits a `notifications/message` warning so MCP clients can decide whether to keep waiting, retry, or switch to {tooliconl}`wait-for-channel`. For deterministic command-completion synchronization, compose `tmux wait-for -S <channel>` into the shell command and call {tooliconl}`wait-for-channel`. (#45) 
        
           ### Documentation 
        
           **Wait family is re-framed around {tooliconl}`wait-for-channel` as the deterministic primitive** 
        
           The {tooliconl}`send-keys` docstring, the server system instructions, and the {tooliconl}`wait-for-text` docstring now point agents at {tooliconl}`wait-for-channel` with composed `tmux wait-for -S` for command completion, and reserve {tooliconl}`wait-for-text` / {tooliconl}`wait-for-content-change` for output the agent does not author. The `run_and_wait` recipe is the canonical status-preserving pattern. (#45 follow-up) 
        
           ### Dependencies 
        
           **Minimum `libtmux>=0.56.0`** (was `>=0.55.1`). Unlocks the new tmux-command wrappers shipped in libtmux 0.56.0 — {meth}`~libtmux.Pane.respawn`, {meth}`~libtmux.Pane.copy_mode`, {meth}`~libtmux.Pane.pipe`, {meth}`~libtmux.Pane.swap`, {meth}`~libtmux.Pane.paste_buffer`, {meth}`~libtmux.Pane.clear_history`, {meth}`~libtmux.Pane.display_message`, {meth}`~libtmux.Server.delete_buffer`, and the {meth}`~libtmux.Session.next_window` / {meth}`~libtmux.Session.previous_window` / {meth}`~libtmux.Session.last_window` trio — so the MCP no longer falls back to raw `cmd()` calls for tmux commands the upstream API now covers. (#46) 
        
           ### Fixes 
        
           **{tooliconl}`wait-for-text` matches patterns across visually-wrapped lines**

AGENTS.md rule:

libtmux-mcp/AGENTS.md

Lines 419 to 421 in 91ea744

    
           **Fixed subheadings**, in this order when present: `### Breaking changes`, `### Dependencies`, `### What's new`, `### Fixes`, `### Documentation`, `### Development`. Dev tooling (helper scripts, internal automation) lives under `### Development`. For breaking changes, show the migration path with concrete inline code (e.g. a `# Before` / `# After` fenced code block). Dependency floor bumps use the form ``Minimum `pkg>=X.Y.Z` (was `>=X.Y.W`)``.

tests/test_pane_tools.py:2074 uses bash/zsh brace expansion {1..100} while every other pre-fill in the same file uses POSIX-portable $(seq 1 N) (8 other call sites: lines 1332, 1519, 1563, 1601, 1662, 1716, 2018). When the new pane's shell is dash or any non-bash POSIX /bin/sh, the literal string {1..100} is iterated once and produces no scrollback — the _prefilled retry_until then exhausts its budget before wait_for_text ever runs. Verified across all tmux 3.2a–master checkouts: shell selection sits below tmux, this is purely a shell-portability issue. Use $(seq 1 100) to match the file's convention.

libtmux-mcp/tests/test_pane_tools.py

Lines 2072 to 2076 in 91ea744

    
           # history-limit is 50. Risk floor (top 10%) is 45. 
        
           # Print 100 lines to ensure hsize reaches the cap (50). 
        
           fresh_pane.send_keys("for i in {1..100}; do echo line$i; done", True) 
        
           def _prefilled() -> bool:

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

…y_limit once The risk-band warning's `advanced` conjunction (state.history_size > entry.history_size OR state.cursor_y != entry.cursor_y) gated the warning on the wait having moved since entry. Three real cases that warning was meant to cover slipped past it: * Entry already in the risk band, idle wait -- no output happens, nothing advances, warning never fires even though the baseline is in dangerous territory from tick zero. * Cursor pinned at the bottom row of the visible region -- subsequent scrolls advance history_size but leave cursor_y unchanged; if hsize also stayed at the cap across two polls, advanced returned False. * Trim+scroll bounce keeping hsize stable at the cap -- entry at 49, every subsequent poll also sees 49; advanced returns False. Drop the conjunction. The one-shot warned_risk_band flag is already the right noise filter -- it limits warnings to one per call regardless of how long the wait sits in the band. Firing on any tick where state.history_size >= risk_floor catches all three cases. Also cache baseline_hlimit at entry instead of re-reading state.history_limit each tick. history-limit is fixed at pane creation (retroactive change only lands in tmux 3.7+), so per-tick reads are redundant. The warning message uses the cached value too. Adds test_wait_for_text_warns_when_already_in_risk_band -- pre-fills a fresh pane past the risk floor, then runs an idle wait with no further output. Asserts the warning fires anyway. Pairs with the existing test_wait_for_text_warns_in_history_limit_risk_band which covers the advance-into-the-band path.

…ooting at wait_for_channel Two doc surfaces still recommended the send_keys -> wait_for_text -> capture_pane pattern that the wait-family re-framing walked back: * docs/topics/prompting.md has a "For general tmux workflows" code block tagged ``:class: system-prompt`` -- specifically meant for agents to copy into their AGENTS.md / CLAUDE.md / .cursorrules. Higher leverage than tool descriptions because it shapes the agent's mental model before any tool call. * docs/topics/troubleshooting.md item 3 ("Timing") had the same recommendation in a "what to check" bullet -- lower visibility but still misleading. Both now lead with wait_for_channel + composed tmux wait-for -S as the deterministic primitive for command completion, and demote wait_for_text / wait_for_content_change to "output you don't author" duty. Matches the send_keys docstring, server instructions, and the wait_for_text "When NOT to use this" section. No CHANGES update -- the underlying re-framing already shipped under ### Documentation. These are stragglers being brought in line with the announced change.

…proportions and ordering Three corrections against the Changelog Conventions in AGENTS.md: * **Section order.** The fixed subheading order is Breaking changes -> Dependencies -> What's new -> Fixes -> Documentation -> Development. The unreleased block had Documentation in the wrong slot (between Breaking changes and Dependencies) and Fixes ahead of Documentation out of order. Reordered to match. * **Proportional prose.** Dropped backend-mechanism phrasing ("trim-then-scroll bounce", "sampled hsize clamped near history-limit", "no longer falls back to raw cmd() calls") that violates the "internal jargon" anti-pattern. The big breaking-change entry is now three short paragraphs in user vocabulary (what changed, what raises, what the warning means). The Dependencies entry no longer walls-of-text 9 specific {meth} method names; it describes the capability ("typed wrappers for tmux commands the server invokes"). * **PR ref form.** Replaced the lone `(#45 follow-up)` qualifier with bare `(#45)` to match the project's pattern across 25+ other entries. The Documentation entry also picks up the doc-fragment work that Phase 12 + Phase 14a actually shipped (quickstart, gotchas, prompting, troubleshooting, recipes, send-keys topics), not just the three docstring surfaces named originally. The reordered breaking change pair leads with the bigger semantic change (waits for new output) and follows with the syntactic break (content_start / content_end removal) so readers see the meatier fix first.

…_wait recipe The run_and_wait prompt template appended ``exit $__mcp_status`` to its shell payload to preserve the command's exit status. In an interactive shell that exits the shell itself; with a single-pane session, the session and tmux server exit with it. Reproduced: channel signals, then list-panes / capture-pane fail with "no server running." The equivalent example in docs/tools/pane/wait-for-channel.md was worse -- ``pytest; status=$?; tmux wait-for -S tests_done; exit $status`` uses ``status`` (read-only in zsh) AND ``exit $status`` (kills the shell in any shell). Reproduced: ``zsh: read-only variable: status`` followed by a deadlocked wait-for. Replace both with the simpler ``{command}; tmux wait-for -S {channel}`` form. Shell ``;`` semantics fire the wait-for whether the command succeeds or fails, so the edge-triggered signal still never deadlocks on a crashed command -- which was the safety property the original ``exit $status`` chain was supposed to provide. Drop the "preserving the command's exit status" claim from the recipe docstring and rendered prose; preserving the status in an interactive shell requires out-of-band routing (a file, a tmux user variable, a scratch pane) that doesn't belong in a teaching recipe. The existing test pinned ``exit $__mcp_status`` as a required token in the rendered template -- that assertion is inverted to lock the new safe contract. Tests still pin the channel naming pattern and the wait-for suffix. Adds a Fixes CHANGES entry: the run_and_wait prompt shipped in 0.1.0a2, so this is a real user-facing bug for 0.1.0a6 users.

…at entry The cursor-position anchor captures rows below the entry cursor each tick, which means rows that already contained content at entry got matched by the regex on the first poll. Reviewer reproduced with pane state 0:0:10 and STALE_BELOW painted on row 1: the wait returned found=True with matched_lines=['STALE_BELOW'] in 10 ms. The docstring claim that wait_for_text matches "lines written after the call starts" was broader than the implementation could honestly deliver. Snapshot the rows below the entry cursor once at entry as a frozenset. On every tick, drop captured lines whose content is already in that set before running the regex. Result is an honest content-delta contract: "row exists at start_line onward AND was not at that content at entry time". Stale paint-style writes (TUI repaints, paste-text, manual cursor positioning) get filtered; new output appended below the entry cursor still matches; in-place updates of rows that pre-existed at entry are also filtered (false-negative edge case the docstring now calls out). Trade-off: one extra capture-pane call at entry. The frozenset lookup is O(1) per line. Per-tick cost is unchanged in subprocess count. Adds test_wait_for_text_ignores_stale_below_cursor as the regression the reviewer's reproduction would have caught: respawn the pane with ``printf 'TOP\nSTALE_BELOW\n'; printf '\033[H'; sleep 60`` (the ESC[H sequence moves the cursor back to (0,0) so STALE_BELOW sits on row 1 below the cursor), then assert wait_for_text(STALE_BELOW) returns found=False. No CHANGES update: this tightens behavior introduced earlier in the same PR (the baseline-anchor design). A 0.1.0a6 user never saw the cursor anchor at all, so the filter is not a user-visible fix — it's the docstring contract getting honest in the same release.

…remaining surfaces Phase 12 + Phase 14a swept six doc surfaces; a follow-up review caught three more: * docs/tools/index.md "Running a command?" entry led with send-keys -> wait-for-text + capture-pane. Now leads with the channel-composed pattern for authored commands, with wait-for-text and wait-for-content-change demoted to "output you don't author." * docs/recipes.md pytest workflow used wait-for-text(pattern="passed|failed|error"). Pytest is the agent's authored command -- the deterministic composition is "pytest; tmux wait-for -S pytest_done" + wait-for-channel. Updated the Act paragraph to match. * src/libtmux_mcp/prompts/recipes.py interrupt_gracefully kept wait_for_text on purpose -- C-c is a signal, not a command, so there is no shell statement to compose tmux wait-for -S into. Added a clarifying note in the rendered template so future readers don't apply the channel re-framing reflexively. No CHANGES update: the wait-family re-framing already has its entry under ### Documentation. This just completes the sweep.

… what early-anchor measures The pre-baseline start_time anchor was annotated as preventing a stalled tmux server from blowing the user-supplied timeout. That overstated what the early anchor does. libtmux's tmux_cmd uses Popen.communicate() with no subprocess timeout, so a hung tmux call blocks indefinitely regardless of where start_time was anchored. The early anchor's actual value: WaitForTextResult.elapsed_seconds reflects total call duration including the baseline read, so a hung tmux call's blowout is visible in the returned result. Measurement, not prevention. Comment-only change. No CHANGES, no behavior delta.

…ion than wait_for_text This branch hardens wait_for_text against pane respawn, pane death, and clear-history mid-wait, and re-frames the wait family as a fallback ladder that ends at wait_for_content_change. But the sibling tool itself never gained those guards -- so an agent that follows the recommended fallback order lands on the tool with the weakest baseline-loss detection. Respawn, death, and clear-history surface there as silent changed=True returns instead of ToolError. Add a single docstring paragraph naming the gap and pointing back at wait_for_channel for correctness-sensitive flows. Full parity with wait_for_text is filed as #53 -- this commit closes the disclosure gap so agents see the asymmetry from the tool description directly, not only from following the issue trail. Verified via the local MCP probe: wait_for_content_change now exposes the wait_for_text / ToolError / wait_for_channel tokens in its description across all four configured clients.

…m tool docstrings The recipe fix in #47 replaced run_and_wait's killing payload (`cmd; status=$?; tmux wait-for -S name; exit $status`) with the safe `cmd; tmux wait-for -S name` form, because `exit $status` in an interactive shell exits the shell itself and destroys single-pane sessions. The sweep missed the two surfaces where it matters most: * wait_for_tools.py module docstring (Safety section) -- told every agent reading it to use the killing payload. * wait_for_channel function docstring (composition example) -- the same payload appears as the canonical composition example, with a body paragraph explicitly recommending `status=$?; ...; exit $status` as "important." Both become MCP tool descriptions at the protocol level, so agents learn the destructive pattern directly from the channel tool that this PR positions as the deterministic primitive. That is the opposite of the framing the rest of the branch establishes. Replace both example payloads with `pytest; tmux wait-for -S tests_done` (the form recipes.py renders since the recipe fix), and add a "Do NOT chain exit $status" warning explaining the failure mode so an agent that has read older docs can correct course. Exit-status preservation in interactive shells is named as out-of-scope to match the recipe's stance. Verified via the local MCP probe against the swapped `tmux` server on claude / codex / cursor / gemini: wait_for_channel now teaches the safe pattern and contains no unguarded `exit $status` token.

…s surfaces The recipe fix in #47 removed exit-status preservation from the run_and_wait recipe itself but the supporting prose that describes what run_and_wait produces was not swept. Five surfaces still advertised the recipe as status-preserving: * docs/prompts.md sample render -- showed the OLD payload `pytest; __mcp_status=$?; tmux wait-for -S name; exit $__mcp_status` in the run_and_wait sample-render block, including the body paragraph that justified the exit-status idiom. Readers of the prompts catalogue page learned the killing pattern even though the rendered prompt itself no longer emits it. * src/libtmux_mcp/tools/pane_tools/io.py send_keys docstring -- referred to run_and_wait as "the canonical status-preserving pattern." The send_keys description is the entry point most agents read for picking a sync primitive; pointing them at a status- preserving recipe that no longer preserves status is a contract lie. * src/libtmux_mcp/tools/pane_tools/wait.py wait_for_text docstring -- carried "safe, status-preserving composition" in the channel cross-link. * CHANGES unreleased entry -- the Documentation bullet for the channel-first re-framing closed with "The run_and_wait recipe shows the canonical status-preserving pattern." * docs/demo.md -- the worked examples and "fundamental pattern" prose still chained `send_keys -> wait_for_text -> capture_pane`, rather than the channel-first pattern this PR establishes. Replace "status-preserving" with "safe-completion" on the io.py / CHANGES surfaces; rewrite the prompts.md sample render to match the actual run_and_wait output (now `cmd; tmux wait-for -S channel`) with the recipe's explicit "does not preserve exit status" disclaimer; switch demo.md's worked examples and "fundamental pattern" sentence to the wait_for_channel composition. Verified via the local MCP probe across claude / codex / cursor / gemini: send_keys description now contains "safe-completion" and no surface teaches the OLD payload string.

WaitForTextResult.timed_out was mechanically `not found`; ContentChangeResult.timed_out was mechanically `not changed`. There is no third state — every wait outcome is either match-found or timed-out, so the second boolean carries no information beyond the primary result field. The redundancy was visible in the test suite as `assert result.timed_out is (not expected_found)`, asserting the identity directly. Drop the field from both Pydantic models. Drop the assignments in the two return sites in wait.py. Drop the redundant assertions in test_pane_tools.py — each was paired with an equivalent assertion on `found` or `changed`. Replace the now-unused-result lints in the two timeout-warning tests with affirmative `assert result.found is False` / `assert result.changed is False` so the timeout outcome stays covered alongside the warning-emission contract. Update the timed_out references in test docstrings and in the JSON examples on the wait-for-text / wait-for-content-change docs pages. CHANGES gains a Breaking changes bullet. Pre-alpha, no compat shim. Agents that read `result.timed_out` should switch to `not result.found` / `not result.changed`. Verified via the local MCP probe across the swapped clients and the full quality-gate chain: 462 tests pass, ruff and mypy clean, build-docs succeeds, tool descriptions still teach the safe pattern.

…able why: AGENTS.md "Changelog Conventions" requires each deliverable to have a bold subheading naming it in user vocabulary, with cross-references in the prose body. The previous heading led with two `{class}` rST roles that crowded the user-visible change ("drop `timed_out`") and pushed it to the end of a long line. what: - Rename the bold subheading to "Wait result models drop `timed_out`" — names the deliverable, fits on one line, reads as user vocabulary. - Move the two `{class}` cross-references for WaitForTextResult and ContentChangeResult into the opening sentence of the body where they render inline. - Body wording unchanged otherwise; the migration guidance and PR reference stay intact.

tony force-pushed the stale-wait-for-text branch from aa8de89 to 317fbee Compare May 12, 2026 22:58

tony force-pushed the stale-wait-for-text branch 3 times, most recently from 5c72dd9 to 9cafb35 Compare May 15, 2026 01:04

tony force-pushed the stale-wait-for-text branch 3 times, most recently from afea390 to 273d08f Compare May 16, 2026 09:40

tony added 19 commits May 16, 2026 07:43

tony force-pushed the stale-wait-for-text branch from 273d08f to 91ea744 Compare May 16, 2026 13:01

tony changed the title ~~Pane(fix[wait_for_text]): anchor on baseline so stale scrollback no longer matches~~ Pane(fix[wait_for_text]): waits for new output, joins wrapped lines, prefers wait-for-channel for completion May 16, 2026

tony added 2 commits May 16, 2026 08:50

tony force-pushed the stale-wait-for-text branch from 91ea744 to 82cc9cd Compare May 16, 2026 13:52

tony added 8 commits May 16, 2026 09:00

tony force-pushed the stale-wait-for-text branch from 989322a to 0ded558 Compare May 16, 2026 17:07

tony added 2 commits May 16, 2026 12:36

tony merged commit b180d99 into main May 16, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pane(fix[wait_for_text]): waits for new output, joins wrapped lines, prefers wait-for-channel for completion#47

Pane(fix[wait_for_text]): waits for new output, joins wrapped lines, prefers wait-for-channel for completion#47
tony merged 31 commits into
mainfrom
stale-wait-for-text

tony commented May 12, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 12, 2026 •

edited

Loading

Uh oh!

tony commented May 13, 2026

Uh oh!

tony commented May 15, 2026

Uh oh!

tony commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tony commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed for users

Theoretical grounding

Test plan

Filed follow-ups (deferred, not in this PR)

Breaking changes

Uh oh!

codecov-commenter commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tony commented May 13, 2026

Code review

Uh oh!

tony commented May 15, 2026

Code review

Uh oh!

tony commented May 16, 2026

Code review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tony commented May 12, 2026 •

edited

Loading

codecov-commenter commented May 12, 2026 •

edited

Loading