Skip to content

Commit ee674b8

Browse files
authored
docs: update a few docs and code comments (#2564)
1 parent 02bf923 commit ee674b8

File tree

6 files changed

+144
-22
lines changed

6 files changed

+144
-22
lines changed

docs/agents.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,48 @@ agent = Agent[UserContext](
211211

212212
## Lifecycle events (hooks)
213213

214-
Sometimes, you want to observe the lifecycle of an agent. For example, you may want to log events, or pre-fetch data when certain events occur. You can hook into the agent lifecycle with the `hooks` property. Subclass the [`AgentHooks`][agents.lifecycle.AgentHooks] class, and override the methods you're interested in.
214+
Sometimes, you want to observe the lifecycle of an agent. For example, you may want to log events, pre-fetch data, or record usage when certain events occur.
215+
216+
There are two hook scopes:
217+
218+
- [`RunHooks`][agents.lifecycle.RunHooks] observe the entire `Runner.run(...)` invocation, including handoffs to other agents.
219+
- [`AgentHooks`][agents.lifecycle.AgentHooks] are attached to a specific agent instance via `agent.hooks`.
220+
221+
The callback context also changes depending on the event:
222+
223+
- Agent start/end hooks receive [`AgentHookContext`][agents.run_context.AgentHookContext], which wraps your original context and carries the shared run usage state.
224+
- LLM, tool, and handoff hooks receive [`RunContextWrapper`][agents.run_context.RunContextWrapper].
225+
226+
Typical hook timing:
227+
228+
- `on_agent_start` / `on_agent_end`: when a specific agent begins or finishes producing a final output.
229+
- `on_llm_start` / `on_llm_end`: immediately around each model call.
230+
- `on_tool_start` / `on_tool_end`: around each local tool invocation.
231+
- `on_handoff`: when control moves from one agent to another.
232+
233+
Use `RunHooks` when you want a single observer for the whole workflow, and `AgentHooks` when one agent needs custom side effects.
234+
235+
```python
236+
from agents import Agent, RunHooks, Runner
237+
238+
239+
class LoggingHooks(RunHooks):
240+
async def on_agent_start(self, context, agent):
241+
print(f"Starting {agent.name}")
242+
243+
async def on_llm_end(self, context, agent, response):
244+
print(f"{agent.name} produced {len(response.output)} output items")
245+
246+
async def on_agent_end(self, context, agent, output):
247+
print(f"{agent.name} finished with usage: {context.usage}")
248+
249+
250+
agent = Agent(name="Assistant", instructions="Be concise.")
251+
result = await Runner.run(agent, "Explain quines", hooks=LoggingHooks())
252+
print(result.final_output)
253+
```
254+
255+
For the full callback surface, see the [Lifecycle API reference](ref/lifecycle.md).
215256

216257
## Guardrails
217258

docs/mcp.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,7 @@ async with MCPServerStdio(
339339
## 5. MCP server manager
340340

341341
When you have multiple MCP servers, use `MCPServerManager` to connect them up front and expose the connected subset to your agents.
342+
See the [MCPServerManager API reference](ref/mcp/manager.md) for constructor options and reconnect behavior.
342343

343344
```python
344345
from agents import Agent, Runner

docs/tools.md

Lines changed: 24 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -673,10 +673,9 @@ Disabled tools are completely hidden from the LLM at runtime, making this useful
673673

674674
## Experimental: Codex tool
675675

676-
The `codex_tool` wraps the Codex CLI so an agent can run workspace-scoped tasks (shell, file edits, MCP tools)
677-
during a tool call. This surface is experimental and may change.
678-
By default, the tool name is `codex`. If you set a custom name, it must be `codex` or start with `codex_`.
679-
When an agent includes multiple Codex tools, each must use a unique name (including vs non-Codex tools).
676+
The `codex_tool` wraps the Codex CLI so an agent can run workspace-scoped tasks (shell, file edits, MCP tools) during a tool call. This surface is experimental and may change.
677+
678+
Use it when you want the main agent to delegate a bounded workspace task to Codex without leaving the current run. By default, the tool name is `codex`. If you set a custom name, it must be `codex` or start with `codex_`. When an agent includes multiple Codex tools, each must use a unique name.
680679

681680
```python
682681
from agents import Agent
@@ -705,21 +704,33 @@ agent = Agent(
705704
)
706705
```
707706

708-
What to know:
707+
Start with these option groups:
708+
709+
- Execution surface: `sandbox_mode` and `working_directory` define where Codex can operate. Pair them together, and set `skip_git_repo_check=True` when the working directory is not inside a Git repository.
710+
- Thread defaults: `default_thread_options=ThreadOptions(...)` configures the model, reasoning effort, approval policy, additional directories, network access, and web search mode. Prefer `web_search_mode` over the legacy `web_search_enabled`.
711+
- Turn defaults: `default_turn_options=TurnOptions(...)` configures per-turn behavior such as `idle_timeout_seconds` and the optional cancellation `signal`.
712+
- Tool I/O: tool calls must include at least one `inputs` item with `{ "type": "text", "text": ... }` or `{ "type": "local_image", "path": ... }`. `output_schema` lets you require structured Codex responses.
713+
714+
Thread reuse and persistence are separate controls:
715+
716+
- `persist_session=True` reuses one Codex thread for repeated calls to the same tool instance.
717+
- `use_run_context_thread_id=True` stores and reuses the thread ID in run context across runs that share the same mutable context object.
718+
- Thread ID precedence is: per-call `thread_id`, then run-context thread ID (if enabled), then the configured `thread_id` option.
719+
- The default run-context key is `codex_thread_id` for `name="codex"` and `codex_thread_id_<suffix>` for `name="codex_<suffix>"`. Override it with `run_context_thread_id_key`.
720+
721+
Runtime configuration:
709722

710723
- Auth: set `CODEX_API_KEY` (preferred) or `OPENAI_API_KEY`, or pass `codex_options={"api_key": "..."}`.
711724
- Runtime: `codex_options.base_url` overrides the CLI base URL.
712725
- Binary resolution: set `codex_options.codex_path_override` (or `CODEX_PATH`) to pin the CLI path. Otherwise the SDK resolves `codex` from `PATH`, then falls back to the bundled vendor binary.
713726
- Environment: `codex_options.env` fully controls the subprocess environment. When it is provided, the subprocess does not inherit `os.environ`.
714727
- Stream limits: `codex_options.codex_subprocess_stream_limit_bytes` (or `OPENAI_AGENTS_CODEX_SUBPROCESS_STREAM_LIMIT_BYTES`) controls stdout/stderr reader limits. Valid range is `65536` to `67108864`; default is `8388608`.
715-
- Inputs: tool calls must include at least one item in `inputs` with `{ "type": "text", "text": ... }` or `{ "type": "local_image", "path": ... }`.
716-
- Thread defaults: configure `default_thread_options` for `model_reasoning_effort`, `web_search_mode` (preferred over legacy `web_search_enabled`), `approval_policy`, and `additional_directories`.
717-
- Turn defaults: configure `default_turn_options` for `idle_timeout_seconds` and cancellation `signal`.
718-
- Safety: pair `sandbox_mode` with `working_directory`; set `skip_git_repo_check=True` outside Git repos.
719-
- Run-context thread persistence: `use_run_context_thread_id=True` stores and reuses `thread_id` in run context, across runs that share that context. This requires a mutable run context (for example, `dict` or a writable object field).
720-
- Run-context key defaults: the stored key defaults to `codex_thread_id` for `name="codex"`, or `codex_thread_id_<suffix>` for `name="codex_<suffix>"`. Set `run_context_thread_id_key` to override.
721-
- Thread ID precedence: per-call `thread_id` input takes priority, then run-context `thread_id` (if enabled), then the configured `thread_id` option.
722728
- Streaming: `on_stream` receives thread/turn lifecycle events and item events (`reasoning`, `command_execution`, `mcp_tool_call`, `file_change`, `web_search`, `todo_list`, and `error` item updates).
723729
- Outputs: results include `response`, `usage`, and `thread_id`; usage is added to `RunContextWrapper.usage`.
724-
- Structure: `output_schema` enforces structured Codex responses when you need typed outputs.
730+
731+
Reference:
732+
733+
- [Codex tool API reference](ref/extensions/experimental/codex/codex_tool.md)
734+
- [ThreadOptions reference](ref/extensions/experimental/codex/thread_options.md)
735+
- [TurnOptions reference](ref/extensions/experimental/codex/turn_options.md)
725736
- See `examples/tools/codex.py` and `examples/tools/codex_same_thread.py` for complete runnable samples.

src/agents/run_internal/oai_conversation.py

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,20 +40,39 @@ def _fingerprint_for_tracker(item: Any) -> str | None:
4040

4141
@dataclass
4242
class OpenAIServerConversationTracker:
43-
"""Track server-side conversation state for conversation-aware runs."""
43+
"""Track server-side conversation state for conversation-aware runs.
44+
45+
This tracker keeps three complementary views of what has already been acknowledged:
46+
47+
- Object identity for prepared items in the current Python process.
48+
- Stable server item IDs and tool call IDs returned by the provider.
49+
- Content fingerprints for retry/resume paths where object identity changes.
50+
51+
The runner uses these sets together to decide which deltas are still safe to send when a
52+
run is resumed, retried after a transient failure, or rebuilt from serialized RunState.
53+
"""
4454

4555
conversation_id: str | None = None
4656
previous_response_id: str | None = None
4757
auto_previous_response_id: bool = False
58+
59+
# In-process object identity for items that have already been delivered or acknowledged.
4860
sent_items: set[int] = field(default_factory=set)
4961
server_items: set[int] = field(default_factory=set)
62+
63+
# Stable provider identifiers returned by the Responses API.
5064
server_item_ids: set[str] = field(default_factory=set)
5165
server_tool_call_ids: set[str] = field(default_factory=set)
66+
67+
# Content-based dedupe for resume/retry paths where objects are reconstructed.
5268
sent_item_fingerprints: set[str] = field(default_factory=set)
5369
sent_initial_input: bool = False
5470
remaining_initial_input: list[TResponseInputItem] | None = None
5571
primed_from_state: bool = False
5672
reasoning_item_id_policy: ReasoningItemIdPolicy | None = None
73+
74+
# Mapping from normalized prepared items back to their original source objects so that
75+
# mark_input_as_sent() can mark the right object identities after the model call succeeds.
5776
prepared_item_sources: dict[int, TResponseInputItem] = field(default_factory=dict)
5877
prepared_item_sources_by_fingerprint: dict[str, list[TResponseInputItem]] = field(
5978
default_factory=dict
@@ -75,7 +94,13 @@ def hydrate_from_state(
7594
model_responses: list[ModelResponse],
7695
session_items: list[TResponseInputItem] | None = None,
7796
) -> None:
78-
"""Seed tracking from prior state so resumed runs do not replay already-sent content."""
97+
"""Seed tracking from prior state so resumed runs do not replay already-sent content.
98+
99+
This reconstructs the tracker from the original input, saved model responses, generated
100+
run items, and optional session history. After hydration, retry logic can treat rebuilt
101+
items as already acknowledged even though their Python object identities may differ from
102+
the original run.
103+
"""
79104
if self.sent_initial_input:
80105
return
81106

src/agents/run_internal/session_persistence.py

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,19 @@ async def prepare_input_with_session(
5959
include_history_in_prepared_input: bool = True,
6060
preserve_dropped_new_items: bool = False,
6161
) -> tuple[str | list[TResponseInputItem], list[TResponseInputItem]]:
62-
"""
63-
Prepare input by combining it with session history and applying the optional input callback.
64-
Returns the prepared input plus the appended items that should be persisted separately.
62+
"""Prepare model input from session history plus the new turn input.
63+
64+
Returns a tuple of:
65+
66+
1. The prepared input that should be sent to the model after normalization and dedupe.
67+
2. The subset of items that should be appended to the session store for this turn.
68+
69+
The second value is intentionally not "everything returned by the callback". When a
70+
``session_input_callback`` reorders or filters history, we still need to persist only the
71+
items that belong to the new turn. This function therefore compares the callback output
72+
against deep-copied history and new-input lists, first by object identity and then by
73+
content frequency, so retries and custom merge strategies do not accidentally re-persist
74+
old history as fresh input.
6575
"""
6676

6777
if session is None:
@@ -102,6 +112,9 @@ async def prepare_input_with_session(
102112
if not isinstance(combined, list):
103113
raise UserError("Session input callback must return a list of input items.")
104114

115+
# The callback may reorder, drop, or duplicate items. Keep separate reference maps for
116+
# the copied history and copied new-input lists so we can reconstruct which output items
117+
# belong to the new turn and therefore still need to be persisted.
105118
history_refs = _build_reference_map(history_for_callback)
106119
new_refs = _build_reference_map(new_items_for_callback)
107120
history_counts = _build_frequency_map(history_for_callback)
@@ -135,6 +148,8 @@ async def prepare_input_with_session(
135148
else:
136149
prepared_items_raw = new_items_for_callback if preserve_dropped_new_items else []
137150

151+
# Normalize exactly as the runtime does elsewhere so the prepared model input and the
152+
# persisted session items are derived from the same item shape and dedupe rules.
138153
prepared_as_inputs = [ensure_input_item_format(item) for item in prepared_items_raw]
139154
filtered = drop_orphan_function_calls(prepared_as_inputs)
140155
normalized = normalize_input_items_for_api(filtered)

src/agents/run_state.py

Lines changed: 32 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,19 @@
117117

118118
@dataclass
119119
class RunState(Generic[TContext, TAgent]):
120-
"""Serializable snapshot of an agent run, including context, usage, and interruptions."""
120+
"""Serializable snapshot of an agent run, including context, usage, and interruptions.
121+
122+
``RunState`` is the durable pause/resume boundary for human-in-the-loop flows. It stores
123+
enough information to continue an interrupted run, including model responses, generated
124+
items, approval state, and optional server-managed conversation identifiers.
125+
126+
Context serialization is intentionally conservative:
127+
128+
- Mapping contexts round-trip directly.
129+
- Custom contexts may require a serializer and deserializer.
130+
- When no safe serializer is available, the snapshot is still written but emits warnings and
131+
records metadata describing what is required to rebuild the original context type.
132+
"""
121133

122134
_current_turn: int = 0
123135
"""Current turn number in the conversation."""
@@ -297,7 +309,13 @@ def _serialize_context_payload(
297309
context_serializer: ContextSerializer | None = None,
298310
strict_context: bool = False,
299311
) -> tuple[dict[str, Any] | None, dict[str, Any]]:
300-
"""Validate and serialize the stored run context."""
312+
"""Validate and serialize the stored run context.
313+
314+
The returned metadata captures how the context was serialized so restore-time code can
315+
decide whether a deserializer or override is required. This lets RunState remain durable
316+
for simple mapping contexts without silently pretending that richer custom objects can be
317+
reconstructed automatically.
318+
"""
301319
if self._context is None:
302320
return None, _build_context_meta(
303321
None,
@@ -1906,7 +1924,18 @@ async def _build_run_state_from_json(
19061924
context_deserializer: ContextDeserializer | None = None,
19071925
strict_context: bool = False,
19081926
) -> RunState[Any, Agent[Any]]:
1909-
"""Shared helper to rebuild RunState from JSON payload."""
1927+
"""Shared helper to rebuild RunState from JSON payload.
1928+
1929+
Context restoration follows this precedence order:
1930+
1931+
1. ``context_override`` when supplied.
1932+
2. ``context_deserializer`` applied to serialized mapping data.
1933+
3. Direct mapping restore for contexts that were serialized as plain mappings.
1934+
1935+
When the snapshot metadata indicates that the original context type could not round-trip
1936+
safely, this function warns or raises (in ``strict_context`` mode) rather than silently
1937+
claiming that the rebuilt mapping is equivalent to the original object.
1938+
"""
19101939
schema_version = state_json.get("$schemaVersion")
19111940
if not schema_version:
19121941
raise UserError("Run state is missing schema version")

0 commit comments

Comments
 (0)