Skip to content

Commit 9ba8787

Browse files
authored
docs: cover GA computer tool migration across Python guides (#2629)
1 parent 1215783 commit 9ba8787

File tree

4 files changed

+33
-0
lines changed

4 files changed

+33
-0
lines changed

docs/models/index.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,16 @@ my_agent = Agent(
6969

7070
For lower latency, using `reasoning.effort="none"` with `gpt-5.4` is recommended. The gpt-4.1 family (including mini and nano variants) also remains a solid choice for building interactive agent apps.
7171

72+
#### ComputerTool model selection
73+
74+
If an agent includes [`ComputerTool`][agents.tool.ComputerTool], the effective model on the actual Responses request determines which computer-tool payload the SDK sends. Explicit `gpt-5.4` requests use the GA built-in `computer` tool, while explicit `computer-use-preview` requests keep the older `computer_use_preview` payload.
75+
76+
Prompt-managed calls are the main exception. If a prompt template owns the model and the SDK omits `model` from the request, the SDK defaults to the preview-compatible computer payload so it does not guess which model the prompt pins. To keep the GA path in that flow, either make `model="gpt-5.4"` explicit on the request or force the GA selector with `ModelSettings(tool_choice="computer")` or `ModelSettings(tool_choice="computer_use")`.
77+
78+
With a registered [`ComputerTool`][agents.tool.ComputerTool], `tool_choice="computer"`, `"computer_use"`, and `"computer_use_preview"` are normalized to the built-in selector that matches the effective request model. If no `ComputerTool` is registered, those strings continue to behave like ordinary function names.
79+
80+
Preview-compatible requests must serialize `environment` and display dimensions up front, so prompt-managed flows that use a [`ComputerProvider`][agents.tool.ComputerProvider] factory should either pass a concrete `Computer` or `AsyncComputer` instance or force the GA selector before sending the request. See [Tools](../tools.md#computertool-and-the-responses-computer-tool) for the full migration details.
81+
7282
#### Non-GPT-5 models
7383

7484
If you pass a non–GPT-5 model name without custom `model_settings`, the SDK reverts to generic `ModelSettings` compatible with any model.

docs/results.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ In practice:
5757

5858
Unlike the JavaScript SDK, Python does not expose a separate `output` property for the model-shaped delta only. Use `new_items` when you need SDK metadata, or inspect `raw_responses` when you need the raw model payloads.
5959

60+
Computer-tool replay follows the raw Responses payload shape. Preview-model `computer_call` items preserve a single `action`, while `gpt-5.4` computer calls can preserve batched `actions[]`. [`to_input_list()`][agents.result.RunResultBase.to_input_list] and [`RunState`][agents.run_state.RunState] keep whichever shape the model produced, so manual replay, pause/resume flows, and stored transcripts continue to work across both preview and GA computer-tool calls. Local execution results still appear as `computer_call_output` items in `new_items`.
61+
6062
### New items
6163

6264
[`new_items`][agents.result.RunResultBase.new_items] gives you the richest view of what happened during the run. Common item types are:

docs/streaming.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ Keep consuming `result.stream_events()` until the async iterator finishes. A str
1010

1111
[`RawResponsesStreamEvent`][agents.stream_events.RawResponsesStreamEvent] are raw events passed directly from the LLM. They are in OpenAI Responses API format, which means each event has a type (like `response.created`, `response.output_text.delta`, etc) and data. These events are useful if you want to stream response messages to the user as soon as they are generated.
1212

13+
Computer-tool raw events keep the same preview-vs-GA distinction as stored results. Preview flows stream `computer_call` items with one `action`, while `gpt-5.4` can stream `computer_call` items with batched `actions[]`. The higher-level [`RunItemStreamEvent`][agents.stream_events.RunItemStreamEvent] surface does not add a special computer-only event name for this: both shapes still surface as `tool_called`, and the screenshot result comes back as `tool_output` wrapping a `computer_call_output` item.
14+
1315
For example, this will output the text generated by the LLM token-by-token.
1416

1517
```python

docs/tools.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,25 @@ Local runtime tools require you to supply implementations:
182182
- [`ApplyPatchTool`][agents.tool.ApplyPatchTool]: implement [`ApplyPatchEditor`][agents.editor.ApplyPatchEditor] to apply diffs locally.
183183
- Local shell skills are available with `ShellTool(environment={"type": "local", "skills": [...]})`.
184184

185+
### ComputerTool and the Responses computer tool
186+
187+
`ComputerTool` is still a local harness: you provide a [`Computer`][agents.computer.Computer] or [`AsyncComputer`][agents.computer.AsyncComputer] implementation, and the SDK maps that harness onto the OpenAI Responses API computer surface.
188+
189+
For explicit [`gpt-5.4`](https://developers.openai.com/api/docs/models/gpt-5.4) requests, the SDK sends the GA built-in tool payload `{"type": "computer"}`. The older `computer-use-preview` model keeps the preview payload `{"type": "computer_use_preview", "environment": ..., "display_width": ..., "display_height": ...}`. This mirrors the platform migration described in OpenAI's [Computer use guide](https://developers.openai.com/api/docs/guides/tools-computer-use/):
190+
191+
- Model: `computer-use-preview` -> `gpt-5.4`
192+
- Tool selector: `computer_use_preview` -> `computer`
193+
- Computer call shape: one `action` per `computer_call` -> batched `actions[]` on `computer_call`
194+
- Truncation: `ModelSettings(truncation="auto")` required on the preview path -> not required on the GA path
195+
196+
The SDK chooses that wire shape from the effective model on the actual Responses request. If you use a prompt template and the request omits `model` because the prompt owns it, the SDK keeps the preview-compatible computer payload unless you either keep `model="gpt-5.4"` explicit or force the GA selector with `ModelSettings(tool_choice="computer")` or `ModelSettings(tool_choice="computer_use")`.
197+
198+
When a [`ComputerTool`][agents.tool.ComputerTool] is present, `tool_choice="computer"`, `"computer_use"`, and `"computer_use_preview"` are all accepted and normalized to the built-in selector that matches the effective request model. Without a `ComputerTool`, those strings still behave like ordinary function names.
199+
200+
This distinction matters when `ComputerTool` is backed by a [`ComputerProvider`][agents.tool.ComputerProvider] factory. The GA `computer` payload does not need `environment` or dimensions at serialization time, so unresolved factories are fine. Preview-compatible serialization still needs a resolved `Computer` or `AsyncComputer` instance so the SDK can send `environment`, `display_width`, and `display_height`.
201+
202+
At runtime, both paths still use the same local harness. Preview responses emit `computer_call` items with a single `action`; `gpt-5.4` can emit batched `actions[]`, and the SDK executes them in order before producing a `computer_call_output` screenshot item. See `examples/tools/computer_use.py` for a runnable Playwright-based harness.
203+
185204
```python
186205
from agents import Agent, ApplyPatchTool, ShellTool
187206
from agents.computer import AsyncComputer

0 commit comments

Comments
 (0)