docs: add tool search coverage across Python guides (#2622)

seratch · web-flow · commit 1215783363a5 · 2026-03-09T13:39:30.000+09:00
diff --git a/docs/agents.md b/docs/agents.md
@@ -311,6 +311,8 @@ Supplying a list of tools doesn't always mean the LLM will use a tool. You can f
 3. `none`, which requires the LLM to _not_ use a tool.
 4. Setting a specific string e.g. `my_tool`, which requires the LLM to use that specific tool.
 
+When you are using OpenAI Responses tool search, named tool choices are more limited: you cannot target bare namespace names or deferred-only tools with `tool_choice`, and `tool_choice="tool_search"` does not target [`ToolSearchTool`][agents.tool.ToolSearchTool]. In those cases, prefer `auto` or `required`. See [Hosted tool search](tools.md#hosted-tool-search) for the Responses-specific constraints.
+
 ```python
 from agents import Agent, Runner, function_tool, ModelSettings
 
diff --git a/docs/context.md b/docs/context.md
@@ -124,6 +124,8 @@ plus additional fields specific to the current tool call:
 - `tool_name` – the name of the tool being invoked  
 - `tool_call_id` – a unique identifier for this tool call  
 - `tool_arguments` – the raw argument string passed to the tool  
+- `tool_namespace` – the Responses namespace for the tool call, when the tool was loaded through `tool_namespace()` or another namespaced surface  
+- `qualified_tool_name` – the tool name qualified with the namespace when one is available  
 
 Use `ToolContext` when you need tool-level metadata during execution.  
 For general context sharing between agents and tools, `RunContextWrapper` remains sufficient. Because `ToolContext` extends `RunContextWrapper`, it can also expose `.tool_input` when a nested `Agent.as_tool()` run supplied structured input.
diff --git a/docs/examples.md b/docs/examples.md
@@ -89,6 +89,7 @@ Check out a variety of sample implementations of the SDK in the examples section
     -   Hosted container shell with inline skills (`examples/tools/container_shell_inline_skill.py`)
     -   Hosted container shell with skill references (`examples/tools/container_shell_skill_reference.py`)
     -   Local shell with local skills (`examples/tools/local_shell_skill.py`)
+    -   Tool search with namespaces and deferred tools (`examples/tools/tool_search.py`)
     -   Computer use
     -   Image generation
     -   Experimental Codex tool workflows (`examples/tools/codex.py`)
diff --git a/docs/mcp.md b/docs/mcp.md
@@ -102,6 +102,8 @@ asyncio.run(main())
 
 The hosted server exposes its tools automatically; you do not add it to `mcp_servers`.
 
+If you want hosted tool search to load a hosted MCP server lazily, set `tool_config["defer_loading"] = True` and add [`ToolSearchTool`][agents.tool.ToolSearchTool] to the agent. This is supported only on OpenAI Responses models. See [Tools](tools.md#hosted-tool-search) for the complete tool-search setup and constraints.
+
 ### Streaming hosted MCP results
 
 Hosted tools support streaming results in exactly the same way as function tools. Use `Runner.run_streamed` to
diff --git a/docs/models/index.md b/docs/models/index.md
@@ -73,6 +73,16 @@ For lower latency, using `reasoning.effort="none"` with `gpt-5.4` is recommended
 
 If you pass a non–GPT-5 model name without custom `model_settings`, the SDK reverts to generic `ModelSettings` compatible with any model.
 
+### Responses-only tool search features
+
+The following tool features are supported only with OpenAI Responses models:
+
+-   [`ToolSearchTool`][agents.tool.ToolSearchTool]
+-   [`tool_namespace()`][agents.tool.tool_namespace]
+-   `@function_tool(defer_loading=True)` and other deferred-loading Responses tool surfaces
+
+These features are rejected on Chat Completions models and on non-Responses backends. When you use deferred-loading tools, add `ToolSearchTool()` to the agent and let the model load tools through `auto` or `required` tool choice instead of forcing bare namespace names or deferred-only function names. See [Tools](../tools.md#hosted-tool-search) for the setup details and current constraints.
+
 ### Responses WebSocket transport
 
 By default, OpenAI Responses API requests use HTTP transport. You can opt in to websocket transport when using OpenAI-backed models.
diff --git a/docs/results.md b/docs/results.md
@@ -63,12 +63,15 @@ Unlike the JavaScript SDK, Python does not expose a separate `output` property f
 
 -   [`MessageOutputItem`][agents.items.MessageOutputItem] for assistant messages
 -   [`ReasoningItem`][agents.items.ReasoningItem] for reasoning items
+-   [`ToolSearchCallItem`][agents.items.ToolSearchCallItem] and [`ToolSearchOutputItem`][agents.items.ToolSearchOutputItem] for Responses tool search requests and loaded tool-search results
 -   [`ToolCallItem`][agents.items.ToolCallItem] and [`ToolCallOutputItem`][agents.items.ToolCallOutputItem] for tool calls and their results
 -   [`ToolApprovalItem`][agents.items.ToolApprovalItem] for tool calls that paused for approval
 -   [`HandoffCallItem`][agents.items.HandoffCallItem] and [`HandoffOutputItem`][agents.items.HandoffOutputItem] for handoff requests and completed transfers
 
 Choose `new_items` over `to_input_list()` whenever you need agent associations, tool outputs, handoff boundaries, or approval boundaries.
 
+When you use hosted tool search, inspect `ToolSearchCallItem.raw_item` to see the search request the model emitted, and `ToolSearchOutputItem.raw_item` to see which namespaces, functions, or hosted MCP servers were loaded for that turn.
+
 ## Continue or resume the conversation
 
 ### Next-turn agent
diff --git a/docs/streaming.md b/docs/streaming.md
@@ -65,6 +65,8 @@ For a full pause/resume walkthrough, see the [human-in-the-loop guide](human_in_
 -   `handoff_requested`
 -   `handoff_occured`
 -   `tool_called`
+-   `tool_search_called`
+-   `tool_search_output_created`
 -   `tool_output`
 -   `reasoning_item_created`
 -   `mcp_approval_requested`
@@ -73,6 +75,8 @@ For a full pause/resume walkthrough, see the [human-in-the-loop guide](human_in_
 
 `handoff_occured` is intentionally misspelled for backward compatibility.
 
+When you use hosted tool search, `tool_search_called` is emitted when the model issues a tool-search request and `tool_search_output_created` is emitted when the Responses API returns the loaded subset.
+
 For example, this will ignore raw events and stream updates to the user.
 
 ```python
diff --git a/docs/tools.md b/docs/tools.md
@@ -15,6 +15,7 @@ Use this page as a catalog, then jump to the section that matches the runtime yo
 | If you want to... | Start here |
 | --- | --- |
 | Use OpenAI-managed tools (web search, file search, code interpreter, hosted MCP, image generation) | [Hosted tools](#hosted-tools) |
+| Defer large tool surfaces until runtime with tool search | [Hosted tool search](#hosted-tool-search) |
 | Run tools in your own process or environment | [Local runtime tools](#local-runtime-tools) |
 | Wrap Python functions as tools | [Function tools](#function-tools) |
 | Let one agent call another without a handoff | [Agents as tools](#agents-as-tools) |
@@ -29,6 +30,7 @@ OpenAI offers a few built-in tools when using the [`OpenAIResponsesModel`][agent
 -   The [`CodeInterpreterTool`][agents.tool.CodeInterpreterTool] lets the LLM execute code in a sandboxed environment.
 -   The [`HostedMCPTool`][agents.tool.HostedMCPTool] exposes a remote MCP server's tools to the model.
 -   The [`ImageGenerationTool`][agents.tool.ImageGenerationTool] generates images from a prompt.
+-   The [`ToolSearchTool`][agents.tool.ToolSearchTool] lets the model load deferred tools, namespaces, or hosted MCP servers on demand.
 
 Advanced hosted search options:
 
@@ -54,6 +56,69 @@ async def main():
     print(result.final_output)
 ```
 
+### Hosted tool search
+
+Tool search lets OpenAI Responses models defer large tool surfaces until runtime, so the model loads only the subset it needs for the current turn. This is useful when you have many function tools, namespace groups, or hosted MCP servers and want to reduce tool-schema tokens without exposing every tool up front.
+
+Start with hosted tool search when the candidate tools are already known when you build the agent. If your application needs to decide what to load dynamically, the Responses API also supports client-executed tool search, but the standard `Runner` does not auto-execute that mode.
+
+```python
+from typing import Annotated
+
+from agents import Agent, Runner, ToolSearchTool, function_tool, tool_namespace
+
+
+@function_tool(defer_loading=True)
+def get_customer_profile(
+    customer_id: Annotated[str, "The customer ID to look up."],
+) -> str:
+    """Fetch a CRM customer profile."""
+    return f"profile for {customer_id}"
+
+
+@function_tool(defer_loading=True)
+def list_open_orders(
+    customer_id: Annotated[str, "The customer ID to look up."],
+) -> str:
+    """List open orders for a customer."""
+    return f"open orders for {customer_id}"
+
+
+crm_tools = tool_namespace(
+    name="crm",
+    description="CRM tools for customer lookups.",
+    tools=[get_customer_profile, list_open_orders],
+)
+
+
+agent = Agent(
+    name="Operations assistant",
+    model="gpt-5.4",
+    instructions="Load the crm namespace before using CRM tools.",
+    tools=[*crm_tools, ToolSearchTool()],
+)
+
+result = await Runner.run(agent, "Look up customer_42 and list their open orders.")
+print(result.final_output)
+```
+
+What to know:
+
+-   Hosted tool search is available only with OpenAI Responses models. The current Python SDK support depends on `openai>=2.25.0`.
+-   Add exactly one `ToolSearchTool()` when you configure deferred-loading surfaces on an agent.
+-   Searchable surfaces include `@function_tool(defer_loading=True)`, `tool_namespace(name=..., description=..., tools=[...])`, and `HostedMCPTool(tool_config={..., "defer_loading": True})`.
+-   Deferred-loading function tools must be paired with `ToolSearchTool()`. Namespace-only setups may also use `ToolSearchTool()` to let the model load the right group on demand.
+-   `tool_namespace()` groups `FunctionTool` instances under a shared namespace name and description. This is usually the best fit when you have many related tools, such as `crm`, `billing`, or `shipping`.
+-   OpenAI's official best-practice guidance is [Use namespaces where possible](https://developers.openai.com/api/docs/guides/tools-tool-search#use-namespaces-where-possible).
+-   Prefer namespaces or hosted MCP servers over many individually deferred functions when possible. They usually give the model a better high-level search surface and better token savings.
+-   Namespaces can mix immediate and deferred tools. Tools without `defer_loading=True` remain callable immediately, while deferred tools in the same namespace are loaded through tool search.
+-   As a rule of thumb, keep each namespace fairly small, ideally fewer than 10 functions.
+-   Named `tool_choice` cannot target bare namespace names or deferred-only tools. Prefer `auto`, `required`, or a real top-level callable tool name.
+-   `ToolSearchTool(execution="client")` is for manual Responses orchestration. If the model emits a client-executed `tool_search_call`, the standard `Runner` raises instead of executing it for you.
+-   Tool search activity appears in [`RunResult.new_items`](results.md#new-items) and in [`RunItemStreamEvent`](streaming.md#run-item-event-names) with dedicated item and event types.
+-   See `examples/tools/tool_search.py` for complete runnable examples covering both namespaced loading and top-level deferred tools.
+-   Official platform guide: [Tool search](https://developers.openai.com/api/docs/guides/tools-tool-search).
+
 ### Hosted container shell + skills
 
 `ShellTool` also supports OpenAI-hosted container execution. Use this mode when you want the model to run shell commands in a managed container instead of your local runtime.
@@ -168,6 +233,8 @@ You can use any Python function as a tool. The Agents SDK will setup the tool au
 
 We use Python's `inspect` module to extract the function signature, along with [`griffe`](https://mkdocstrings.github.io/griffe/) to parse docstrings and `pydantic` for schema creation.
 
+When you are using OpenAI Responses models, `@function_tool(defer_loading=True)` hides a function tool until `ToolSearchTool()` loads it. You can also group related function tools with [`tool_namespace()`][agents.tool.tool_namespace]. See [Hosted tool search](#hosted-tool-search) for the full setup and constraints.
+
 ```python
 import json