docs: add responses websocket support (#2533)

seratch · web-flow · commit ac8fc9d7314b · 2026-02-24T08:23:14.000+09:00
diff --git a/docs/examples.md b/docs/examples.md
@@ -23,6 +23,7 @@ Check out a variety of sample implementations of the SDK in the examples section
     -   Agent lifecycle management
     -   Dynamic system prompts
     -   Streaming outputs (text, items, function call args)
+    -   Responses websocket transport with a shared session helper across turns (`examples/basic/stream_ws.py`)
     -   Prompt templates
     -   File handling (local and remote, images and PDFs)
     -   Usage tracking
diff --git a/docs/models/index.md b/docs/models/index.md
@@ -61,6 +61,45 @@ For lower latency, using `reasoning.effort="none"` with `gpt-5.2` is recommended
 
 If you pass a non–GPT-5 model name without custom `model_settings`, the SDK reverts to generic `ModelSettings` compatible with any model.
 
+### Responses WebSocket transport
+
+By default, OpenAI Responses API requests use HTTP transport. You can opt in to websocket transport when using OpenAI-backed models.
+
+```python
+from agents import set_default_openai_responses_transport
+
+set_default_openai_responses_transport("websocket")
+```
+
+This affects OpenAI Responses models resolved by the default OpenAI provider (including string model names such as `"gpt-5.2"`).
+
+You can also configure websocket transport per provider or per run:
+
+```python
+from agents import Agent, OpenAIProvider, RunConfig, Runner
+
+provider = OpenAIProvider(
+    use_responses_websocket=True,
+    # Optional; if omitted, OPENAI_WEBSOCKET_BASE_URL is used when set.
+    websocket_base_url="wss://your-proxy.example/v1",
+)
+
+agent = Agent(name="Assistant")
+result = await Runner.run(
+    agent,
+    "Hello",
+    run_config=RunConfig(model_provider=provider),
+)
+```
+
+If you need prefix-based model routing (for example mixing `openai/...` and `litellm/...` model names in one run), use [`MultiProvider`][agents.MultiProvider] and set `openai_use_responses_websocket=True` there instead.
+
+Notes:
+
+-   This is the Responses API over websocket transport, not the [Realtime API](../realtime/guide.md).
+-   Install the `websockets` package if it is not already available in your environment.
+-   You can use [`Runner.run_streamed()`][agents.run.Runner.run_streamed] directly after enabling websocket transport. For multi-turn workflows where you want to reuse the same websocket connection across turns (and nested agent-as-tool calls), the [`responses_websocket_session()`][agents.responses_websocket_session] helper is recommended. See the [Running agents](../running_agents.md) guide and [`examples/basic/stream_ws.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/stream_ws.py).
+
 ## Non-OpenAI models
 
 You can use most other non-OpenAI models via the [LiteLLM integration](./litellm.md). First, install the litellm dependency group:
diff --git a/docs/ref/index.md b/docs/ref/index.md
@@ -7,6 +7,9 @@
             - set_default_openai_key
             - set_default_openai_client
             - set_default_openai_api
+            - set_default_openai_responses_transport
+            - ResponsesWebSocketSession
+            - responses_websocket_session
             - set_tracing_export_api_key
             - set_tracing_disabled
             - set_trace_processors
diff --git a/docs/ref/responses_websocket_session.md b/docs/ref/responses_websocket_session.md
@@ -0,0 +1,3 @@
+# `Responses WebSocket Session`
+
+::: agents.responses_websocket_session
diff --git a/docs/release.md b/docs/release.md
@@ -19,6 +19,16 @@ We will increment `Z` for non-breaking changes:
 
 ## Breaking change changelog
 
+### 0.10.0
+
+This minor release does **not** introduce a breaking change, but it includes a significant new feature area for OpenAI Responses users: websocket transport support for the Responses API.
+
+Highlights:
+
+-   Added websocket transport support for OpenAI Responses models (opt-in; HTTP remains the default transport).
+-   Added a `responses_websocket_session()` helper / `ResponsesWebSocketSession` for reusing a shared websocket-capable provider and `RunConfig` across multi-turn runs.
+-   Added a new websocket streaming example (`examples/basic/stream_ws.py`) covering streaming, tools, approvals, and follow-up turns.
+
 ### 0.9.0
 
 In this version, Python 3.9 is no longer supported, as this major version reached EOL three months ago. Please upgrade to a newer runtime version.
diff --git a/docs/running_agents.md b/docs/running_agents.md
@@ -42,6 +42,71 @@ The runner then runs a loop:
 
 Streaming allows you to additionally receive streaming events as the LLM runs. Once the stream is done, the [`RunResultStreaming`][agents.result.RunResultStreaming] will contain the complete information about the run, including all the new outputs produced. You can call `.stream_events()` for the streaming events. Read more in the [streaming guide](streaming.md).
 
+### Responses WebSocket transport (optional helper)
+
+If you enable the OpenAI Responses websocket transport, you can keep using the normal `Runner` APIs. The websocket session helper is recommended for connection reuse, but it is not required.
+
+This is the Responses API over websocket transport, not the [Realtime API](realtime/guide.md).
+
+#### Pattern 1: No session helper (works)
+
+Use this when you just want websocket transport and do not need the SDK to manage a shared provider/session for you.
+
+```python
+import asyncio
+
+from agents import Agent, Runner, set_default_openai_responses_transport
+
+
+async def main():
+    set_default_openai_responses_transport("websocket")
+
+    agent = Agent(name="Assistant", instructions="Be concise.")
+    result = Runner.run_streamed(agent, "Summarize recursion in one sentence.")
+
+    async for event in result.stream_events():
+        if event.type == "raw_response_event":
+            continue
+        print(event.type)
+
+
+asyncio.run(main())
+```
+
+This pattern is fine for single runs. If you call `Runner.run()` / `Runner.run_streamed()` repeatedly, each run may reconnect unless you manually reuse the same `RunConfig` / provider instance.
+
+#### Pattern 2: Use `responses_websocket_session()` (recommended for multi-turn reuse)
+
+Use [`responses_websocket_session()`][agents.responses_websocket_session] when you want a shared websocket-capable provider and `RunConfig` across multiple runs (including nested agent-as-tool calls that inherit the same `run_config`).
+
+```python
+import asyncio
+
+from agents import Agent, responses_websocket_session
+
+
+async def main():
+    agent = Agent(name="Assistant", instructions="Be concise.")
+
+    async with responses_websocket_session() as ws:
+        first = ws.run_streamed(agent, "Say hello in one short sentence.")
+        async for _event in first.stream_events():
+            pass
+
+        second = ws.run_streamed(
+            agent,
+            "Now say goodbye.",
+            previous_response_id=first.last_response_id,
+        )
+        async for _event in second.stream_events():
+            pass
+
+
+asyncio.run(main())
+```
+
+Finish consuming streamed results before the context exits. Exiting the context while a websocket request is still in flight may force-close the shared connection.
+
 ## Run config
 
 The `run_config` parameter lets you configure some global settings for the agent run:
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -94,6 +94,7 @@ plugins:
                     - ref/run.md
                     - ref/run_config.md
                     - ref/run_state.md
+                    - ref/responses_websocket_session.md
                     - ref/run_error_handlers.md
                     - ref/memory.md
                     - ref/repl.md
@@ -118,6 +119,8 @@ plugins:
                     - ref/models/interface.md
                     - ref/models/openai_chatcompletions.md
                     - ref/models/openai_responses.md
+                    - ref/models/openai_provider.md
+                    - ref/models/multi_provider.md
                     - ref/mcp/server.md
                     - ref/mcp/util.md
                 - Tracing:

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	+# `Responses WebSocket Session`
	`2`	`+`
	`3`	`+::: agents.responses_websocket_session`