docs: update pages to add any-llm adapter (#2715)

seratch · web-flow · commit 13f4a7405e49 · 2026-03-25T16:46:21.000+09:00
diff --git a/docs/examples.md b/docs/examples.md
@@ -29,7 +29,7 @@ Check out a variety of sample implementations of the SDK in the examples section
     -   File handling (local and remote, images and PDFs)
     -   Usage tracking
     -   Runner-managed retry settings (`examples/basic/retry.py`)
-    -   Runner-managed retries with LiteLLM (`examples/basic/retry_litellm.py`)
+    -   Runner-managed retries through a third-party adapter (`examples/basic/retry_litellm.py`)
     -   Non-strict output types
     -   Previous response ID usage
 
@@ -68,7 +68,7 @@ Check out a variety of sample implementations of the SDK in the examples section
     -   Stateless Responses compaction with `ModelSettings(store=False)` (`examples/memory/compaction_session_stateless_example.py`)
 
 -   **[model_providers](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers):**
-    Explore how to use non-OpenAI models with the SDK, including custom providers and LiteLLM integration.
+    Explore how to use non-OpenAI models with the SDK, including custom providers and third-party adapters.
 
 -   **[realtime](https://github.com/openai/openai-agents-python/tree/main/examples/realtime):**
     Examples showing how to build real-time experiences using the SDK, including:
diff --git a/docs/llms-full.txt b/docs/llms-full.txt
@@ -38,7 +38,7 @@ The Agents SDK delivers a focused set of Python primitives—agents, tools, guar
 - [Realtime guide](https://openai.github.io/openai-agents-python/realtime/guide/): Deep dive into realtime session lifecycle, structured input, approvals, interruptions, and low-level transport control.
 
 ## Models and Provider Integrations
-- [Model catalog](https://openai.github.io/openai-agents-python/models/): Covers OpenAI model selection, non-OpenAI provider patterns, websocket transport, and the SDK's best-effort LiteLLM guidance in one place.
+- [Model catalog](https://openai.github.io/openai-agents-python/models/): Covers OpenAI model selection, non-OpenAI provider patterns, websocket transport, and third-party adapter guidance in one place.
 
 ## API Reference – Agents SDK Core
 - [API index](https://openai.github.io/openai-agents-python/ref/index/): Directory of all documented modules, classes, and functions in the SDK.
@@ -103,7 +103,7 @@ The Agents SDK delivers a focused set of Python primitives—agents, tools, guar
 ## API Reference – Extensions
 - [Handoff filters extension](https://openai.github.io/openai-agents-python/ref/extensions/handoff_filters/): Build filters that decide whether to trigger a handoff.
 - [Handoff prompt extension](https://openai.github.io/openai-agents-python/ref/extensions/handoff_prompt/): Customize prompt templates used when transferring control.
-- [LiteLLM extension](https://openai.github.io/openai-agents-python/ref/extensions/litellm/): Adapter for using LiteLLM-managed providers inside the SDK.
+- [Third-party adapters API reference](https://openai.github.io/openai-agents-python/ref/extensions/): API reference entry point for Any-LLM and LiteLLM model adapters and providers.
 - [SQLAlchemy session memory](https://openai.github.io/openai-agents-python/ref/extensions/memory/sqlalchemy_session/): Persist agent session history to SQL databases.
 
 ## Optional
diff --git a/docs/llms.txt b/docs/llms.txt
@@ -49,10 +49,10 @@ The SDK focuses on a concise set of primitives so you can orchestrate multi-agen
 - [Tracing APIs](https://openai.github.io/openai-agents-python/ref/tracing/index/): Programmatic interfaces for creating traces, spans, and integrating custom processors.
 - [Realtime APIs](https://openai.github.io/openai-agents-python/ref/realtime/agent/): Classes for realtime agents, runners, sessions, and event payloads.
 - [Voice APIs](https://openai.github.io/openai-agents-python/ref/voice/pipeline/): Configure voice pipelines, inputs, events, and model adapters.
-- [Extensions](https://openai.github.io/openai-agents-python/ref/extensions/handoff_filters/): Extend the SDK with custom handoff filters, prompts, LiteLLM integration, and SQLAlchemy session memory.
+- [Extensions](https://openai.github.io/openai-agents-python/ref/extensions/handoff_filters/): Extend the SDK with custom handoff filters, prompts, third-party adapters, and SQLAlchemy session memory.
 
 ## Models and Providers
-- [Model catalog](https://openai.github.io/openai-agents-python/models/): Overview of OpenAI models, non-OpenAI provider patterns, websocket transport, and the SDK's best-effort LiteLLM guidance.
+- [Model catalog](https://openai.github.io/openai-agents-python/models/): Overview of OpenAI models, non-OpenAI provider patterns, websocket transport, and third-party adapter guidance.
 
 ## Optional
 - [Release notes](https://openai.github.io/openai-agents-python/release/): Track SDK changes, migration notes, and deprecations.
diff --git a/docs/models/index.md b/docs/models/index.md
@@ -16,7 +16,7 @@ Start with the simplest path that fits your setup:
 | Use one non-OpenAI provider | Start with the built-in provider integration points | [Non-OpenAI models](#non-openai-models) |
 | Mix models or providers across agents | Select providers per run or per agent and review feature differences | [Mixing models in one workflow](#mixing-models-in-one-workflow) and [Mixing models across providers](#mixing-models-across-providers) |
 | Tune advanced OpenAI Responses request settings | Use `ModelSettings` on the OpenAI Responses path | [Advanced OpenAI Responses settings](#advanced-openai-responses-settings) |
-| Use LiteLLM for non-OpenAI Chat Completions providers | Treat LiteLLM as a beta fallback | [LiteLLM](#litellm) |
+| Use a third-party adapter for non-OpenAI or mixed-provider routing | Compare the supported beta adapters and validate the provider path you plan to ship | [Third-party adapters](#third-party-adapters) |
 
 ## OpenAI models
 
@@ -135,7 +135,7 @@ result = await Runner.run(
 
 #### Advanced routing with `MultiProvider`
 
-If you need prefix-based model routing (for example mixing `openai/...` and `litellm/...` model names in one run), use [`MultiProvider`][agents.MultiProvider] and set `openai_use_responses_websocket=True` there instead.
+If you need prefix-based model routing (for example mixing `openai/...` and `any-llm/...` model names in one run), use [`MultiProvider`][agents.MultiProvider] and set `openai_use_responses_websocket=True` there instead.
 
 `MultiProvider` keeps two historical defaults:
 
@@ -180,7 +180,7 @@ If you use a custom OpenAI-compatible endpoint or proxy, websocket transport als
 
 ## Non-OpenAI models
 
-If you need a non-OpenAI provider, start with the SDK's built-in provider integration points. In many setups, this is enough without adding LiteLLM. Examples for each pattern live in [examples/model_providers](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/).
+If you need a non-OpenAI provider, start with the SDK's built-in provider integration points. In many setups, this is enough without adding a third-party adapter. Examples for each pattern live in [examples/model_providers](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/).
 
 ### Ways to integrate non-OpenAI providers
 
@@ -189,7 +189,7 @@ If you need a non-OpenAI provider, start with the SDK's built-in provider integr
 | [`set_default_openai_client`][agents.set_default_openai_client] | One OpenAI-compatible endpoint should be the default for most or all agents | Global default |
 | [`ModelProvider`][agents.models.interface.ModelProvider] | One custom provider should apply to a single run | Per run |
 | [`Agent.model`][agents.agent.Agent.model] | Different agents need different providers or concrete model objects | Per agent |
-| LiteLLM (beta) | You need LiteLLM-specific provider coverage or routing | See [LiteLLM](#litellm) |
+| Third-party adapter | You need adapter-managed provider coverage or routing that the built-in paths do not provide | See [Third-party adapters](#third-party-adapters) |
 
 You can integrate other LLM providers with these built-in paths:
 
@@ -404,7 +404,7 @@ Stateful follow-up requests using `previous_response_id` or `conversation_id` ar
 - An agent can override only part of `retry.backoff` and keep sibling backoff fields from the runner.
 - `policy` is runtime-only, so serialized `ModelSettings` keep `max_retries` and `backoff` but omit the callback itself.
 
-For fuller examples, see [`examples/basic/retry.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry.py) and [`examples/basic/retry_litellm.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry_litellm.py).
+For fuller examples, see [`examples/basic/retry.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry.py) and the [adapter-backed retry example](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry_litellm.py).
 
 ## Troubleshooting non-OpenAI providers
 
@@ -443,14 +443,24 @@ You need to be aware of feature differences between model providers, or you may
 -   Filter out multimodal inputs before calling models that are text-only
 -   Be aware that providers that don't support structured JSON outputs will occasionally produce invalid JSON.
 
-## LiteLLM
+## Third-party adapters
 
-LiteLLM support is included on a best-effort, beta basis for cases where you need to bring non-OpenAI providers into an Agents SDK workflow.
+Reach for a third-party adapter only when the SDK's built-in provider integration points are not enough. If you are using OpenAI models only with this SDK, prefer the built-in [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] path instead of Any-LLM or LiteLLM. Third-party adapters are for cases where you need to combine OpenAI models with non-OpenAI providers, or need adapter-managed provider coverage or routing that the built-in paths do not provide. Adapters add another compatibility layer between the SDK and the upstream model provider, so feature support and request semantics can vary by provider. The SDK currently includes Any-LLM and LiteLLM as best-effort, beta adapter integrations.
 
-If you are using OpenAI models with this SDK, we recommend the built-in [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] path instead of LiteLLM.
+### Any-LLM
 
-If you need to combine OpenAI models with non-OpenAI providers, especially through Chat Completions-compatible APIs, LiteLLM is available as a beta option, but it may not be the optimal choice for every setup.
+Any-LLM support is included on a best-effort, beta basis for cases where you need Any-LLM-managed provider coverage or routing.
 
-If you need LiteLLM for a non-OpenAI provider, install `openai-agents[litellm]`, then start from [`examples/model_providers/litellm_auto.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/litellm_auto.py) or [`examples/model_providers/litellm_provider.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/litellm_provider.py). You can either use `litellm/...` model names or instantiate [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel] directly.
+Depending on the upstream provider path, Any-LLM may use the Responses API, Chat Completions-compatible APIs, or provider-specific compatibility layers.
 
-If you want LiteLLM responses to populate the SDK's usage metrics, pass `ModelSettings(include_usage=True)`.
+If you need Any-LLM, install `openai-agents[any-llm]`, then start from [`examples/model_providers/any_llm_auto.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/any_llm_auto.py) or [`examples/model_providers/any_llm_provider.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/any_llm_provider.py). You can use `any-llm/...` model names with [`MultiProvider`][agents.MultiProvider], instantiate `AnyLLMModel` directly, or use `AnyLLMProvider` at run scope. If you need to pin the model surface explicitly, pass `api="responses"` or `api="chat_completions"` when constructing `AnyLLMModel`.
+
+Any-LLM remains a third-party adapter layer, so provider dependencies and capability gaps are defined upstream by Any-LLM rather than by the SDK. Usage metrics are propagated automatically when the upstream provider returns them, but streamed Chat Completions backends may require `ModelSettings(include_usage=True)` before they emit usage chunks. Validate the exact provider backend you plan to deploy if you depend on structured outputs, tool calling, usage reporting, or Responses-specific behavior.
+
+### LiteLLM
+
+LiteLLM support is included on a best-effort, beta basis for cases where you need LiteLLM-specific provider coverage or routing.
+
+If you need LiteLLM, install `openai-agents[litellm]`, then start from [`examples/model_providers/litellm_auto.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/litellm_auto.py) or [`examples/model_providers/litellm_provider.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/litellm_provider.py). You can use `litellm/...` model names or instantiate [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel] directly.
+
+Some LiteLLM-backed providers do not populate SDK usage metrics by default. If you need usage reporting, pass `ModelSettings(include_usage=True)` and validate the exact provider backend you plan to deploy if you depend on structured outputs, tool calling, usage reporting, or adapter-specific routing behavior.
diff --git a/docs/models/litellm.md b/docs/models/litellm.md
@@ -1,9 +1,9 @@
 # LiteLLM
 
 <script>
-  window.location.replace("../#litellm");
+  window.location.replace("../#third-party-adapters");
 </script>
 
-This page moved to the [LiteLLM section in Models](index.md#litellm).
+This page moved to the [Third-party adapters section in Models](index.md#third-party-adapters).
 
 If you are not redirected automatically, use the link above.
diff --git a/docs/ref/extensions/litellm.md b/docs/ref/extensions/litellm.md
@@ -1,3 +1,9 @@
 # `LiteLLM Models`
 
-::: agents.extensions.models.litellm_model
+<script>
+  window.location.replace("../third_party_adapters/");
+</script>
+
+This page moved to the [Third-party adapters API reference](third_party_adapters.md).
+
+If you are not redirected automatically, use the link above.
diff --git a/docs/tracing.md b/docs/tracing.md
@@ -103,18 +103,18 @@ To customize this default setup, to send traces to alternative or additional bac
 
 ## Tracing with non-OpenAI models
 
-You can use an OpenAI API key with non-OpenAI Models to enable free tracing in the OpenAI Traces dashboard without needing to disable tracing.
+You can use an OpenAI API key with non-OpenAI models to enable free tracing in the OpenAI Traces dashboard without needing to disable tracing. See the [Third-party adapters](models/index.md#third-party-adapters) section in the Models guide for adapter selection and setup caveats.
 
 ```python
 import os
 from agents import set_tracing_export_api_key, Agent, Runner
-from agents.extensions.models.litellm_model import LitellmModel
+from agents.extensions.models.any_llm_model import AnyLLMModel
 
 tracing_api_key = os.environ["OPENAI_API_KEY"]
 set_tracing_export_api_key(tracing_api_key)
 
-model = LitellmModel(
-    model="your-model-name",
+model = AnyLLMModel(
+    model="your-provider/your-model-name",
     api_key="your-api-key",
 )
 
diff --git a/docs/usage.md b/docs/usage.md
@@ -29,23 +29,14 @@ print("Total tokens:", usage.total_tokens)
 
 Usage is aggregated across all model calls during the run (including tool calls and handoffs).
 
-### Enabling usage with LiteLLM models
+### Enabling usage with third-party adapters
 
-LiteLLM providers do not report usage metrics by default. When you are using [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel], pass `ModelSettings(include_usage=True)` to your agent so that LiteLLM responses populate `result.context_wrapper.usage`. See the [LiteLLM note](models/index.md#litellm) in the Models guide for setup guidance and examples.
+Usage reporting varies across third-party adapters and provider backends. If you rely on adapter-backed models and need accurate `result.context_wrapper.usage` values:
 
-```python
-from agents import Agent, ModelSettings, Runner
-from agents.extensions.models.litellm_model import LitellmModel
-
-agent = Agent(
-    name="Assistant",
-    model=LitellmModel(model="your/model", api_key="..."),
-    model_settings=ModelSettings(include_usage=True),
-)
+- With `AnyLLMModel`, usage is propagated automatically when the upstream provider returns it. For streamed Chat Completions backends, you may need `ModelSettings(include_usage=True)` before usage chunks are emitted.
+- With `LitellmModel`, some provider backends do not report usage by default, so `ModelSettings(include_usage=True)` is often required.
 
-result = await Runner.run(agent, "What's the weather in Tokyo?")
-print(result.context_wrapper.usage.total_tokens)
-```
+Review the adapter-specific notes in the [Third-party adapters](models/index.md#third-party-adapters) section of the Models guide and validate the exact provider backend you plan to deploy.
 
 ## Per-request usage tracking
 
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -156,7 +156,11 @@ plugins:
                 - Extensions:
                     - ref/extensions/handoff_filters.md
                     - ref/extensions/handoff_prompt.md
-                    - ref/extensions/litellm.md
+                    - Third-party adapters:
+                        - Any-LLM model: ref/extensions/models/any_llm_model.md
+                        - Any-LLM provider: ref/extensions/models/any_llm_provider.md
+                        - LiteLLM model: ref/extensions/models/litellm_model.md
+                        - LiteLLM provider: ref/extensions/models/litellm_provider.md
                     - ref/extensions/tool_output_trimmer.md
                     - ref/extensions/memory/sqlalchemy_session.md
                     - ref/extensions/memory/async_sqlite_session.md