Summary
@tanstack/ai-anthropic's text adapter defaults the Anthropic max_tokens request field to 1024 when the caller doesn't pass one. For any non-trivial generation (codegen, agentic tool flows, long-form output) this silently truncates the response: the request comes back with stop_reason: "max_tokens", often mid-tool-call, so the run looks like it "failed to do anything" rather than "ran out of output budget".
This is Anthropic-specific. Anthropic's Messages API requires max_tokens, so the adapter must send some value — but 1024 is far below what the targeted models can produce (Sonnet/Opus support 64K–128K output), and the package already knows each model's real ceiling.
Where
packages/ai-anthropic/src/adapters/text.ts:423
const defaultMaxTokens = modelOptions?.max_tokens ?? 1024
Why 1024 is the wrong default
- It's a ceiling, not a reservation. Billing is on tokens actually generated, so a higher default costs nothing unless the model genuinely produces more. The only effect of a low default is truncation.
- The data for a better default already exists.
packages/ai-anthropic/src/model-meta.ts carries max_output_tokens per model (e.g. 128_000, 32_000). The default ignores it and hard-codes 1024.
- It's inconsistent across adapters.
@tanstack/ai-openai has no equivalent ?? 1024 floor (OpenAI treats max_tokens as optional and defaults to the model max), so callers only get bitten on Anthropic.
- The failure mode is opaque. It surfaces as a confusing incomplete/failed agent run, not an obvious "you hit max_tokens" — even though the adapter already has a
case "max_tokens" branch and therefore knows it truncated.
Reproduction
Call chat({ adapter: anthropicText('claude-sonnet-4-5'), ... }) (or createAnthropicChat) with a prompt that asks the model to write a file / produce output longer than ~1024 tokens, and don't set maxTokens. The stream ends early with stop_reason: "max_tokens" and the tool/file output is cut off mid-stream.
Setting maxTokens explicitly on the chat() call works around it, which confirms the default is the cause.
Proposed fix
- Default
max_tokens from ModelMeta.max_output_tokens for the resolved model when the caller doesn't specify one, falling back to a sane constant only for unknown models. (Optionally cap to a reasonable ceiling so an unspecified call can't accidentally request the full 128K — though since it's a ceiling-not-reservation and large values should be streamed, defaulting to the model max is defensible.)
- When a response stops on
stop_reason: "max_tokens" while using the defaulted (caller-unspecified) cap, emit a warning so truncation isn't silent.
Happy to open a PR with the model-meta-aware default + truncation warning (plus the docs/skill updates the repo conventions call for). Wanted to confirm the approach in an issue first.
Environment
@tanstack/ai-anthropic — present on main (v0.15.8) and on the published 0.10.1.
Summary
@tanstack/ai-anthropic's text adapter defaults the Anthropicmax_tokensrequest field to 1024 when the caller doesn't pass one. For any non-trivial generation (codegen, agentic tool flows, long-form output) this silently truncates the response: the request comes back withstop_reason: "max_tokens", often mid-tool-call, so the run looks like it "failed to do anything" rather than "ran out of output budget".This is Anthropic-specific. Anthropic's Messages API requires
max_tokens, so the adapter must send some value — but 1024 is far below what the targeted models can produce (Sonnet/Opus support 64K–128K output), and the package already knows each model's real ceiling.Where
packages/ai-anthropic/src/adapters/text.ts:423Why 1024 is the wrong default
packages/ai-anthropic/src/model-meta.tscarriesmax_output_tokensper model (e.g.128_000,32_000). The default ignores it and hard-codes 1024.@tanstack/ai-openaihas no equivalent?? 1024floor (OpenAI treatsmax_tokensas optional and defaults to the model max), so callers only get bitten on Anthropic.case "max_tokens"branch and therefore knows it truncated.Reproduction
Call
chat({ adapter: anthropicText('claude-sonnet-4-5'), ... })(orcreateAnthropicChat) with a prompt that asks the model to write a file / produce output longer than ~1024 tokens, and don't setmaxTokens. The stream ends early withstop_reason: "max_tokens"and the tool/file output is cut off mid-stream.Setting
maxTokensexplicitly on thechat()call works around it, which confirms the default is the cause.Proposed fix
max_tokensfromModelMeta.max_output_tokensfor the resolved model when the caller doesn't specify one, falling back to a sane constant only for unknown models. (Optionally cap to a reasonable ceiling so an unspecified call can't accidentally request the full 128K — though since it's a ceiling-not-reservation and large values should be streamed, defaulting to the model max is defensible.)stop_reason: "max_tokens"while using the defaulted (caller-unspecified) cap, emit a warning so truncation isn't silent.Happy to open a PR with the model-meta-aware default + truncation warning (plus the docs/skill updates the repo conventions call for). Wanted to confirm the approach in an issue first.
Environment
@tanstack/ai-anthropic— present onmain(v0.15.8) and on the published0.10.1.