ai-anthropic: max_tokens defaults to 1024, silently truncating responses when caller doesn't set it

## Summary

`@tanstack/ai-anthropic`'s text adapter defaults the Anthropic `max_tokens` request field to **1024** when the caller doesn't pass one. For any non-trivial generation (codegen, agentic tool flows, long-form output) this silently truncates the response: the request comes back with `stop_reason: "max_tokens"`, often mid-tool-call, so the run looks like it "failed to do anything" rather than "ran out of output budget".

This is Anthropic-specific. Anthropic's Messages API *requires* `max_tokens`, so the adapter must send some value — but 1024 is far below what the targeted models can produce (Sonnet/Opus support 64K–128K output), and the package already knows each model's real ceiling.

## Where

`packages/ai-anthropic/src/adapters/text.ts:423`

```ts
const defaultMaxTokens = modelOptions?.max_tokens ?? 1024
```

## Why 1024 is the wrong default

- **It's a ceiling, not a reservation.** Billing is on tokens actually generated, so a higher default costs nothing unless the model genuinely produces more. The only effect of a low default is truncation.
- **The data for a better default already exists.** `packages/ai-anthropic/src/model-meta.ts` carries `max_output_tokens` per model (e.g. `128_000`, `32_000`). The default ignores it and hard-codes 1024.
- **It's inconsistent across adapters.** `@tanstack/ai-openai` has no equivalent `?? 1024` floor (OpenAI treats `max_tokens` as optional and defaults to the model max), so callers only get bitten on Anthropic.
- **The failure mode is opaque.** It surfaces as a confusing incomplete/failed agent run, not an obvious "you hit max_tokens" — even though the adapter already has a `case "max_tokens"` branch and therefore knows it truncated.

## Reproduction

Call `chat({ adapter: anthropicText('claude-sonnet-4-5'), ... })` (or `createAnthropicChat`) with a prompt that asks the model to write a file / produce output longer than ~1024 tokens, and **don't** set `maxTokens`. The stream ends early with `stop_reason: "max_tokens"` and the tool/file output is cut off mid-stream.

Setting `maxTokens` explicitly on the `chat()` call works around it, which confirms the default is the cause.

## Proposed fix

1. Default `max_tokens` from `ModelMeta.max_output_tokens` for the resolved model when the caller doesn't specify one, falling back to a sane constant only for unknown models. (Optionally cap to a reasonable ceiling so an unspecified call can't accidentally request the full 128K — though since it's a ceiling-not-reservation and large values should be streamed, defaulting to the model max is defensible.)
2. When a response stops on `stop_reason: "max_tokens"` while using the defaulted (caller-unspecified) cap, emit a warning so truncation isn't silent.

Happy to open a PR with the model-meta-aware default + truncation warning (plus the docs/skill updates the repo conventions call for). Wanted to confirm the approach in an issue first.

## Environment

- `@tanstack/ai-anthropic` — present on `main` (v0.15.8) and on the published `0.10.1`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ai-anthropic: max_tokens defaults to 1024, silently truncating responses when caller doesn't set it #849

Summary

Where

Why 1024 is the wrong default

Reproduction

Proposed fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

ai-anthropic: max_tokens defaults to 1024, silently truncating responses when caller doesn't set it #849

Description

Summary

Where

Why 1024 is the wrong default

Reproduction

Proposed fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions