Skip to content

feat: rename AnalyticsFormat to API enum names with legacy aliases#328

Open
jamesbroadhead wants to merge 2 commits into
mainfrom
stack/arrow-2-format-rename
Open

feat: rename AnalyticsFormat to API enum names with legacy aliases#328
jamesbroadhead wants to merge 2 commits into
mainfrom
stack/arrow-2-format-rename

Conversation

@jamesbroadhead
Copy link
Copy Markdown
Contributor

@jamesbroadhead jamesbroadhead commented Apr 29, 2026

Summary

Aligns the client-side analytics format model with the verbatim enum values in the Databricks Statement Execution API (Format) so we no longer translate to/from local aliases:

  • "JSON""JSON_ARRAY"
  • "ARROW""ARROW_STREAM"

The old spellings are kept as @deprecated aliases — see "Backwards compatibility" below — so existing useAnalyticsQuery({ format: "JSON" | "ARROW" }) callers keep working unchanged.

Stack

Based on #327. Review independently — no behavior change.

Backwards compatibility

AnalyticsFormat is widened to a four-member union: "JSON_ARRAY" | "ARROW_STREAM" | "JSON" | "ARROW", with the two legacy values tagged @deprecated and a JSDoc note describing the removal condition (no consumer remaining on appkit/appkit-ui < 0.33.0). The route handler normalizes incoming legacy values once at the entry point, so all downstream code (cache key, format branching, formatParameters) continues to operate on the canonical "JSON_ARRAY" | "ARROW_STREAM" pair. InferResultByFormat is widened so callers passing "ARROW" still get TypedArrowTable<...> inferred.

Net effect: not a breaking change. IDE shows a strikethrough/deprecation hint on the old spellings.

Test plan

  • Existing tests pass after rename
  • Legacy format: "JSON" and format: "ARROW" callers still typecheck and return the right inferred result types
  • Server-side normalization preserves the existing cache-key shape (canonical values only) — no cache fragmentation between callers using different spellings

This pull request was AI-assisted by Isaac.

@jamesbroadhead jamesbroadhead requested a review from a team as a code owner May 11, 2026 14:47
@jamesbroadhead jamesbroadhead requested review from MarioCadenas and removed request for a team May 11, 2026 14:47
@jamesbroadhead jamesbroadhead deleted the branch main May 11, 2026 15:14
@jamesbroadhead jamesbroadhead changed the title refactor: rename AnalyticsFormat to API enum names (JSON_ARRAY, ARROW_STREAM) feat: rename AnalyticsFormat to API enum names with legacy aliases May 11, 2026
@jamesbroadhead jamesbroadhead changed the base branch from stack/arrow-1-coverage-tests to main May 11, 2026 16:21
Renames the client-side analytics format model from "JSON"/"ARROW" to
"JSON_ARRAY"/"ARROW_STREAM" to match the Statement Execution API enum
verbatim — no more local-name to API-name translation.

Pure mechanical rename. No behavior change. Internal type values only;
the lowercase user-facing values passed to useChartData ("json", "arrow",
"auto") are unchanged.

Carved out of #256 (#327 is layer 1, this is layer 2). The actual
inline-Arrow-IPC + warehouse-fallback fix sits on top of this in layer 3.

Note: this is a breaking change for any direct consumer of
useAnalyticsQuery passing explicit format: "JSON" or "ARROW" — they will
need to update to "JSON_ARRAY" / "ARROW_STREAM". Consumers using
useChartData (lowercase "json"/"arrow"/"auto") are unaffected.

Co-authored-by: Isaac
Widen AnalyticsFormat to also include the pre-rename "JSON" and "ARROW"
spellings, both marked @deprecated with a JSDoc note describing the
removal condition (no consumer on appkit/appkit-ui < 0.33.0). Add a
normalizeAnalyticsFormat helper and call it at the analytics route
handler entry point so all downstream code (cache key, format
branching, formatParameters) continues to operate on the canonical
"JSON_ARRAY" | "ARROW_STREAM" values.

InferResultByFormat is widened to also match "ARROW" so callers
passing the legacy spelling still get TypedArrowTable<...> inferred.

This lifts the breaking-change carve-out from the rename, so callers
of useAnalyticsQuery({ format: "JSON" | "ARROW" }) keep working with
only an IDE deprecation hint.

Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
@jamesbroadhead jamesbroadhead force-pushed the stack/arrow-2-format-rename branch from 2d40daf to 265a9ed Compare May 11, 2026 16:30
jamesbroadhead added a commit that referenced this pull request May 11, 2026
Serverless warehouses return ARROW_STREAM + INLINE results as base64 Arrow
IPC in result.attachment rather than result.data_array. The previous code
path discarded inline data for any ARROW_STREAM response (designed for
EXTERNAL_LINKS), so these warehouses silently returned empty results.

This commit makes the analytics plugin work across classic and serverless
warehouses by handling both dispositions for ARROW_STREAM, decoding inline
Arrow IPC attachments server-side, and falling back to JSON_ARRAY when a
warehouse rejects ARROW_STREAM + INLINE.

Changes
- Inline Arrow IPC decoding (new arrow-schema.ts) via apache-arrow's
  tableFromIPC, producing the same row-object shape as JSON_ARRAY
  regardless of warehouse backend. apache-arrow@21.1.0 added as a server
  dep.
- Format fallback: ARROW_STREAM + INLINE requests automatically fall back
  to JSON_ARRAY if a classic warehouse rejects them. Explicit format
  requests are respected without fallback.
- Zod-validated SSE wire protocol for /api/analytics/query (shared schema
  between server and client; malformed payloads surface a clear error
  instead of silent undefined).
- Default remains JSON_ARRAY for compatibility.

Stack: layer 3 of 3 carved from #256.
- #327 — coverage backfill (layer 1)
- #328 — AnalyticsFormat rename to API enum names (layer 2)
- (this PR) — the actual fix

Fixes #242

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant