Skip to content

Optimize translation costs: default off, per-conversation toggle, batch mode, screen-aware deferral #6837

@beastoin

Description

@beastoin

Translation via the /v4/listen API is the largest translation cost driver. Currently every user with a primary language set gets real-time Google Cloud Translation API calls on every conversation, even monolingual speakers. This issue proposes 5 changes to cut costs while keeping translation seamless for users who need it.

Current Behavior

  • Default ON: single_language_mode defaults to false. Any user who sets a primary language during onboarding gets auto-translation active on every listen session.
  • Global toggle only: Translation is all-or-nothing via Settings → Language → Automatic Translation. No per-conversation control.
  • Real-time per-segment: TranslationCoordinator processes every segment through language detection → classification → GCP Translation API ($15/M chars). Monolingual gate skips most calls after 4 consecutive same-language detections, but the coordinator is instantiated for every session.
  • No screen awareness: Translation happens even when the phone screen is off and user isn't viewing the transcript.
  • No time limit: No restriction on translating old segments retroactively.
  • Desktop waste: Desktop app sends language param to backend, backend translates and persists, but desktop never displays translations — wasted API calls.

Expected Behavior

Smart, cost-efficient translation that's seamless when users need it but doesn't burn API calls when they don't.

Affected Areas

File Description
backend/routers/transcribe.py:318-328 Translation language determination — currently enabled whenever single_language_mode=false and language preference exists
backend/utils/translation_coordinator.py Real-time coordinator — instantiated every session even for monolingual users
backend/utils/translation.py GCP Translation API client — _client.translate_text() calls
backend/utils/translation_cache.py Monolingual gate + caching (already good, but gate activates too late)
backend/routers/users.py:525-534 Language preference endpoint — auto-sets single_language_mode based on language support, not user choice
app/lib/pages/settings/language_settings_page.dart Global translation toggle UI
app/lib/services/sockets/transcription_service.dart WebSocket language param
app/lib/providers/capture_provider.dart TranslationEvent handler
app/lib/widgets/transcript.dart Translation display in conversation view

Solution

1. Default auto-translate OFF

  • Change single_language_mode default to true for new users
  • Existing users with translation enabled keep their setting (no migration needed)
  • Users who want translation explicitly enable it in Settings

2. Per-conversation toggle in live conversation UI

  • Add a translate button/toggle in the live conversation capturing screen
  • When user turns it on mid-conversation, backend starts translating from that point
  • Once enabled for a conversation, keep auto-translate on for that conversation permanently (sticky per-conversation)
  • Settings → Language still has the global option to turn it off entirely

3. Batch translation for on-demand activation

  • When user enables translation mid-conversation, batch-translate existing segments (only segments created in the last 24h)
  • Use translate_units_batch() (already exists) for efficient batching instead of per-segment real-time calls
  • Cap retroactive translation to 24h window to prevent unbounded cost on long histories

4. Screen-aware deferral

  • Track screen on/off state from the mobile app (send as WebSocket metadata or periodic signal)
  • When screen is off: defer translation, accumulate segments
  • When screen turns on: batch-translate accumulated segments in one API call
  • Net effect: same UX (translations appear when user looks), far fewer API calls (1 batch vs N real-time)

5. Desktop: skip translation entirely

  • Desktop app doesn't display translations — don't request them
  • Either: desktop sends single_language_mode=true override in WebSocket params, or backend checks source=desktop and skips translation
  • Remove unused google-cloud-translate from pusher/requirements.txt (dead dependency)

Impact

  • Cost: Major reduction in GCP Translation API spend — most users are monolingual and currently trigger the coordinator unnecessarily
  • UX: No degradation — users who need translation get it on-demand with the same seamless experience
  • Risk: Users who currently rely on always-on translation will need to re-enable it after the default change (one-time)

by AI for @beastoin

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions