Commit Graph

3 Commits

Author SHA1 Message Date
Cosmo
e96e7a1342 feat(voice): tool endpoints, timer widget, clean Siri-style overlay
All checks were successful
Deploy / deploy (push) Successful in 3m18s
Adds the infrastructure for Claude tool use + visual timer.

Tablet API surface (all bearer-authed with VOICE_API_KEY, middleware bypassed):
- /api/voice/tools/weather    — current + short forecast via Open-Meteo
- /api/voice/tools/transport  — tram arrivals by direction / route filter
- /api/voice/tools/events     — Google Calendar today/week
- /api/voice/tools/notes      — notes + shopping lists
- /api/voice/timer            — start (with seconds+label), cancel; GET list (cookie ok)
  Active timers persisted at /data/tablet-timers.json

UI:
- VoiceOverlay stripped to minimal Siri look: no agent emoji/name, just the
  pulsing orb (3-layer radial gradient, independent breath animations),
  subtle status label on wake only, transcription/response text centered.
  Agents distinguished by orb color (Cosmo indigo/violet, Люся pink).
- TimerWidget: bottom-right chip stack with countdown, progress bar, turns
  amber in last 10s. On expiry, fires fullscreen alarm overlay with beep
  (WebAudio osc) + Остановить button.

Other:
- lib/timers.ts — persistent timer store in /data
- lib/voice-tools.ts — shared bearer-auth helper
- middleware — bypass list now covers /api/voice/tools/* and /api/voice/timer
2026-04-23 13:33:31 +00:00
Cosmo
a780fc7bd5 feat(voice): play TTS through tablet speakers via ElevenLabs proxy
All checks were successful
Deploy / deploy (push) Successful in 2m58s
Stage 2 of voice integration — centralizes TTS on the tablet so the
Python satellite no longer needs ElevenLabs credentials or mpv.

- app/api/voice/tts — POST {text, agent}, proxies to ElevenLabs
  streaming endpoint with flash_v2_5 default, returns audio/mpeg.
  Per-agent voice id via COSMO_TTS_VOICE / LUSYA_TTS_VOICE env.
- VoiceOverlay — on response/error events fetches TTS and plays via
  HTMLAudioElement; on wake event stops playback (barge-in). Dismiss
  timer extended by text length so long responses do not cut off.
- Autoplay caveat: browser may block first playback until user taps
  anywhere on the page (FKB: enable Force Autoplay to bypass).
2026-04-23 12:52:26 +00:00
Cosmo
51c3d6016a feat(voice): SSE bridge + Siri-blob overlay for wake-word script
All checks were successful
Deploy / deploy (push) Successful in 3m12s
Adds the tablet side of voice assistant integration. External Python
script (openWakeWord + Groq STT + OpenClaw) will POST state transitions
to /api/voice/event with a bearer token, and the tablet shows a
fullscreen overlay with Siri-style animated blob + current agent +
recognized text / response text.

- lib/voice-bus.ts — in-process EventEmitter singleton, preserved
  across hot reloads via globalThis
- app/api/voice/event — POST, bearer-auth via VOICE_API_KEY env,
  validates event kind, broadcasts on voiceBus
- app/api/voice/stream — GET, SSE endpoint, per-connection listener
  with 15s keep-alive ping and abort-signal cleanup
- components/VoiceOverlay — full-screen overlay, 3-layer pulsing
  Siri blob, per-agent palette (cosmo indigo/violet, lusya pink/rose),
  auto-dismiss timeouts (wake=20s safety, response=6s, error=4s),
  auto-reconnect on SSE drop
- middleware bypasses /api/voice/event so the script does not need
  a user auth cookie
- VoiceOverlay mounted in HomePageInner outside tab routing so it
  appears on every view
2026-04-23 12:36:26 +00:00