openWakeWord pipeline на onnxruntime-web прямо на планшете. Цепочка:
mic (16kHz, AudioWorklet) → melspectrogram.onnx → embedding_model.onnx
(sliding 76-frame window, stride 8) → cosmo.onnx → score 0..1.
Триггер при score≥0.5 → запускается тот же VAD-flow что и push-to-talk.
- public/wake/ — cosmo.onnx (custom-trained на голос Даниила) +
melspectrogram.onnx + embedding_model.onnx (~2.9MB вместе).
- lib/wake-word.ts — WakeWordDetector class. ort грузится через
<script src=/vad/ort.wasm.min.js> на клиенте — обход проблемы next-swc
с парсингом import.meta.url в onnxruntime-web .mjs билдах.
- VoiceController: тап = активация (нужен для AudioContext user-gesture),
далее непрерывное слушание wake-word; на детект → MicVAD флоу.
Долгий тап = выкл. Ручной тап остаётся как fallback.
После деплоя Python-агент на .103 не нужен — можно архивировать
home-voice-assistant. На .103 остаётся только ElevenLabs прокси :8888.
Шаг 1 миграции голосового стека из home-voice-assistant в сам tablet:
- /api/voice/chat — Claude Haiku 4.5 с tool-loop (max 4 раунда), prompt
caching на system + старой истории, история в /data/voice-history/.
Эмитит command/response/error в voice-bus → орб моргает как раньше.
- /api/voice/stt — Groq whisper-large-v3-turbo, multipart или raw audio.
- lib/voice-text.ts — порт clean_for_speech (без pymorphy3, время в
именительном падеже) и strip_fillers + RESET_PATTERNS.
- lib/voice-executors.ts — tool executors через loopback fetch на
существующие /api/voice/tools/* и /api/voice/timer.
- Поддержка ANTHROPIC_PROXY/GROQ_PROXY (fallback на HTTPS_PROXY).
После деплоя нужны GROQ_API_KEY и ANTHROPIC_API_KEY в tablet.env.
Шаги 2 (push-to-talk в браузере) и 3 (wake-word) — отдельно.
Bug: после перезагрузки страницы оверлей «Таймер прозвенел» открывался
снова и снова. Две причины:
- dismissTimer в TimerWidget удалял таймер только из локального
useState, но /data/tablet-timers.json оставался нетронутым. После
reload таймер возвращался в список и firedRef (которая пустая после
reload) снова триггерила alarm.
- lib/timers.ts держал просроченные таймеры 30 минут, давая им шанс
повторно сработать при каждом reload в этом окне.
Фикс:
- dismissTimer теперь POST /api/voice/timer {action:cancel, id} через
cookie auth (endpoint с прошлого коммита принимает и cookie, и bearer).
- Retention в listActive снижена до 30 секунд — этого хватает чтобы
клиент увидел свежий звонок; старше = самоудаление.
- TimerWidget клиентский фильтр тоже 30 секунд.
UI:
- Replace Notes column on Home bento with TimerHomeWidget. Shows all
active timers as stacked cards with big 30px countdowns, per-timer
+1/-1 minute buttons and cancel. Colors: indigo default, amber in
last 10s, red when expired. Empty state suggests voice command.
- Existing chip TimerWidget (bottom-right) kept for ambient view on
other tabs — redundant on Home, but harmless.
API:
- /api/voice/timer accepts cookie OR bearer (browser widget cancel
works with user's auth_token cookie; Python script uses bearer).
- New action 'adjust' — shifts endsAt by delta_seconds. Clamps so
endsAt never goes into the past.
- Cancel now supports {label} in addition to {id} (fuzzy substring
match, most-recently-started wins). Emits timer_cancel with id+label
so clients can refresh.
- findByLabel / adjustTimer helpers in lib/timers.ts.
Tool endpoints (events, notes, transport, weather) call other /api/*
routes via loopback (http://localhost:3000). Those routes are
middleware-protected — cookie-less loopbacks were getting 401, which
surfaced to the voice agent as get_today_events → tool_http_502.
Add internal header bypass: middleware lets the request through when
x-voice-internal matches VOICE_API_KEY. Only our own tool endpoints
use this header, from inside the same container, so the blast radius
is limited to loopback traffic.
- middleware.ts: check x-voice-internal before cookie
- lib/voice-tools.ts: internalHeaders() helper
- app/api/voice/tools/{weather,transport,events,notes}: use it
Adds the infrastructure for Claude tool use + visual timer.
Tablet API surface (all bearer-authed with VOICE_API_KEY, middleware bypassed):
- /api/voice/tools/weather — current + short forecast via Open-Meteo
- /api/voice/tools/transport — tram arrivals by direction / route filter
- /api/voice/tools/events — Google Calendar today/week
- /api/voice/tools/notes — notes + shopping lists
- /api/voice/timer — start (with seconds+label), cancel; GET list (cookie ok)
Active timers persisted at /data/tablet-timers.json
UI:
- VoiceOverlay stripped to minimal Siri look: no agent emoji/name, just the
pulsing orb (3-layer radial gradient, independent breath animations),
subtle status label on wake only, transcription/response text centered.
Agents distinguished by orb color (Cosmo indigo/violet, Люся pink).
- TimerWidget: bottom-right chip stack with countdown, progress bar, turns
amber in last 10s. On expiry, fires fullscreen alarm overlay with beep
(WebAudio osc) + Остановить button.
Other:
- lib/timers.ts — persistent timer store in /data
- lib/voice-tools.ts — shared bearer-auth helper
- middleware — bypass list now covers /api/voice/tools/* and /api/voice/timer
Adds the tablet side of voice assistant integration. External Python
script (openWakeWord + Groq STT + OpenClaw) will POST state transitions
to /api/voice/event with a bearer token, and the tablet shows a
fullscreen overlay with Siri-style animated blob + current agent +
recognized text / response text.
- lib/voice-bus.ts — in-process EventEmitter singleton, preserved
across hot reloads via globalThis
- app/api/voice/event — POST, bearer-auth via VOICE_API_KEY env,
validates event kind, broadcasts on voiceBus
- app/api/voice/stream — GET, SSE endpoint, per-connection listener
with 15s keep-alive ping and abort-signal cleanup
- components/VoiceOverlay — full-screen overlay, 3-layer pulsing
Siri blob, per-agent palette (cosmo indigo/violet, lusya pink/rose),
auto-dismiss timeouts (wake=20s safety, response=6s, error=4s),
auto-reconnect on SSE drop
- middleware bypasses /api/voice/event so the script does not need
a user auth cookie
- VoiceOverlay mounted in HomePageInner outside tab routing so it
appears on every view