Commit Graph

24 Commits

Author SHA1 Message Date
Cosmo
05de9c284b feat(llm): direct Claude Haiku 4.5 backend with prompt caching
Adds a parallel LLM backend that bypasses OpenClaw and talks to
Anthropic Messages API directly. Selected via LLM_BACKEND=claude in
.env; default remains openclaw so nothing breaks for existing setup.

Why: OpenClaw gateway adds 500-1000ms overhead on every turn (auth,
memory fetch, routing). Direct Haiku 4.5 + prompt caching = faster
first token and -90% cost on cached chunks.

- satellite/llm_claude.py — Anthropic SDK streaming client, prompt
  caching on system prompt and all-but-last-2 history messages, per
  agent+date JSON history in HISTORY_DIR, reset_history() for the
  'сбрось' command, per-agent system prompts (Cosmo / Люся), fallback
  to error event if SDK/key missing.
- satellite/llm.py — dispatches to ask_claude_stream when backend=claude,
  exports LLM_BACKEND so modes.py can route reset too.
- satellite/modes.py — _handle_reset calls reset_history when backend
  is claude, keeps /new POST for openclaw.
- requirements.txt — anthropic >= 0.50.0
- .env.example — LLM_BACKEND, ANTHROPIC_API_KEY, ANTHROPIC_MODEL,
  HISTORY_DIR, MAX_HISTORY, HTTPS_PROXY block for non-RU egress.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 13:12:39 +00:00
Cosmo
584e21923c feat(notifier): route TTS to tablet when TABLET_TTS_ENABLED
When TABLET_URL and VOICE_API_KEY are set, the tablet handles TTS
via its ElevenLabs proxy — local speak() is skipped. Controlled
by TABLET_TTS_ENABLED (default true when tablet is configured).

- notifier.speak_locally() — gate used by all local speech paths
- llm._maybe_speak — no-op when tablet plays the voice
- modes._handle_reset — emits response event and skips local speak
  when tablet TTS is on; keeps spoken fallback otherwise

Tablet side in smart-home-tablet repo: /api/voice/tts endpoint +
VoiceOverlay audio playback (commit ba2e… pending).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 12:52:34 +00:00
Cosmo
e4e7529063 feat(notifier): push state events to Smart Home Tablet overlay
Adds a thin HTTP bridge so the tablet at https://tablet.digital-home.site
shows a Siri-style overlay reflecting the current assistant state
(wake / command / response / idle / error). Non-fatal: if the tablet
is offline or TABLET_URL/VOICE_API_KEY are unset, events are silently
skipped and the assistant keeps working.

- satellite/notifier.py — POST /api/voice/event with bearer token,
  reused requests.Session for keep-alive, 1.5s timeout
- satellite/modes.py — emits wake on activation, command after STT,
  response after LLM, idle on timeout
- satellite/llm.py — emits error on gateway connection/timeout/HTTP
- .env.example documents TABLET_URL and VOICE_API_KEY

Tablet side (separate repo smart-home-tablet, commit 51c3d60) exposes
POST /api/voice/event + GET /api/voice/stream (SSE) and renders a
full-screen overlay in components/VoiceOverlay.tsx.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 12:43:01 +00:00
a9001aef92 refactor: VAD upgrade, retry, dead code cleanup, AGENT removal
- audio: switch VAD to webrtcvad with RMS gate + fallback to RMS
- audio: honor FOLLOWUP_TIMEOUT — short silence wait after bot response
- llm: retry with exponential backoff on network errors and 5xx
- llm: VOICE_MAX_TOKENS env (default 300) instead of hardcoded 150
- tts: optional VAD-based barge-in (BARGE_IN_ENABLED, off by default)
- tts: remove dead start_barge_in_listener / was_barge_in helpers
- config: drop AGENT/LUSYA_AGENT — routing happens via session_key
- modes: remove unused imports, pass FOLLOWUP_TIMEOUT to follow-up record()
- docs: full rewrite of README and CLAUDE.md to match current architecture
2026-04-16 17:10:59 +03:00
Cosmo
a885cbe74b feat: VAD-based barge-in during TTS playback 2026-04-14 15:28:14 +00:00
Cosmo
cdf8748e48 feat: VAD-based barge-in during TTS playback 2026-04-14 15:28:12 +00:00
Cosmo
cd921e1540 fix: strip emoji from TTS text in clean_for_speech 2026-04-14 15:26:21 +00:00
3301b3559d Merge branch 'main' of https://git.digital-home.site/daniil/home-voice-assistant 2026-04-14 18:23:41 +03:00
0f4ae3a80c Edit new session 2026-04-14 18:22:01 +03:00
Cosmo
cc9de661cc feat: barge-in support — stop TTS when wake word detected during playback 2026-04-14 15:02:00 +00:00
Cosmo
182e7875ab fix: strip filler phrases from agent response before TTS 2026-04-14 11:49:58 +00:00
a0618c961d Add russian translate 2026-04-14 14:40:52 +03:00
cc8cbefe18 Delete logs 2026-04-14 13:45:38 +03:00
Cosmo
24c8e38be6 fix: replace VOICE_SESSION_KEY with COSMO_SESSION_KEY and LUSYA_SESSION_KEY 2026-04-14 09:45:48 +00:00
09d22177cd Edit voice settings 2026-04-13 23:28:41 +03:00
0494c24c47 Delete conversation from modes 2026-04-13 23:19:18 +03:00
28cccbdac1 Merge pull request 'feat: route voice through OpenClaw agent session (full memory + tools)' (#1) from feature/openclaw-agent-session into main
Reviewed-on: #1
2026-04-13 20:15:02 +00:00
Cosmo
d9d892664a feat: route voice requests through OpenClaw agent session
- Remove local Conversation history (now managed by gateway)
- Use x-openclaw-session-key for persistent agent sessions
- Agent now has full context: SOUL.md, MEMORY.md, tools
- Add VOICE_SESSION_KEY env var (default: agent:main:voice:home)
- Backward compatible: conv parameter kept for compatibility
2026-04-13 20:12:01 +00:00
a836bbb848 Add wakeword cosmo model v1 2026-04-13 18:33:53 +03:00
7239f85506 Edit tts mode 2026-04-13 18:33:19 +03:00
780f6f0084 Switch wake word from Porcupine to openwakeword + training pipeline
- Add training/ pipeline (step_1..step_5) and own-samples flow
- record_wav.py with single-shot and long-record modes, RMS-based silence filter
- remove_silent.py to drop silent samples and renumber
- modes.py: openwakeword inference with reset() and quiet predictions; commented Lusya block for later
- stt.py: drop local faster-whisper fallback, Groq-only
- config.py: remove unused STT_PROVIDER/WHISPER_*
- llm.py: replace __import__("os") hack with proper import
- tts.py: remove debug traceback in play_error_sound
- requirements.txt: add openwakeword/sounddevice/scipy, drop faster-whisper
- deploy/setup.sh: validate ELEVENLABS_API_KEY and WAKE_WORD_COSMO presence
- README.md, CLAUDE.md, project_roadmap memory updated to reflect new architecture
2026-04-13 15:40:44 +03:00
0a89bf5105 Edit code for success run 2026-04-12 21:58:40 +03:00
d.klimov
128cc70ab9 Add voice messages 2026-04-12 15:59:25 +03:00
7ca8268b78 Initial commit: Cosmo Voice Satellite
Two-agent voice assistant (Cosmo + Люся) via OpenClaw Gateway.
Streaming STT (Groq) + LLM + TTS (ElevenLabs) pipeline with
keep-alive sessions, barge-in, and daily conversation sessions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:34:08 +03:00