feat(llm): direct Claude Haiku 4.5 backend with prompt caching
Adds a parallel LLM backend that bypasses OpenClaw and talks to Anthropic Messages API directly. Selected via LLM_BACKEND=claude in .env; default remains openclaw so nothing breaks for existing setup. Why: OpenClaw gateway adds 500-1000ms overhead on every turn (auth, memory fetch, routing). Direct Haiku 4.5 + prompt caching = faster first token and -90% cost on cached chunks. - satellite/llm_claude.py — Anthropic SDK streaming client, prompt caching on system prompt and all-but-last-2 history messages, per agent+date JSON history in HISTORY_DIR, reset_history() for the 'сбрось' command, per-agent system prompts (Cosmo / Люся), fallback to error event if SDK/key missing. - satellite/llm.py — dispatches to ask_claude_stream when backend=claude, exports LLM_BACKEND so modes.py can route reset too. - satellite/modes.py — _handle_reset calls reset_history when backend is claude, keeps /new POST for openclaw. - requirements.txt — anthropic >= 0.50.0 - .env.example — LLM_BACKEND, ANTHROPIC_API_KEY, ANTHROPIC_MODEL, HISTORY_DIR, MAX_HISTORY, HTTPS_PROXY block for non-RU egress. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -11,6 +11,9 @@ from . import notifier
|
||||
|
||||
VOICE_SESSION_KEY = os.getenv("VOICE_SESSION_KEY", "agent:main:voice:home")
|
||||
|
||||
# Feature flag — выбор LLM backend. openclaw (дефолт) или claude (прямой Anthropic).
|
||||
LLM_BACKEND = os.getenv("LLM_BACKEND", "openclaw").lower()
|
||||
|
||||
# "stream" — режем по предложениям (быстро, но рваная интонация)
|
||||
# "full" — собираем весь ответ, потом TTS (естественно, но пауза перед началом)
|
||||
TTS_MODE = os.getenv("TTS_MODE", "full")
|
||||
@@ -65,7 +68,12 @@ def _post_with_retry(session, url, headers, payload):
|
||||
|
||||
|
||||
def ask_agent_stream(text: str, agent_id: str = "cosmo") -> str:
|
||||
"""Отправляет запрос к OpenClaw gateway и озвучивает ответ."""
|
||||
"""Отправляет запрос к выбранному LLM backend и озвучивает ответ."""
|
||||
if LLM_BACKEND == "claude":
|
||||
from .llm_claude import ask_claude_stream
|
||||
return ask_claude_stream(text, agent_id)
|
||||
|
||||
# Иначе — путь через OpenClaw (старый behaviour)
|
||||
def _maybe_speak(t: str):
|
||||
# Если TTS на планшете — пропускаем локальный звук, планшет зачитает по response event.
|
||||
if t.strip() and notifier.speak_locally():
|
||||
|
||||
Reference in New Issue
Block a user