feat(llm): direct Claude Haiku 4.5 backend with prompt caching

Adds a parallel LLM backend that bypasses OpenClaw and talks to
Anthropic Messages API directly. Selected via LLM_BACKEND=claude in
.env; default remains openclaw so nothing breaks for existing setup.

Why: OpenClaw gateway adds 500-1000ms overhead on every turn (auth,
memory fetch, routing). Direct Haiku 4.5 + prompt caching = faster
first token and -90% cost on cached chunks.

- satellite/llm_claude.py — Anthropic SDK streaming client, prompt
  caching on system prompt and all-but-last-2 history messages, per
  agent+date JSON history in HISTORY_DIR, reset_history() for the
  'сбрось' command, per-agent system prompts (Cosmo / Люся), fallback
  to error event if SDK/key missing.
- satellite/llm.py — dispatches to ask_claude_stream when backend=claude,
  exports LLM_BACKEND so modes.py can route reset too.
- satellite/modes.py — _handle_reset calls reset_history when backend
  is claude, keeps /new POST for openclaw.
- requirements.txt — anthropic >= 0.50.0
- .env.example — LLM_BACKEND, ANTHROPIC_API_KEY, ANTHROPIC_MODEL,
  HISTORY_DIR, MAX_HISTORY, HTTPS_PROXY block for non-RU egress.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This commit is contained in:

Cosmo

2026-04-23 13:12:39 +00:00

parent 584e21923c

commit 05de9c284b

5 changed files with 300 additions and 20 deletions

3

requirements.txt

View File

@@ -11,6 +11,9 @@ webrtcvad-wheels
 # STT через облако
 groq
 # LLM — прямой Claude (альтернатива OpenClaw, активируется LLM_BACKEND=claude)
 anthropic>=0.50.0
 # TTS
 elevenlabs

feat(llm): direct Claude Haiku 4.5 backend with prompt caching

3 requirements.txt Unescape Escape View File

3

requirements.txt

View File