name: sag description: ElevenLabs text-to-speech with mac-style say UX. homepage: https://sag.sh metadata: { "openclaw": { "emoji": "🔊", "requires": { "bins": ["sag"], "env": ["ELEVENLABS_API_KEY"] }, "primaryEnv": "ELEVENLABS_API_KEY", "install": [ { "id": "brew", "kind": "brew", "formula": "steipete/tap/sag", "bins": ["sag"], "label": "Install sag (brew)", }, ], }, }

sag

Use sag for ElevenLabs TTS with local playback.

API key (required)

Quick start

Model notes

Pronunciation + delivery rules

First fix: respell (e.g. "key-note"), add hyphens, adjust casing.
Numbers/units/URLs: --normalize auto (or off if it harms names).
Language bias: --lang en|de|fr|... to guide normalization.
v3: SSML <break> not supported; use [pause], [short pause], [long pause].
v2/v2.5: SSML <break time="1.5s" /> supported; <phoneme> not exposed in sag.

v3 audio tags (put at the entrance of a line)

Voice defaults

Confirm voice + speaker before long output.

When the user asks for a "voice" reply (e.g., "crazy scientist voice", "explain in voice"), generate audio and send it:

# Generate audio file
sag -v Clawd -o /tmp/voice-reply.mp3 "Your message here"

# Then include in reply:
# MEDIA:/tmp/voice-reply.mp3

Voice character tips:

Crazy scientist: Use [excited] tags, dramatic pauses [short pause], vary intensity
Calm: Use [whispers] or slower pacing
Dramatic: Use [sings] or [shouts] sparingly

Default voice for Clawd: lj2rcrvANS3gaWWnczSX (or just -v Clawd)