Speech-to-text transcription using Groq Whisper API. Supports m4a, mp3, wav, ogg, flac, webm.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
npx agent-skills-cli listSkill Instructions
name: transcribe description: Speech-to-text transcription using Groq Whisper API. Supports m4a, mp3, wav, ogg, flac, webm.
Transcribe
Speech-to-text using Groq Whisper API.
Setup
The script needs GROQ_API_KEY environment variable. Check if already set:
echo $GROQ_API_KEY
If not set, guide the user through setup:
- Ask if they have a Groq API key
- If not, have them sign up at https://console.groq.com/ and create an API key
- Have them add to their shell profile (~/.zshrc or ~/.bashrc):
export GROQ_API_KEY="<their-api-key>" - Then run
source ~/.zshrc(or restart terminal)
Usage
{baseDir}/transcribe.sh <audio-file>
Supported Formats
- m4a, mp3, wav, ogg, flac, webm
- Max file size: 25MB
Output
Returns plain text transcription with punctuation and proper capitalization to stdout.
More by badlogic
View allThis is a multiline description. It spans multiple lines. And should be normalized.
VS Code integration for viewing diffs and comparing files. Use when showing file differences to the user.
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
Google Calendar CLI for listing calendars, viewing/creating/updating events, and checking availability.
