39 lines
1.1 KiB
Markdown
39 lines
1.1 KiB
Markdown
# Groq Whisper API (free)
|
|
|
|
Transcribe audio files using Groq's free Whisper inference API.
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
|
|
```
|
|
|
|
Defaults:
|
|
- Model: `whisper-large-v3` (Groq's fastest whisper model)
|
|
- Output: `<input>.txt`
|
|
|
|
## Useful flags
|
|
|
|
```bash
|
|
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model whisper-large-v3 --out /tmp/transcript.txt
|
|
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language en
|
|
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --prompt "Speaker names: Peter, Daniel"
|
|
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json
|
|
```
|
|
|
|
## API key
|
|
|
|
Uses the `GROQ_API_KEY` environment variable (already configured on the gateway).
|
|
|
|
## Models available
|
|
|
|
- `whisper-large-v3` - Latest and fastest on Groq (recommended)
|
|
- `whisper-large-v2` - Slightly older but still fast
|
|
- `whisper-base` - Faster but less accurate
|
|
|
|
## Why Groq?
|
|
|
|
- **Free** — no per-minute charges
|
|
- **Fast** — Groq's LPU delivers near-real-time transcription
|
|
- **No quota limits** — generous free tier
|