Overview
Paratran is a transcription tool for macOS on Apple Silicon. It wraps NVIDIA’s Parakeet TDT model (running through parakeet-mlx) in three interfaces: a CLI for one-off jobs, a REST API for long-running services, and an MCP server so Claude Code or Claude Desktop can transcribe audio as a tool call.
I built it because Whisper is slow on a laptop and most of the fast alternatives are cloud services. The default Parakeet model hits 6.34% average WER across eight English benchmarks, supports 25 languages, and runs roughly 30x faster than Whisper on the same hardware via MLX. It’s fast enough that a one-hour recording transcribes in well under a minute on an M-series chip.

Features
CLI
- Transcribe one file or a batch in a single command
- Output as plain text, JSON, SRT, or WebVTT (or all four at once)
- Greedy or beam search decoding with full control over beam size, length penalty, patience, and duration reward
- Chunking for long audio with configurable overlap so sentence boundaries survive joins
- Optional sentence splitting by max words, max duration, or silence gap
- BF16 by default with an
--fp32flag for environments that need it
Client mode (no per-file model reload)
The model takes a few seconds to load. Reloading it for every file is wasteful, so the same paratran binary doubles as a client: start paratran serve once, then point subsequent invocations at it with -s http://localhost:8000 (or PARATRAN_SERVER in the environment) and they hand the file off over HTTP and print the result. The CLI surface is identical either way.
REST API
A FastAPI server exposes /health and /transcribe. Upload a wav, mp3, flac, m4a, ogg, or webm file and get back the text plus per-sentence timing and per-token timestamps as JSON. All decoding and sentence-splitting options are query parameters, and interactive docs ship at /docs.
curl -X POST http://localhost:8000/transcribe -F "file=@recording.m4a"MCP server
The same package installs paratran-mcp, an MCP server that exposes a transcribe tool to any MCP client. Stdio is the default (drop a few lines into .claude/settings.json and Claude Code can transcribe audio files in-place), and a streamable HTTP transport is available for remote or multi-client setups.
{
"mcpServers": {
"paratran": {
"command": "uvx",
"args": ["--from", "paratran", "paratran-mcp"]
}
}
}Quick start
# Run without installing
uvx paratran recording.wav
# Or install as a tool
uv tool install paratran
paratran recording.wav
# Start the server once, transcribe many files instantly
paratran serve &
paratran -s http://localhost:8000 --output-format all recording.wavThe default model (parakeet-tdt-0.6b-v3) downloads on first use into the HuggingFace cache. --cache-dir or PARATRAN_MODEL_DIR redirects it to an external drive when the local SSD is tight.
Technology stack
- Python 3.11+, distributed on PyPI as
paratran - parakeet-mlx for inference, running on Apple’s MLX framework
- FastAPI and Uvicorn for the REST server
- The official
mcpPython SDK for the MCP server, with stdio and streamable-HTTP transports - macOS on M1/M2/M3/M4 (the MLX backend is Apple Silicon only)