A desktop application built with Claude Code that records audio and transcribes it locally using OpenAI's Whisper model — no cloud API required.
The tool captures microphone audio through a tkinter GUI, saves recordings as WAV files, and runs them through a local Whisper speech-recognition model for transcription. Everything runs on your machine — no data leaves your desktop.
The application is structured as four Python modules that separate concerns cleanly:
sounddevice. Manages a state machine (idle/recording/paused), streams microphone input into memory, and writes timestamped WAV files via scipy.openai/whisper-base model through Hugging Face Transformers on first use. Converts int16 audio to float32, runs the speech-recognition pipeline, and saves transcripts as text files.sounddevice — cross-platform audio I/Otransformers + torch — Hugging Face pipeline and PyTorch backend for Whisperscipy + numpy — WAV file I/O and audio array processingtkinter — GUI framework (included with Python)pip install -r requirements.txt
python main.py
The Whisper model downloads automatically on first run (~150 MB). Recordings are saved to recordings/ and transcripts to transcripts/.