voice-recognition
// Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.
$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
namevoice-recognition
descriptionLocal speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.
version1.0.0
Voice Recognition (Whisper)
Local speech-to-text with OpenAI Whisper CLI.
Features
- Local processing - No API key needed, free
- Multi-language - Chinese, English, 100+ languages
- Translation - Translate to English
- Summarization - Generate quick summary
Usage
Basic
# Chinese recognition
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a
# Force Chinese
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --zh
# English recognition
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --en
# Translate to English
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --translate
# With summary
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --summarize
Quick Command (add to ~/.zshrc)
alias voice="python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py"
Then use:
voice ~/Downloads/audio.m4a --zh
Requirements
- OpenAI Whisper CLI:
brew install openai-whisper - Python 3.10+
Files
scripts/voice识别_升级版.py- Main scriptscripts/voice_tool_README.md- Documentation
Supported Formats
- MP3, M4A, WAV, OGG, FLAC, WebM
Language Support
100+ languages including:
- Chinese (zh)
- English (en)
- Japanese (ja)
- Korean (ko)
- And more...
Notes
- Default model:
medium(balance of speed and accuracy) - First run downloads model to
~/.cache/whisper - Processing time varies by audio length and model size