tts-skill
// Multi-engine text-to-speech skill. Supports Qwen3-TTS local voice cloning, VoiceCraft online TTS, and OpenAI TTS.
$ git log --oneline --stat
stars:194
forks:37
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nametts-skill
descriptionMulti-engine text-to-speech skill. Supports Qwen3-TTS local voice cloning, VoiceCraft online TTS, and OpenAI TTS.
🎙️ TTS-Skill — Multi-Engine Text-to-Speech
TTS-Skill provides a single entrypoint for generating speech using multiple backends, with consistent output naming and progress feedback for long-running jobs.
Engines
- qwen3-tts: local voice cloning with a reference audio + transcript
- edge-tts: online voices with speed/pitch/style controls
- openai-tts: OpenAI speech generation via API
Command Syntax
/tts-skill [engine] [text] --voice [voice-keyword] [other options]
If you use the Python entrypoint:
python tts-skill.py [engine] [text] --voice [voice-keyword]
Text Input
Pass text as a positional argument, or use --text-file / -f to read from a file.
Example:
python tts-skill.py qwen3-tts --text-file "input\\text.txt" --voice 寒冰射手
Notes:
--text-filesupports relative and absolute paths; relative paths are resolved from your current working directory- If both positional text and
--text-fileare provided,--text-filetakes priority - UTF-8 is recommended (UTF-8 BOM is supported); on decode error it falls back to GBK
You can also call engine scripts directly:
python engines/qwen3-tts-cli.py --text-file "input\\text.txt" --voice 寒冰射手
python engines/edge-tts-cli.py --text-file "input\\text.txt" --voice xiaoxiao
python engines/openai-tts-cli.py --text-file "input\\text.txt" --voice alloy
Local Voice Assets (Qwen3-TTS)
To add a clone voice, put a matching pair of files in assets/:
assets/Lei.wav
assets/Lei.txt
Supported audio formats: .wav, .mp3, .m4a, .flac.
Then:
python tts-skill.py qwen3-tts "测试文本" --voice Lei
Output
If --output is not provided:
- Output directory:
output/ - Filename pattern:
YYYYMMDD_HHMMSS_<first-6-chars>.<ext>
Progress & Timing (Qwen3-TTS)
Qwen3-TTS jobs print a live progress bar with ETA. After completion, tts-skill.py prints:
- total runtime
- total chars and Chinese chars
- average seconds per Chinese character (or per char if no Chinese)
Project Layout
tts-skill/
├── .trae/
│ └── plans/
├── assets/
│ ├── Lei.txt
│ ├── 寒冰射手.txt
│ ├── 布里茨.txt
│ └── 赵信.txt
├── engines/
│ ├── edge-tts-cli.py
│ ├── edge-tts.config
│ ├── openai-tts-cli.py
│ ├── openai-tts.config
│ ├── qwen3-tts-cli.py
│ └── qwen3-tts.config
├── input/
│ └── text.txt
├── output/
├── tts-skill.py
├── INSTALL.md
├── INSTALL.zh-CN.md
├── README.md
├── README.zh-CN.md
├── SKILL.md
└── SKILL.zh-CN.md
Chinese Spec
See SKILL.zh-CN.md.