Назад към всички

tts-skill

// Multi-engine text-to-speech skill. Supports Qwen3-TTS local voice cloning, VoiceCraft online TTS, and OpenAI TTS.

$ git log --oneline --stat
stars:194
forks:37
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nametts-skill
descriptionMulti-engine text-to-speech skill. Supports Qwen3-TTS local voice cloning, VoiceCraft online TTS, and OpenAI TTS.

🎙️ TTS-Skill — Multi-Engine Text-to-Speech

TTS-Skill provides a single entrypoint for generating speech using multiple backends, with consistent output naming and progress feedback for long-running jobs.

Engines

  • qwen3-tts: local voice cloning with a reference audio + transcript
  • edge-tts: online voices with speed/pitch/style controls
  • openai-tts: OpenAI speech generation via API

Command Syntax

/tts-skill [engine] [text] --voice [voice-keyword] [other options]

If you use the Python entrypoint:

python tts-skill.py [engine] [text] --voice [voice-keyword]

Text Input

Pass text as a positional argument, or use --text-file / -f to read from a file.

Example:

python tts-skill.py qwen3-tts --text-file "input\\text.txt" --voice 寒冰射手

Notes:

  • --text-file supports relative and absolute paths; relative paths are resolved from your current working directory
  • If both positional text and --text-file are provided, --text-file takes priority
  • UTF-8 is recommended (UTF-8 BOM is supported); on decode error it falls back to GBK

You can also call engine scripts directly:

python engines/qwen3-tts-cli.py --text-file "input\\text.txt" --voice 寒冰射手
python engines/edge-tts-cli.py --text-file "input\\text.txt" --voice xiaoxiao
python engines/openai-tts-cli.py --text-file "input\\text.txt" --voice alloy

Local Voice Assets (Qwen3-TTS)

To add a clone voice, put a matching pair of files in assets/:

assets/Lei.wav
assets/Lei.txt

Supported audio formats: .wav, .mp3, .m4a, .flac.

Then:

python tts-skill.py qwen3-tts "测试文本" --voice Lei

Output

If --output is not provided:

  • Output directory: output/
  • Filename pattern: YYYYMMDD_HHMMSS_<first-6-chars>.<ext>

Progress & Timing (Qwen3-TTS)

Qwen3-TTS jobs print a live progress bar with ETA. After completion, tts-skill.py prints:

  • total runtime
  • total chars and Chinese chars
  • average seconds per Chinese character (or per char if no Chinese)

Project Layout

tts-skill/
├── .trae/
│   └── plans/
├── assets/
│   ├── Lei.txt
│   ├── 寒冰射手.txt
│   ├── 布里茨.txt
│   └── 赵信.txt
├── engines/
│   ├── edge-tts-cli.py
│   ├── edge-tts.config
│   ├── openai-tts-cli.py
│   ├── openai-tts.config
│   ├── qwen3-tts-cli.py
│   └── qwen3-tts.config
├── input/
│   └── text.txt
├── output/
├── tts-skill.py
├── INSTALL.md
├── INSTALL.zh-CN.md
├── README.md
├── README.zh-CN.md
├── SKILL.md
└── SKILL.zh-CN.md

Chinese Spec

See SKILL.zh-CN.md.