TTS Discord Bot

🎵 A powerful Discord bot that brings text-to-speech functionality to your server using the GPT-SoVITS v2 API. Transform text messages into natural-sounding speech with multiple character voices, including support for custom user voice samples.

✨ Features

🎙️ Text-to-Speech Capabilities

Multiple Character Voices: Choose from a variety of pre-configured character voices
Custom Voice Samples: Upload and use your own voice samples for personalized TTS
Real-time Voice Chat Integration: Automatically convert text messages to speech in voice channels
Intelligent Text Processing: Handles Discord mentions, markdown formatting, and special characters

🤖 AI Integration

LLM Commands: Ask questions and get AI-powered responses with voice playback
Image Analysis: Upload images along with questions for visual content analysis
Conversation History: Maintains chat context for more natural conversations
Gemini AI Integration: Powered by Google's Gemini language model

🎛️ Voice Management

Per-user Voice Settings: Each user can configure their preferred voice character
Voice Channel Controls: Join/leave voice channels with simple commands
TTS Toggle: Enable/disable TTS functionality per user
Message Commands: Right-click any message to convert it to speech

📋 Available Commands

Slash Commands

/ask - Ask a question to the AI and get voice response
/chat - Have a conversation with the AI assistant
/clear_chat - Clear conversation history
/set_voice - Configure your preferred voice character
/get_voice - Check your current voice settings
/play_tts - Convert text to speech and play in voice channel
/join_voice - Make the bot join your voice channel
/leave_voice - Make the bot leave the current voice channel
/tts_start - Enable TTS functionality
/tts_stop - Disable TTS functionality
/voice_add - Add a new voice character (requires manager role)
/voice_remove - Remove an existing voice character
/voice_edit - Edit a voice character's audio or reference text

Text Commands

!!tts start - Enable automatic TTS for your messages
!!tts stop - Disable automatic TTS for your messages

Context Menu Commands

Play TTS - Right-click any message to convert it to speech

🚀 Installation

Prerequisites

Python 3.11 or higher
FFmpeg (included in the project)
GPT-SoVITS v2 API server running on http://127.0.0.1:9880/tts or your custom endpoint
Discord Bot Token
Google Gemini API Key (for AI features)

Setup Instructions

Clone the repository

git clone https://github.com/yourusername/TTS-Discord-Bot.git
cd TTS-Discord-Bot

Install dependencies
```
pip install -r requirements.txt
```

Configure environment variables Create a .env file in the project root:

DISCORD_TOKEN=your_discord_bot_token
GUILD_ID=your_discord_guild_id
GOOGLE_API_KEY=your_gemini_api_key

# Optional overrides
TTS_API_URL=http://127.0.0.1:9880/tts/
TTS_TARGET_CHANNEL_ID=933384447145943071
VOICE_TEXT_INPUT_CHANNEL_IDS=1087044327315878020,1047857030226006016
VOICE_MANAGER_ROLE_ID=1003708775284342955
MESSAGE_BOT_TARGET_USER_ID=998254901538861157
VOICE_DIR=data/samples
USER_SETTINGS_FILE=data/user_settings.json
USER_VOICE_SETTINGS_FILE=data/user_voice.json
REVERSE_MAPPING_FILE=data/game_id_to_user_id.json

Set up voice samples
- Place voice sample files in the data/samples/ directory
- Configure sample metadata in data/sample_data.json
Start the bot
```
python run.py
```
Or use the provided batch file on Windows:
```
start.bat
```

📁 Project Structure

TTS-Discord-Bot/
├── bot/                           # Core bot modules
│   ├── api/                       # API integrations
│   │   ├── gemini_api.py          # Google Gemini API client
│   │   ├── gemini_chat_history.py # Chat history management
│   │   └── tts_handler.py         # TTS processing and audio generation
│   ├── commands/                  # Slash commands
│   │   ├── general.py             # General utility commands
│   │   ├── llm_commands.py        # AI/LLM related commands
│   │   ├── tts_commands.py        # TTS configuration commands
│   │   └── voice_commands.py      # Voice channel management
│   ├── events/                    # Event listeners
│   │   ├── message_listener.py    # Target channel message processing
│   │   ├── on_ready.py            # Bot startup events
│   │   └── voice_chat_text_channel_listener.py # Voice chat text processing
│   └── message_command/           # Context menu commands
│       ├── analyze_material.py    # Material analysis
│       └── play_tts.py            # Message TTS conversion
├── data/                          # Data storage
│   ├── samples/                   # Voice sample files
│   ├── conversations/             # Chat history storage
│   └── *.json                     # Configuration files
├── utils/                         # Utility functions
├── config.py                      # Configuration settings
└── run.py                         # Main entry point

⚙️ Configuration

Voice Samples

Add new voice characters by:

Placing audio files in data/samples/

Adding character metadata to data/sample_data.json:

{
  "character_name": {
    "file": "character_voice.wav",
    "text": "Sample text for this character"
  }
}

You can also manage voices in Discord using /voice_add, /voice_remove and /voice_edit (requires the role specified by VOICE_MANAGER_ROLE_ID).

Channel Configuration

Configure target channels in config.py:

TTS_TARGET_CHANNEL_ID: Channel for message-based TTS
VOICE_TEXT_INPUT_CHANNEL_IDS: Channels for automatic voice chat TTS
VOICE_MANAGER_ROLE_ID: Role allowed to add, remove and edit voice characters

🔧 Dependencies

disnake: Discord API wrapper
google-genai: Google Gemini API client
pydub: Audio processing
requests: HTTP requests
python-dotenv: Environment variable management
PyNaCl: Voice functionality

📝 Usage Examples

Basic TTS Usage

Join a voice channel
Use /tts_start to enable automatic TTS
Type messages in configured channels to hear them spoken
Use /set_voice to change your voice character

AI Conversations

Use /ask "What is the weather like?" for one-time questions
Use /chat "Hello!" to start a conversation
Upload images with /ask for visual analysis
Use /clear_chat to reset conversation history

Voice Sample Management

Upload your voice samples to data/samples/
Select "自己聲音 (需要先上傳語音樣本）" in /set_voice
The bot will use your custom voice for TTS

🐳 Docker Support

A Dockerfile is included for containerized deployment:

docker build -t tts-discord-bot .
docker run -d --env-file .env tts-discord-bot

🤝 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with disnake Discord API library
Powered by GPT-SoVITS v2 for high-quality TTS
AI features provided by Google Gemini
Supports multiple languages with focus on Traditional Chinese

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TTS Discord Bot

✨ Features

🎙️ Text-to-Speech Capabilities

🤖 AI Integration

🎛️ Voice Management

📋 Available Commands

Slash Commands

Text Commands

Context Menu Commands

🚀 Installation

Prerequisites

Setup Instructions

📁 Project Structure

⚙️ Configuration

Voice Samples

Channel Configuration

🔧 Dependencies

📝 Usage Examples

Basic TTS Usage

AI Conversations

Voice Sample Management

🐳 Docker Support

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.github		.github
bot		bot
data		data
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.py		config.py
ffmpeg.exe		ffmpeg.exe
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.py		run.py
start.bat		start.bat

License

mc-cloud-town/TTS-Discord-Bot

Folders and files

Latest commit

History

Repository files navigation

TTS Discord Bot

✨ Features

🎙️ Text-to-Speech Capabilities

🤖 AI Integration

🎛️ Voice Management

📋 Available Commands

Slash Commands

Text Commands

Context Menu Commands

🚀 Installation

Prerequisites

Setup Instructions

📁 Project Structure

⚙️ Configuration

Voice Samples

Channel Configuration

🔧 Dependencies

📝 Usage Examples

Basic TTS Usage

AI Conversations

Voice Sample Management

🐳 Docker Support

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 4

Uh oh!

Languages

Packages