Skip to content

mc-cloud-town/TTS-Discord-Bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TTS Discord Bot

🎡 A powerful Discord bot that brings text-to-speech functionality to your server using the GPT-SoVITS v2 API. Transform text messages into natural-sounding speech with multiple character voices, including support for custom user voice samples.

✨ Features

πŸŽ™οΈ Text-to-Speech Capabilities

  • Multiple Character Voices: Choose from a variety of pre-configured character voices
  • Custom Voice Samples: Upload and use your own voice samples for personalized TTS
  • Real-time Voice Chat Integration: Automatically convert text messages to speech in voice channels
  • Intelligent Text Processing: Handles Discord mentions, markdown formatting, and special characters

πŸ€– AI Integration

  • LLM Commands: Ask questions and get AI-powered responses with voice playback
  • Image Analysis: Upload images along with questions for visual content analysis
  • Conversation History: Maintains chat context for more natural conversations
  • Gemini AI Integration: Powered by Google's Gemini language model

πŸŽ›οΈ Voice Management

  • Per-user Voice Settings: Each user can configure their preferred voice character
  • Voice Channel Controls: Join/leave voice channels with simple commands
  • TTS Toggle: Enable/disable TTS functionality per user
  • Message Commands: Right-click any message to convert it to speech

πŸ“‹ Available Commands

Slash Commands

  • /ask - Ask a question to the AI and get voice response
  • /chat - Have a conversation with the AI assistant
  • /clear_chat - Clear conversation history
  • /set_voice - Configure your preferred voice character
  • /get_voice - Check your current voice settings
  • /play_tts - Convert text to speech and play in voice channel
  • /join_voice - Make the bot join your voice channel
  • /leave_voice - Make the bot leave the current voice channel
  • /tts_start - Enable TTS functionality
  • /tts_stop - Disable TTS functionality
  • /voice_add - Add a new voice character (requires manager role)
  • /voice_remove - Remove an existing voice character
  • /voice_edit - Edit a voice character's audio or reference text

Text Commands

  • !!tts start - Enable automatic TTS for your messages
  • !!tts stop - Disable automatic TTS for your messages

Context Menu Commands

  • Play TTS - Right-click any message to convert it to speech

πŸš€ Installation

Prerequisites

  • Python 3.11 or higher
  • FFmpeg (included in the project)
  • GPT-SoVITS v2 API server running on http://127.0.0.1:9880/tts or your custom endpoint
  • Discord Bot Token
  • Google Gemini API Key (for AI features)

Setup Instructions

  1. Clone the repository

    git clone https://github.com/yourusername/TTS-Discord-Bot.git
    cd TTS-Discord-Bot
  2. Install dependencies

    pip install -r requirements.txt
  3. Configure environment variables Create a .env file in the project root:

    DISCORD_TOKEN=your_discord_bot_token
    GUILD_ID=your_discord_guild_id
    GOOGLE_API_KEY=your_gemini_api_key
    
    # Optional overrides
    TTS_API_URL=http://127.0.0.1:9880/tts/
    TTS_TARGET_CHANNEL_ID=933384447145943071
    VOICE_TEXT_INPUT_CHANNEL_IDS=1087044327315878020,1047857030226006016
    VOICE_MANAGER_ROLE_ID=1003708775284342955
    MESSAGE_BOT_TARGET_USER_ID=998254901538861157
    VOICE_DIR=data/samples
    USER_SETTINGS_FILE=data/user_settings.json
    USER_VOICE_SETTINGS_FILE=data/user_voice.json
    REVERSE_MAPPING_FILE=data/game_id_to_user_id.json
  4. Set up voice samples

    • Place voice sample files in the data/samples/ directory
    • Configure sample metadata in data/sample_data.json
  5. Start the bot

    python run.py

    Or use the provided batch file on Windows:

    start.bat

πŸ“ Project Structure

TTS-Discord-Bot/
β”œβ”€β”€ bot/                           # Core bot modules
β”‚   β”œβ”€β”€ api/                       # API integrations
β”‚   β”‚   β”œβ”€β”€ gemini_api.py          # Google Gemini API client
β”‚   β”‚   β”œβ”€β”€ gemini_chat_history.py # Chat history management
β”‚   β”‚   └── tts_handler.py         # TTS processing and audio generation
β”‚   β”œβ”€β”€ commands/                  # Slash commands
β”‚   β”‚   β”œβ”€β”€ general.py             # General utility commands
β”‚   β”‚   β”œβ”€β”€ llm_commands.py        # AI/LLM related commands
β”‚   β”‚   β”œβ”€β”€ tts_commands.py        # TTS configuration commands
β”‚   β”‚   └── voice_commands.py      # Voice channel management
β”‚   β”œβ”€β”€ events/                    # Event listeners
β”‚   β”‚   β”œβ”€β”€ message_listener.py    # Target channel message processing
β”‚   β”‚   β”œβ”€β”€ on_ready.py            # Bot startup events
β”‚   β”‚   └── voice_chat_text_channel_listener.py # Voice chat text processing
β”‚   └── message_command/           # Context menu commands
β”‚       β”œβ”€β”€ analyze_material.py    # Material analysis
β”‚       └── play_tts.py            # Message TTS conversion
β”œβ”€β”€ data/                          # Data storage
β”‚   β”œβ”€β”€ samples/                   # Voice sample files
β”‚   β”œβ”€β”€ conversations/             # Chat history storage
β”‚   └── *.json                     # Configuration files
β”œβ”€β”€ utils/                         # Utility functions
β”œβ”€β”€ config.py                      # Configuration settings
└── run.py                         # Main entry point

βš™οΈ Configuration

Voice Samples

Add new voice characters by:

  1. Placing audio files in data/samples/
  2. Adding character metadata to data/sample_data.json:
    {
      "character_name": {
        "file": "character_voice.wav",
        "text": "Sample text for this character"
      }
    }

You can also manage voices in Discord using /voice_add, /voice_remove and /voice_edit (requires the role specified by VOICE_MANAGER_ROLE_ID).

Channel Configuration

Configure target channels in config.py:

  • TTS_TARGET_CHANNEL_ID: Channel for message-based TTS
  • VOICE_TEXT_INPUT_CHANNEL_IDS: Channels for automatic voice chat TTS
  • VOICE_MANAGER_ROLE_ID: Role allowed to add, remove and edit voice characters

πŸ”§ Dependencies

  • disnake: Discord API wrapper
  • google-genai: Google Gemini API client
  • pydub: Audio processing
  • requests: HTTP requests
  • python-dotenv: Environment variable management
  • PyNaCl: Voice functionality

πŸ“ Usage Examples

Basic TTS Usage

  1. Join a voice channel
  2. Use /tts_start to enable automatic TTS
  3. Type messages in configured channels to hear them spoken
  4. Use /set_voice to change your voice character

AI Conversations

  1. Use /ask "What is the weather like?" for one-time questions
  2. Use /chat "Hello!" to start a conversation
  3. Upload images with /ask for visual analysis
  4. Use /clear_chat to reset conversation history

Voice Sample Management

  1. Upload your voice samples to data/samples/
  2. Select "θ‡ͺ己聲音 (ιœ€θ¦ε…ˆδΈŠε‚³θͺžιŸ³ζ¨£ζœ¬οΌ‰" in /set_voice
  3. The bot will use your custom voice for TTS

🐳 Docker Support

A Dockerfile is included for containerized deployment:

docker build -t tts-discord-bot .
docker run -d --env-file .env tts-discord-bot

🀝 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Built with disnake Discord API library
  • Powered by GPT-SoVITS v2 for high-quality TTS
  • AI features provided by Google Gemini
  • Supports multiple languages with focus on Traditional Chinese

About

A discord bot for text to speech using GPT-SoVITS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •  

Languages