Skip to content

🤖 Intelligent document chatbot powered by AI - Upload, index, and chat with your documents using advanced vector search.

Notifications You must be signed in to change notification settings

papai0709/DocuMind

Repository files navigation

DocuMind 🤖📄

Intelligent Document Chatbot - Portable Edition

Transform your document library into an interactive AI-powered knowledge base. DocuMind works seamlessly on Windows, Mac, and Linux with zero-configuration setup.

✨ Features

  • 🤖 AI-Powered: GPT-4 and Azure OpenAI integration
  • 📄 Multi-Format: PDF, DOCX, TXT support
  • 🔍 Smart Search: Vector embeddings and semantic search
  • 💬 Dual Interfaces: Streamlit (GUI) and Flask (Web API)
  • 🌍 Cross-Platform: Windows, macOS, and Linux
  • Portable: Self-contained with automatic setup
  • 🔒 Secure: API keys protected, user data stays local

🚀 Quick Start (Recommended)

Option 1: One-Click Launch

Windows:

# Download and double-click
launchers\run_windows.bat

Mac/Linux:

# Download and run
./launchers/run_unix.sh

Universal (All Platforms):

python documind.py
# or
python launchers/launch.py

Option 2: Automatic Setup

# Run the portable setup (handles everything)
python setup/portable_setup.py

📥 Installation Methods

Method 1: Download & Run (Easiest)

  1. Download DocuMind
  2. Run the appropriate launcher for your OS
  3. Follow the configuration prompts
  4. Start chatting with your documents!

Method 2: Git Clone

git clone https://github.com/papai0709/DocuMind.git
cd DocuMind
python setup/portable_setup.py

Method 3: Manual Setup

# 1. Create virtual environment
python -m venv .venv

# 2. Activate it
# Windows:
.venv\Scripts\activate
# Mac/Linux:
source .venv/bin/activate

# 3. Install dependencies
pip install -r requirements/requirements-portable.txt

# 4. Configure API keys
python setup/easy_config.py

# 5. Launch
python documind.py

⚙️ Configuration

DocuMind includes multiple easy configuration options:

GUI Configuration Tool

python setup/easy_config.py
  • Graphical interface for easy setup
  • Automatic validation
  • Platform-specific guidance

Command Line Setup

python setup/portable_setup.py
  • Automated environment setup
  • Dependency installation
  • Configuration generation

Manual Configuration

Create a .env file in the project root:

For Azure OpenAI (Recommended):

USE_AZURE_OPENAI=true
AZURE_OPENAI_API_KEY=your_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_CHAT_DEPLOYMENT_NAME=gpt-4
AZURE_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002

For OpenAI:

USE_AZURE_OPENAI=false
OPENAI_API_KEY=your_openai_key_here

🖥️ Platform-Specific Instructions

Windows

  • Requirements: Python 3.8+ (download from python.org)
  • Launcher: launchers\run_windows.bat or python documind.py
  • Features: Colored console output, Windows integration

macOS

  • Requirements: Python 3.8+ (use Homebrew: brew install python)
  • Launcher: ./launchers/run_unix.sh or python documind.py
  • Features: Native terminal integration

Linux

  • Requirements: Python 3.8+, pip, venv
  • Install: sudo apt install python3 python3-venv python3-pip (Ubuntu/Debian)
  • Launcher: ./launchers/run_unix.sh or python documind.py

☁️ Azure Web Deployment

DocuMind is ready for deployment to Azure Web Apps. You can deploy it using either a Docker container (Recommended) or direct code deployment.

Option 1: Docker Container (Recommended)

  1. Build and push the image:

    docker build -t your-registry.azurecr.io/documind:latest .
    docker push your-registry.azurecr.io/documind:latest
  2. Create Web App for Containers:

    • Choose "Docker Container" as the publish option.
    • Select your image.
    • Set the WEBSITES_PORT environment variable to 8000.
  3. Configure Environment Variables:

    • Set all necessary environment variables (API keys, etc.) in the Azure Portal > Environment Variables.
    • Persistence: To save your vector database across restarts, mount an Azure Storage File Share to /data and set CHROMA_DB_PATH=/data/chroma_db.

Option 2: Code Deployment

  1. Create Web App (Linux):

    • Runtime stack: Python 3.10 or higher.
    • Publish: Code.
  2. Configuration:

    • Azure should automatically detect requirements.txt in the root.
    • Set the Startup Command in Configuration > General Settings to:
      sh startup.sh
  3. Environment:

    • Add your API keys and other settings in Environment Variables.

📁 Project Structure

DocuMind/
├── 🚀 Launchers
│   ├── documind.py            # Main entry point
│   └── launchers/
│       ├── launch.py          # Universal Python launcher
│       ├── run_windows.bat    # Windows batch file
│       └── run_unix.sh        # Mac/Linux shell script
├── 🔧 Setup & Configuration
│   ├── setup/
│   │   ├── portable_setup.py  # Automated installer
│   │   └── easy_config.py     # GUI/CLI configuration
│   └── config/
│       └── .env.example       # Configuration template
├── 📦 Dependencies
│   └── requirements/
│       ├── requirements-portable.txt  # Core cross-platform
│       ├── requirements-windows.txt   # Windows optimized
│       ├── requirements-unix.txt      # Mac/Linux optimized
│       └── requirements.txt          # Complete feature set
├── 🧠 Source Code
│   └── src/
│       ├── core/              # Core AI functionality
│       └── web/               # Web interfaces
├── 🧪 Tests & Data
│   ├── tests/                 # Test suite
│   └── data/                  # Document storage
└── 📚 Documentation
    └── docs/                  # Detailed guides

🎯 Usage

  1. Start DocuMind using your preferred launcher
  2. Upload documents via the web interface
  3. Ask questions about your documents
  4. Get AI responses with source citations

Web Interfaces

Streamlit (Recommended):

Flask (API):

🔧 Advanced Configuration

Environment Variables

# Core Settings
APP_TITLE=DocuMind
APP_ICON=🤖
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
MAX_FILE_SIZE_MB=10

# Performance
ENABLE_CACHING=true
LOG_LEVEL=INFO

Performance Optimization

# Install optional performance packages
pip install watchdog psutil rich

�️ Troubleshooting

Common Issues

Python not found:

  • Install Python 3.8+ from python.org
  • Make sure Python is in your PATH

Import errors:

  • Run python portable_setup.py to reinstall dependencies
  • Check virtual environment activation

Configuration issues:

  • Run python easy_config.py to reconfigure
  • Verify API keys in .env file

Port conflicts:

  • Streamlit: Change port with --server.port 8502
  • Flask: Modify port=5001 in flask_app.py

Getting Help

  1. Check the docs/ folder for detailed guides
  2. Run diagnostic: python tests/test_setup.py
  3. Open an issue on GitHub

🔒 Security & Privacy

  • API Keys: Stored locally in .env (never committed)
  • Documents: Processed locally, never sent to external servers
  • Data: Vector database stored locally in data/chroma_db/
  • Privacy: No telemetry, no external data sharing

🎨 Customization

DocuMind is highly customizable:

  • Themes: Modify Streamlit themes in .streamlit/config.toml
  • Models: Switch between GPT-3.5, GPT-4, or custom models
  • UI: Customize web interfaces in src/web/
  • Processing: Adjust chunking and embedding parameters

🤝 Contributing

We welcome contributions! DocuMind is designed to be:

  • Portable: Works everywhere Python runs
  • Modular: Easy to extend and customize
  • Documented: Clear code and comprehensive docs

� License

MIT License - Feel free to use, modify, and distribute.

🌟 Why DocuMind?

  • Zero Configuration: Works out of the box
  • Professional Grade: Enterprise-ready with Azure OpenAI
  • Privacy First: Your documents stay on your machine
  • Cross-Platform: One codebase, everywhere
  • Open Source: Transparent and customizable

Ready to make your documents intelligent? Get started in under 2 minutes!

# Clone, configure, and launch - it's that simple!
git clone https://github.com/papai0709/DocuMind.git
cd DocuMind
python documind.py
```# DocuMind

About

🤖 Intelligent document chatbot powered by AI - Upload, index, and chat with your documents using advanced vector search.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published