Skip to content

kathisnehith/GraphRAG-dev

Repository files navigation

GraphRAG Studio 🔗

A comprehensive GraphRAG (Graph Retrieval-Augmented Generation) implementation that transforms documents into knowledge graphs and enables intelligent querying through natural language processing.

Common Use Cases: Research papers analysis, legal document review, Medical documentation Q&A, educational content exploration, and enterprise document intelligence.

🌟 Overview

GraphRAG Studio is a complete solution for creating, storing, and querying knowledge graphs from your documents. It combines the power of Large Language Models (LLMs) with Neo4j graph databases to provide an interactive platform for document-based question answering with enhanced context understanding.

🚀 Features

  • Multi-LLM Support: Compatible with OpenAI GPT-4 and Anthropic Claude models
  • Document Processing: PDF document parsing and intelligent chunking
  • Knowledge Graph Creation: Automatic entity and relationship extraction with LLM
  • Neo4j Integration: Persistent graph storage with enhanced schema support
  • Interactive Web Interface: Streamlit-based user-friendly application
  • Graph Visualization: Interactive network visualization with PyVis
  • Natural Language Querying: Cypher query generation from natural language

Core Components

  • GraphTransformer: Converts documents to graph structures
  • Neo4jGraph: Database interface and query execution
  • GraphCypherQAChain: Natural language to Cypher translation
  • Visualize: Graph rendering and visualization utilities

🏗️ Project Structure

GraphRAG/
├── graphrag_studio_app.py      # Custom GraphRAG-studio-ready application
├── requirements.txt            # Python dependencies
├── app/
│   └── main.py                # query ready GraphRAG app
├── backend/
│   ├── graph_transformer.py   # Graph transformation logic
│   └── graph_query.py         # Graph querying logic
├── test_notebooks/
│   ├── Graphrag_pdf.ipynb     # PDF data processing notebook
│   ├── Graphrag_table.ipynb   # Table data processing notebook
├── tests/
│   └── test_viz.py            # Visualization tests
└── utils/
    ├── visualizer.py          # Graph visualization utilities
    └── retriver_visualizer.py # Retrieval visualization

📊 GraphRAG Workflow Visualization

GraphRAG Workflow

🚀 Quick Start

Using GraphRAG Studio (Recommended)

  1. Launch the application

    streamlit run graphrag_studio_app.py
  2. Configure your setup

    • Select your preferred LLM provider (OpenAI or Anthropic)
    • Enter your API key
    • Connect to your Neo4j database
  3. Process your documents

    • Upload a PDF document
    • Create and store graph documents
    • Start querying your knowledge graph!

📋 Usage Guide

1. Document Upload and Processing

  • Upload PDF documents through the web interface
  • Documents are automatically chunked using RecursiveCharacterTextSplitter
  • Optimal chunk size: 1200 characters with 40-character overlap

2. Knowledge Graph Creation

  • Choose between OpenAI GPT-4 or Anthropic Claude models
  • Graph transformer extracts entities and relationships
  • Nodes and relationships are automatically identified and structured

3. Database Storage

  • Graphs are stored in Neo4j with enhanced schema support
  • Persistent storage enables complex queries and relationships
  • Source document information is preserved

4. Natural Language Querying

  • Ask questions in plain English
  • System generates appropriate Cypher queries
  • Returns contextual answers based on graph relationships

🛠️ Main Installation

  1. Clone the repository

    git clone <repository-url>
    cd GraphRAG
  2. Install Python dependencies

    pip install -r requirements.txt
  3. Set up Neo4j Database

    • Create a Neo4j account at Neo4j AuraDB or Neo4j local instance
    • Create a new database instance
    • Note down the connection URI, username, and password
  4. Set up environment variables Create a .env file in the root directory:

    ANTHROPIC_API_KEY=your_anthropic_api_key_here
    OPENAI_API_KEY=your_openai_api_key_here
    NEO4J_URI=neo4j+s://your-neo4j-url
    NEO4J_USERNAME=neo4j
    NEO4J_PASSWORD=your_neo4j_password

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published