Skip to content

[Bug] Document resync fails with duplicate key error and deletion not working properly #22

@qowlgur121

Description

@qowlgur121

Bug Description

Document resync operation fails with database constraint violation error, and document deletion appears to not work correctly.

Steps to Reproduce

  1. Upload a document through Admin UI (POST /api/v1/sources/upload)
  2. Wait for document to reach COMPLETED status
  3. Click "Resync" button for the same document
  4. Error occurs: "Error: could not execute statement [ERROR: duplicate ke..."

Expected Behavior

  • Resync should successfully restart the ingestion pipeline for existing documents
  • Document deletion should properly remove all related data

Actual Behavior

  • Resync fails with duplicate key constraint violation
  • Deletion operation appears incomplete

Technical Analysis

The issue seems to be related to:

  1. Resync Logic: When resyncing existing files, the system appears to create duplicate entries instead of updating existing ones
  2. Deletion Logic: Document deletion may not be properly cascading through all related tables/indexes

Possible Root Causes

  1. File checksum uniqueness constraint conflict during resync
  2. Incomplete cleanup in PostgreSQL/Elasticsearch during deletion
  3. Missing transaction rollback handling in resync workflow

Environment

  • Version: 1.0.0
  • Component: open-context-core (Spring Boot)
  • Database: PostgreSQL 16
  • Deployment: Docker Compose

Additional Context

This affects the core document management workflow and prevents users from:

  • Re-processing documents after system updates
  • Properly removing unwanted documents
  • Maintaining clean document state

Suggested Investigation Areas

  1. Check SourceDocumentRepository resync implementation
  2. Verify CASCADE delete constraints in database schema
  3. Review transaction management in document ingestion pipeline
  4. Validate Elasticsearch index cleanup during deletion

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions