Skip to content

Spring Boot integration service that transforms ABBYY OCR XML output into Equinor's standardized format using Apache Camel, with OAuth2 authentication and automated document processing

Notifications You must be signed in to change notification settings

TorbenMerrald/equinorabbyyintegration

Repository files navigation

EquinorAbbyyIntegration

Document processing integration application that transforms ABBYY OCR XML output into Equinor's standardized XML format and delivers it to Equinor backend systems.

Features

  • XML Processing Pipeline: Polls folders for ABBYY-generated XML files and transforms them using XSLT
  • Schema Validation: Validates XML against XSD schemas before processing
  • Document Aggregation: Aggregates multiple XML documents by correlation ID
  • OAuth2 Authentication: Secure API access via Azure AD bearer tokens
  • Mailroom Integration: Sends scanned documents as PDF email attachments
  • Scheduled Cleanup: Automatic deletion of old processed files

Technologies

Technology Version Purpose
Spring Boot 2.6.0 Application framework
Apache Camel 3.13.0 Integration/routing engine
Java 8 Primary language
Groovy - Scripting and utilities
Saxon - XSLT transformation engine
Install4j 9.0.5 Windows installer generation

Project Structure

equinorabbyyintegration/
├── src/main/
│   ├── groovy/dk/bpas/equinor/gomintegration/
│   │   ├── GomApplication.java          # Spring Boot entry point
│   │   ├── routes/                       # Camel route definitions
│   │   ├── utilities/                    # Processing utilities
│   │   └── tools/                        # Services (email, cleanup)
│   └── resources/
│       ├── application.properties        # Configuration
│       ├── *.xsl                         # XSLT stylesheets
│       └── *.xsd                         # XML schemas
├── fromFolder/                           # Production input
├── fromTestFolder/                       # Test input
├── toFolder/                             # Production output
├── toTestFolder/                         # Test output
└── build.gradle                          # Build configuration

Configuration

Environment Variables

The following environment variables must be set for OAuth2 authentication:

Variable Description
DEV_OAUTH_CLIENT_ID Development OAuth client ID
DEV_OAUTH_CLIENT_SECRET Development OAuth client secret
QA_OAUTH_CLIENT_ID QA OAuth client ID
QA_OAUTH_CLIENT_SECRET QA OAuth client secret
PROD_OAUTH_CLIENT_ID Production OAuth client ID
PROD_OAUTH_CLIENT_SECRET Production OAuth client secret

Application Properties

Key configuration in application.properties:

  • Folder paths for file polling
  • OAuth2 endpoints for DEV/QA/PROD environments
  • Email settings for mailroom integration
  • File cleanup schedule and retention period

Building

# Run tests
gradle test

# Build Spring Boot JAR
gradle bootJar

# Create Windows installer (requires Install4j license)
gradle copyInstall4JMedia

Running

java -jar build/libs/equinorabbyyintegration.jar

Data Flow

ABBYY XML Files → File Polling → BOM Removal → Namespace Stripping
    → Element Restructuring → XSD Validation → XSLT Transformation
    → Document Aggregation → OAuth2 Token → HTTP POST to Equinor
    → Output Saved

Processing Steps

  1. Restructure XML: Uses remove_namespace.xsl and rename_details.xsl
  2. Validate XML: Uses abbyySummary.xsd, abbyyDetails.xsd, abbyySummary_Details.xsd, abbyyIncludes.xsd
  3. Transform to Equinor format: Uses Summary.xsl, Details.xsl, Summary_Details.xsl
  4. Aggregate: XmlAggregationStrategy combines multiple XMLs into one file

Document Types Supported

  • Summary
  • Details
  • Summary_Details
  • Verification

Testing

Each document type has its own test case to verify that the generated Equinor XML files are correct.

gradle test

CI/CD

Push to Bitbucket triggers the pipeline (bitbucket-pipelines.yml):

  1. Run Gradle tests
  2. Build Spring Boot JAR
  3. Generate Windows EXE installer
  4. Upload artifacts

The EXE is built automatically if all tests pass.

Security

See AGENTS.md for security review findings and credential management practices.

License

BPA Solutions A/S

About

Spring Boot integration service that transforms ABBYY OCR XML output into Equinor's standardized format using Apache Camel, with OAuth2 authentication and automated document processing

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published