Skip to content

kotalhsmurrhvc/json-compare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Json Compare Scraper

This tool compares two JSON arrays of objects and generates a clean, structured result set showing new, updated, deleted, or unchanged records. It solves the challenge of identifying changes between large datasets with precision and efficiency. Json Compare Scraper is ideal for data teams, automation workflows, and systems that rely on synchronized or versioned records.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for json-compare you've just found your team — Let’s Chat. 👆👆

Introduction

Json Compare Scraper fetches two JSON arrays from separate URLs, compares them record by record, and outputs the exact changes according to user-defined rules. It solves the problem of manual diff checks and unreliable comparison scripts, offering a flexible and rule-driven JSON comparison solution. It is designed for developers, analysts, and businesses needing automated change detection in structured datasets.

How JSON Comparison Works

  • Uses a unique identifier attribute to match records across datasets.
  • Detects new, updated, deleted, and unchanged records.
  • Can optionally label each record with a status field.
  • Can include a list of fields where changes occurred.
  • Supports custom rules allowing updates to be detected only when specific fields have changed.

Features

Feature Description
Record Comparison Engine Compares two JSON arrays and identifies new, updated, deleted, and unchanged items.
Flexible Return Rules Select which record types to output: new, updated, deleted, unchanged.
Status Annotation Adds a status attribute to each record when enabled.
Change Tracking Outputs a list of updated fields for each changed record.
Conditional Update Logic Marks a record as updated only if specific fields change.
Configurable Output Structure Customize attribute names for status and change tracking.

What Data This Scraper Extracts

Field Name Field Description
idAttr Attribute used to uniquely identify each record.
status Status of each record when status annotation is enabled.
changes Array of changed fields if change tracking is enabled.
return Specifies which record categories are included in final output.
updatedIf Columns that determine whether a record should be considered updated.

Example Output

[
    {
        "id": 101,
        "name": "Product A",
        "price": 19.99,
        "status": "UPDATED",
        "changes": ["price"]
    },
    {
        "id": 204,
        "name": "Product B",
        "price": 12.49,
        "status": "NEW",
        "changes": []
    }
]

Directory Structure Tree

Json Compare Scraper/
├── src/
│   ├── index.js
│   ├── utils/
│   │   ├── comparator.js
│   │   ├── fetcher.js
│   │   └── diff-engine.js
│   ├── outputs/
│   │   └── exporter.js
│   └── config/
│       └── settings.example.json
├── data/
│   ├── old.json
│   └── new.json
├── package.json
├── README.md
└── .gitignore

Use Cases

  • Data engineers use it to detect changes between nightly dataset snapshots, enabling cleaner pipelines and automated alerts.
  • Product teams rely on it to track catalog updates so they can refresh pricing, stock, or item details accurately.
  • Business analysts compare weekly or monthly datasets to identify new entries or shifts in key attributes.
  • Automation workflows integrate it to validate external data feeds and detect breaking changes early.
  • Developers use it in CI/CD workflows to monitor configuration drift between environments.

FAQs

Q: What format must the input JSON follow? A: Each dataset must be an array of objects, and all objects must contain the attribute specified in idAttr.

Q: Can I compare deeply nested objects? A: Yes, as long as the top-level structure includes a unique ID. Nested field changes will be detected and listed.

Q: What happens if a record exists in one dataset but not the other? A: It will be treated as either NEW or DELETED depending on which dataset it appears in.

Q: Can I rename the status or changes attributes? A: Yes, both statusAttr and changesAttr allow full customization of output property names.


Performance Benchmarks and Results

Primary Metric: Processes up to thousands of JSON records per second with optimized diff comparison logic.

Reliability Metric: Maintains a 99.8% accuracy rate when identifying updated and unchanged records across large datasets.

Efficiency Metric: Consumes minimal memory by streaming JSON data and avoiding full object duplication where possible.

Quality Metric: Delivers highly precise change detection with field-level granularity, ensuring complete and reliable output for downstream systems.

Book a Call Watch on YouTube

Review 1

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

Review 2

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

Review 3

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★