This project is a FastAPI application designed for matching brand and company names using fuzzy string matching. It leverages the rapidfuzz library for efficient string matching and integrates with a JSON dataset to provide accurate and fast brand-company associations.
- Fuzzy String Matching: Utilizes
rapidfuzzfor matching user-input brand and company names to a dataset. - FastAPI Framework: Built using FastAPI for efficient and easy-to-use web API development.
- Dynamic Dataset Integration: Integrates with a JSON-based dataset for brands and companies, allowing dynamic data handling.
- Customizable Thresholds: Utilizes environment variables for customizable matching thresholds.
- Clone the Repository:
- Clone the repository from GitHub.
- Install Dependencies:
- Install required Python libraries:
fastapi,numpy,pandas,rapidfuzz,uvicorn.
- Install required Python libraries:
- Dataset Preparation:
- Place your
DatasetMapping.jsonfile in the project directory.
- Place your
- Environment Variables:
- Set
THRESHOLD_COMPANY_MATCHandTHRESHOLD_CONF_LEVELin your environment for customization.
- Set
- Run the application using Uvicorn:
uvicorn main:app --reload - The application will be served at
http://127.0.0.1:8000/.
- Endpoint:
GET / - Parameters:
brand: The brand name to match.company: The company name to match.
- Returns: A JSON response with the best match and confidence level.
Contributions to enhance or expand the project are welcome. Please fork the repository and submit a pull request with your changes.
- Adding more datasets for broader matching capabilities.
- Improving the matching algorithm for higher accuracy.
- Incorporating advanced data preprocessing techniques.