Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models"
-
Updated
Dec 22, 2025 - JavaScript
Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models"
Official codebase for the ACL 2025 Findings paper: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval.
smallevals — CPU-fast, GPU-blazing fast offline retrieval evaluation for RAG systems with tiny QA models.
Research-grade neuro-symbolic RAG framework where retrieval is a policy, not a vector search, built for evaluation, ablation, and reliability analysis.
A systems-level analysis of static RAG pipelines, isolating ingestion, retrieval, and ranking boundaries to expose structural failure modes before generation.
Visual RAPTOR ColBERT Integration System - Multimodal document retrieval with SigLIP, PyMuPDF, and evaluation metrics.
A controlled experiment evaluating whether hybrid (dense + sparse) retrieval surfaces evidence that dense-only RAG systems misrank—without changing generation behavior.
Add a description, image, and links to the retrieval-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the retrieval-evaluation topic, visit your repo's landing page and select "manage topics."