Our repository is split into two sections, one for model code and the other for evaluation.
- Clone the repository
git clone https://github.com/markendo/FEATHER
cd FEATHER
- Install packages
conda create -n feather python=3.10 -y
conda activate feather
cd prismatic-vlms
pip install -e .
cd ../vlm-evaluation
pip install -e .
cd ..
- Prepare data
The script for preparing evaluation datasets is at vlm-evaluation/scripts/datasets/prepare.py. More information about dataset preparation is available in the original codebase. Lastly, copy your HuggingFace token to vlm-evaluation/.hf_token.
We provide code for our experiments on evaluating various criteria for token pruning such as FastV, our modified version removing RoPE from the criteria, and our final FEATHER approach.
export DATASET_ROOT_DIR=/path/to/dataset/directory/
cd vlm-evaluation
bash scripts/eval_fastv.sh
bash scripts/eval_fastv_norope.sh
bash scripts/eval_feather.sh
Below are the results for RefCOCO and OCID-Ref.
| Criteria | OCID-Ref | RefCOCOg | RefCOCO+ | RefCOCO |
|---|---|---|---|---|
| FastV | 5.8 | 5.0 | 6.6 | 7.6 |
| FastV w/o RoPE | 23.2 | 15.0 | 13.8 | 15.4 |
| FEATHER | 32.5 | 38.7 | 38.7 | 43.4 |
The main implementation of FEATHER is provided in prismatic-vlms/prismatic/models/backbones/llm/llama2_models.py. Note that results can vary slightly based on attention implementation.
This repository is built on top of the prismatic-vlms and vlm-evaluation codebases.
@article{endo2025feather,
author = {Endo, Mark and Wang, Xiaohan and Yeung-Levy, Serena},
title = {Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration},
journal = {ICCV},
year = {2025},
}