CRAVE

This repo provides an official implementation of CRAVE as described in the paper:

Borrowing Eyes for the Blind Spot: Overcoming Data Scarcity in Malicious Video Detection via Cross-Domain Retrieval Augmentation

Introduction

CRAVE: a Cross-domain Retrieval AugmEntation framework for the data-scarcity in malicious video detection.

Structure

├── data    # dataset path
│   ├── FakeTT
│   ├── FVC
│   ├── HateMM
│   ├── MHClipEN
│   ├── Fakeddit
│   └── FHM
├── preprocess  # code for preprocessing data
│   ├── extract_audio.py
│   ├── extract_fea_image
│   ├── extract_fea_video
│   └── extract_key_frames
├── retrieve    # code of conducting retrieval
│   ├── make_image_retrieval_feature.py
│   ├── make_retrieval_result_keyframe.py
│   ├── utils.py
│   └── video_direct_retrieval
├── run         # script for preprocessing and retrieval
├── src         # code of model arch and training
│   ├── config
│   ├── main.py     # main code for training 
│   ├── model
│   │   ├──Base
│   │   └──CRAVE    # implementation of CRAVE
└── └── utils

Dataset

We provide IDs for each dataset split. Due to copyright restrictions, the raw datasets are not included in this repository. You can obtain them from their respective original project sites.

Video Dataset

Image-text Dataset

FHM
Fakeddit

Usage

Requirement

To set up the environment, run the following commands:

conda create --name CRAVE python=3.12
conda activate CRAVE
pip install -r requirements.txt

Preprocess

Download datasets and store them in data presented in Source Code Structure, and save videos and images to videos and img in the corresponding dataset path.
For video datasets, save data.jsonl in each dataset path, with each line containing vid, title, ocr, transcript, and label.
For image-text datasets, save data.jsonl in each dataset path, with each line including id, text, and label.
Run the following codes to preprocess data:

bash run/preprocess.sh  # preprocess data
bash run/retrieve.sh    # generate retrieval result

Run

python src/main.py --config-name CRAVE_FakeTT.yaml     # run CRAVE on FakeTT
python src/main.py --config-name CRAVE_FVC.yaml        # run CRAVE on FVC
python src/main.py --config-name CRAVE_HateMM.yaml     # run CRAVE on HateMM
python src/main.py --config-name CRAVE_MHClipEN.yaml   # run CRAVE on MHClipEN

Citation

@inproceedings{hong2025borrowing,
	author = {Hong, Rongpei and Lang, Jian and Zhong, Ting and Zhou, Fan},
	booktitle = {IEEE International Conference on Computer Vision ({ICCV})},
	year = {2025},
	title = {Borrowing Eyes for the Blind Spot: Overcoming Data Scarcity in Malicious Video Detection via Cross-Domain Retrieval Augmentation},
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
asset		asset
data		data
preprocess		preprocess
retrieve		retrieve
run		run
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CRAVE

Borrowing Eyes for the Blind Spot: Overcoming Data Scarcity in Malicious Video Detection via Cross-Domain Retrieval Augmentation

Introduction

Structure

Dataset

Video Dataset

Image-text Dataset

Usage

Requirement

Preprocess

Run

Citation

About

Uh oh!

Releases

Packages

Languages

License

ronpay/CRAVE

Folders and files

Latest commit

History

Repository files navigation

CRAVE

Borrowing Eyes for the Blind Spot: Overcoming Data Scarcity in Malicious Video Detection via Cross-Domain Retrieval Augmentation

Introduction

Structure

Dataset

Video Dataset

Image-text Dataset

Usage

Requirement

Preprocess

Run

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages