Beyond Occurrences: Bayesian Gaussian Processes for Relative Prevalence Species Distribution Modeling
Biogeography is a Python package for modeling species distributions using Gaussian Processes (GPs). The core model, EcoGP, leverages environmental and spatial focused GPs to capture complex ecological dependencies in microbial species data. The package is designed for scalable variational inference. For more details, refer to our paper (NO LINK YET).
- EcoGP: Multitask Gaussian Process model for species distribution modeling.
- Training: Variational inference with batch learning.
- Configs: Easily configurable config files for quick implementation on own data.
- Baselines: Standard models for comparison.
- Dataset Support: Ready-to-use datasets for butterflies, Central Park, and toy examples.
-
Clone the repository:
git clone https://github.com/MicrobialDarkMatter/biogeography.git cd biogeography -
Install dependencies:
pip install -r requirements.txt
EcoGP/model.py: Main GP model.EcoGP/train.py: Training script.configs/: Configuration files for experiments.EcoGP/baselines/: Baseline models.data/: Datasets.
Run the training script with a configuration file:
python EcoGP/train.py --config configs/config.py- Replace
config.pywith your desired config file.
- You can modify hyperparameters, dataset paths, and model options.
- Variational setup using Pyro.
- Implements a multitask variational GP using GPyTorch.
- Supports batch learning.
- Option to customize kernels.
- Place your datasets in the
data/directory. - Supported datasets: butterflies, Central Park, toy examples.
- Baseline models are in
EcoGP/baselines/. - Used these for comparison with EcoGP.
- Python 3.11+
- See
requirements.txtfor all dependencies.
If you use this package in your research, please cite:
TO COME
See LICENSE for details.