🔬 EMUSES - Scientific Predictive Modeling Platform

Enabling collaborative scientific research through interpretable predictive modeling and seamless model sharing.

EMUSES transforms scientific data into predictive insights, supporting research workflows from individual analysis to community-wide collaboration. Built for researchers who need both quick results and deep analytical control across diverse domains including neuroimaging, astronomy, genetics, sociology, economics, and beyond.

🚀 Quick Start (5 minutes)

Prerequisites

Python 3.11+ (recommended for optimal performance)
Basic command line familiarity

Installation & First Analysis

Recommended: Using Conda (Best for Scientific Computing)

# 1. Create isolated environment
conda create -n emuses-env python=3.11
conda activate emuses-env

# 2. macOS ONLY: Install OpenMP (one-time setup)
# Required for XGBoost and other ML libraries
conda install -c conda-forge llvm-openmp  # macOS only

# 3. Install EMUSES
pip install git+https://github.com/chrisfoulon/emuses.git

# 4. Verify installation
python -m emuses.cli --help

# 5. Run your first analysis (with sample data)
python -m emuses.cli full output_folder docs/examples/sample_data/hcp_input_data.csv --scores docs/examples/sample_data/hcp_labels.csv

Alternative: Using pip + venv (Lightweight)

# 1. Create isolated environment
python -m venv emuses-env
source emuses-env/bin/activate  # Linux/macOS
# emuses-env\Scripts\activate   # Windows

# 2. macOS ONLY: Install OpenMP via Homebrew (one-time setup)
# brew install libomp  # Required for XGBoost on macOS

# 3. Install EMUSES
pip install git+https://github.com/chrisfoulon/emuses.git

# 4. Verify installation
python -m emuses.cli --help

# 5. Run your first analysis (with sample data)
python -m emuses.cli full output_folder docs/examples/sample_data/hcp_input_data.csv --scores docs/examples/sample_data/hcp_labels.csv

📝 macOS Users: XGBoost requires OpenMP for multi-threading. Install via brew install libomp (pip/venv) or conda install -c conda-forge llvm-openmp (conda). This is a one-time setup that enables high-performance ML libraries. Important: Conda users must use the conda installation (llvm-openmp) - the brew version won't be found inside conda environments.

✅ Success: Your first scientific prediction model is ready in output_folder/!

📊 Understanding Your Results: Your analysis includes prediction models, statistical heatmaps, correlation analysis, and effect size maps. See results guide →

🔬 Research Use Cases

🏠 Individual Researchers

# Local analysis with your data
python -m emuses.cli full my_results/ my_brain_data.csv --scores my_cognitive_scores.csv

Perfect for: Exploratory analysis, method development, personal research projects

🏛️ Research Labs

# Multi-user collaboration with shared models
python -m emuses.cli models install shared_model.zip
python -m emuses.cli models list --workspace our_lab

Perfect for: Team collaboration, model validation, reproducible workflows

🌍 Scientific Community

# Access community models and benchmarks
python -m emuses.cli models search "fMRI working memory"
python -m emuses.cli models info community_model_v2

Perfect for: Meta-analyses, benchmarking, scientific reproducibility

⭐ Key Features

🔬 Research-Optimized: Designed for complex scientific prediction tasks (neuroimaging, astronomy, genetics, sociology, economics, and more)
🔄 Multi-Mode Flexibility: Local, collaborative, or cloud-based workflows
📊 Model Registry: Share, discover, and reproduce predictive models
🎯 Research-Focused: Designed for scientific rigor and interpretability
⚡ Quick Start: From installation to results in under 5 minutes
🔬 Deep Control: Comprehensive configuration for advanced users

📚 Documentation Paths

Choose your path based on your needs:

🚀 Quick Start Guide

5-minute path to your first results
→ For time-constrained researchers who need immediate results

📖 Model Registry Guide

Comprehensive model sharing documentation
→ For researchers who want to understand model registry capabilities

🔬 Research Workflows Guide

Scientific use case patterns and methodological examples
→ For researchers implementing scientific analysis workflows across diverse domains (neuroimaging, astronomy, genetics, sociology, economics)

🔧 API Documentation

Interactive API reference
→ For computational scientists integrating EMUSES into workflows

👥 Developer Guide

Integration and contribution guide
→ For extending EMUSES or contributing to development

🎯 Getting Started by Setup Mode

🟢 Local Mode (Recommended for Beginners)

# Automatic setup - no configuration needed
python -m emuses.cli full output/ input_data.csv --scores scores.csv

🟡 Database Mode (Lab Collaboration)

# Multi-user setup with PostgreSQL
python -m emuses.cli models status  # Shows current mode
# See: docs/USER_GUIDE.md#database-mode-setup

🔴 Cloud Mode (Production/Community)

# Full production deployment
# See: docs/USER_GUIDE.md#cloud-mode-setup

🏗️ Installation Options

Standard Installation (Conda - Recommended)

# Create environment
conda create -n emuses-env python=3.11
conda activate emuses-env

# macOS: Install OpenMP
conda install -c conda-forge llvm-openmp  # macOS only

# Install EMUSES
pip install git+https://github.com/chrisfoulon/emuses.git

Development Installation

git clone https://github.com/chrisfoulon/emuses.git
cd emuses

# Create conda environment
conda create -n emuses-dev python=3.11
conda activate emuses-dev

# macOS: Install OpenMP
conda install -c conda-forge llvm-openmp  # macOS only

# Install in editable mode
pip install -e .

Production Installation

# With Docker for full deployment
docker pull ghcr.io/chrisfoulon/emuses:latest
# See: docs/deployment/ for complete setup

🧪 Sample Data & Examples

EMUSES includes real-world sample data from the Human Connectome Project:

Input: Neuroimaging features from 1068 subjects
Target: Fluid intelligence prediction task
Location: docs/examples/sample_data/

Perfect for testing workflows and learning EMUSES capabilities.

🤝 Research Community

EMUSES enables reproducible scientific research through:

Model Sharing: Publish and discover predictive models
Reproducible Workflows: Standardized analysis pipelines
Community Benchmarks: Compare methods across research groups
Open Science: Transparent and reproducible research practices

📄 Citation

If you use EMUSES in your research, please cite:

@software{emuses2024,
  title={EMUSES: Scientific Predictive Modeling Platform},
  author={Foulon, Chris and Contributors},
  year={2024},
  url={https://github.com/chrisfoulon/emuses},
  version={0.9.0}
}

🔗 Links

🌐 Documentation: Full documentation portal
🚀 Quick Start: 5-minute tutorial
📊 Model Registry: Model sharing guide
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

📊 Project Status

Current Version: 0.9.0-dev (Model Registry Complete)
Next Release: 1.0.0 (Production Ready)
Test Coverage: 47.1% (Exceeds research software standards)
Status: Pre-production, active development

Documentation developed using LAD (LLM-Assisted Development) methodology with human oversight.

🧠 Built for scientists, by scientists | ⚡ Quick results, deep control | 🤝 Individual to community scale

Name		Name	Last commit message	Last commit date
Latest commit History 359 Commits
.claude		.claude
.github		.github
.lad		.lad
alembic		alembic
dev-docs		dev-docs
docker		docker
docs		docs
emuses		emuses
legacy_archive		legacy_archive
notes		notes
scripts		scripts
test_data		test_data
tests		tests
.bandit		.bandit
.coveragerc		.coveragerc
.env.production.template		.env.production.template
.env.staging		.env.staging
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.safety-policy.yml		.safety-policy.yml
=1.0.0		=1.0.0
ADMIN_FUNCTIONS_TESTING_PLAN.md		ADMIN_FUNCTIONS_TESTING_PLAN.md
AUTOCOMPLETION_ANALYSIS_FINAL.md		AUTOCOMPLETION_ANALYSIS_FINAL.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
MODEL_NAME_FIX_INSTRUCTIONS.md		MODEL_NAME_FIX_INSTRUCTIONS.md
PROJECT_STATUS.md		PROJECT_STATUS.md
README.md		README.md
alembic.ini		alembic.ini
all_failures.txt		all_failures.txt
analysis_methods_comparison.md		analysis_methods_comparison.md
analysis_notes_registry.md		analysis_notes_registry.md
cli_chunk1.txt		cli_chunk1.txt
cli_results.txt		cli_results.txt
comprehensive_test_output.txt		comprehensive_test_output.txt
coverage_analysis_output.txt		coverage_analysis_output.txt
cycle2_impact_analysis.md		cycle2_impact_analysis.md
cycle_impact_analysis.md		cycle_impact_analysis.md
docker-compose.backup.yml		docker-compose.backup.yml
docker-compose.observability.yml		docker-compose.observability.yml
docker-compose.production.yml		docker-compose.production.yml
docker-compose.staging.yml		docker-compose.staging.yml
docker-compose.yml		docker-compose.yml
get-docker.sh		get-docker.sh
github_issue_draft.md		github_issue_draft.md
implementation_context.md		implementation_context.md
mkdocs.yml		mkdocs.yml
model_trace.json		model_trace.json
multiuser_chunk1.txt		multiuser_chunk1.txt
pdca_cycle_1_plan.md		pdca_cycle_1_plan.md
pdca_cycle_1_results.md		pdca_cycle_1_results.md
pdca_cycle_2_results.md		pdca_cycle_2_results.md
pdca_cycle_3_results.md		pdca_cycle_3_results.md
pdca_cycle_inference_stage_results.md		pdca_cycle_inference_stage_results.md
pytest.ini		pytest.ini
registry_chunk1.txt		registry_chunk1.txt
registry_chunk2.txt		registry_chunk2.txt
requirements-dev.in		requirements-dev.in
requirements-dev.txt		requirements-dev.txt
requirements-prod.in		requirements-prod.in
requirements-prod.txt		requirements-prod.txt
requirements.in		requirements.in
requirements.txt		requirements.txt
security_results.txt		security_results.txt
setup.cfg		setup.cfg
setup.py		setup.py
temp_analysis_notes.md		temp_analysis_notes.md
test_analysis_summary.md		test_analysis_summary.md
test_collection_baseline.txt		test_collection_baseline.txt
test_context_summary.md		test_context_summary.md
test_execution_baseline.md		test_execution_baseline.md
test_health_metrics.md		test_health_metrics.md
test_model_trace.json		test_model_trace.json
test_name_logic.py		test_name_logic.py
testing_checklist_registry.md		testing_checklist_registry.md
tools_results.txt		tools_results.txt
unfixed_test_analysis.md		unfixed_test_analysis.md
vscode_wsl_settings.json		vscode_wsl_settings.json

chrisfoulon/emuses

Folders and files

Latest commit

History

Repository files navigation

🔬 EMUSES - Scientific Predictive Modeling Platform

🚀 Quick Start (5 minutes)

Prerequisites

Installation & First Analysis

Recommended: Using Conda (Best for Scientific Computing)

Alternative: Using pip + venv (Lightweight)

🔬 Research Use Cases

🏠 Individual Researchers

🏛️ Research Labs

🌍 Scientific Community

⭐ Key Features

📚 Documentation Paths

🚀 Quick Start Guide

📖 Model Registry Guide

🔬 Research Workflows Guide

🔧 API Documentation

👥 Developer Guide

🎯 Getting Started by Setup Mode

🟢 Local Mode (Recommended for Beginners)

🟡 Database Mode (Lab Collaboration)

🔴 Cloud Mode (Production/Community)

🏗️ Installation Options

Standard Installation (Conda - Recommended)

Development Installation

Production Installation

🧪 Sample Data & Examples

🤝 Research Community

📄 Citation

🔗 Links

📊 Project Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages