Comprehensive Python toolkit for Drosophila olfactory research: DoOR database integration, FlyWire connectomics, pathway analysis, and neural network preprocessing.
Extract, analyze, and integrate Drosophila melanogaster odorant-receptor response data with connectome analysis. No R installation required.
NEW in v1.0.0: Complete mushroom body circuit validation with ORN→PN→KC→MBON pathway tracing! 🎉
- ✅ Pure Python - Extract DoOR R data files without installing R
- 🚀 Fast - Parquet-based caching for quick loading
- 📊 693 odorants × 78 receptors - Comprehensive olfactory data
- 🔍 Search & Filter - Query by odorant name, receptor, or properties
- 🧠 Interglomerular Cross-Talk - Analyze lateral inhibition pathways
- 🔬 NetworkX Graphs - 108,980+ pathways across 38 glomeruli
- 📈 Statistical Analysis - Hub detection, community detection, asymmetry
- 🎨 Publication-Ready Figures - High-resolution network visualizations
- 🎯 ORN → PN → KC → MBON Tracing - Complete learning circuit pathways
- 🧬 Anatomical Validation - Validate LASSO-identified receptors in MB circuits
- 🏆 Priority Ranking - Integrate behavioral importance with connectivity
- 📊 Circuit Classification - Appetitive (α/β) vs Aversive (γ) lobe specialization
- 🔬 Experimental Design - Generate priority matrices for optogenetic validation
- 🗺️ FlyWire Integration - Map receptors to neural connectivity (100K+ cells)
- 🛤️ Pathway Analysis - Trace Or47b, Or42b, Or92a pathways
- 🤖 ML-Ready - PyTorch/NumPy integration with sparse encoding
- 🧪 Experiment Design - PGCN blocking protocol generation
- 🎓 LASSO Behavioral Prediction - Identify sparse receptor circuits from optogenetic data
# Core package
pip install door-python-toolkit
# With all features
pip install door-python-toolkit[all]
# Individual feature sets
pip install door-python-toolkit[flywire] # FlyWire integration
pip install door-python-toolkit[connectomics] # Connectomics module
pip install door-python-toolkit[torch] # PyTorch support
pip install door-python-toolkit[extract] # DoOR extractionfrom door_toolkit import DoOREncoder
# Load encoder
encoder = DoOREncoder("door_cache")
# Encode single odorant → 78-dim PN activation vector
pn_activation = encoder.encode("acetic acid")
print(pn_activation.shape) # (78,)
# Search odorants
acetates = encoder.list_available_odorants(pattern="acetate")
print(f"Found {len(acetates)} acetates") # 36from door_toolkit.connectomics import CrossTalkNetwork
from door_toolkit.connectomics.pathway_analysis import analyze_single_orn
# Load network
network = CrossTalkNetwork.from_csv('interglomerular_crosstalk_pathways.csv')
network.set_min_synapse_threshold(10)
# Analyze DL5 glomerulus
results = analyze_single_orn(network, 'ORN_DL5', by_glomerulus=True)
print(f"Found {results.num_pathways} cross-talk pathways")- Installation
- Core DoOR Features
- Connectomics Module
- FlyWire Integration
- Mushroom Body Circuit Validation
- Pathway Analysis
- Neural Network Preprocessing
- Command-Line Interface
- API Reference
- Examples
- Citation
- Contributing
- License
The Database of Odorant Responses (DoOR) is a comprehensive collection of odorant-receptor response measurements for Drosophila melanogaster.
Published: Münch & Galizia (2016), Scientific Data 3:160122 Citation: https://doi.org/10.1038/sdata.2016.122
| Metric | Value |
|---|---|
| Odorants | 693 compounds |
| Receptors | 78 ORN types (Or, Ir, Gr) |
| Measurements | 7,381 odorant-receptor pairs |
| Sparsity | 86% (typical for chemical screens) |
| Response Range | [0, 1] normalized |
from door_toolkit import DoORExtractor
# Extract R data files to Python formats
extractor = DoORExtractor(
input_dir="path/to/DoOR.data/data", # Unzipped DoOR R package
output_dir="door_cache"
)
extractor.run()from door_toolkit import DoOREncoder
# Load encoder
encoder = DoOREncoder("door_cache")
# Encode batch
odors = ["acetic acid", "1-pentanol", "ethyl acetate"]
pn_batch = encoder.batch_encode(odors)
print(pn_batch.shape) # (3, 78)
# Get metadata
stats = encoder.get_receptor_coverage("acetic acid")
print(f"Active receptors: {stats['n_active']}")Comprehensive tools for analyzing interglomerular cross-talk in the Drosophila olfactory system using FlyWire connectome data.
✅ Network Construction
- NetworkX-based directed graph (108,980+ pathways)
- Hierarchical representation: individual neurons + glomerulus meta-nodes
- 2,828 neurons across 38 glomeruli
- Synapse-weighted edges with configurable thresholds
✅ Four Analysis Modes
- Single ORN Focus - All pathways from one ORN/glomerulus
- ORN Pair Comparison - Bidirectional cross-talk quantification
- Full Network View - Global topology and statistics
- Pathway Search - Find specific connections
✅ Statistical Analyses
- Hub neuron detection (degree, betweenness, closeness, eigenvector centrality)
- Community detection (Louvain, greedy modularity, label propagation)
- Asymmetry quantification
- Path length distributions
✅ Biophysical Parameters
- Research-based parameters (Wilson, Olsen, Kazama labs)
- Dale's law enforcement
- Synaptic time constants for ACh and GABA
from door_toolkit.connectomics import CrossTalkNetwork
from door_toolkit.connectomics.pathway_analysis import analyze_single_orn, compare_orn_pair
from door_toolkit.connectomics.statistics import NetworkStatistics
from door_toolkit.connectomics.visualization import NetworkVisualizer
# Load network
network = CrossTalkNetwork.from_csv('interglomerular_crosstalk_pathways.csv')
network.set_min_synapse_threshold(10)
# Mode 1: Analyze single glomerulus
results = analyze_single_orn(network, 'ORN_DL5', by_glomerulus=True)
print(f"Found {results.num_pathways} pathways from DL5")
# Mode 2: Compare two glomeruli
comparison = compare_orn_pair(network, 'ORN_DL5', 'ORN_VA1v', by_glomerulus=True)
print(f"Asymmetry ratio: {comparison.get_asymmetry_ratio():.3f}")
# Mode 3: Full network analysis
stats = NetworkStatistics(network)
hubs = stats.detect_hub_neurons(method='betweenness', threshold_percentile=95)
communities = stats.detect_communities(algorithm='louvain', level='glomerulus')
print(f"Found {len(hubs)} hub neurons, {max(communities.values()) + 1} communities")
# Mode 4: Pathway search
from door_toolkit.connectomics.pathway_analysis import find_pathways
pathways = find_pathways(network, 'ORN_VM7v', 'ORN_D', by_glomerulus=True)
print(f"Found {pathways['num_pathways']} pathways")
# Visualization
visualizer = NetworkVisualizer(network)
visualizer.plot_full_network(output_path='network.png', min_synapse_display=50)
visualizer.plot_single_orn_pathways('ORN_DL5', output_path='DL5_pathways.png')
visualizer.plot_glomerulus_heatmap(output_path='heatmap.png')The antennal lobe processes olfactory information through:
- ORNs - Express specific odorant receptors, converge into glomeruli
- Local Neurons (LNs) - GABAergic inhibitory neurons mediating lateral inhibition
- Projection Neurons (PNs) - Cholinergic neurons to higher brain centers
Lateral inhibition mechanisms:
- ORN → LN → ORN: Lateral inhibition between glomeruli (52% of pathways, median 3 synapses)
- ORN → LN → PN: Feedforward inhibition to PNs (16% of pathways)
- ORN → PN → feedback: Feedback loops (20% of pathways, up to 1,018 synapses)
Our analysis revealed:
- Hub LNs: lLN2T_c, lLN2X04, lLN8, LN60b (prime optogenetic targets)
- 15 functional communities with one major 22-glomerulus cluster
- VM7v acts as convergence hub receiving from multiple glomeruli
- Asymmetric connectivity patterns suggesting specialized functions
The connectomics module includes a robust identifier resolution system that automatically normalizes messy ORN/glomerulus names and maps receptor names to their glomerulus names.
Key features:
- Format-agnostic: Accepts
"DL3","dl3","ORN_DL3","ORN-DL3","Glomerulus DL3"- all resolve to"ORN_DL3" - Receptor-to-glomerulus mapping: Automatically maps
"Or7a"→"ORN_DL5","Ir31a"→"ORN_VL2p","Gr21a"→"ORN_V" - Complete coverage: Includes 44 receptors (33 Or, 10 Ir, 1 Gr) mapped to their FlyWire glomeruli
- Fuzzy matching: Suggests alternatives when exact matches fail (ranked by similarity)
- Clear errors: Provides actionable error messages with top 10 suggestions
In FlyWire, neurons are labeled by glomerulus name (e.g., ORN_VL2p; Ir31a), not receptor name. The resolver automatically handles this translation so you can use familiar receptor names like "Ir31a" or "Or7a" in your code. The system uses normalization (case-insensitive, separator-agnostic) combined with receptor mapping and fuzzy matching to prevent "non-matching ORN name" errors. All pathway analysis functions (analyze_single_orn, compare_orn_pair, find_pathways) accept both receptor names and glomerulus names. See examples/connectomics/example_orn_identifier_resolution.py for a complete demonstration.
Map DoOR receptor data to FlyWire neural connectivity and community labels.
- Parse 100K+ FlyWire community labels efficiently
- Map DoOR receptors to FlyWire root IDs
- Generate 3D spatial activation maps
- Export mappings in JSON/CSV formats
DoORFlyWireIntegrator.get_connectivity_matrix_door_indexed()translates FlyWire glomerulus labels (e.g.,ORN_DL5) into DoOR receptor names (Or7a) so tuning and connectivity matrices share the same index before statistical analysis.scripts/analysis_1_tuning_vs_connectivity.pynow logs detailed overlap diagnostics and generates a diagnostic report if insufficient overlapping receptors are found, making namespace issues easy to detect.
from door_toolkit.flywire import FlyWireMapper
# Initialize mapper
mapper = FlyWireMapper(
community_labels_path="processed_labels.csv.gz",
door_cache_path="door_cache",
auto_parse=True
)
# Find cells expressing specific receptor
or42b_cells = mapper.find_receptor_cells("Or42b")
print(f"Found {len(or42b_cells)} Or42b neurons")
# Map all receptors
mappings = mapper.map_door_to_flywire()
print(f"Mapped {len(mappings)} receptors")
# Create spatial activation map
spatial_map = mapper.create_spatial_activation_map("ethyl butyrate")
print(f"Active at {spatial_map.total_cells} locations")
# Export mappings
mapper.export_mapping("flywire_mapping.json", format="json")# Map receptors to FlyWire
door-flywire --labels processed_labels.csv.gz --cache door_cache --map-receptors
# Find specific receptor
door-flywire --labels processed_labels.csv.gz --find-receptor Or42b
# Create spatial map
door-flywire --labels processed_labels.csv.gz --cache door_cache \
--spatial-map "ethyl butyrate" --output spatial_map.jsonNEW! Validate LASSO-identified receptors using complete FlyWire mushroom body pathways.
You've identified important receptors using LASSO regression on behavioral data. But do these receptors actually connect to the learning circuit?
This module answers: "Are my receptors anatomically positioned in the mushroom body (MB), and which should I test first?"
LASSO Behavioral Prediction → FlyWire Pathway Tracing → Priority Matrix → Optogenetics
↓ ↓ ↓ ↓
Or67c (weight=0.126) 23 ORNs → 6 PNs → 341 KCs Final Score: 0.920 TEST FIRST!
56.7% γ lobe Circuit: Aversive
✅ Complete Pathway Tracing
- Trace: ORN → PN → KC → MBON
- Synapse-level connectivity (5.3M connections)
- Cell type classification (137K neurons)
- Mushroom body compartments (α/β, γ, α'β' lobes)
✅ Circuit Validation Metrics
- ORN→PN Strength: % of ORN output reaching PNs (commitment to learning pathway)
- KC Coverage: % of Kenyon Cells contacted (breadth of MB access)
- Lobe Specialization: α/β (appetitive) vs γ (aversive) fraction
- Circuit Score: Composite 0-1 score for "in learning circuit"
✅ Integration with Behavioral Data
- Load LASSO regression results
- Combine behavioral importance + anatomical validation
- Generate experimental priority matrix
- Export publication-ready figures
✅ Sensillum Mapping
- Automatic mapping: ab2B → Or85a, ab3A → Or22a, ab1A → Or42b
- Translates sensillum labels to specific Or receptors
from door_toolkit.flywire import FlyWireMapper
from door_toolkit.flywire.mushroom_body_tracer import MushroomBodyTracer
# Step 1: Map receptors to FlyWire ORN neurons
mapper = FlyWireMapper("processed_labels.csv.gz", auto_parse=True)
or67c_cells = mapper.find_receptor_cells("Or67c")
print(f"Found {len(or67c_cells)} Or67c ORNs")
# Step 2: Initialize mushroom body tracer
tracer = MushroomBodyTracer(
synapse_path="connections_princeton.csv.gz",
cell_types_path="consolidated_cell_types.csv.gz"
)
# Step 3: Trace complete pathway (ORN → PN → KC → MBON)
pathway = tracer.trace_receptor_pathway(
receptor_name="Or67c",
orn_ids=[cell["root_id"] for cell in or67c_cells]
)
print(f"Pathway Summary:")
print(f" ORNs: {pathway.n_orns}")
print(f" PNs: {len(pathway.unique_pns)}")
print(f" KCs: {len(pathway.unique_kcs)}")
print(f" Synapses (ORN→PN): {pathway.total_orn_to_pn_synapses}")
print(f" Synapses (PN→KC): {pathway.total_pn_to_kc_synapses}")
print(f" KC compartments: {pathway.kc_compartments}")
# Step 4: Calculate connectivity metrics
metrics = tracer.calculate_connectivity_metrics(pathway)
print(f"\nConnectivity Metrics:")
print(f" ORN→PN strength: {metrics.orn_to_pn_strength:.2%}")
print(f" KC coverage: {metrics.kc_coverage:.2%}")
print(f" α/β lobe (appetitive): {metrics.alpha_beta_fraction:.2%}")
print(f" γ lobe (aversive): {metrics.gamma_fraction:.2%}")
print(f" Circuit score: {metrics.circuit_score:.3f}")
print(f" Circuit type: {metrics.to_dict()['circuit_type']}")
# Step 5: Export results
tracer.export_pathway_csv([pathway], "pathway_summary.csv")
tracer.export_metrics_csv([metrics], "connectivity_metrics.csv")Run the complete workflow from LASSO results to experimental priorities:
# Full pipeline: examples/advanced/flywire_mb_pathway_analysis.py
python examples/advanced/flywire_mb_pathway_analysis.pyOutput:
Top 3 High-Priority Receptors:
1. Or67c - Final Score: 0.920 (AVERSIVE, γ lobe) → TEST FIRST ⭐⭐⭐
2. Or22b - Final Score: 0.686 (APPETITIVE, α/β) → TEST SECOND ⭐⭐
3. Or85a - Final Score: 0.658 (APPETITIVE, α/β) → TEST SECOND ⭐⭐
Files generated:
✓ final_priority_matrix.csv - Ranked receptors with all metrics
✓ flywire_pathway_summaries.csv - ORN→PN→KC pathway stats
✓ flywire_connectivity_metrics.csv - Circuit validation scores
✓ priority_scatter.png - LASSO vs Connectivity plot
✓ priority_bar.png - Priority ranking visualization
Or67c (Top Candidate):
LASSO Weight: 0.126 (HIGHEST)
Pathway: 23 ORNs → 6 PNs → 341 KCs
Circuit: 56.7% γ lobe (AVERSIVE learning)
Final Score: 0.920
Recommendation: TEST FIRST - Silencing will impair learned aversive responses
Or85a (ab2B sensillum):
LASSO Weight: 0.067 (3rd highest)
Pathway: 42 ORNs → 5 PNs → 391 KCs
Circuit: 55.6% α/β lobe (APPETITIVE learning)
ORN→PN Strength: 84.2% (HIGHEST commitment!)
Final Score: 0.658
Recommendation: TEST SECOND - Strong appetitive circuit
Circuit Types:
- Appetitive (α/β lobe): Reward/feeding learning (Or22b, Or85a, Or42b)
- Aversive (γ lobe): Avoidance/punishment learning (Or67c, Or49a)
Connectivity Metrics:
- High ORN→PN strength (>70%): Strong commitment to learning pathway
- High KC coverage (>20%): Broad access to memory encoding
- Lobe specialization (>50%): Clear circuit type assignment
- Circuit score (>0.80): High confidence in MB circuit membership
from door_toolkit.pathways import LassoBehavioralPredictor
# Step 1: Run LASSO to identify important receptors
predictor = LassoBehavioralPredictor(
doorcache_path="door_cache",
behavior_csv_path="reaction_rates_summary.csv"
)
# Fit models for different optogenetic conditions
results_hex = predictor.fit_behavior("opto_hex")
results_eb = predictor.fit_behavior("opto_EB")
results_benz = predictor.fit_behavior("opto_benz_1")
print(f"Or22b LASSO weight (hexanol): {results_hex.lasso_weights.get('Or22b', 0):.4f}")
print(f"Or67c LASSO weight (EB): {results_eb.lasso_weights.get('Or67c', 0):.4f}")
print(f"Or85a LASSO weight (benz): {results_benz.lasso_weights.get('Or85a', 0):.4f}")
# Step 2: Validate with FlyWire (see above)
# ...
# Step 3: Generate final priority matrix
# Combines: 60% behavioral importance + 40% circuit connectivity# Run complete mushroom body analysis
python examples/advanced/flywire_mb_pathway_analysis.py
# Output: flywire_mb_analysis/
# ├── final_priority_matrix.csv # Experimental priorities
# ├── flywire_pathway_summaries.csv # Pathway statistics
# ├── flywire_connectivity_metrics.csv # Circuit validation
# ├── priority_scatter.png # Visualization
# ├── priority_bar.png # Rankings
# └── UPDATED_SUMMARY.md # Complete reportResearch Question: "Which receptors are critical for learned olfactory behavior?"
Workflow:
- ✅ LASSO identifies Or67c, Or22b, Or85a as important (sparse circuit)
- ✅ FlyWire validates all 3 reach mushroom body via PN→KC pathways
- ✅ Circuit analysis reveals:
- Or67c: 56.7% γ lobe → aversive learning
- Or22b: 69.5% α/β lobe → appetitive learning
- Or85a: 55.6% α/β lobe → appetitive learning
- ✅ Priority matrix ranks Or67c #1 (score: 0.920)
- ✅ Optogenetic validation confirms Or67c silencing impairs learning
Result: Anatomically validated, prioritized receptor list for experiments! 🎯
Quantitative analysis of olfactory pathways and experiment protocol generation.
- Trace known pathways (Or47b→feeding, Or42b, Or92a→avoidance)
- Custom pathway analysis
- Shapley importance computation
- PGCN experiment protocol generation
- Behavioral prediction
from door_toolkit.pathways import PathwayAnalyzer, BlockingExperimentGenerator, BehavioralPredictor
# Pathway analysis
analyzer = PathwayAnalyzer("door_cache")
# Trace Or47b feeding pathway
pathway = analyzer.trace_or47b_feeding_pathway()
print(f"Pathway strength: {pathway.strength:.3f}")
print(f"Top receptors: {pathway.get_top_receptors(5)}")
# Custom pathway
custom = analyzer.trace_custom_pathway(
receptors=["Or92a"],
odorants=["geosmin"],
behavior="avoidance"
)
# Shapley importance
importance = analyzer.compute_shapley_importance("feeding")
top_receptors = sorted(importance.items(), key=lambda x: -x[1])[:10]
# Generate experiment protocol
generator = BlockingExperimentGenerator("door_cache")
protocol = generator.generate_experiment_1_protocol() # Single-unit veto
protocol.export_json("experiment_protocol.json")
# Behavioral prediction (heuristic)
predictor = BehavioralPredictor("door_cache")
prediction = predictor.predict_behavior("hexanol")
print(f"Valence: {prediction.predicted_valence}")
print(f"Confidence: {prediction.confidence:.2%}")
# LASSO behavioral prediction (data-driven)
from door_toolkit.pathways import LassoBehavioralPredictor
lasso_predictor = LassoBehavioralPredictor(
doorcache_path="door_cache",
behavior_csv_path="reaction_rates_summary.csv"
)
# Fit model for optogenetic condition
results = lasso_predictor.fit_behavior("opto_hex")
print(f"R² = {results.cv_r2_score:.3f}")
print(f"Selected {results.n_receptors_selected} receptors")
# Get top predictive receptors
for receptor, weight in results.get_top_receptors(5):
print(f" {receptor}: {weight:.4f}")
# Generate plots
results.plot_predictions(save_to="opto_hex_predictions.png")
results.plot_receptors(save_to="opto_hex_receptors.png")
# Export results
results.export_csv("opto_hex_results.csv")
results.export_json("opto_hex_model.json")
# Compare multiple conditions
comparison = lasso_predictor.compare_conditions(
conditions=["opto_hex", "opto_EB", "opto_benz_1"],
plot=True,
save_dir="comparison_results"
)The LassoBehavioralPredictor uses sparse regression (LASSO) to identify minimal receptor circuits that predict behavioral responses from optogenetic manipulation experiments:
Features:
- Automatic odorant name matching between behavioral data and DoOR
- Cross-validated LASSO regression with automatic λ selection
- Sparse receptor circuit identification (typically 3-10 receptors)
- Multiple prediction modes: test odorant, trained odorant, or interaction features
- Visualization: predicted vs actual PER, receptor importance rankings
- Export to CSV/JSON for downstream analysis
Workflow:
- Load optogenetic behavioral data (PER responses)
- Match odorant names to DoOR receptor profiles
- Fit LASSO models with cross-validation
- Extract sparse receptor weights
- Visualize and export results
Example dataset format (reaction_rates_summary.csv):
dataset,3-Octonol,Benzaldehyde,Ethyl_Butyrate,Hexanol,Linalool
opto_hex,0.25,0.00,0.19,0.69,0.19
opto_EB,0.13,0.00,0.22,0.20,0.00
opto_benz_1,0.25,0.02,0.44,0.59,0.12
Biological Interpretation:
- Positive weights → receptors associated with higher PER
- Negative weights → receptors associated with lower PER (potential inhibition)
- Zero weights → receptors excluded by LASSO (not predictive)
- Sparse circuits (3-7 receptors) suggest minimal testable hypotheses
# Trace pathways
door-pathways --cache door_cache --trace or47b-feeding
# Custom pathway
door-pathways --cache door_cache --custom-pathway \
--receptors Or92a --odorants geosmin --behavior avoidance
# Shapley importance
door-pathways --cache door_cache --shapley feeding --output importance.json
# Generate experiment
door-pathways --cache door_cache --generate-experiment 1 \
--output exp1_protocol.json --format markdown
# Predict behavior
door-pathways --cache door_cache --predict-behavior "ethyl butyrate"Prepare DoOR data for neural network training with sparse encoding and augmentation.
- Sparse KC-like encoding (5% sparsity)
- Hill equation concentration-response modeling
- Noise augmentation (Gaussian, Poisson, dropout)
- PyTorch/NumPy/HDF5 export
- PGCN-compatible dataset generation
from door_toolkit.neural import DoORNeuralPreprocessor
# Initialize preprocessor
preprocessor = DoORNeuralPreprocessor(
"door_cache",
n_kc_neurons=2000,
random_seed=42
)
# Create sparse encoding
sparse_data = preprocessor.create_sparse_encoding(sparsity_level=0.05)
print(f"Shape: {sparse_data.shape}")
print(f"Sparsity: {(sparse_data > 0).mean():.2%}")
# Generate augmented dataset
aug_orn, aug_kc, labels = preprocessor.generate_noise_augmented_responses(
n_augmentations=5,
noise_level=0.1
)
# Export PGCN dataset
preprocessor.export_pgcn_dataset(
output_dir="pgcn_dataset",
format="pytorch", # or "numpy", "h5"
include_sparse=True
)
# Train/val split
train, val = preprocessor.create_training_validation_split(train_fraction=0.8)from door_toolkit.neural.concentration_models import ConcentrationResponseModel
model = ConcentrationResponseModel()
# Fit Hill equation
concentrations = np.array([0.001, 0.01, 0.1, 1.0])
responses = np.array([0.1, 0.3, 0.7, 0.9])
params = model.fit_hill_equation(concentrations, responses)
print(f"EC50: {params.ec50:.3f}")
print(f"Hill coefficient: {params.hill_coefficient:.3f}")
# Generate concentration series
conc, resp = model.generate_concentration_series(params, n_points=50)
# Model odor mixtures
mixture_responses = model.model_mixture_interactions(
[params1, params2],
concentrations,
interaction_type="additive"
)# Sparse encoding
door-neural --cache door_cache --sparse-encode --sparsity 0.05 \
--output sparse_data.npy
# Augment dataset
door-neural --cache door_cache --augment --n-augmentations 5 \
--output-dir augmented_data/
# Export PGCN dataset
door-neural --cache door_cache --export-pgcn \
--output-dir pgcn_dataset/ --format pytorch
# Dataset statistics
door-neural --cache door_cache --stats# Extract DoOR data
door-extract --input DoOR.data/data --output door_cache
# Validate cache contents
door-extract --validate door_cache
# List odorants (optional substring filter)
door-extract --list-odorants door_cache --pattern acetate
# Encode an odorant and show receptor responses
door-extract --cache door_cache --odor "ethyl butyrate" --coverage
# Compare multiple odorants
door-extract --cache door_cache --odors "ethyl butyrate" "acetic acid" \
--top 15 --coverage --save reports/odor-comparison
# Inspect receptor response profiles
door-extract --cache door_cache --receptor Or42b --top 25# FlyWire integration
door-flywire --labels processed_labels.csv.gz --cache door_cache --map-receptors
# Pathway analysis
door-pathways --cache door_cache --trace or47b-feeding
# Neural preprocessing
door-neural --cache door_cache --sparse-encode --sparsity 0.05 --output sparse_data.npyAdd --debug to any command for detailed tracebacks and logging.
Receptor group shortcuts:
or– Odorant receptors (OrXX)ir– Ionotropic receptors (IrXX)gr– Gustatory receptors (GrXX)neuron– Antennal/palp neuron classes (ab*, ac*, pb*)
Extract DoOR R data files to Python formats.
from door_toolkit import DoORExtractor
extractor = DoORExtractor(input_dir, output_dir)
extractor.run()
extractor.extract_response_matrix()
extractor.extract_odor_metadata()Encode odorant names to neural activation patterns.
from door_toolkit import DoOREncoder
encoder = DoOREncoder(cache_path, use_torch=False)
encoder.encode(odor_name)
encoder.batch_encode(odor_names)
encoder.list_available_odorants(pattern)
encoder.get_receptor_coverage(odor_name)
encoder.get_odor_metadata(odor_name)Main class for connectomics network analysis.
from door_toolkit.connectomics import CrossTalkNetwork
network = CrossTalkNetwork.from_csv(filepath, config=None)
network.set_min_synapse_threshold(threshold)
network.get_pathways_from_orn(orn_identifier, by_glomerulus=False)
network.get_pathways_between_orns(source, target, by_glomerulus=False)
network.find_shortest_paths(source, target, max_paths=10)
network.get_hub_neurons(neuron_category=None, top_n=10)
network.get_network_statistics()
network.export_to_graphml(filepath)
network.export_to_gexf(filepath)Statistical analysis of connectomics networks.
from door_toolkit.connectomics.statistics import NetworkStatistics
stats = NetworkStatistics(network)
stats.detect_hub_neurons(method='degree', threshold_percentile=90.0)
stats.detect_communities(algorithm='louvain', level='glomerulus')
stats.calculate_asymmetry_matrix()
stats.analyze_path_lengths(source_glomerulus=None)
stats.generate_full_report()from door_toolkit.connectomics.pathway_analysis import (
analyze_single_orn,
compare_orn_pair,
find_pathways
)
# Mode 1: Single ORN
results = analyze_single_orn(network, orn_identifier, by_glomerulus=True)
# Mode 2: ORN pair comparison
comparison = compare_orn_pair(network, orn1, orn2, by_glomerulus=True)
# Mode 4: Pathway search
pathways = find_pathways(network, source, target, by_glomerulus=False)from door_toolkit.connectomics.visualization import NetworkVisualizer
visualizer = NetworkVisualizer(network)
visualizer.plot_full_network(output_path='network.png', **kwargs)
visualizer.plot_single_orn_pathways(orn_identifier, output_path='pathways.png')
visualizer.plot_glomerulus_heatmap(output_path='heatmap.png')NEW! Trace complete pathways through mushroom body learning circuits.
from door_toolkit.flywire.mushroom_body_tracer import MushroomBodyTracer
# Initialize tracer
tracer = MushroomBodyTracer(
synapse_path="connections_princeton.csv.gz",
cell_types_path="consolidated_cell_types.csv.gz",
min_synapse_threshold=1
)
# Trace pathway: ORN → PN → KC → MBON
pathway = tracer.trace_receptor_pathway(receptor_name, orn_ids)
# Calculate connectivity metrics
metrics = tracer.calculate_connectivity_metrics(pathway, total_kcs_in_brain=2000)
# Export results
tracer.export_pathway_csv([pathway], "pathway_summary.csv")
tracer.export_metrics_csv([metrics], "connectivity_metrics.csv")Key Classes:
PathwayStep: Single synapse connectionMushroomBodyPathway: Complete ORN→PN→KC pathwayConnectivityMetrics: Circuit validation scores
Attributes:
pathway.n_orns: Number of ORN neuronspathway.n_pns: Number of PN neurons contactedpathway.n_kcs: Number of KC neurons contactedpathway.kc_compartments: Dict of KC counts by lobe (α/β, γ, α'β')metrics.orn_to_pn_strength: ORN→PN pathway strength (0-1)metrics.kc_coverage: Fraction of KCs contacted (0-1)metrics.alpha_beta_fraction: Fraction in appetitive lobe (0-1)metrics.circuit_score: Overall connectivity score (0-1)
IMPORTANT: Prevents confusion between receptor counts and unique glomerulus counts in many-to-one mappings.
from door_toolkit.integration.mapping_accounting import (
compute_mapping_stats,
format_mapping_summary,
log_mapping_stats,
write_mapping_stats_json
)
# Compute comprehensive mapping statistics
mapping = {'OR82A': 'VA6', 'OR94A': 'VA6', 'OR7A': 'DL5'} # Example with collision
stats = compute_mapping_stats(
mapping,
note="Example mapping",
adult_only=False # Include larval receptors
)
# Get compact summary
summary = format_mapping_summary(stats)
# "3 receptors → 2 unique glomeruli (1 collision)"
# Check for many-to-one collapses
if stats['collision_count'] > 0:
print(f"Collisions: {stats['collision_summary']}")
# ['VA6: OR82A, OR94A']
# Write JSON artifact for reproducibility
write_mapping_stats_json("mapping_stats.json", stats)Key Stats Returned:
n_receptors_mapped: Number of receptor genes successfully mappedn_unique_glomeruli_from_mapped_receptors: Number of distinct glomeruli (may differ!)collision_count: Number of glomeruli with ≥2 receptors (many-to-one)collisions: Dict of glomerulus → [receptor list] for collisionscollision_summary: Human-readable collision descriptions
📚 See: docs/RECEPTOR_GLOMERULUS_MAPPING_ACCOUNTING.md for complete documentation on preventing receptor vs glomerulus count confusion.
Complete working examples are available in the examples/ directory:
examples/basic/encode_odorants.py- Encode odorants to PN activationsexamples/basic/search_odorants.py- Search and filter odorantsexamples/basic/receptor_analysis.py- Analyze receptor responses
examples/connectomics/example_1_single_orn_analysis.py- Mode 1: Single ORN focusexamples/connectomics/example_2_orn_pair_comparison.py- Mode 2: ORN pair comparisonexamples/connectomics/example_3_full_network_analysis.py- Mode 3: Full network viewexamples/connectomics/example_4_pathway_search.py- Mode 4: Pathway searchexamples/connectomics/example_orn_identifier_resolution.py- Robust identifier resolution demoexamples/connectomics/analyze_data_characteristics.py- Data quality analysis
examples/advanced/flywire_integration_example.py- FlyWire mappingexamples/advanced/flywire_mb_pathway_analysis.py- NEW! Mushroom body circuit validationexamples/advanced/pathway_analysis_example.py- Pathway tracingexamples/advanced/neural_preprocessing_example.py- Neural network prepexamples/lasso_behavioral_prediction_demo.py- LASSO regression for behavioral prediction
# Extract DoOR data first
door-extract --input DoOR.data/data --output door_cache
# Run examples
python examples/basic/encode_odorants.py
python examples/connectomics/example_1_single_orn_analysis.py
python examples/advanced/flywire_integration_example.py
# NEW: Mushroom body circuit validation
python examples/advanced/flywire_mb_pathway_analysis.pyFrom LASSO to Optogenetics:
# 1. Run LASSO behavioral prediction
python examples/lasso_behavioral_prediction_demo.py
# 2. Validate receptors with FlyWire mushroom body analysis
python examples/advanced/flywire_mb_pathway_analysis.py
# Output:
# behavioral_prediction_results/
# ├── opto_hex_results.csv # LASSO identified receptors
# └── opto_hex_predictions.png
#
# flywire_mb_analysis/
# ├── final_priority_matrix.csv # Experimental priorities
# ├── priority_scatter.png
# └── UPDATED_SUMMARY.md # Complete analysis report
# 3. Use priority matrix to design optogenetic experiments!- Python ≥ 3.8
- pandas ≥ 1.5.0
- numpy ≥ 1.21.0
- pyarrow ≥ 12.0.0
- networkx ≥ 2.8
- matplotlib ≥ 3.5.0
- scipy ≥ 1.9.0
- pyreadr ≥ 0.4.7 - Required for DoORExtractor
- torch ≥ 2.0.0 - For PyTorch integration
- seaborn ≥ 0.11.0 - For heatmaps
- python-louvain ≥ 0.16 - For Louvain community detection
- plotly ≥ 5.11.0 - For interactive visualizations
- h5py ≥ 3.7.0 - For HDF5 export
# Clone repository
git clone https://github.com/yourusername/door-python-toolkit.git
cd door-python-toolkit
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install development dependencies
make install-dev
# Extract DoOR data
make extract INPUT=path/to/DoOR.data/data OUTPUT=door_cache
# Run tests
make test
# Lint and format
make lint
make formatThis toolkit extracts data from the original DoOR R packages:
- DoOR.data - https://github.com/ropensci/DoOR.data
- DoOR.functions - https://github.com/ropensci/DoOR.functions
Download DoOR data:
wget https://github.com/ropensci/DoOR.data/archive/refs/tags/v2.0.0.zip
unzip v2.0.0.zip
door-extract --input DoOR.data-2.0.0/data --output door_cacheFlyWire connectome data is available from:
- FlyWire - https://flywire.ai/
- Community labels - Available through CAVE API
- DoOR extraction: Full dataset in <10 seconds
- FlyWire parsing: 100K+ labels in <30 seconds
- Network construction: 108,980 pathways loaded in <5 seconds
- Receptor mapping: >80% success rate
- Sparse encoding: Maintains 5±1% sparsity
- Memory usage: <2GB for largest datasets
Run the comprehensive test suite:
# Install dev dependencies
pip install -e .[dev]
# Run tests
pytest tests/ -v
# With coverage
pytest tests/ --cov=door_toolkit --cov-report=html
# Specific test modules
pytest tests/test_connectomics.py -v
pytest tests/test_encoder.py -v- Couto, A., et al. (2005) "Molecular, Anatomical, and Functional Organization of the Drosophila Olfactory System." Current Biology 15(17): 1535-1547. DOI: 10.1016/j.cub.2005.07.034
- Hallem, E. A. & Carlson, J. R. (2006) "Coding of Odors by a Receptor Repertoire." Cell 125(1): 143-160. DOI: 10.1016/j.cell.2006.01.050
- Silbering, A. F., et al. (2011) "Complementary Function and Integrated Wiring of the Evolutionarily Distinct Drosophila Olfactory Subsystems." Journal of Neuroscience 31(38): 13357-13375. DOI: 10.1523/JNEUROSCI.2360-11.2011
- Fishilevich, E. & Vosshall, L. B. (2005) "Genetic and Functional Subdivision of the Drosophila Antennal Lobe." Current Biology 15(17): 1548-1553. DOI: 10.1016/j.cub.2005.07.066
- Benton, R., et al. (2009) "Variant Ionotropic Glutamate Receptors as Chemosensory Receptors in Drosophila." Cell 136(1): 149-162. DOI: 10.1016/j.cell.2008.12.001
If you use this toolkit in your research, please cite:
@software{door_python_toolkit,
author = {Hanan, Cole and Contributors},
title = {DoOR Python Toolkit: Comprehensive Tools for Drosophila Olfactory Research},
year = {2025},
version = {1.0.0},
url = {https://github.com/colehanan1/door-python-toolkit},
note = {Production-ready toolkit with mushroom body circuit validation and LASSO behavioral prediction}
}@article{muench2016door,
title={DoOR 2.0--Comprehensive Mapping of Drosophila melanogaster Odorant Responses},
author={M{\"u}nch, Daniel and Galizia, C Giovanni},
journal={Scientific Data},
volume={3},
number={1},
pages={1--14},
year={2016},
publisher={Nature Publishing Group}
}@article{flywire2024,
title={FlyWire: online community for whole-brain connectomics},
author={FlyWire Consortium and Others},
journal={Nature},
year={2024}
}- Wilson & Laurent (2005). Role of GABAergic inhibition in shaping odor-evoked spatiotemporal patterns in the Drosophila antennal lobe. Journal of Neuroscience.
- Olsen & Wilson (2008). Lateral presynaptic inhibition mediates gain control in olfactory glomeruli. Nature.
- Kazama & Wilson (2009). Origins of correlated activity in an olfactory circuit. Nature Neuroscience.
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
Development setup:
git clone https://github.com/yourusername/door-python-toolkit.git
cd door-python-toolkit
python -m venv .venv
source .venv/bin/activate
make install-dev
make testCode Style:
- Follow PEP 8
- Use Black for formatting (
make format) - Add type hints
- Write docstrings for public APIs
- Add tests for new features
"Odorant not found"
→ Use encoder.list_available_odorants() to see exact names (case-insensitive)
"Cache not found"
→ Run DoORExtractor first to extract R data files
"High sparsity"
→ Normal for DoOR (86%). Use fillna(0.0) or filter to well-covered receptors
PyTorch not available
→ Install with pip install door-python-toolkit[torch]
FileNotFoundError: interglomerular_crosstalk_pathways.csv
→ Ensure data files are in correct location or provide full path
MemoryError when loading large files
→ Increase synapse threshold to reduce network size:
network.set_min_synapse_threshold(20) # Only strong connectionsVisualization is cluttered → Filter by synapse strength:
visualizer.plot_full_network(min_synapse_display=50, show_individual_neurons=False)Community detection fails
→ Install python-louvain: pip install python-louvain
Heatmap not showing
→ Install seaborn: pip install seaborn
Qt/matplotlib crash → Module uses non-interactive 'Agg' backend by default. If issues persist, check your matplotlib configuration.
- DoOR database creators: Daniel Münch & C. Giovanni Galizia
- Original R package: rOpenSci DoOR project
- FlyWire Consortium: For comprehensive connectome data
- Contributors: Cole Hanan and the Drosophila neuroscience community
- Raman Lab: WashU neuroscience research
MIT License - see LICENSE file for details.
- PyPI: https://pypi.org/project/door-python-toolkit/
- GitHub: https://github.com/yourusername/door-python-toolkit
- Documentation: https://door-python-toolkit.readthedocs.io
- Issues: https://github.com/yourusername/door-python-toolkit/issues
- Original DoOR: https://github.com/ropensci/DoOR.data
- FlyWire: https://flywire.ai/
- Raman Lab: https://ramanlab.wustl.edu/
Made with ❤️ for the Drosophila neuroscience community