End‑to‑end deep‑learning pipeline for smart‑building power data
Data cleaning ▶ feature engineering ▶ LSTM/CNN‑LSTM/Transformer ▶ anomaly detection ▶ transfer learning.
| Skill | How it’s demonstrated here |
|---|---|
| Time‑series feature engineering | Cyclical encodings (sin/cos), holiday/weekend flags, Isolation‑Forest outlier capping, hour‑level resampling. |
| Model development at scale | LSTM, CNN‑LSTM, and Transformer architectures Bayesian‑tuned with Keras Tuner. |
| Rigorous evaluation | RMSE, MAE, R², CV‑RMSE, multi‑horizon error curves, hour‑of‑day error maps. |
| Transfer learning | 72 % RMSE drop by fine‑tuning the champion model on a low‑data floor. |
| Anomaly analytics | Mean + 3 σ threshold, 51 anomalies characterised by hour & weekday distributions. |
| Domain‑specific KPIs | EUI (205.4 kWh m⁻² yr⁻¹), load factor, peak/base loads, daily archetype profiles. |
| Reproducibility | Global np/tf seeds, Conda environment.yml, exact hyper‑params logged, notebook + HTML render. |
EnergyAI-TimeSeriesLab/
├── EnergyAI-TimeSeriesLab.ipynb # main notebook (fully explained, runnable)
├── EnergyAI-TimeSeriesLab.html # static render for quick browsing
├── dataset/ # Floor1‑7 CSVs (tracked with Git LFS)
├── environment.yml # conda spec for Apple‑Silicon + TensorFlow‑Metal
└── README.md
# 1️⃣ Clone + set up env
git clone https://github.com/mnikoopayan/EnergyAI-TimeSeriesLab.git
cd EnergyAI-TimeSeriesLab
conda env create -f environment.yml
conda activate tf_m1
# 2️⃣ (If you fork) Pull the seven Floor*.csv files into ./dataset
# Files are tracked with Git LFS, so `git lfs pull` will fetch them.
# 3️⃣ Launch the lab
jupyter lab EnergyAI-TimeSeriesLab.ipynbAll code blocks are cell‑by‑cell runnable; each section is self‑contained so you can jump straight to modelling or anomaly analytics.
| Model | RMSE (kWh) | MAE (kWh) | R² | CV‑RMSE (%) | Train time |
|---|---|---|---|---|---|
| Tuned LSTM | 3.62 | 2.17 | 0.986 | 16.5 | 203 s |
| CNN‑LSTM | 3.71 | 2.18 | 0.985 | 16.9 | 314 s |
| Transformer | 4.29 | 2.42 | 0.981 | 19.6 | 422 s |
| Horizon | RMSE | MAE | R² | NMAE % |
|---|---|---|---|---|
| 1 h | 4.22 | 2.59 | 0.981 | 11.9 |
| 3 h | 4.89 | 2.92 | 0.975 | 13.4 |
| 6 h | 4.92 | 3.02 | 0.974 | 13.8 |
| 12 h | 6.71 | 3.56 | 0.952 | 16.4 |
| 24 h | 8.13 | 4.00 | 0.929 | 18.4 |
| Strategy | Test RMSE |
|---|---|
| Train from scratch | 20.78 kWh |
| Fine‑tuned champion | 5.76 kWh |
Δ = ‑72 % error — showcases how pretrained sequence encoders cut data requirements.
- Data wrangling – 689 128 rows × 30 sensor channels (per floor) cleaned; power → energy via hourly means.
- Feature factory – time & cyclical features, holiday flags, Isolation‑Forest capping at 99.5 % quantile.
- Model zoo – three deep‑learning families, Bayesian‑optimised then fully trained with early‑stopping.
- Champion selection – LSTM wins on RMSE + compute cost; extended to multi‑output forecaster.
- Explainability hooks – predictions‑vs‑actuals plots, error heat‑maps, anomaly visualisations.
- Cross‑floor transfer – demonstrate model reuse when Phase‑II deployment data are scarce.
- Building KPIs – EUI, load factor, weekday/weekend archetypes for facility ops teams.
- Python ≥ 3.10
- TensorFlow 2.16 + Metal for Apple Silicon (
environment.yml) - keras‑tuner, scikit‑learn, seaborn, pandas, matplotlib
- Integrate weather covariates (dry‑bulb, dew‑point) via API pull.
- Serve forecasts through FastAPI + Streamlit dashboard.
- Extend transfer‑learning demo to unseen buildings (domain adaptation).
Dataset: CU‑BEMS — Smart‑building electricity & indoor environmental sensor data (one‑minute resolution, Jul 2018 – Dec 2019).
Chulalongkorn University, Bangkok; 11,700 m² academic office building.
- Citation: Pipattanasomporn M. et al. “CU‑BEMS, smart building electricity consumption and indoor environmental sensor datasets.” Scientific Data 7, 241 (2020). DOI: 10.1038/s41597‑020‑00582‑3.
- License: Creative Commons CC‑BY 4.0 — free to share and adapt with attribution.
Mohammad Saleh Nikoopayan Tak – PhD candidate @ NJIT | Data‑Science for the Built Environment
LinkedIn • GitHub • Google Scholar • ResearchGate
Open to collaborations and conversations on smart‑building data science!