Reproducibility of Machine Learning-Based Fault Detection and Diagnosis for HVAC Systems in Buildings: An Empirical Study
Reproducibility is a cornerstone of credible scientific research. The topic gained prominence in fields such as psychology, medicine and artificial intelligence where concerns about non-replicable results sparked ongoing discussions about research practices. However, its status within machine learning for building systems is underexamined. Therefore, this work contributes to closing this gap by analyzing the reproducibility of machine learning-based fault detection and diagnosis studies published over the past decade. We found that nearly all articles are not reproducible due to insufficient disclosure across key dimensions of reproducibility. Notably, 72% of the articles do not specify whether the dataset used is public, proprietary, or commercially available. Only two papers share a link to their code, one of which was broken. Two-thirds of the publications were authored exclusively by academic researchers, yet no significant differences in reproducibility were observed compared to publications with industry-affiliated authors. These findings highlight the need for targeted interventions, including reproducibility guidelines, training for researchers, and policies by journals and conferences that promote transparency and reproducibility.
data/— Directory contains recorded reproducibility variables (checklist).notebooks/- Directory contains notebooks for data preprocessing, scopus API call and data analysis.results/— Output of the reproducibility assessment (e.g., plots).paper/— The manuscript is available in this directory.figures/— Directory contains figures used in the study.


