Vol.4,No.3,2024
OPEN ACCESS
ARTICLE
Evaluating suitability of regression models in small data regimes using concrete with recycled copper tailings as a case study
  • Ahed Habib, Samer Barakat, Samir Dirar, Salah Al-Toubat, Zaid A. Al-Sadoon
Sustainable Structures   Vol.4,No.3,2024  DOI:10.54113/j.sust.2024.000056  Online published:2024-12-1
Abstract
The utilization of regression models for the prediction of construction material properties is well-established, yet their performance when applied to small datasets is still unclear. This study investigates the performance of different regression models combined with various data preprocessing techniques in contexts where data is limited. Specifically, the research focuses on evaluating the suitability of five regression models across nine different data processing scenarios using concrete with recycled copper tailing as a case study. This study aims to determine which combinations of regression models and preprocessing methods yield the most accurate predictions in small data regimes. This research is motivated by the necessity to enhance prediction reliability in the field of construction materials, where experimental data can often be scarce or costly to obtain. Within the study context, a dataset comprising 21 experimental specimens is used to evaluate the performance of the models on various concrete properties, including fresh density, compressive strength, flexural strength, pull-off strength, abrasion resistance, water penetration, rapid chloride ion permeability, and air permeability. Through rigorous evaluation involving a 10-fold cross-validation process to verify accuracy, the research demonstrates that selecting the optimal regression model and data preprocessing technique selection substantially improves prediction outcomes, even with limited data. The findings highlight the importance of this research, suggesting that even small datasets, when handled correctly, can provide robust insights.
Keywords
Regression models; small data regimes; copper tailing concrete; multivariable regression; data preprocessing