Integrating model tree and modified stepwise regression in concrete slump prediction and steel fabrication estimating

Publication: Canadian Journal of Civil Engineering
18 June 2021

Abstract

The model tree algorithm of M5 is integrated with the multiple linear regression technique called modified stepwise regression (MSR), resulting in a new method for modeling complex civil engineering problems. We purposefully chose artificial neural networks (ANN) for comparison against the proposed “M5+MSR” because they fall at the two ends of the model interpretability spectrum in machine learning. This research addresses the critical question of how to balance the trade-off between bias, variance and model complexity in machine learning through contrasting “M5+MSR” against other commonly applied methods. In two application cases (Case 1: concrete workability and Case 2: steel fabrication estimating), the proposed “M5+MSR” gave rise to explainable regression tree models featuring substantially reduced complexities against ANN and model prediction errors comparable to ANN. The resulting “M5+MSR” models consistently outperformed ANN in terms of model overfitting metrics by 19% in Case 1 and 21% in Case 2, thus boasting better learning performances. The proposed new method will potentially find applications in tackling a wide range of complicated engineering problems that entail fitting prediction models based on laboratory or field data.

Résumé

L’algorithme arborescent modèle de M5 est intégré à la technique de régression linéaire multiple appelée régression séquentielle modifiée (RSM), ce qui donne lieu à une nouvelle méthodologie pour la modélisation de problèmes complexes de génie civil. Nous avons délibérément choisi des réseaux de neurones artificiels (RNA) pour les comparer au « M5+RSM » proposé parce qu’ils se situent aux deux extrémités du spectre d’interprétabilité des modèles dans l’apprentissage machine. Cette recherche aborde la question essentielle à savoir comment trouver un compromis entre le biais, la variance et la complexité du modèle dans l’apprentissage machine en comparant « M5+RSM » à d’autres méthodes couramment appliquées. Dans deux cas d’application (cas 1 : maniabilité du béton et cas 2 : estimation de la fabrication de l’acier), le « M5+RSM » proposé a donné lieu à des modèles d’arbre de régression explicables présentant des complexités considérablement réduites par rapport aux RNA et à des erreurs de prédiction du modèle comparables aux RNA. Dans les deux cas, les modèles « M5+RSM » qui en découlent ont constamment surperformé les RNA en termes de paramètres de dépassement de modèle de 19 % dans le cas 1 et de 21 % dans le cas 2, offrant ainsi de meilleures performances d’apprentissage. La nouvelle méthodologie proposée pourrait trouver des applications pour s’attaquer à un large éventail de problèmes techniques complexes qui impliquent des modèles de prévision d’ajustement basés sur des données de laboratoire ou de terrain. [Traduit par la Rédaction]

Get full access to this article

View all available purchase options and get full access to this article.

References

Afsarian F., Saber A., Pourzangbar A., Olabi A.G., and Khanmohammadi M.A. 2018. Analysis of recycled aggregates effect on energy conservation using M5′ model tree algorithm. Energy, 156: 264–277.
Akaike, H. 1977. On entropy maximization principle. In Applications of Statistics. Edited by P.R. Krishnaiah. North-Holland, Amsterdam.
Al-Barqawi H. and Zayed T. 2008. Infrastructure management: integrated AHP/ANN model to evaluate municipal water mains’ performance. Journal of Infrastructure Systems, ASCE, 14(4): 305–318.
Ashby, R. 1956. An introduction to cybernetics. Marine Biological Laboratory Library, Woods Hole, Mass.
Awolusi T.F., Oke O.L., Akinkurolere O.O., Sojobi A.O., and Aluko O.G. 2019. Performance comparison of neural network training algorithms in the modeling properties of steel fiber reinforced concrete. Heliyon, 5(1): e01115.
Behnood A., Behnood V., Modiri Gharehveran M., and Alyamac K.E. 2017. Prediction of the compressive strength of normal and high-performance concretes using M5P model tree algorithm. Construction and Building Materials, 142: 199–207.
Breiman, L., Friedman, J., Olshen, R., and Stone, C. 1984. Classification and regression trees. Wadsworth, Belmont, Calif.
Desai, V.S., and Joshi, S. 2010. Application of decision tree technique to analyze construction project data. In Proceedings of Information Systems, Technology and Management (ICISTM 2010). Edited by S.K. Prasad, H.M. Vin, S. Sahni, M.P. Jaiswal, and B. Thipakorn. Communications in Computer and Information Science, Vol. 54. Springer, Berlin, Heidelberg. pp. 304–313.
Deshpande N., Londhe S., and Kulkarni S. 2014. Modeling compressive strength of recycled aggregate concrete by artificial neural network, model tree and nonlinear regression. International Journal of Sustainable Built Environment, 3(2): 187–198.
El-Abbasy M.S., Senouci A., Zayed T., Mirahadi F., and Parvizsedghy L. 2014. Artificial neural network models for predicting condition of offshore oil and gas pipelines. Automation in Construction, 45: 50–65.
Frank E., Wang Y., Inglis S., Holmes G., and Witten I.H. 1998. Using model trees for classification. Machine Learning, 32: 63–76.
Frank, E., Mayo, M., and Kramer, S. 2015. Alternating model trees. In SAC ‘15: Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, 13–17 April 2015. Association of Computing Machinery, New York, NY. pp. 871–878.
Gardner B.J., Gransberg D.D., and David J.H. 2016. Reducing data-collection efforts for conceptual cost estimating at a highway agency. Journal of Construction Engineering and Management, ASCE, 142(11): 04016057.
Geman S., Bienenstock E., and Doursat R. 1992. Neural networks and the bias/variance dilemma. Neural Computation, 4(1): 1–58.
Gunning, D., 2016. Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA). Available at https://www.darpa.mil/attachments/DARPA-BAA-16-53.pdf.
Harrell, F., Jr. 2015. Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer Series in Statistics. Springer International Publishing, Cham.
Ivanescu A.E., Li P., George B., Brown A.W., Keith S.W., Raju D., and Allison D.B. 2016. The importance of prediction model validation and assessment in obesity and nutrition research. International Journal of Obesity, 40: 887–894.
Jaillon L. and Poon C.S. 2014. Life cycle design and prefabrication in buildings: a review and case studies in Hong Kong. Automation in Construction, 39: 195–202.
Kleinberg, J., and Tardos, É. 2006. Algorithm design. Pearson/Addison-Wesley, Boston, Mass.
Lee M.-J., Hanna A.S., and Loh W.-Y. 2004. Decision Tree Approach to Classify and Quantify Cumulative Impact of Change Orders on Productivity. Journal of Computing in Civil Engineering, 18(2): 132–144.
Li Z., Shen G.Q., and Xue X. 2014. Critical review of the research on the management of prefabricated construction. Habitat International, 43: 240–249.
Loh W.Y. 2002. Regression trees with unbiased variable selection and interaction detection. Statistica Sinica, 12: 361–386.
Lowe D.J., Emsley M.W., and Anthony H. 2006. Predicting construction cost using multiple regression techniques. Journal of Construction Engineering and Management, ASCE, 132(7): 750–758.
Mehta, P.K., and Monteiro, P.J.M. 1993. Concrete: structure, properties, and materials. Prentice Hall Inc., Englewood Cliffs, NJ.
Mendenhall, W.M., and Sincich, T.L. 2015. Statistics for engineering and the sciences. 6th ed. Chapman and Hall/CRC.
Mohsenijam A. and Lu M. 2019. Framework for developing labour-hour prediction models from project design features: case study in structural steel fabrication. Canadian Journal of Civil Engineering, 46(10): 871–880.
Mohsenijam A., Siu M.F., and Lu M. 2017. Modified stepwise regression approach to streamlining predictive analytics for construction engineering applications. Journal of Computing in Civil Engineering, ASCE, 31(3): 04016066.
Moody, J. 1994. Prediction risk and architecture selection for neural networks. In From Statistics to Neural Networks. Edited by V. Cherkassky, J.H. Friedman, and H. Wechsler. NATO ASI Series (Series F: Computer and Systems Sciences), Vol. 136. Springer, Berlin, Heidelberg. pp. 147–165.
Morgan J.N. and Sonquist J.A. 1963. Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association, 58(302): 415–434.
Omran B.A., Qian C., and Ruoyu J. 2016. Comparison of data mining techniques for predicting compressive strength of environmentally friendly concrete. Journal of Computing in Civil Engineering, ASCE, 30(6): 04016029.
Quinlan, J.R. 1992. Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, AI’92, Hobart, Tasmania, 16–18 November 1992. World Scientific, Singapore. pp. 343–348.
Rudin C. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1, 206–215.
Said H.M. and Prathyaj K. 2018. Performance measurement of building sheet-metal ductwork prefabrication under batch production settings. Journal of Construction Engineering and Management, ASCE, 144(2): 04017107.
Sargent R.G. 2013. Verification and validation of simulation models. Journal of Simulation, 7(1): 12–24.
Schwarz G. 1978. Estimating the dimension of a model. Annals of Statistics, 6: 461–464.
Seber, G.A.F., and Lee, A.J. 2003. Linear regression analysis, Wiley, New York.
Shakhnarovich, G. 2011. Notes on derivation of bias-variance decomposition in linear regression. Toyota Technological Institute, Chicago, Ill.
Sweis R.J., Sweis G.J., Abu Hammad A.A., and Abu Rumman M. 2009. Modeling the variability of labor productivity in masonry construction. Jordan Journal of Civil Engineering, 3(3): 197–212.
UCI. 2020. Concrete Slump Test Data Set. Available from https://archive.ics.uci.edu/ml/datasets/Concrete+Slump+Test.
Wang, G.C.S., and Jain, C.L. 2003. Regression analysis: modeling and forecasting. Graceway, New York.
Yeh I.C. 2006. Exploring concrete slump model using artificial neural networks. Journal of Computing in Civil Engineering, ASCE, 20: 217–221.
Yeh I.C. 2007. Modeling slump flow of concrete using second-order regressions and artificial neural networks. Cement and concrete Composites, 29(6): 474–480.
Yu, L., Lai, K.K., Wang, S., and Huang, W. 2006. A bias-variance-complexity trade-off framework for complex system modeling. In Proceedings of Computational Science and Its Applications - ICCSA 2006 International Conference, Glasgow, UK, 8–11 May 2006. Edited by M. Gavrilova, O. Gervasi, V. Kumar, C.J.K. Tan, D. Taniar, A. Laganá, Y. Mun, and H. Choo. Springer Berlin Heidelberg. pp. 518–527.

Supplementary Material

Supplementary data (cjce-2020-0753suppla.docx)

Information & Authors

Information

Published In

cover image Canadian Journal of Civil Engineering
Canadian Journal of Civil Engineering
Volume 49Number 4April 2022
Pages: 478 - 486

History

Received: 16 November 2020
Accepted: 10 June 2021
Published online: 18 June 2021

Permissions

Request permissions for this article.

Key Words

  1. multiple linear regression
  2. model tree
  3. modified stepwise regression
  4. concrete slump
  5. steel fabrication
  6. estimating

Mots-clés

  1. régression linéaire multiple
  2. arbre modèle
  3. régression séquentielle modifiée
  4. affaissement du béton
  5. fabrication de l’acier
  6. estimation

Authors

Affiliations

Arash Mohsenijam
Supreme Group, 28169 96 Ave., Acheson, AB T7X 6J7, Canada.
Department of Civil and Environmental Engineering, University of Alberta, 116 St & 85 Ave., Edmonton, AB T6G 2R3, Canada.
Serhii Naumets
Department of Civil and Environmental Engineering, University of Alberta, 116 St & 85 Ave., Edmonton, AB T6G 2R3, Canada.

Metrics & Citations

Metrics

Other Metrics

Citations

Cite As

Export Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

There are no citations for this item

View Options

Get Access

Login options

Check if you access through your login credentials or your institution to get full access on this article.

Subscribe

Click on the button below to subscribe to Canadian Journal of Civil Engineering

Purchase options

Purchase this article to get full access to it.

Restore your content access

Enter your email address to restore your content access:

Note: This functionality works only for purchases done as a guest. If you already have an account, log in to access the content to which you are entitled.

View options

PDF

View PDF

Full Text

View Full Text

Media

Media

Other

Tables

Share Options

Share

Share the article link

Share with email

Email a colleague

Share on social media