A Proposal for a New Python Library Implementing Stepwise Procedure

Carefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensi...

Full description

Saved in:
Bibliographic Details
Main Authors: Luiz Paulo Fávero, Helder Prado Santos, Patrícia Belfiore, Alexandre Duarte, Igor Pinheiro de Araújo Costa, Adilson Vilarinho Terra, Miguel Ângelo Lellis Moreira, Wilson Tarantin Junior, Marcos dos Santos
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/17/11/502
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846154671884861440
author Luiz Paulo Fávero
Helder Prado Santos
Patrícia Belfiore
Alexandre Duarte
Igor Pinheiro de Araújo Costa
Adilson Vilarinho Terra
Miguel Ângelo Lellis Moreira
Wilson Tarantin Junior
Marcos dos Santos
author_facet Luiz Paulo Fávero
Helder Prado Santos
Patrícia Belfiore
Alexandre Duarte
Igor Pinheiro de Araújo Costa
Adilson Vilarinho Terra
Miguel Ângelo Lellis Moreira
Wilson Tarantin Junior
Marcos dos Santos
author_sort Luiz Paulo Fávero
collection DOAJ
description Carefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensive approach to selecting variables in multiple regression models using the stepwise procedure. As the main contribution of this study, we present the stepwise function implemented in Python to improve the effectiveness of statistical analyses, allowing the intuitive and efficient selection of statistically significant variables. The application of the function is exemplified in a real case study of real estate pricing, validating its effectiveness in improving the fit of regression models. In addition, we presented a methodological framework for treating joint problems in data analysis, such as heteroskedasticity, multicollinearity, and nonadherence of residues to normality. This framework offers a robust computational implementation to mitigate such issues. This study aims to advance the understanding and application of statistical methods in Python, providing valuable tools for researchers, students, and professionals from various areas.
format Article
id doaj-art-17948ddf59014740afcfce4145ac0a23
institution Kabale University
issn 1999-4893
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj-art-17948ddf59014740afcfce4145ac0a232024-11-26T17:45:25ZengMDPI AGAlgorithms1999-48932024-11-01171150210.3390/a17110502A Proposal for a New Python Library Implementing Stepwise ProcedureLuiz Paulo Fávero0Helder Prado Santos1Patrícia Belfiore2Alexandre Duarte3Igor Pinheiro de Araújo Costa4Adilson Vilarinho Terra5Miguel Ângelo Lellis Moreira6Wilson Tarantin Junior7Marcos dos Santos8Faculty of Economics, Administration, and Accounting, University of Sao Paulo, Sao Paulo 05508-900, BrazilFaculty of Economics, Administration, and Accounting, University of Sao Paulo, Sao Paulo 05508-900, BrazilDepartment of Management Engineering, Federal University of ABC, Sao Bernardo do Campo 09606-045, BrazilPolytechnic School, University of Sao Paulo, Sao Paulo 05508-010, BrazilProduction Engineering Department, Fluminense Federal University, Niteroi 24210-240, BrazilProduction Engineering Department, Fluminense Federal University, Niteroi 24210-240, BrazilProduction Engineering Department, Fluminense Federal University, Niteroi 24210-240, BrazilFaculty of Economics, Administration, and Accounting, University of Sao Paulo, Sao Paulo 05508-900, BrazilSystems and Computing Department, Military Institute of Engineering, Rio de Janeiro 22290-270, BrazilCarefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensive approach to selecting variables in multiple regression models using the stepwise procedure. As the main contribution of this study, we present the stepwise function implemented in Python to improve the effectiveness of statistical analyses, allowing the intuitive and efficient selection of statistically significant variables. The application of the function is exemplified in a real case study of real estate pricing, validating its effectiveness in improving the fit of regression models. In addition, we presented a methodological framework for treating joint problems in data analysis, such as heteroskedasticity, multicollinearity, and nonadherence of residues to normality. This framework offers a robust computational implementation to mitigate such issues. This study aims to advance the understanding and application of statistical methods in Python, providing valuable tools for researchers, students, and professionals from various areas.https://www.mdpi.com/1999-4893/17/11/502supervised machine learningstepwise functionPython function
spellingShingle Luiz Paulo Fávero
Helder Prado Santos
Patrícia Belfiore
Alexandre Duarte
Igor Pinheiro de Araújo Costa
Adilson Vilarinho Terra
Miguel Ângelo Lellis Moreira
Wilson Tarantin Junior
Marcos dos Santos
A Proposal for a New Python Library Implementing Stepwise Procedure
Algorithms
supervised machine learning
stepwise function
Python function
title A Proposal for a New Python Library Implementing Stepwise Procedure
title_full A Proposal for a New Python Library Implementing Stepwise Procedure
title_fullStr A Proposal for a New Python Library Implementing Stepwise Procedure
title_full_unstemmed A Proposal for a New Python Library Implementing Stepwise Procedure
title_short A Proposal for a New Python Library Implementing Stepwise Procedure
title_sort proposal for a new python library implementing stepwise procedure
topic supervised machine learning
stepwise function
Python function
url https://www.mdpi.com/1999-4893/17/11/502
work_keys_str_mv AT luizpaulofavero aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT helderpradosantos aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT patriciabelfiore aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT alexandreduarte aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT igorpinheirodearaujocosta aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT adilsonvilarinhoterra aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT miguelangelolellismoreira aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT wilsontarantinjunior aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT marcosdossantos aproposalforanewpythonlibraryimplementingstepwiseprocedure
AT luizpaulofavero proposalforanewpythonlibraryimplementingstepwiseprocedure
AT helderpradosantos proposalforanewpythonlibraryimplementingstepwiseprocedure
AT patriciabelfiore proposalforanewpythonlibraryimplementingstepwiseprocedure
AT alexandreduarte proposalforanewpythonlibraryimplementingstepwiseprocedure
AT igorpinheirodearaujocosta proposalforanewpythonlibraryimplementingstepwiseprocedure
AT adilsonvilarinhoterra proposalforanewpythonlibraryimplementingstepwiseprocedure
AT miguelangelolellismoreira proposalforanewpythonlibraryimplementingstepwiseprocedure
AT wilsontarantinjunior proposalforanewpythonlibraryimplementingstepwiseprocedure
AT marcosdossantos proposalforanewpythonlibraryimplementingstepwiseprocedure