Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance

This study proposes a two-stage methodology for predicting wind energy production using time, environmental, technical, and locational variables. In the first stage, machine learning algorithms, including random forest (RF), gradient boosting (GB), k-nearest neighbors (kNNs), linear regression (LR),...

Full description

Saved in:
Bibliographic Details
Main Authors: Yasemin Ayaz Atalan, Abdulkadir Atalan
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/241
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549420454215680
author Yasemin Ayaz Atalan
Abdulkadir Atalan
author_facet Yasemin Ayaz Atalan
Abdulkadir Atalan
author_sort Yasemin Ayaz Atalan
collection DOAJ
description This study proposes a two-stage methodology for predicting wind energy production using time, environmental, technical, and locational variables. In the first stage, machine learning algorithms, including random forest (RF), gradient boosting (GB), k-nearest neighbors (kNNs), linear regression (LR), and decision trees (Tree), were employed to estimate energy output. Among these, RF exhibited the best performance with the lowest error metrics (MSE: 0.003, RMSE: 0.053) and the highest R<sup>2</sup> value (0.988). In the second stage, analysis of variance (ANOVA) was conducted to evaluate the statistical relationships between independent variables and the predicted dependent variable, identifying wind speed (<i>p</i> < 0.001) and rotor speed (<i>p</i> < 0.001) as the most influential factors. Furthermore, RF and GB models produced predictions most closely aligned with actual data, achieving R<sup>2</sup> values of 88.83% and 89.30% in the ANOVA validation phase. Integrating RF and GB models with statistical validation highlighted the robustness of the methodology. These findings demonstrate the robustness of integrating machine learning models with statistical verification methods.
format Article
id doaj-art-e05b4d2ecdae41bab650db7f50b4b404
institution Kabale University
issn 2076-3417
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-e05b4d2ecdae41bab650db7f50b4b4042025-01-10T13:14:54ZengMDPI AGApplied Sciences2076-34172024-12-0115124110.3390/app15010241Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of VarianceYasemin Ayaz Atalan0Abdulkadir Atalan1Department of Energy Management, Çanakkale Onsekiz Mart University, Çanakkale 17100, TurkeyFaculty of Engineering, Çanakkale Onsekiz Mart University, Çanakkale 17100, TurkeyThis study proposes a two-stage methodology for predicting wind energy production using time, environmental, technical, and locational variables. In the first stage, machine learning algorithms, including random forest (RF), gradient boosting (GB), k-nearest neighbors (kNNs), linear regression (LR), and decision trees (Tree), were employed to estimate energy output. Among these, RF exhibited the best performance with the lowest error metrics (MSE: 0.003, RMSE: 0.053) and the highest R<sup>2</sup> value (0.988). In the second stage, analysis of variance (ANOVA) was conducted to evaluate the statistical relationships between independent variables and the predicted dependent variable, identifying wind speed (<i>p</i> < 0.001) and rotor speed (<i>p</i> < 0.001) as the most influential factors. Furthermore, RF and GB models produced predictions most closely aligned with actual data, achieving R<sup>2</sup> values of 88.83% and 89.30% in the ANOVA validation phase. Integrating RF and GB models with statistical validation highlighted the robustness of the methodology. These findings demonstrate the robustness of integrating machine learning models with statistical verification methods.https://www.mdpi.com/2076-3417/15/1/241wind energy predictionrenewable energymachine learningANOVAstatistical validation
spellingShingle Yasemin Ayaz Atalan
Abdulkadir Atalan
Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance
Applied Sciences
wind energy prediction
renewable energy
machine learning
ANOVA
statistical validation
title Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance
title_full Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance
title_fullStr Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance
title_full_unstemmed Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance
title_short Testing the Wind Energy Data Based on Environmental Factors Predicted by Machine Learning with Analysis of Variance
title_sort testing the wind energy data based on environmental factors predicted by machine learning with analysis of variance
topic wind energy prediction
renewable energy
machine learning
ANOVA
statistical validation
url https://www.mdpi.com/2076-3417/15/1/241
work_keys_str_mv AT yaseminayazatalan testingthewindenergydatabasedonenvironmentalfactorspredictedbymachinelearningwithanalysisofvariance
AT abdulkadiratalan testingthewindenergydatabasedonenvironmentalfactorspredictedbymachinelearningwithanalysisofvariance