Internal validation parameters of linear regression equations in QSAR problem

The article discusses a set of internal validation parameters that are (or can be) used to describe the quality of regression models in quantitative structure-activity relationship problems. Among these parameters there are well known determination coefficient, root mean square deviation, mean absol...

Full description

Saved in:
Bibliographic Details
Main Authors: Inna Khristenko, Volodymyr Ivanov
Format: Article
Language:English
Published: V. N. Karazin Kharkiv National University 2023-05-01
Series:Вісник Харківського національного університету: Серія xімія
Subjects:
Online Access:https://periodicals.karazin.ua/chemistry/article/view/23245
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549922971680768
author Inna Khristenko
Volodymyr Ivanov
author_facet Inna Khristenko
Volodymyr Ivanov
author_sort Inna Khristenko
collection DOAJ
description The article discusses a set of internal validation parameters that are (or can be) used to describe the quality of regression models in quantitative structure-activity relationship problems. Among these parameters there are well known determination coefficient, root mean square deviation, mean absolute error, etc. Also the indices based at Kullback-Leibler divergence as a measure of distance between two sets have been investigated. All the parameters (indices) were calculated for several regression models which describe boiling point of saturated hydrocarbons (alkanes). Regression models include a four-component additive scheme and equations describing the property as a function of topological indices. The two types of regressions based on these indices are linear dependencies on only one topological index and linear dependencies on topological index and the number of carbon atoms in the hydrocarbon. Various linear regression equations have been described with internal validation parameters that evaluate the quality of the equations from different perspectives. It is shown that a wide set of test parameters is not only an additional yet alternative description of regression models, but also provides the most complete description of the predictive characteristics and quality of the obtained regression model.
format Article
id doaj-art-559796a2220a4aff850597c0e9845505
institution Kabale University
issn 2220-637X
2220-6396
language English
publishDate 2023-05-01
publisher V. N. Karazin Kharkiv National University
record_format Article
series Вісник Харківського національного університету: Серія xімія
spelling doaj-art-559796a2220a4aff850597c0e98455052025-01-10T11:26:48ZengV. N. Karazin Kharkiv National UniversityВісник Харківського національного університету: Серія xімія2220-637X2220-63962023-05-0140122110.26565/2220-637X-2023-40-0223245Internal validation parameters of linear regression equations in QSAR problemInna Khristenko0Volodymyr Ivanov1V. N. Karazin Kharkiv National University, 4 Svobody sq., Kharkiv, 61022, UkraineV. N. Karazin Kharkiv National University, 4 Svobody sq., Kharkiv, 61022, UkraineThe article discusses a set of internal validation parameters that are (or can be) used to describe the quality of regression models in quantitative structure-activity relationship problems. Among these parameters there are well known determination coefficient, root mean square deviation, mean absolute error, etc. Also the indices based at Kullback-Leibler divergence as a measure of distance between two sets have been investigated. All the parameters (indices) were calculated for several regression models which describe boiling point of saturated hydrocarbons (alkanes). Regression models include a four-component additive scheme and equations describing the property as a function of topological indices. The two types of regressions based on these indices are linear dependencies on only one topological index and linear dependencies on topological index and the number of carbon atoms in the hydrocarbon. Various linear regression equations have been described with internal validation parameters that evaluate the quality of the equations from different perspectives. It is shown that a wide set of test parameters is not only an additional yet alternative description of regression models, but also provides the most complete description of the predictive characteristics and quality of the obtained regression model.https://periodicals.karazin.ua/chemistry/article/view/23245quantitative structure-activity relationships (qsar)regression modelsinternal validationtopological descriptors
spellingShingle Inna Khristenko
Volodymyr Ivanov
Internal validation parameters of linear regression equations in QSAR problem
Вісник Харківського національного університету: Серія xімія
quantitative structure-activity relationships (qsar)
regression models
internal validation
topological descriptors
title Internal validation parameters of linear regression equations in QSAR problem
title_full Internal validation parameters of linear regression equations in QSAR problem
title_fullStr Internal validation parameters of linear regression equations in QSAR problem
title_full_unstemmed Internal validation parameters of linear regression equations in QSAR problem
title_short Internal validation parameters of linear regression equations in QSAR problem
title_sort internal validation parameters of linear regression equations in qsar problem
topic quantitative structure-activity relationships (qsar)
regression models
internal validation
topological descriptors
url https://periodicals.karazin.ua/chemistry/article/view/23245
work_keys_str_mv AT innakhristenko internalvalidationparametersoflinearregressionequationsinqsarproblem
AT volodymyrivanov internalvalidationparametersoflinearregressionequationsinqsarproblem