Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.

Accurate forecasting of claim frequency in automobile insurance is essential for insurers to assess risks effectively and establish appropriate pricing policies. Traditional methods typically rely on a Poisson distribution for modeling claim counts; however, this approach can be inadequate due to fr...

Full description

Saved in:

Bibliographic Details
Main Author:	Gadir Alomair
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2024-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0314975
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841555644686008320
author	Gadir Alomair
author_facet	Gadir Alomair
author_sort	Gadir Alomair
collection	DOAJ
description	Accurate forecasting of claim frequency in automobile insurance is essential for insurers to assess risks effectively and establish appropriate pricing policies. Traditional methods typically rely on a Poisson distribution for modeling claim counts; however, this approach can be inadequate due to frequent zero-claim periods, leading to zero inflation in the data. Zero inflation occurs when more zeros are observed than expected under standard Poisson or negative binomial (NB) models. While machine learning (ML) techniques have been explored for predictive analytics in other contexts, their application to zero-inflated insurance data remains limited. This study investigates the utility of ML in improving forecast accuracy under conditions of zero-inflation, a data characteristic common in automobile insurance. The research involved a comparative evaluation of several models, including Poisson, NB, zero-inflated Poisson (ZIP), hurdle Poisson, zero-inflated negative binomial (ZINB), hurdle negative binomial, random forest (RF), support vector machine (SVM), and artificial neural network (ANN) on an insurance dataset. The performance of these models was assessed using mean absolute error. The results reveal that the SVM model outperforms others in predictive accuracy, particularly in handling zero-inflation, followed by the ZIP and ZINB models. In contrast, the traditional Poisson and NB models showed lower predictive capabilities. By addressing the challenge of zero-inflation in automobile claim data, this study offers insights into improving the accuracy of claim frequency predictions. Although this study is based on a single dataset, the findings provide valuable perspectives on enhancing prediction accuracy and improving risk management practices in the insurance industry.
format	Article
id	doaj-art-f2bec89aafa74e7688e583bbbd3dd9a2
institution	Kabale University
issn	1932-6203
language	English
publishDate	2024-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj-art-f2bec89aafa74e7688e583bbbd3dd9a22025-01-08T05:32:02ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-011912e031497510.1371/journal.pone.0314975Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.Gadir AlomairAccurate forecasting of claim frequency in automobile insurance is essential for insurers to assess risks effectively and establish appropriate pricing policies. Traditional methods typically rely on a Poisson distribution for modeling claim counts; however, this approach can be inadequate due to frequent zero-claim periods, leading to zero inflation in the data. Zero inflation occurs when more zeros are observed than expected under standard Poisson or negative binomial (NB) models. While machine learning (ML) techniques have been explored for predictive analytics in other contexts, their application to zero-inflated insurance data remains limited. This study investigates the utility of ML in improving forecast accuracy under conditions of zero-inflation, a data characteristic common in automobile insurance. The research involved a comparative evaluation of several models, including Poisson, NB, zero-inflated Poisson (ZIP), hurdle Poisson, zero-inflated negative binomial (ZINB), hurdle negative binomial, random forest (RF), support vector machine (SVM), and artificial neural network (ANN) on an insurance dataset. The performance of these models was assessed using mean absolute error. The results reveal that the SVM model outperforms others in predictive accuracy, particularly in handling zero-inflation, followed by the ZIP and ZINB models. In contrast, the traditional Poisson and NB models showed lower predictive capabilities. By addressing the challenge of zero-inflation in automobile claim data, this study offers insights into improving the accuracy of claim frequency predictions. Although this study is based on a single dataset, the findings provide valuable perspectives on enhancing prediction accuracy and improving risk management practices in the insurance industry.https://doi.org/10.1371/journal.pone.0314975
spellingShingle	Gadir Alomair Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset. PLoS ONE
title	Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.
title_full	Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.
title_fullStr	Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.
title_full_unstemmed	Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.
title_short	Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.
title_sort	predictive performance of count regression models versus machine learning techniques a comparative analysis using an automobile insurance claims frequency dataset
url	https://doi.org/10.1371/journal.pone.0314975
work_keys_str_mv	AT gadiralomair predictiveperformanceofcountregressionmodelsversusmachinelearningtechniquesacomparativeanalysisusinganautomobileinsuranceclaimsfrequencydataset

Predictive performance of count regression models versus machine learning techniques: A comparative analysis using an automobile insurance claims frequency dataset.

Similar Items