On Systems of Neural ODEs with Generalized Power Activation Functions

When constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded,...

Full description

Saved in:

Bibliographic Details
Main Authors:	Vasiliy Ye. Belozyorov, Yevhen V. Koshel
Format:	Article
Language:	English
Published:	Oles Honchar Dnipro National University 2024-08-01
Series:	Journal of Optimization, Differential Equations and Their Applications
Subjects:	system of ordinary autonomous differential equations limit cycle chaotic attractor logistic mapping residual neural network activation function time series
Online Access:	https://model-dnu.dp.ua/index.php/SM/article/view/201
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846139647130861568
author	Vasiliy Ye. Belozyorov Yevhen V. Koshel
author_facet	Vasiliy Ye. Belozyorov Yevhen V. Koshel
author_sort	Vasiliy Ye. Belozyorov
collection	DOAJ
description	When constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded, which can lead to vanishing gradient problem. The ReLU is not bounded but is not differentiable in the point 0, which may lead to suboptimal training in some optimizers. One can attempt to use sigmoid-like functions like the cubic root, but it is also not differentiable in the point 0. One activation function that is often overlooked is the identity function. Even though it doesn’t induce nonlinear behavior in the model by itself, it can help build more explainable models more quickly due to non-existent cost of its evaluation, while the non-linearities can be provided by the model’s evaluation rule. In this article, we explore the use of specially-designed unbounded differentiable generalized power activation function, the identity function, and their combinations for approximating univariate time series data with neural ordinary differential equations. Examples are given.
format	Article
id	doaj-art-df0568d3f85647d198511ee1dea4fb5a
institution	Kabale University
issn	2617-0108 2663-6824
language	English
publishDate	2024-08-01
publisher	Oles Honchar Dnipro National University
record_format	Article
series	Journal of Optimization, Differential Equations and Their Applications
spelling	doaj-art-df0568d3f85647d198511ee1dea4fb5a2024-12-06T09:13:59ZengOles Honchar Dnipro National UniversityJournal of Optimization, Differential Equations and Their Applications2617-01082663-68242024-08-01322569110.15421/142409193On Systems of Neural ODEs with Generalized Power Activation FunctionsVasiliy Ye. Belozyorov0Yevhen V. Koshel1Faculty of Applied Mathematics and Information Technologies, Oles Honchar Dnipro National UniversityFaculty of Applied Mathematics and Information Technologies, Oles Honchar Dnipro National UniversityWhen constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded, which can lead to vanishing gradient problem. The ReLU is not bounded but is not differentiable in the point 0, which may lead to suboptimal training in some optimizers. One can attempt to use sigmoid-like functions like the cubic root, but it is also not differentiable in the point 0. One activation function that is often overlooked is the identity function. Even though it doesn’t induce nonlinear behavior in the model by itself, it can help build more explainable models more quickly due to non-existent cost of its evaluation, while the non-linearities can be provided by the model’s evaluation rule. In this article, we explore the use of specially-designed unbounded differentiable generalized power activation function, the identity function, and their combinations for approximating univariate time series data with neural ordinary differential equations. Examples are given.https://model-dnu.dp.ua/index.php/SM/article/view/201system of ordinary autonomous differential equationslimit cyclechaotic attractorlogistic mappingresidual neural networkactivation functiontime series
spellingShingle	Vasiliy Ye. Belozyorov Yevhen V. Koshel On Systems of Neural ODEs with Generalized Power Activation Functions Journal of Optimization, Differential Equations and Their Applications system of ordinary autonomous differential equations limit cycle chaotic attractor logistic mapping residual neural network activation function time series
title	On Systems of Neural ODEs with Generalized Power Activation Functions
title_full	On Systems of Neural ODEs with Generalized Power Activation Functions
title_fullStr	On Systems of Neural ODEs with Generalized Power Activation Functions
title_full_unstemmed	On Systems of Neural ODEs with Generalized Power Activation Functions
title_short	On Systems of Neural ODEs with Generalized Power Activation Functions
title_sort	on systems of neural odes with generalized power activation functions
topic	system of ordinary autonomous differential equations limit cycle chaotic attractor logistic mapping residual neural network activation function time series
url	https://model-dnu.dp.ua/index.php/SM/article/view/201
work_keys_str_mv	AT vasiliyyebelozyorov onsystemsofneuralodeswithgeneralizedpoweractivationfunctions AT yevhenvkoshel onsystemsofneuralodeswithgeneralizedpoweractivationfunctions

On Systems of Neural ODEs with Generalized Power Activation Functions

Similar Items