On Systems of Neural ODEs with Generalized Power Activation Functions

When constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded,...

Full description

Saved in:
Bibliographic Details
Main Authors: Vasiliy Ye. Belozyorov, Yevhen V. Koshel
Format: Article
Language:English
Published: Oles Honchar Dnipro National University 2024-08-01
Series:Journal of Optimization, Differential Equations and Their Applications
Subjects:
Online Access:https://model-dnu.dp.ua/index.php/SM/article/view/201
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846139647130861568
author Vasiliy Ye. Belozyorov
Yevhen V. Koshel
author_facet Vasiliy Ye. Belozyorov
Yevhen V. Koshel
author_sort Vasiliy Ye. Belozyorov
collection DOAJ
description When constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded, which can lead to vanishing gradient problem. The ReLU is not bounded but is not differentiable in the point 0, which may lead to suboptimal training in some optimizers. One can attempt to use sigmoid-like functions like the cubic root, but it is also not differentiable in the point 0. One activation function that is often overlooked is the identity function. Even though it doesn’t induce nonlinear behavior in the model by itself, it can help build more explainable models more quickly due to non-existent cost of its evaluation, while the non-linearities can be provided by the model’s evaluation rule. In this article, we explore the use of specially-designed unbounded differentiable generalized power activation function, the identity function, and their combinations for approximating univariate time series data with neural ordinary differential equations. Examples are given.
format Article
id doaj-art-df0568d3f85647d198511ee1dea4fb5a
institution Kabale University
issn 2617-0108
2663-6824
language English
publishDate 2024-08-01
publisher Oles Honchar Dnipro National University
record_format Article
series Journal of Optimization, Differential Equations and Their Applications
spelling doaj-art-df0568d3f85647d198511ee1dea4fb5a2024-12-06T09:13:59ZengOles Honchar Dnipro National UniversityJournal of Optimization, Differential Equations and Their Applications2617-01082663-68242024-08-01322569110.15421/142409193On Systems of Neural ODEs with Generalized Power Activation FunctionsVasiliy Ye. Belozyorov0Yevhen V. Koshel1Faculty of Applied Mathematics and Information Technologies, Oles Honchar Dnipro National UniversityFaculty of Applied Mathematics and Information Technologies, Oles Honchar Dnipro National UniversityWhen constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded, which can lead to vanishing gradient problem. The ReLU is not bounded but is not differentiable in the point 0, which may lead to suboptimal training in some optimizers. One can attempt to use sigmoid-like functions like the cubic root, but it is also not differentiable in the point 0. One activation function that is often overlooked is the identity function. Even though it doesn’t induce nonlinear behavior in the model by itself, it can help build more explainable models more quickly due to non-existent cost of its evaluation, while the non-linearities can be provided by the model’s evaluation rule. In this article, we explore the use of specially-designed unbounded differentiable generalized power activation function, the identity function, and their combinations for approximating univariate time series data with neural ordinary differential equations. Examples are given.https://model-dnu.dp.ua/index.php/SM/article/view/201system of ordinary autonomous differential equationslimit cyclechaotic attractorlogistic mappingresidual neural networkactivation functiontime series
spellingShingle Vasiliy Ye. Belozyorov
Yevhen V. Koshel
On Systems of Neural ODEs with Generalized Power Activation Functions
Journal of Optimization, Differential Equations and Their Applications
system of ordinary autonomous differential equations
limit cycle
chaotic attractor
logistic mapping
residual neural network
activation function
time series
title On Systems of Neural ODEs with Generalized Power Activation Functions
title_full On Systems of Neural ODEs with Generalized Power Activation Functions
title_fullStr On Systems of Neural ODEs with Generalized Power Activation Functions
title_full_unstemmed On Systems of Neural ODEs with Generalized Power Activation Functions
title_short On Systems of Neural ODEs with Generalized Power Activation Functions
title_sort on systems of neural odes with generalized power activation functions
topic system of ordinary autonomous differential equations
limit cycle
chaotic attractor
logistic mapping
residual neural network
activation function
time series
url https://model-dnu.dp.ua/index.php/SM/article/view/201
work_keys_str_mv AT vasiliyyebelozyorov onsystemsofneuralodeswithgeneralizedpoweractivationfunctions
AT yevhenvkoshel onsystemsofneuralodeswithgeneralizedpoweractivationfunctions