On Systems of Neural ODEs with Generalized Power Activation Functions
When constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded,...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Oles Honchar Dnipro National University
2024-08-01
|
| Series: | Journal of Optimization, Differential Equations and Their Applications |
| Subjects: | |
| Online Access: | https://model-dnu.dp.ua/index.php/SM/article/view/201 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846139647130861568 |
|---|---|
| author | Vasiliy Ye. Belozyorov Yevhen V. Koshel |
| author_facet | Vasiliy Ye. Belozyorov Yevhen V. Koshel |
| author_sort | Vasiliy Ye. Belozyorov |
| collection | DOAJ |
| description | When constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded, which can lead to vanishing gradient problem. The ReLU is not bounded but is not differentiable in the point 0, which may lead to suboptimal training in some optimizers. One can attempt to use sigmoid-like functions like the cubic root, but it is also not differentiable in the point 0. One activation function that is often overlooked is the identity function. Even though it doesn’t induce nonlinear behavior in the model by itself, it can help build more explainable models more quickly due to non-existent cost of its evaluation, while the non-linearities can be provided by the model’s evaluation rule. In this article, we explore the use of specially-designed unbounded differentiable generalized power activation function, the identity function, and their combinations for approximating univariate time series data with neural ordinary differential equations. Examples are given. |
| format | Article |
| id | doaj-art-df0568d3f85647d198511ee1dea4fb5a |
| institution | Kabale University |
| issn | 2617-0108 2663-6824 |
| language | English |
| publishDate | 2024-08-01 |
| publisher | Oles Honchar Dnipro National University |
| record_format | Article |
| series | Journal of Optimization, Differential Equations and Their Applications |
| spelling | doaj-art-df0568d3f85647d198511ee1dea4fb5a2024-12-06T09:13:59ZengOles Honchar Dnipro National UniversityJournal of Optimization, Differential Equations and Their Applications2617-01082663-68242024-08-01322569110.15421/142409193On Systems of Neural ODEs with Generalized Power Activation FunctionsVasiliy Ye. Belozyorov0Yevhen V. Koshel1Faculty of Applied Mathematics and Information Technologies, Oles Honchar Dnipro National UniversityFaculty of Applied Mathematics and Information Technologies, Oles Honchar Dnipro National UniversityWhen constructing neural network-based models, it is common practice to use time-tested activation functions such as the hyperbolic tangent, the sigmoid or the ReLU functions. These choices, however, may be suboptimal. The hyperbolic tangent and the sigmoid functions are differentiable but bounded, which can lead to vanishing gradient problem. The ReLU is not bounded but is not differentiable in the point 0, which may lead to suboptimal training in some optimizers. One can attempt to use sigmoid-like functions like the cubic root, but it is also not differentiable in the point 0. One activation function that is often overlooked is the identity function. Even though it doesn’t induce nonlinear behavior in the model by itself, it can help build more explainable models more quickly due to non-existent cost of its evaluation, while the non-linearities can be provided by the model’s evaluation rule. In this article, we explore the use of specially-designed unbounded differentiable generalized power activation function, the identity function, and their combinations for approximating univariate time series data with neural ordinary differential equations. Examples are given.https://model-dnu.dp.ua/index.php/SM/article/view/201system of ordinary autonomous differential equationslimit cyclechaotic attractorlogistic mappingresidual neural networkactivation functiontime series |
| spellingShingle | Vasiliy Ye. Belozyorov Yevhen V. Koshel On Systems of Neural ODEs with Generalized Power Activation Functions Journal of Optimization, Differential Equations and Their Applications system of ordinary autonomous differential equations limit cycle chaotic attractor logistic mapping residual neural network activation function time series |
| title | On Systems of Neural ODEs with Generalized Power Activation Functions |
| title_full | On Systems of Neural ODEs with Generalized Power Activation Functions |
| title_fullStr | On Systems of Neural ODEs with Generalized Power Activation Functions |
| title_full_unstemmed | On Systems of Neural ODEs with Generalized Power Activation Functions |
| title_short | On Systems of Neural ODEs with Generalized Power Activation Functions |
| title_sort | on systems of neural odes with generalized power activation functions |
| topic | system of ordinary autonomous differential equations limit cycle chaotic attractor logistic mapping residual neural network activation function time series |
| url | https://model-dnu.dp.ua/index.php/SM/article/view/201 |
| work_keys_str_mv | AT vasiliyyebelozyorov onsystemsofneuralodeswithgeneralizedpoweractivationfunctions AT yevhenvkoshel onsystemsofneuralodeswithgeneralizedpoweractivationfunctions |