Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications

Augmented reality applications involving human interaction with virtual objects often rely on segmentation-based hand detection techniques. Semantic segmentation can then be enhanced with instance-specific information to model complex interactions between objects, but extracting such information typ...

Full description

Saved in:
Bibliographic Details
Main Authors: Miguel Veganzones, Ana Cisnal, Eusebio de la Fuente, Juan Carlos Fraile
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/23/11357
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846124477543350272
author Miguel Veganzones
Ana Cisnal
Eusebio de la Fuente
Juan Carlos Fraile
author_facet Miguel Veganzones
Ana Cisnal
Eusebio de la Fuente
Juan Carlos Fraile
author_sort Miguel Veganzones
collection DOAJ
description Augmented reality applications involving human interaction with virtual objects often rely on segmentation-based hand detection techniques. Semantic segmentation can then be enhanced with instance-specific information to model complex interactions between objects, but extracting such information typically increases the computational load significantly. This study proposes a training strategy that enables conventional semantic segmentation networks to preserve some instance information during inference. This is accomplished by introducing pixel weight maps into the loss calculation, increasing the importance of boundary pixels between instances. We compare two common fully convolutional network (FCN) architectures, U-Net and ResNet, and fine-tune the fittest to improve segmentation results. Although the resulting model does not reach state-of-the-art segmentation performance on the EgoHands dataset, it preserves some instance information with no computational overhead. As expected, degraded segmentations are a necessary trade-off to preserve boundaries when instances are close together. This strategy allows approximating instance segmentation in real-time using non-specialized hardware, obtaining a unique blob for an instance with an intersection over union greater than 50% in 79% of the instances in our test set. A simple FCN, typically used for semantic segmentation, has shown promising instance segmentation results by introducing per-pixel weight maps during training for light-weight applications.
format Article
id doaj-art-f7579c42e32545ad91b12a5fb3881ac2
institution Kabale University
issn 2076-3417
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-f7579c42e32545ad91b12a5fb3881ac22024-12-13T16:23:39ZengMDPI AGApplied Sciences2076-34172024-12-0114231135710.3390/app142311357Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation ApplicationsMiguel Veganzones0Ana Cisnal1Eusebio de la Fuente2Juan Carlos Fraile3Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Escuela de Ingenierías Industriales, Universidad de Valladolid, Paseo Prado de la Magdalena 3-5, 47011 Valladolid, SpainInstituto de las Tecnologías Avanzadas de la Producción (ITAP), Escuela de Ingenierías Industriales, Universidad de Valladolid, Paseo Prado de la Magdalena 3-5, 47011 Valladolid, SpainInstituto de las Tecnologías Avanzadas de la Producción (ITAP), Escuela de Ingenierías Industriales, Universidad de Valladolid, Paseo Prado de la Magdalena 3-5, 47011 Valladolid, SpainInstituto de las Tecnologías Avanzadas de la Producción (ITAP), Escuela de Ingenierías Industriales, Universidad de Valladolid, Paseo Prado de la Magdalena 3-5, 47011 Valladolid, SpainAugmented reality applications involving human interaction with virtual objects often rely on segmentation-based hand detection techniques. Semantic segmentation can then be enhanced with instance-specific information to model complex interactions between objects, but extracting such information typically increases the computational load significantly. This study proposes a training strategy that enables conventional semantic segmentation networks to preserve some instance information during inference. This is accomplished by introducing pixel weight maps into the loss calculation, increasing the importance of boundary pixels between instances. We compare two common fully convolutional network (FCN) architectures, U-Net and ResNet, and fine-tune the fittest to improve segmentation results. Although the resulting model does not reach state-of-the-art segmentation performance on the EgoHands dataset, it preserves some instance information with no computational overhead. As expected, degraded segmentations are a necessary trade-off to preserve boundaries when instances are close together. This strategy allows approximating instance segmentation in real-time using non-specialized hardware, obtaining a unique blob for an instance with an intersection over union greater than 50% in 79% of the instances in our test set. A simple FCN, typically used for semantic segmentation, has shown promising instance segmentation results by introducing per-pixel weight maps during training for light-weight applications.https://www.mdpi.com/2076-3417/14/23/11357computer visionconvolutional neural networksdeep learninghand segmentationsemantic segmentation
spellingShingle Miguel Veganzones
Ana Cisnal
Eusebio de la Fuente
Juan Carlos Fraile
Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications
Applied Sciences
computer vision
convolutional neural networks
deep learning
hand segmentation
semantic segmentation
title Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications
title_full Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications
title_fullStr Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications
title_full_unstemmed Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications
title_short Training Fully Convolutional Neural Networks for Lightweight, Non-Critical Instance Segmentation Applications
title_sort training fully convolutional neural networks for lightweight non critical instance segmentation applications
topic computer vision
convolutional neural networks
deep learning
hand segmentation
semantic segmentation
url https://www.mdpi.com/2076-3417/14/23/11357
work_keys_str_mv AT miguelveganzones trainingfullyconvolutionalneuralnetworksforlightweightnoncriticalinstancesegmentationapplications
AT anacisnal trainingfullyconvolutionalneuralnetworksforlightweightnoncriticalinstancesegmentationapplications
AT eusebiodelafuente trainingfullyconvolutionalneuralnetworksforlightweightnoncriticalinstancesegmentationapplications
AT juancarlosfraile trainingfullyconvolutionalneuralnetworksforlightweightnoncriticalinstancesegmentationapplications