Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem

The advancement of deep learning methods across various applications has forced the creation of enormous training datasets. However, obtaining suitable real-world datasets is often challenging for various reasons. Consequently, numerous studies have emerged focusing on the generation and utilization...

Full description

Saved in:
Bibliographic Details
Main Authors: Michał Staniszewski, Aleksander Kempski, Michał Marczyk, Marek Socha, Paweł Foszner, Mateusz Cebula, Agnieszka Labus, Michał Cogiel, Dominik Golba
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/354
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549369996738560
author Michał Staniszewski
Aleksander Kempski
Michał Marczyk
Marek Socha
Paweł Foszner
Mateusz Cebula
Agnieszka Labus
Michał Cogiel
Dominik Golba
author_facet Michał Staniszewski
Aleksander Kempski
Michał Marczyk
Marek Socha
Paweł Foszner
Mateusz Cebula
Agnieszka Labus
Michał Cogiel
Dominik Golba
author_sort Michał Staniszewski
collection DOAJ
description The advancement of deep learning methods across various applications has forced the creation of enormous training datasets. However, obtaining suitable real-world datasets is often challenging for various reasons. Consequently, numerous studies have emerged focusing on the generation and utilization of synthetic data in the training process. Hence, there is no universal formula for preparing synthetic data and leveraging it in network training to maximize the effectiveness of various detection methods. This work provides a comprehensive overview of several synthetic data generation techniques, followed by a thorough investigation into the impact of training methods and the selection of synthetic data quantities. The outcomes of this research enable the formulation of conclusions regarding the recipe for developing synthetic data with high efficacy in enhancing detection methods. The main conclusion for the synthetic data generation methods is to ensure maximum diversity at a high level of photorealism, which allows improving the classification quality by more than 5% to even 19% for different detection metrics.
format Article
id doaj-art-af3e615a5b7a4cd3a8f6f4fc12b499d3
institution Kabale University
issn 2076-3417
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-af3e615a5b7a4cd3a8f6f4fc12b499d32025-01-10T13:15:17ZengMDPI AGApplied Sciences2076-34172025-01-0115135410.3390/app15010354Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection ProblemMichał Staniszewski0Aleksander Kempski1Michał Marczyk2Marek Socha3Paweł Foszner4Mateusz Cebula5Agnieszka Labus6Michał Cogiel7Dominik Golba8Department of Computer Graphics, Vision and Digital Systems, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 2A, 44-100 Gliwice, PolandDepartment of Computer Graphics, Vision and Digital Systems, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 2A, 44-100 Gliwice, PolandDepartment of Data Science and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, PolandDepartment of Data Science and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, PolandDepartment of Computer Graphics, Vision and Digital Systems, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 2A, 44-100 Gliwice, PolandDepartment of Computer Graphics, Vision and Digital Systems, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 2A, 44-100 Gliwice, PolandDepartment of Urban and Spatial Planning, Faculty of Architecture, Silesian University of Technology, Akademicka 7, 44-100 Gliwice, PolandQSystems.pro sp. z o.o. Mochnackiego 34, 41-907 Bytom, PolandQSystems.pro sp. z o.o. Mochnackiego 34, 41-907 Bytom, PolandThe advancement of deep learning methods across various applications has forced the creation of enormous training datasets. However, obtaining suitable real-world datasets is often challenging for various reasons. Consequently, numerous studies have emerged focusing on the generation and utilization of synthetic data in the training process. Hence, there is no universal formula for preparing synthetic data and leveraging it in network training to maximize the effectiveness of various detection methods. This work provides a comprehensive overview of several synthetic data generation techniques, followed by a thorough investigation into the impact of training methods and the selection of synthetic data quantities. The outcomes of this research enable the formulation of conclusions regarding the recipe for developing synthetic data with high efficacy in enhancing detection methods. The main conclusion for the synthetic data generation methods is to ensure maximum diversity at a high level of photorealism, which allows improving the classification quality by more than 5% to even 19% for different detection metrics.https://www.mdpi.com/2076-3417/15/1/354mutli-object detectionsynthetic data generationdeep and transfer learning
spellingShingle Michał Staniszewski
Aleksander Kempski
Michał Marczyk
Marek Socha
Paweł Foszner
Mateusz Cebula
Agnieszka Labus
Michał Cogiel
Dominik Golba
Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem
Applied Sciences
mutli-object detection
synthetic data generation
deep and transfer learning
title Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem
title_full Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem
title_fullStr Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem
title_full_unstemmed Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem
title_short Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem
title_sort searching for the ideal recipe for preparing synthetic data in the multi object detection problem
topic mutli-object detection
synthetic data generation
deep and transfer learning
url https://www.mdpi.com/2076-3417/15/1/354
work_keys_str_mv AT michałstaniszewski searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT aleksanderkempski searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT michałmarczyk searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT mareksocha searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT pawełfoszner searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT mateuszcebula searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT agnieszkalabus searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT michałcogiel searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem
AT dominikgolba searchingfortheidealrecipeforpreparingsyntheticdatainthemultiobjectdetectionproblem