Searching for the Ideal Recipe for Preparing Synthetic Data in the Multi-Object Detection Problem

The advancement of deep learning methods across various applications has forced the creation of enormous training datasets. However, obtaining suitable real-world datasets is often challenging for various reasons. Consequently, numerous studies have emerged focusing on the generation and utilization...

Full description

Saved in:
Bibliographic Details
Main Authors: Michał Staniszewski, Aleksander Kempski, Michał Marczyk, Marek Socha, Paweł Foszner, Mateusz Cebula, Agnieszka Labus, Michał Cogiel, Dominik Golba
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/354
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The advancement of deep learning methods across various applications has forced the creation of enormous training datasets. However, obtaining suitable real-world datasets is often challenging for various reasons. Consequently, numerous studies have emerged focusing on the generation and utilization of synthetic data in the training process. Hence, there is no universal formula for preparing synthetic data and leveraging it in network training to maximize the effectiveness of various detection methods. This work provides a comprehensive overview of several synthetic data generation techniques, followed by a thorough investigation into the impact of training methods and the selection of synthetic data quantities. The outcomes of this research enable the formulation of conclusions regarding the recipe for developing synthetic data with high efficacy in enhancing detection methods. The main conclusion for the synthetic data generation methods is to ensure maximum diversity at a high level of photorealism, which allows improving the classification quality by more than 5% to even 19% for different detection metrics.
ISSN:2076-3417