Synthetic dataset of ID and Travel Documents

Abstract This paper presents a new synthetic dataset of ID and travel documents, called SIDTD. The SIDTD dataset is created to help training and evaluating forged ID documents detection systems. Such a dataset has become a necessity as ID documents contain personal information and a public dataset o...

Full description

Saved in:
Bibliographic Details
Main Authors: Carlos Boned, Maxime Talarmain, Nabil Ghanmi, Guillaume Chiron, Sanket Biswas, Ahmad Montaser Awal, Oriol Ramos Terrades
Format: Article
Language:English
Published: Nature Portfolio 2024-12-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-04160-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846112867321905152
author Carlos Boned
Maxime Talarmain
Nabil Ghanmi
Guillaume Chiron
Sanket Biswas
Ahmad Montaser Awal
Oriol Ramos Terrades
author_facet Carlos Boned
Maxime Talarmain
Nabil Ghanmi
Guillaume Chiron
Sanket Biswas
Ahmad Montaser Awal
Oriol Ramos Terrades
author_sort Carlos Boned
collection DOAJ
description Abstract This paper presents a new synthetic dataset of ID and travel documents, called SIDTD. The SIDTD dataset is created to help training and evaluating forged ID documents detection systems. Such a dataset has become a necessity as ID documents contain personal information and a public dataset of real documents can not be released. Moreover, forged documents are scarce, compared to legit ones, and the way they are generated varies from one fraudster to another resulting in a class of high intra-variability. In this paper we introduce a dataset, synthetically generated, that simulates the most common, and easiest, forgeries to be made by common users of ID documents and travel documents. The creation of this dataset will help to document image analysis community to progress in the task of automatic ID document verification in online onboarding systems.
format Article
id doaj-art-a2e14e4f8c5e48ca99a91803ffa374d7
institution Kabale University
issn 2052-4463
language English
publishDate 2024-12-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-a2e14e4f8c5e48ca99a91803ffa374d72024-12-22T12:14:35ZengNature PortfolioScientific Data2052-44632024-12-0111111010.1038/s41597-024-04160-9Synthetic dataset of ID and Travel DocumentsCarlos Boned0Maxime Talarmain1Nabil Ghanmi2Guillaume Chiron3Sanket Biswas4Ahmad Montaser Awal5Oriol Ramos Terrades6Computer Vision CentreComputer Vision CentreIDNowIDNowComputer Vision CentreIDNowComputer Vision CentreAbstract This paper presents a new synthetic dataset of ID and travel documents, called SIDTD. The SIDTD dataset is created to help training and evaluating forged ID documents detection systems. Such a dataset has become a necessity as ID documents contain personal information and a public dataset of real documents can not be released. Moreover, forged documents are scarce, compared to legit ones, and the way they are generated varies from one fraudster to another resulting in a class of high intra-variability. In this paper we introduce a dataset, synthetically generated, that simulates the most common, and easiest, forgeries to be made by common users of ID documents and travel documents. The creation of this dataset will help to document image analysis community to progress in the task of automatic ID document verification in online onboarding systems.https://doi.org/10.1038/s41597-024-04160-9
spellingShingle Carlos Boned
Maxime Talarmain
Nabil Ghanmi
Guillaume Chiron
Sanket Biswas
Ahmad Montaser Awal
Oriol Ramos Terrades
Synthetic dataset of ID and Travel Documents
Scientific Data
title Synthetic dataset of ID and Travel Documents
title_full Synthetic dataset of ID and Travel Documents
title_fullStr Synthetic dataset of ID and Travel Documents
title_full_unstemmed Synthetic dataset of ID and Travel Documents
title_short Synthetic dataset of ID and Travel Documents
title_sort synthetic dataset of id and travel documents
url https://doi.org/10.1038/s41597-024-04160-9
work_keys_str_mv AT carlosboned syntheticdatasetofidandtraveldocuments
AT maximetalarmain syntheticdatasetofidandtraveldocuments
AT nabilghanmi syntheticdatasetofidandtraveldocuments
AT guillaumechiron syntheticdatasetofidandtraveldocuments
AT sanketbiswas syntheticdatasetofidandtraveldocuments
AT ahmadmontaserawal syntheticdatasetofidandtraveldocuments
AT oriolramosterrades syntheticdatasetofidandtraveldocuments