A scaling law to model the effectiveness of identification techniques

Abstract AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques...

Full description

Saved in:
Bibliographic Details
Main Authors: Luc Rocher, Julien M. Hendrickx, Yves-Alexandre de Montjoye
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-55296-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841544466567004160
author Luc Rocher
Julien M. Hendrickx
Yves-Alexandre de Montjoye
author_facet Luc Rocher
Julien M. Hendrickx
Yves-Alexandre de Montjoye
author_sort Luc Rocher
collection DOAJ
description Abstract AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques and derive an analytical expression for correctness (κ), the fraction of people accurately identified in a population. We then generalize the model to forecast how κ scales from small-scale experiments to the real world, for exact, sparse, and machine learning-based robust identification techniques. Despite having only two degrees of freedom, our method closely fits 476 correctness curves and strongly outperforms curve-fitting methods and entropy-based rules of thumb. Our work provides a principled framework for forecasting the privacy risks posed by identification techniques, while also supporting independent accountability efforts for AI-based biometric systems.
format Article
id doaj-art-cd9c3f8b56e942c29c5f0e3f03cfa3b5
institution Kabale University
issn 2041-1723
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-cd9c3f8b56e942c29c5f0e3f03cfa3b52025-01-12T12:32:02ZengNature PortfolioNature Communications2041-17232025-01-0116111110.1038/s41467-024-55296-6A scaling law to model the effectiveness of identification techniquesLuc Rocher0Julien M. Hendrickx1Yves-Alexandre de Montjoye2Oxford Internet Institute, University of OxfordInformation and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de LouvainData Science Institute, Imperial College LondonAbstract AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques and derive an analytical expression for correctness (κ), the fraction of people accurately identified in a population. We then generalize the model to forecast how κ scales from small-scale experiments to the real world, for exact, sparse, and machine learning-based robust identification techniques. Despite having only two degrees of freedom, our method closely fits 476 correctness curves and strongly outperforms curve-fitting methods and entropy-based rules of thumb. Our work provides a principled framework for forecasting the privacy risks posed by identification techniques, while also supporting independent accountability efforts for AI-based biometric systems.https://doi.org/10.1038/s41467-024-55296-6
spellingShingle Luc Rocher
Julien M. Hendrickx
Yves-Alexandre de Montjoye
A scaling law to model the effectiveness of identification techniques
Nature Communications
title A scaling law to model the effectiveness of identification techniques
title_full A scaling law to model the effectiveness of identification techniques
title_fullStr A scaling law to model the effectiveness of identification techniques
title_full_unstemmed A scaling law to model the effectiveness of identification techniques
title_short A scaling law to model the effectiveness of identification techniques
title_sort scaling law to model the effectiveness of identification techniques
url https://doi.org/10.1038/s41467-024-55296-6
work_keys_str_mv AT lucrocher ascalinglawtomodeltheeffectivenessofidentificationtechniques
AT julienmhendrickx ascalinglawtomodeltheeffectivenessofidentificationtechniques
AT yvesalexandredemontjoye ascalinglawtomodeltheeffectivenessofidentificationtechniques
AT lucrocher scalinglawtomodeltheeffectivenessofidentificationtechniques
AT julienmhendrickx scalinglawtomodeltheeffectivenessofidentificationtechniques
AT yvesalexandredemontjoye scalinglawtomodeltheeffectivenessofidentificationtechniques