Classifier surrogates: sharing AI-based searches with the world

Abstract In recent years, neural network-based classification has been used to improve data analysis at collider experiments. While this strategy proves to be hugely successful, the underlying models are not commonly shared with the public and rely on experiment-internal data as well as full detecto...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sebastian Bieringer, Gregor Kasieczka, Jan Kieseler, Mathias Trabs
Format:	Article
Language:	English
Published:	SpringerOpen 2024-09-01
Series:	European Physical Journal C: Particles and Fields
Online Access:	https://doi.org/10.1140/epjc/s10052-024-13353-w
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846164886014394368
author	Sebastian Bieringer Gregor Kasieczka Jan Kieseler Mathias Trabs
author_facet	Sebastian Bieringer Gregor Kasieczka Jan Kieseler Mathias Trabs
author_sort	Sebastian Bieringer
collection	DOAJ
description	Abstract In recent years, neural network-based classification has been used to improve data analysis at collider experiments. While this strategy proves to be hugely successful, the underlying models are not commonly shared with the public and rely on experiment-internal data as well as full detector simulations. We show a concrete implementation of a newly proposed strategy, so-called Classifier Surrogates, to be trained inside the experiments, that only utilise publicly accessible features and truth information. These surrogates approximate the original classifier distribution, and can be shared with the public. Subsequently, such a model can be evaluated by sampling the classification output from high-level information without requiring a sophisticated detector simulation. Technically, we show that continuous normalizing flows are a suitable generative architecture that can be efficiently trained to sample classification results using conditional flow matching. We further demonstrate that these models can be easily extended by Bayesian uncertainties to indicate their degree of validity when confronted with unknown inputs by the user. For a concrete example of tagging jets from hadronically decaying top quarks, we demonstrate the application of flows in combination with uncertainty estimation through either inference of a mean-field Gaussian weight posterior, or Monte Carlo sampling network weights.
format	Article
id	doaj-art-252ff79fd37144de9e72d5e535a4fcbd
institution	Kabale University
issn	1434-6052
language	English
publishDate	2024-09-01
publisher	SpringerOpen
record_format	Article
series	European Physical Journal C: Particles and Fields
spelling	doaj-art-252ff79fd37144de9e72d5e535a4fcbd2024-11-17T12:45:22ZengSpringerOpenEuropean Physical Journal C: Particles and Fields1434-60522024-09-0184911010.1140/epjc/s10052-024-13353-wClassifier surrogates: sharing AI-based searches with the worldSebastian Bieringer0Gregor Kasieczka1Jan Kieseler2Mathias Trabs3Institut für Experimentalphysik, Universität HamburgInstitut für Experimentalphysik, Universität HamburgInstitut für Experimentelle Teilchenphysik, Karlsruher Institut für TechnologieInstitut für Stochastik, Karlsruher Institut für TechnologieAbstract In recent years, neural network-based classification has been used to improve data analysis at collider experiments. While this strategy proves to be hugely successful, the underlying models are not commonly shared with the public and rely on experiment-internal data as well as full detector simulations. We show a concrete implementation of a newly proposed strategy, so-called Classifier Surrogates, to be trained inside the experiments, that only utilise publicly accessible features and truth information. These surrogates approximate the original classifier distribution, and can be shared with the public. Subsequently, such a model can be evaluated by sampling the classification output from high-level information without requiring a sophisticated detector simulation. Technically, we show that continuous normalizing flows are a suitable generative architecture that can be efficiently trained to sample classification results using conditional flow matching. We further demonstrate that these models can be easily extended by Bayesian uncertainties to indicate their degree of validity when confronted with unknown inputs by the user. For a concrete example of tagging jets from hadronically decaying top quarks, we demonstrate the application of flows in combination with uncertainty estimation through either inference of a mean-field Gaussian weight posterior, or Monte Carlo sampling network weights.https://doi.org/10.1140/epjc/s10052-024-13353-w
spellingShingle	Sebastian Bieringer Gregor Kasieczka Jan Kieseler Mathias Trabs Classifier surrogates: sharing AI-based searches with the world European Physical Journal C: Particles and Fields
title	Classifier surrogates: sharing AI-based searches with the world
title_full	Classifier surrogates: sharing AI-based searches with the world
title_fullStr	Classifier surrogates: sharing AI-based searches with the world
title_full_unstemmed	Classifier surrogates: sharing AI-based searches with the world
title_short	Classifier surrogates: sharing AI-based searches with the world
title_sort	classifier surrogates sharing ai based searches with the world
url	https://doi.org/10.1140/epjc/s10052-024-13353-w
work_keys_str_mv	AT sebastianbieringer classifiersurrogatessharingaibasedsearcheswiththeworld AT gregorkasieczka classifiersurrogatessharingaibasedsearcheswiththeworld AT jankieseler classifiersurrogatessharingaibasedsearcheswiththeworld AT mathiastrabs classifiersurrogatessharingaibasedsearcheswiththeworld

Classifier surrogates: sharing AI-based searches with the world

Similar Items