Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation

Identifying gamma rays and rejecting the background of cosmic-ray hadrons are crucial for very-high-energy gamma-ray observations and relevant scientific research. Based on the simulated data from the square kilometer array (KM2A) of LHAASO, eight high-level features were extracted for the gamma/had...

Full description

Saved in:
Bibliographic Details
Main Authors: Jie Li, Hongkui Lv, Yang Liu, Jiajun Huang, Yu Wang, Wenbin Lin
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:The Astrophysical Journal Supplement Series
Subjects:
Online Access:https://doi.org/10.3847/1538-4365/ad9581
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841553624824545280
author Jie Li
Hongkui Lv
Yang Liu
Jiajun Huang
Yu Wang
Wenbin Lin
author_facet Jie Li
Hongkui Lv
Yang Liu
Jiajun Huang
Yu Wang
Wenbin Lin
author_sort Jie Li
collection DOAJ
description Identifying gamma rays and rejecting the background of cosmic-ray hadrons are crucial for very-high-energy gamma-ray observations and relevant scientific research. Based on the simulated data from the square kilometer array (KM2A) of LHAASO, eight high-level features were extracted for the gamma/hadron classification. Machine learning (ML) models, including logistic regression, support vector machines, decision trees, random forests, XGBoost, CatBoost, and deep neural networks (DNN) were constructed and trained using data sets of four energy bands ranging from 10 ^12 to 10 ^16 eV, and finally fused using the stacking ensemble algorithm. To comprehensively assess the classification ability of each model, the accuracy, F1 score, precision, recall, and area under the curve value of the receiver operating characteristic curve were used. The results show that the ML methods have a significant improvement on particle classification in LHAASO-KM2A, particularly in the low-energy range. Among these methods, XGBoost, CatBoost, and DNN demonstrate stronger classification capabilities than decision trees and random forests, while the fusion model exhibits the best discriminatory ability. The ML methods provide a useful and alternative method for gamma/hadron identification. The codes used in this paper are available at Zenodo at doi: http://dx.doi.org/10.5281/zenodo.13623261 .
format Article
id doaj-art-75a0d63c738f4a65bb52c1b5c7c76378
institution Kabale University
issn 0067-0049
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series The Astrophysical Journal Supplement Series
spelling doaj-art-75a0d63c738f4a65bb52c1b5c7c763782025-01-09T06:48:40ZengIOP PublishingThe Astrophysical Journal Supplement Series0067-00492025-01-0127612410.3847/1538-4365/ad9581Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray ObservationJie Li0Hongkui Lv1https://orcid.org/0000-0002-7779-3630Yang Liu2Jiajun Huang3Yu Wang4https://orcid.org/0000-0001-7959-3387Wenbin Lin5https://orcid.org/0000-0002-4282-066XSchool of Mathematics and Physics, University of South China , Hengyang 421001, People’s Republic of China ; lwb@usc.edu.cnKey Laboratory of Particle Astrophysics, Institute of High Energy Physics , Chinese Academy of Sciences, Beijing 100049, People’s Republic of China ; lvhk@ihep.ac.cn; TIANFU Cosmic Ray Research Center , Chengdu, Sichuan, People’s Republic of ChinaSchool of Computer Science, University of South China , Hengyang 421001, People’s Republic of ChinaKey Laboratory of Particle Astrophysics, Institute of High Energy Physics , Chinese Academy of Sciences, Beijing 100049, People’s Republic of China ; lvhk@ihep.ac.cn; University of Chinese Academy of Sciences , Beijing 10049, People’s Republic of ChinaICRA-Dip. di Fisica, University of Rome , P.le Aldo Moro, 5, Rome 00185, Italy ; yu.wang@icranet.org; INAF-Osservatorio Astronomico d’Abruzzo , Via M. Maggini snc, Teramo I-64100, Italy; International Center for Relativistic Astrophysics Network (ICRANet) , Pescara I-65122, ItalySchool of Mathematics and Physics, University of South China , Hengyang 421001, People’s Republic of China ; lwb@usc.edu.cn; School of Computer Science, University of South China , Hengyang 421001, People’s Republic of China; International Center for Relativistic Astrophysics Network (ICRANet) , Pescara I-65122, Italy; School of Physical Science and Technology, Southwest Jiaotong University , Chengdu 610031, People’s Republic of ChinaIdentifying gamma rays and rejecting the background of cosmic-ray hadrons are crucial for very-high-energy gamma-ray observations and relevant scientific research. Based on the simulated data from the square kilometer array (KM2A) of LHAASO, eight high-level features were extracted for the gamma/hadron classification. Machine learning (ML) models, including logistic regression, support vector machines, decision trees, random forests, XGBoost, CatBoost, and deep neural networks (DNN) were constructed and trained using data sets of four energy bands ranging from 10 ^12 to 10 ^16 eV, and finally fused using the stacking ensemble algorithm. To comprehensively assess the classification ability of each model, the accuracy, F1 score, precision, recall, and area under the curve value of the receiver operating characteristic curve were used. The results show that the ML methods have a significant improvement on particle classification in LHAASO-KM2A, particularly in the low-energy range. Among these methods, XGBoost, CatBoost, and DNN demonstrate stronger classification capabilities than decision trees and random forests, while the fusion model exhibits the best discriminatory ability. The ML methods provide a useful and alternative method for gamma/hadron identification. The codes used in this paper are available at Zenodo at doi: http://dx.doi.org/10.5281/zenodo.13623261 .https://doi.org/10.3847/1538-4365/ad9581High-energy cosmic radiationCosmic raysClassificationAstronomy data analysisInterdisciplinary astronomyAstronomy software
spellingShingle Jie Li
Hongkui Lv
Yang Liu
Jiajun Huang
Yu Wang
Wenbin Lin
Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
The Astrophysical Journal Supplement Series
High-energy cosmic radiation
Cosmic rays
Classification
Astronomy data analysis
Interdisciplinary astronomy
Astronomy software
title Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
title_full Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
title_fullStr Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
title_full_unstemmed Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
title_short Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
title_sort application of machine learning to background rejection in very high energy gamma ray observation
topic High-energy cosmic radiation
Cosmic rays
Classification
Astronomy data analysis
Interdisciplinary astronomy
Astronomy software
url https://doi.org/10.3847/1538-4365/ad9581
work_keys_str_mv AT jieli applicationofmachinelearningtobackgroundrejectioninveryhighenergygammarayobservation
AT hongkuilv applicationofmachinelearningtobackgroundrejectioninveryhighenergygammarayobservation
AT yangliu applicationofmachinelearningtobackgroundrejectioninveryhighenergygammarayobservation
AT jiajunhuang applicationofmachinelearningtobackgroundrejectioninveryhighenergygammarayobservation
AT yuwang applicationofmachinelearningtobackgroundrejectioninveryhighenergygammarayobservation
AT wenbinlin applicationofmachinelearningtobackgroundrejectioninveryhighenergygammarayobservation