RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder

Unsupervised anomaly detection is well known for its ability to effectively identify and discern anomalies in data containing rare anomalies or diverse patterns, leading to broad applications across various research fields. However, this technology has not yet been extensively applied in the field o...

Full description

Saved in:
Bibliographic Details
Main Authors: Dohyung Kwon, Jeongmin Yu
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10772234/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846129649560584192
author Dohyung Kwon
Jeongmin Yu
author_facet Dohyung Kwon
Jeongmin Yu
author_sort Dohyung Kwon
collection DOAJ
description Unsupervised anomaly detection is well known for its ability to effectively identify and discern anomalies in data containing rare anomalies or diverse patterns, leading to broad applications across various research fields. However, this technology has not yet been extensively applied in the field of cultural heritage monitoring. In response, this paper proposes the RaViT-AE model, a new vision transformer-based autoencoder that implements region-attentive patch projection to perform anomaly detection in images using unsupervised learning techniques. Region-attentive patch projection enhances detection by applying higher-dimensional embeddings to regions of petroglyph images that show a higher likelihood of anomalies, effectively extracting features and recognizing complex patterns. Additionally, the introduction of F-SSIM loss facilitates effective model learning by considering both structural similarities and high-level semantic differences between original and reconstructed images. This study is conducted on a dataset of petroglyph images from Bangudae Terrace in Daegok-ri, Ulju, South Korea, collected from a fixed CCTV camera over more than one year. The results reveal that the proposed RaViT-AE model outperforms previous unsupervised anomaly detection models, including GAN and CNN-based autoencoders, achieving an AUC of 0.976, accuracy of 0.944, and F1-score of 0.936. This study demonstrates that the RaViT-AE model can significantly contribute to the continuous monitoring and protection of cultural heritage by robustly reconstructing images and accurately detecting anomalies.
format Article
id doaj-art-efe7d48aa543431a9bd53ee44e7ddb12
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-efe7d48aa543431a9bd53ee44e7ddb122024-12-10T00:02:03ZengIEEEIEEE Access2169-35362024-01-011218076718078010.1109/ACCESS.2024.350998810772234RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT AutoencoderDohyung Kwon0https://orcid.org/0009-0002-7333-9948Jeongmin Yu1https://orcid.org/0000-0002-9034-5234Department of Digital Heritage, Korea National University of Heritage, Buyeo, Republic of KoreaDepartment of Digital Heritage, Korea National University of Heritage, Buyeo, Republic of KoreaUnsupervised anomaly detection is well known for its ability to effectively identify and discern anomalies in data containing rare anomalies or diverse patterns, leading to broad applications across various research fields. However, this technology has not yet been extensively applied in the field of cultural heritage monitoring. In response, this paper proposes the RaViT-AE model, a new vision transformer-based autoencoder that implements region-attentive patch projection to perform anomaly detection in images using unsupervised learning techniques. Region-attentive patch projection enhances detection by applying higher-dimensional embeddings to regions of petroglyph images that show a higher likelihood of anomalies, effectively extracting features and recognizing complex patterns. Additionally, the introduction of F-SSIM loss facilitates effective model learning by considering both structural similarities and high-level semantic differences between original and reconstructed images. This study is conducted on a dataset of petroglyph images from Bangudae Terrace in Daegok-ri, Ulju, South Korea, collected from a fixed CCTV camera over more than one year. The results reveal that the proposed RaViT-AE model outperforms previous unsupervised anomaly detection models, including GAN and CNN-based autoencoders, achieving an AUC of 0.976, accuracy of 0.944, and F1-score of 0.936. This study demonstrates that the RaViT-AE model can significantly contribute to the continuous monitoring and protection of cultural heritage by robustly reconstructing images and accurately detecting anomalies.https://ieeexplore.ieee.org/document/10772234/Unsupervised anomaly detectionCultural heritageIntelligent monitoringVision transformerBangudae petroglyph
spellingShingle Dohyung Kwon
Jeongmin Yu
RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
IEEE Access
Unsupervised anomaly detection
Cultural heritage
Intelligent monitoring
Vision transformer
Bangudae petroglyph
title RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
title_full RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
title_fullStr RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
title_full_unstemmed RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
title_short RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
title_sort ravit ae unsupervised anomaly detection for intelligent cultural heritage monitoring using region attentive vit autoencoder
topic Unsupervised anomaly detection
Cultural heritage
Intelligent monitoring
Vision transformer
Bangudae petroglyph
url https://ieeexplore.ieee.org/document/10772234/
work_keys_str_mv AT dohyungkwon ravitaeunsupervisedanomalydetectionforintelligentculturalheritagemonitoringusingregionattentivevitautoencoder
AT jeongminyu ravitaeunsupervisedanomalydetectionforintelligentculturalheritagemonitoringusingregionattentivevitautoencoder