RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder
Unsupervised anomaly detection is well known for its ability to effectively identify and discern anomalies in data containing rare anomalies or diverse patterns, leading to broad applications across various research fields. However, this technology has not yet been extensively applied in the field o...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10772234/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846129649560584192 |
|---|---|
| author | Dohyung Kwon Jeongmin Yu |
| author_facet | Dohyung Kwon Jeongmin Yu |
| author_sort | Dohyung Kwon |
| collection | DOAJ |
| description | Unsupervised anomaly detection is well known for its ability to effectively identify and discern anomalies in data containing rare anomalies or diverse patterns, leading to broad applications across various research fields. However, this technology has not yet been extensively applied in the field of cultural heritage monitoring. In response, this paper proposes the RaViT-AE model, a new vision transformer-based autoencoder that implements region-attentive patch projection to perform anomaly detection in images using unsupervised learning techniques. Region-attentive patch projection enhances detection by applying higher-dimensional embeddings to regions of petroglyph images that show a higher likelihood of anomalies, effectively extracting features and recognizing complex patterns. Additionally, the introduction of F-SSIM loss facilitates effective model learning by considering both structural similarities and high-level semantic differences between original and reconstructed images. This study is conducted on a dataset of petroglyph images from Bangudae Terrace in Daegok-ri, Ulju, South Korea, collected from a fixed CCTV camera over more than one year. The results reveal that the proposed RaViT-AE model outperforms previous unsupervised anomaly detection models, including GAN and CNN-based autoencoders, achieving an AUC of 0.976, accuracy of 0.944, and F1-score of 0.936. This study demonstrates that the RaViT-AE model can significantly contribute to the continuous monitoring and protection of cultural heritage by robustly reconstructing images and accurately detecting anomalies. |
| format | Article |
| id | doaj-art-efe7d48aa543431a9bd53ee44e7ddb12 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-efe7d48aa543431a9bd53ee44e7ddb122024-12-10T00:02:03ZengIEEEIEEE Access2169-35362024-01-011218076718078010.1109/ACCESS.2024.350998810772234RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT AutoencoderDohyung Kwon0https://orcid.org/0009-0002-7333-9948Jeongmin Yu1https://orcid.org/0000-0002-9034-5234Department of Digital Heritage, Korea National University of Heritage, Buyeo, Republic of KoreaDepartment of Digital Heritage, Korea National University of Heritage, Buyeo, Republic of KoreaUnsupervised anomaly detection is well known for its ability to effectively identify and discern anomalies in data containing rare anomalies or diverse patterns, leading to broad applications across various research fields. However, this technology has not yet been extensively applied in the field of cultural heritage monitoring. In response, this paper proposes the RaViT-AE model, a new vision transformer-based autoencoder that implements region-attentive patch projection to perform anomaly detection in images using unsupervised learning techniques. Region-attentive patch projection enhances detection by applying higher-dimensional embeddings to regions of petroglyph images that show a higher likelihood of anomalies, effectively extracting features and recognizing complex patterns. Additionally, the introduction of F-SSIM loss facilitates effective model learning by considering both structural similarities and high-level semantic differences between original and reconstructed images. This study is conducted on a dataset of petroglyph images from Bangudae Terrace in Daegok-ri, Ulju, South Korea, collected from a fixed CCTV camera over more than one year. The results reveal that the proposed RaViT-AE model outperforms previous unsupervised anomaly detection models, including GAN and CNN-based autoencoders, achieving an AUC of 0.976, accuracy of 0.944, and F1-score of 0.936. This study demonstrates that the RaViT-AE model can significantly contribute to the continuous monitoring and protection of cultural heritage by robustly reconstructing images and accurately detecting anomalies.https://ieeexplore.ieee.org/document/10772234/Unsupervised anomaly detectionCultural heritageIntelligent monitoringVision transformerBangudae petroglyph |
| spellingShingle | Dohyung Kwon Jeongmin Yu RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder IEEE Access Unsupervised anomaly detection Cultural heritage Intelligent monitoring Vision transformer Bangudae petroglyph |
| title | RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder |
| title_full | RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder |
| title_fullStr | RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder |
| title_full_unstemmed | RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder |
| title_short | RaViT-AE: Unsupervised Anomaly Detection for Intelligent Cultural Heritage Monitoring Using Region-Attentive ViT Autoencoder |
| title_sort | ravit ae unsupervised anomaly detection for intelligent cultural heritage monitoring using region attentive vit autoencoder |
| topic | Unsupervised anomaly detection Cultural heritage Intelligent monitoring Vision transformer Bangudae petroglyph |
| url | https://ieeexplore.ieee.org/document/10772234/ |
| work_keys_str_mv | AT dohyungkwon ravitaeunsupervisedanomalydetectionforintelligentculturalheritagemonitoringusingregionattentivevitautoencoder AT jeongminyu ravitaeunsupervisedanomalydetectionforintelligentculturalheritagemonitoringusingregionattentivevitautoencoder |