Adversarial Defense on Harmony: Reverse Attack for Robust AI Models Against Adversarial Attacks
Deep neural networks (DNNs) are crucial in safety-critical applications but vulnerable to adversarial attacks, where subtle perturbations cause misclassification. Existing defense mechanisms struggle with small perturbations and face accuracy-robustness trade-offs. This study introduces the ...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10766602/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deep neural networks (DNNs) are crucial in safety-critical applications but vulnerable to adversarial attacks, where subtle perturbations cause misclassification. Existing defense mechanisms struggle with small perturbations and face accuracy-robustness trade-offs. This study introduces the “Reverse Attack” method to address these challenges. Our approach uniquely reconstructs and classifies images by applying perturbations opposite to the attack direction, using a complementary “Revenant” classifier to maintain original image accuracy. The proposed method significantly outperforms existing strategies, maintaining clean image accuracy with only a 2.92% decrease while achieving over 70% robust accuracy against all benchmarked adversarial attacks. This contrasts with current mechanisms, which typically suffer an 18% reduction in clean image accuracy and only 36% robustness against adversarial examples. We evaluate our method on the CIFAR-10 dataset using ResNet50, testing against various attacks including PGD and components of Auto Attack. Although our approach incurs additional computational costs during reconstruction, our method represents a significant advancement in robust defenses against adversarial attacks while preserving clean input performance. This balanced approach paves the way for more reliable DNNs in critical applications. Future work will focus on optimization and exploring applicability to larger datasets and complex architectures. |
---|---|
ISSN: | 2169-3536 |