ENGDM: Enhanced Non-Isotropic Gaussian Diffusion Model for Progressive Image Editing
Diffusion models have made remarkable progress in image generation, leading to advancements in the field of image editing. However, balancing editability with faithfulness remains a significant challenge. Motivated by the fact that more novel content will be generated when larger variance noise is a...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/10/2970 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Diffusion models have made remarkable progress in image generation, leading to advancements in the field of image editing. However, balancing editability with faithfulness remains a significant challenge. Motivated by the fact that more novel content will be generated when larger variance noise is applied to the image, in this paper, we propose an Enhanced Non-isotropic Gaussian Diffusion Model (ENGDM) for progressive image editing, which introduces independent Gaussian noise with varying variances to each pixel based on its editing needs. To enable efficient inference without retraining, ENGDM is rectified into an isotropic Gaussian diffusion model (IGDM) by assigning different total diffusion times to different pixels. Furthermore, we introduce reinforced text embeddings, using a novel editing reinforcement loss in the latent space to optimize text embeddings for enhanced editability. And we introduce optimized noise variances by employing a structural consistency loss to dynamically adjust the denoising time steps for each pixel for better faithfulness. Experimental results on multiple datasets demonstrate that ENGDM achieves state-of-the-art performance in image-editing tasks, effectively balancing faithfulness to the source image and alignment with the desired editing target. |
|---|---|
| ISSN: | 1424-8220 |