Afaq et al. (2026) ViTs-based Dual Metric Deep Learning Technique for change detection from high-resolution satellite images
Identification
- Journal: Remote Sensing Applications Society and Environment
- Year: 2026
- Date: 2026-01-01
- Authors: Yasir Afaq, Nouhaila El Koufi
- DOI: 10.1016/j.rsase.2026.101956
Research Groups
- Department of Computer Science and Engineering, SRM University-AP, Amaravati, Andhra Pradesh, India
- Laboratory of Mathematical Modeling and Economic Calculations, FEG, Morocco
Short Summary
This paper proposes ViT-DMDLT, a deep learning framework leveraging Vision Transformers and Convolutional Neural Networks, to effectively detect small-scale land-use and land-cover changes from super-resolution satellite imagery, demonstrating superior accuracy across multiple public datasets.
Objective
- To develop a robust deep learning framework capable of effectively detecting small-scale and complex land-use and land-cover changes from super-resolution satellite images, overcoming limitations in capturing both spatial and temporal variations.
Study Configuration
- Spatial Scale: High-resolution and super-resolution satellite images, focusing on small-scale and complex land-use and land-cover changes.
- Temporal Scale: Change detection between temporal images.
Methodology and Data
- Models used: Vision Transformer-based Dual Metric Deep Learning Technique (ViT-DMDLT), which integrates Vision Transformers (ViTs), Convolutional Neural Networks (CNNs), and a Dual Metric Network (DMN).
- Data sources: High-resolution and low-resolution satellite images (integrated to obtain super-resolution images). Validated on publicly available datasets: SYSU, Cropland, and LEVIR-CD.
Main Results
- The ViT-DMDLT framework achieved superior performance in change detection.
- Overall accuracy of 96.78% on the SYSU dataset.
- Overall accuracy of 90.77% on the Cropland dataset.
- Overall accuracy of 98.12% on the LEVIR-CD dataset.
- The framework demonstrated robustness compared to other state-of-the-art models, effectively detecting land cover changes with high accuracy even for complex and small variations from super-resolution images.
Contributions
- Introduction of ViT-DMDLT, a novel deep learning framework for change detection in super-resolution satellite images, combining the strengths of Vision Transformers, Convolutional Neural Networks, and a Dual Metric Network.
- Addresses the significant challenge of monitoring small-scale and complex land-use and land-cover changes, which is crucial for various remote sensing applications.
- Integrates low-resolution and high-resolution satellite images to generate super-resolution imagery, enhancing the input quality for change detection.
- Utilizes ViT encoders to capture overall spatial dependencies and a dual metric to ensure robust features by reducing intra-class changes and enhancing inter-class separability among temporal images.
- Achieves superior and robust performance compared to existing state-of-the-art models across multiple public datasets, demonstrating its effectiveness in complex scenarios.
Funding
- Not specified in the provided text.
Citation
@article{Afaq2026ViTsbased,
author = {Afaq, Yasir and Koufi, Nouhaila El},
title = {ViTs-based Dual Metric Deep Learning Technique for change detection from high-resolution satellite images},
journal = {Remote Sensing Applications Society and Environment},
year = {2026},
doi = {10.1016/j.rsase.2026.101956},
url = {https://doi.org/10.1016/j.rsase.2026.101956}
}
Original Source: https://doi.org/10.1016/j.rsase.2026.101956