Transformer-Based Change Detection in Remote Sensing Imagery
This project targets to detect, localize, and interpret changes in the Earth’s surface through an intelligent transformer-based system. It tackles a vital challenge: automatically identifying regions of change between two images taken at different times (dual-phase imagery), such as areas affected by urban expansion, deforestation, or natural disasters. Traditionally, such detection relies on convolutional methods that struggle with long-range context and subtle temporal differences. By introducing transformer-based architectures, the project brings a fresh perspective, enabling models to compare images not just pixel by pixel, but by capturing spatial and temporal “conversations” between regions across time. This method holds the promise of faster, more accurate mapping of change, key for environmental monitoring, urban planning, and disaster response.

The pipeline for this project is built in PyTorch, implementing transformer models for pixel-level change detection. We adapt and apply models like the Bitemporal Image Transformer (BIT), which first compresses dual-temporal images into meaningful tokens, then uses transformer encoders to model spatial-temporal context and decoders to project these insights back into refined pixel predictions. This approach significantly reduces the compute overhead compared to convolution-heavy systems, while maintaining or surpassing accuracy. The code supports widely-used remote sensing change detection datasets (LEVIR-CD, WHU-CD, DSIFN-CD), providing scripts for training, evaluating, and running demos. Users can recreate the entire workflow, from preprocessing paired images to generating visual change maps. Notably, BIT-based models often outperform standard convolutional baselines using just a third of the parameters and compute cost.

The modular structure of this pipeline means developers can easily experiment with new transformer variants, attention mechanisms, or dataset formats. The integration of dual temporal data opens doors to domain-specific customization. This can be adapted to detect building damage after storms, monitor vegetation loss in wildfire zones, or track glacier retreat. The use of publicly available datasets and open MIT licensing ensures the project can be extended and adapted freely. For practitioners, the ability to visualize and act on accurate change maps, without needing massive compute or complex pipelines, makes this system incredibly practical. This work shows how current AI research can move beyond lab benchmarks into environments where timely and reliable change detection can make a real difference.