Finetuning the Geospatial Foundation model for Land Cover mapping
Abstract
Land use and land cover (LULC) play pivotal roles in achieving several sustainable development goals established by United Nations member states for the social, economic, and environmental advancement of our planet and its inhabitants. Understanding LULC and its dynamics is crucial for gaining insights into the changing composition and spatial distribution of land surface features across diverse landscapes. While researchers have explored various AI/ML approaches using remote sensing images spanning several decades, existing LULC mapping techniques encounter challenges related to accuracy, the need for substantial labeled data for training, and adaptability to different geographical regions, among others. Recent advances in foundational models have gained significant traction due to their ability to alleviate labeled data scarcity issues. Therefore, in this study, we propose a fine-tuning strategy utilizing a cutting-edge geospatial foundation model jointly developed by IBM and NASA, known as Prithvi, to address existing challenges such as lesser requirements of labeled data and accuracy. This paper presents the results achieved using Prithvi for LULC mapping with relatively small training dataset $(565 \space total \space 224 × 224 \space {pixel^2} \space images)$. We conduct a performance comparison of the Prithvi model with traditional deep learning-based U-Net model and a large foundational model known as Vision Transformer (ViT). The results demonstrate that Prithvi surpasses U-Net and ViT in terms of mean Intersection over Union (IoU) across several LULC classes.