Publication
IGARSS 2024
Conference paper
Geospatial Sampling by Maximizing Information Entropy
Abstract
To refine unsupervised geospatial model training, we introduce a novel method emphasizing diverse and clean datasets. Extracting finer-resolution metrics like land use, temperature, and precipitation, we cluster similar statistics to comprehend data distribution comprehensively. Weighted sampling based on cluster size ensures representative data points, with a down-weighting strategy favoring less frequent data for enhanced diversity. This achieves a balanced dataset representation, significantly improving the geospatial foundation model's accuracy. Our study underscores the potential for optimizing geospatial data sampling, enhancing model accuracy, and broadening practical applications.