About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
AGU Fall 2022
Poster
Towards Accelerated Discovery Services for Geospatio-temporal Foundation Models
Abstract
Recent large success with foundation models (FM) in natural language processing gets attracting earth monitoring scientists to apply the idea to remote sensing data-based scientific studies, such as climate impact modeling. FMs are large artificial intelligence model trained on a vast quantity of data at scale, usually by self-supervised learning, to be adapted to variety of down-stream tasks. In addition to standardized benchmarks for measuring performance in down-stream tasks, the support for easy-to-setup and scalable training of FM models is a key to stimulate the development of FMs. We propose an orchestration service with reproducible benchmark experimentation of custom FM models and scalable model training. The provided framework allows any Python code written in a distributed-data-parallel manner to run in an arbitrary scale, from a single notebook for debugging to a cluster of massive GPU servers from arbitrary cloud or on-premise for full run, with a large dataset service coordinated to work with the framework. We have made some down-stream tasks, including precipitation observation interpolation, flood-mapping, and super-resolution, readily available for anyone to challenge, while providing reference state-of-the-art machine-learning solution implementations for the tasks, adapted to run in a distributed-data-parallel manner. We compare their performance with different pre-trained artificial neural networks and different transfer-learning methods. These down-stream tasks, scalable state-of-the-art solution implementations, and foundation models training implementations are to be open-sourced for any scientist/engineer to reproduce and customize the experiments we conducted.