IBM and ESA open-source AI models trained on a new dataset for analyzing extreme floods and wildfires
IBM and ESA fine-tuned its multi-modal TerraMind models on a first-of-its kind dataset designed to improve how we prepare and respond to natural disasters.
Record-setting wildfires across Bolivia last year scorched an area the size of Greece, displacing thousands of people and leading to widespread loss of crops and livestock. The cause of the fires was attributed to land clearing, pasture burning, and a severe drought during what was Earth’s warmest year on record.
The Bolivia wildfires are just one, among hundreds, of extreme flood and wildfire events captured in a new global, multi-modal dataset called ImpactMesh, open-sourced this week by IBM Research in Europe and the European Space Agency (ESA). The dataset is also multi-temporal, meaning it features before-and-after snapshots of flooded or fire-scorched areas. The footage was captured by the Copernicus Sentinel-1 and Sentinel-2 Earth-orbiting satellites over the last decade.
To provide a clearer picture of landscape-level changes, each of the extreme events in the dataset is represented by three types of observations — optical images, radar images, and an elevation map of the impacted area. When storm clouds and smoky fires block optical sensors from seeing the extent of flood and wildfires from space, radar images and the altitude of the terrain can help to reveal the severity of what just happened.
Today’s geospatial foundation models are pre-trained on raw satellite footage for a given location and time stamp as an abstraction of the physical world. They’re trained on either multi-temporal before-and-after images, as IBM and NASA’s Prithvi models were, or they’re trained on data of different modalities, like IBM and ESA’s TerraMind model released earlier this year, as part of ESA’s Future EO program. The ImpactMesh dataset was designed to fuse these approaches together to bring the impact of flooding and fire events into sharper focus.
To demonstrate its potential, IBM and ESA researchers used the dataset to customize their pre-trained TerraMind model for wildfire analysis. In early experiments, they found that the before-and-after optical and radar images of each event helped the tuned model produce burn scar maps at least 5% more accurate than maps produced by models trained on single optical images.
Floods and wildfires together account for nearly half of natural disasters recorded in the last decade, and evidence suggests these events are becoming more severe as Earth’s climate gets hotter. AI models trained on ImpactMesh could be used for a range of applications, from planning the immediate response after a disaster to assessing the damage and figuring out where (and where not to) rebuild. The dataset’s unique pre- and post-disaster coverage could also be useful in drawing up more accurate risk maps.
“Our goal is to empower researchers and responders to harness Earth observation data for faster, more accurate disaster mapping,” said Giuseppe Borghi, head of the ESA’s Φ-lab division. “This is a step toward building resilience in the face of a changing planet.”
The ImpactMesh dataset and customized TerraMind models are part of an ongoing collaboration between IBM and ESA. In April, researchers released their multi-modal TerraMind model, which at the time outperformed a dozen other geospatial models on common mapping tasks on the community benchmark, PANGAEA.
The work is part of ongoing work at IBM Research to develop open-source, industry-leading AI models, tools, and benchmarks to study our planet.
“ImpactMesh could set a new standard for applying geospatial AI to natural disasters,” said Juan Bernabe-Moreno, director of IBM Research Europe, Ireland, and UK. “Through advanced model architectures, rich Earth observation data, and open collaboration, we can improve our preparedness and response to extreme events.”
In addition to ImpactMesh, IBM and ESA are releasing TerraKit, an open-source package that makes it easier to build geospatial datasets and tune AI models on the most up-to-date information. TerraKit can be used to expand on ImpactMesh’s collection of curated flood and wildfire data, or to create a new dataset from scratch.
The newly tuned TerraMind models, along with ImpactMesh, are available on Hugging Face under a permissive Apache 2.0 license. “We hope that researchers can expand on this work and improve how we track and respond to natural disasters,” said Benedikt Blumenstiel, a software engineer at IBM who helped build the dataset and tuned models.
Related posts
- Technical noteYue Zhu, Radu Stoica, Animesh Trivedi, Jonathan Terner, Frank Schmuck, Jeremy Cohn, Christof Schmitt, Anthony Hsu, Guy Margalit, Vasily Tarasov, Swaminathan Sundararaman, Talia Gershon, and Vincent Hsu
It takes a village to make open infrastructure for AI a reality
NewsPeter HessCtrl+Z for agents
ResearchKim MartineauThe future of AI is in your hands
NewsMike Murphy
