Release
3 minute read

IBM and ESA open-source AI models trained on a new dataset for analyzing extreme floods and wildfires

IBM and ESA fine-tuned its multi-modal TerraMind models on a first-of-its kind dataset designed to improve how we prepare and respond to natural disasters.

In this AI analysis of Sentinel-1 imagery from 2018, flooding in a Paris suburb is shown in magenta. IBM and ESA fine-tuned their TerraMind AI model on their new TerraMesh dataset to distinguish flooding from pre-existing water.

Record-setting wildfires across Bolivia last year scorched an area the size of Greece, displacing thousands of people and leading to widespread loss of crops and livestock. The cause of the fires was attributed to land clearing, pasture burning, and a severe drought during what was Earth’s warmest year on record.

The Bolivia wildfires are just one, among hundreds, of extreme flood and wildfire events captured in a new global, multi-modal dataset called ImpactMesh, open-sourced this week by IBM Research in Europe and the European Space Agency (ESA). The dataset is also multi-temporal, meaning it features before-and-after snapshots of flooded or fire-scorched areas. The footage was captured by the Copernicus Sentinel-1 and Sentinel-2 Earth-orbiting satellites over the last decade.

To provide a clearer picture of landscape-level changes, each of the extreme events in the dataset is represented by three types of observations — optical images, radar images, and an elevation map of the impacted area. When storm clouds and smoky fires block optical sensors from seeing the extent of flood and wildfires from space, radar images and the altitude of the terrain can help to reveal the severity of what just happened.

fire corfu.png
An analysis of satellite imagery using IBM and ESA's fine-tuned TerraMind model shows the extent of burn scars left by a 2023 wildfire in Corfu, Greece.

Today’s geospatial foundation models are pre-trained on raw satellite footage for a given location and time stamp as an abstraction of the physical world. They’re trained on either multi-temporal before-and-after images, as IBM and NASA’s Prithvi models were, or they’re trained on data of different modalities, like IBM and ESA’s TerraMind model released earlier this year, as part of ESA’s Future EO program. The ImpactMesh dataset was designed to fuse these approaches together to bring the impact of flooding and fire events into sharper focus.

To demonstrate its potential, IBM and ESA researchers used the dataset to customize their pre-trained TerraMind model for wildfire analysis. In early experiments, they found that the before-and-after optical and radar images of each event helped the tuned model produce burn scar maps at least 5% more accurate than maps produced by models trained on single optical images.

blogArt_iImpactMeshCoverageMap2.jpg
IBM and ESA's ImpactMesh dataset is the first global multi-modal, multi-temporal collection of images covering extreme floods and wildfires over the last decade.

Floods and wildfires together account for nearly half of natural disasters recorded in the last decade, and evidence suggests these events are becoming more severe as Earth’s climate gets hotter. AI models trained on ImpactMesh could be used for a range of applications, from planning the immediate response after a disaster to assessing the damage and figuring out where (and where not to) rebuild. The dataset’s unique pre- and post-disaster coverage could also be useful in drawing up more accurate risk maps.

“Our goal is to empower researchers and responders to harness Earth observation data for faster, more accurate disaster mapping,” said Giuseppe Borghi, head of the ESA’s Φ-lab division. “This is a step toward building resilience in the face of a changing planet.”

blogArt_FloodAustralia.jpg
Despite heavy cloud cover, IBM and ESA's fine-tuned TerraMind model was able to identify flooded areas in Queensland, Australia in 2022 by drawing on radar imagery (left) included in the TerraMesh dataset.

The ImpactMesh dataset and customized TerraMind models are part of an ongoing collaboration between IBM and ESA. In April, researchers released their multi-modal TerraMind model, which at the time outperformed a dozen other geospatial models on common mapping tasks on the community benchmark, PANGAEA.

The work is part of ongoing work at IBM Research to develop open-source, industry-leading AI models, tools, and benchmarks to study our planet.

“ImpactMesh could set a new standard for applying geospatial AI to natural disasters,” said Juan Bernabe-Moreno, director of IBM Research Europe, Ireland, and UK. “Through advanced model architectures, rich Earth observation data, and open collaboration, we can improve our preparedness and response to extreme events.”

In addition to ImpactMesh, IBM and ESA are releasing TerraKit, an open-source package that makes it easier to build geospatial datasets and tune AI models on the most up-to-date information. TerraKit can be used to expand on ImpactMesh’s collection of curated flood and wildfire data, or to create a new dataset from scratch.

The newly tuned TerraMind models, along with ImpactMesh, are available on Hugging Face under a permissive Apache 2.0 license. “We hope that researchers can expand on this work and improve how we track and respond to natural disasters,” said Benedikt Blumenstiel, a software engineer at IBM who helped build the dataset and tuned models.

Related posts