Published July 5, 2025 | Version v1
Journal article Open

A large-scale dataset for training deep learning segmentation and tracking of extreme weather

  • 1. University of Chicago
  • 2. ETH Zurich
  • 3. NVIDIA Corporation

Description

As Earth's climate continues to undergo changes, it is imperative to gain understanding of how high-impact, extreme weather events will change. Researchers are increasingly relying on data-driven, learning-based approaches for the detection and tracking of extreme weather events. While several attempts to generate datasets of hand-labeled weather or climate have been made, a significant challenge has been to gather a sufficient number of expert-annotated samples. To address this challenge, we introduce the largest dataset of expert-guided, hand-labeled segmentation masks of extreme weather events. It contains global annotations for atmospheric rivers, tropical cyclones, and atmospheric blocking events from the European Centre for Medium-Range Weather Forecasting's reanalysis version 5. Every timestep for each event is annotated by two separate annotators to bring the total number of labeled timesteps to 49,184. Professional annotators were trained and guided to identify these features by domain-experts, and event-specific experts were consulted for each of the annotation guides. The resulting annotations are demonstrated to have characteristics similar to other methods and those generated directly by domain experts.

Data availability

The code used to process ERA5 data from NetCDF format into an appropriate format for webKnossos along with uploading to the web interface is available. Although our dataset is provided in already packaged form, pre-processing code to download webKnossos annotations and package them into the NetCDF format is also available. These are both available on GitHub at the following URL: https://github.com/andregraubner/ClimateNetLarge

Files

Large-scale-dataset-for-training-deep-learning-segmentation-and-tracking-of-extreme-weather.pdf

Additional details

Identifiers

DOI
10.1038/s41597-025-05480-0
Other
oai:uchicago.tind.io:15617

Funding

European Space Agency
Open Space Innovation Platform
Swiss Federal Institute of Technology Zurich

UChicago Information

Division(s)
Social Sciences Division
Center(s) or Institute(s)
Urban Theory Lab