Skip to the content.

s2s-ai-challenge legacy

The s2s-ai-competition is officially over. However, the organizers encourages everyone to post new contributions on renku in the future and to take advantage of the software framework developed for the competition. This website and the renku projects will be kept available until December 2023 (the end of the S2S project). Especially if your contribution beats the current leader, please contact Andy Robertson and Frederic Vitart. The s2s-ai-challenge-scorer contributing to the leaderboard stops working, but you can still easily score your contributions manually.

Competition Overview

WMO logo WWRP logo WCRP logo S2S logo SDSC logo ECMWF logo

Table of Contents

  1. Announcements
  2. Description
  3. Timeline
  4. Prize
  5. Predictions
  6. Evaluation
  7. Data
  8. Training
  9. Discussions
  10. Leaderboard
  11. Rules
  12. Organizers


2022-02-04: prize announcement

The expert peer review gave a pass on all five submissions. Based on the leaderboard the prizes go to:

🥇(CHF 15,000): David Landry, Jordan Gierschendorf, Arlan Dirkson, Bertrand Denis (Centre de recherche informatique de Montréal and ECCC)

🥈 (CHF 10,000): Llorenç Lledó, Sergi Bech, Lluís Palma, Andrea Manrique, Carlos Gomez (Barcelona Supercomputing Center)

🥉 (CHF 5,000): Shanglin Zhou and Adam Bienkowski (University of Connecticut)

The organizers thank Frederic Pinault (ECMWF), Jesper Dramsch (ECMWF), Stephan Rasp ( and Kenneth Nowak (USBR) for their expert reviews.

2022-01-26: prize announcement on 4 Feb 2022

2021-11-23: date for conference session: 26 Jan 2022

2021-11-04: Announcement of the score RPSS and beginning of open review

2021-10-22: Small modifications to s2saichallengescorer

2021-09-30: Last month for submissions




2021-05-31: Template repository and scorer bot ready

2021-05-27: Town hall dates and EWC compute deadline changes

The organizers slightly adapted the rules:

The organizers invite everyone to join two town hall meetings:

The meetings will include a 15-minutes presentation on the competition rules and technical aspects, followed by a 45-minutes discussion for Q&A.

A first version of the s2s-ai-challenge-template repository is released. Please fork again or rebase.

The deadline to apply for EWC compute access is shifted to 15th June 2021. Please use the competition registration form to explain why you need compute resources. Please note that ECMWF just provides access to EWC, but not detailed support of how to setup your environments etc.

The RPSS formula has change to incorporate the RPSS with respect to climatology, see evaluation.

2021-05-10: Rules adapted to discourage overfitting

The organizers modified the rules:

The organizers are aware that overfitting is an issue if the ground truth is accessible. A more robust verification would require predicting future states with weekly submissions over a year, which would take much more time until one year of new observations is available. Therefore, we decided against a real-time competition to shorten the project length and keep momentum high. Over time we will all see which methods genuinely have skill and which overfitted their available data.


The World Meteorological Organization (WMO) is launching an open prize challenge to improve current forecasts of precipitation and temperature from today’s best computational fluid dynamical models 3 to 6 weeks into the future using Artificial Intelligence and/or Machine Learning techniques. The challenge is part of the the Subseasonal-to-Seasonal Prediction Project (S2S Project), coordinated by the World Weather Research Programme (WWRP)/World Climate Research Programme (WCRP), in collaboration with Swiss Data Science Center (SDSC) and European Centre for Medium-Range Weather Forecasts (ECMWF).

Improved sub-seasonal to seasonal (S2S) forecast skill would benefit multiple user sectors immensely, including water, energy, health, agriculture and disaster risk reduction. The creation of an extensive database of S2S model forecasts has provided a new opportunity to apply the latest developments in machine learning to improve S2S prediction of temperature and total precipitation forecasts up to 6 weeks ahead, with focus on biweekly averaged conditions around the globe.

The competition will be implemented on the platform of Renkulab at the Swiss Data Science Center (SDSC), which hosts all the codes and scripts. The training and verification data will be easily accessible from the European Weather Cloud and relevant access scripts will be provided to the participants. All the codes and forecasts of the challenge will be made open access after the end of the competition.

This is the landing page of the competition presenting static information about the competition. For code examples and how to contribute, please visit the contribution template repository



Prizes will be awarded to for the top three submissions evaluated by RPSS and peer-review scores and must beat the calibrated ECMWF benchmark and climatology. The ECMWF recalibration has been performed by using the tercile boundaries from the model climatology rather than from observations:

The 3rd prize is reserved for the top submission from developing or least developed country or small island state as per the UN list (see table C, F, H p.166ff). If such a submissions is already among the top 2, the third submission will get the 3rd prize.


The organizers envisage two different approaches for Machine Learning-based predictions, which may be combined. Predict the week 3-4 & 5-6 state based on:

ML-based predictions schematic

For the exact valid_times to predict, see timings. For the data to use us for training, see data sources. Comply with the rules.


The objective of the competition is to improve week 3-4 (weeks 3 plus 4) and 5-6 (weeks 5 plus 6) subseasonal global probabilistic 2m temperature and total precipitation tercile forecasts issued in the year 2020 by using Machine Learning/Artificial Intelligence.

The evaluation will be continuously performed by a s2saichallengescorer bot, following the verification notebook. Submissions are evaluated on the Ranked Probability Score (RPS) between the ML-based forecasts and ground truth CPC temperature and accumulated precipitation observations based on pre-computed observations-based terciles calculated in renku_datasets_biweekly.ipynb. This RPS is compared to the climatology forecast in the Ranked Probability Skill Score (RPSS). The ML-based forecasts should beat the re-calibrated real-time 2020 ECMWF and climatology forecasts to be able to win prizes, see end of verification notebook.

RPS is calculated with the open-source package xskillscore over all 2020 forecast_times. For probabilistic forecasts:

xs.rps(observations, probabilistic_forecasts, category_edges=None, input_distributions='p', dim='forecast_time')

See the xskillscore.rps API for details.

def RPSS(RPS_ML, RPS_clim):
    """Ranked Probability Skill Score with respect to climatology.
    |  Score  | Description                               |
    |    1    | maximum, perfect improvement              |
    |  (0,1]  | positive means ML better than climatology |
    |    0    | equal performance                         |
    | (0, -∞) | negative means ML worse than climatology  |

  return 1 - RPS_ML / RPS_clim

The RPS_ML and RPS_clim are first calculated on each grid cell over land globally on a 1.5 degree grid. In grid cells where numerical values are expected but NaNs are provided, the RPS is penalized with 2. The gridded RPSS=1-RPS_ML/RPS_clim is calculated from the ML-based RPS averaged over all forecast_times and the climatological RPS averaged over all forecast_times. The RPSS values are clipped to the interval [-10, 1]. This gridded RPSS is then spatially averaged (weighted (np.cos(np.deg2rad(ds.latitude)))) over [90N-60S] land points and further averaged over both variables and both lead_times. Please note that the observational probabilities are applied with a dry mask on total precipitation tp evaluation as in Vigaud et al. 2017, i.e. we exclude grid cells where the observations-based lower tercile edge is below 1 mm/day. Please find the ground truth compared against here.

For diagnostics, we will further host leaderboards for the two variables in three regions in November 2021:

Please find more details in the verification notebook.


We expect submissions to cover all bi-weekly week 3-4 and week 5-6 forecasts issued in 2020, see timings. We expect one submission netcdf file for all 53 forecasts issued on thursdays in 2020. Submissions must be gridded on a global 1.5 degree grid.

Each submission has to be a netcdf file with the following dimension sizes and coordinates:

Dimensions:        (category: 3, forecast_time: 53, latitude: 121, lead_time: 2, longitude: 240)
  * forecast_time  (forecast_time) datetime64[ns] 2020-01-02 ... 2020-12-31
  * latitude       (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
  * lead_time      (lead_time) timedelta64[ns] 14 days 28 days
  * longitude      (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
    valid_time     (lead_time, forecast_time) datetime64[ns] 2020-01-16 ... 2...
  * category       (category) object 'below normal' 'near normal' 'above normal'
Data variables:
    t2m            (category, lead_time, forecast_time, latitude, longitude) float32 ...
    tp             (category, lead_time, forecast_time, latitude, longitude) float32 ...
    author:        Aaron Spring
    comment:       created for the s2s-ai-challenge as a template for the web...

This template submissions file is available here.

Click on 📄 to see the metadata for the coordinates and variables.

We deal with two fundamentally different variables here: Total precipitation is precipitation flux pr accumulated over lead_time until valid_time and therefore describes a point observation. 2m temperature is averaged over lead_time(valid_time) and therefore describes an average observation. The submission file data model unifies both approaches and assigns 14 days for week 3-4 and 28 days for week 5-6 marking the first day of the biweekly aggregate.

Submissions have to be commited in git with git lfs in a repository hosted by

After the competition, the code for training together with the gridded results must be made public, so the organizers and peer review can check adherence to the rules. Please indicate the resources used (number of CPUs/GPUs, memory, platform; see safeguards in examples) in your scripts/notebooks to allow reproducibility and document them fully to enable easy interpretation of the codes. Submissions, which cannot be independently reproduced by the organizers after the competition ends, cannot win prizes, please see rules.



The organizers explicitly choose to run this competition on past 2020 data, instead of a real-time competition to enable a much shorter competition period and to keep momentum high. We are aware of the dangers of overfitting (see rules), if the ground truth data is accessible.

Please find here an explicit list of the forecast dates required.

1) Which forecast starts/target periods (weeks 3-4 & 5-6) to require to be submitted?

Please find a list of the dates when forecasts are issued (forecast_time with CF standard_name forecast_reference_time) and corresponding start and end in valid_time for week 3-4 and week 5-6.

lead_time week 3-4 start week 3-4 end week 5-6 start week 5-6 end
valid_time 2020-01-02 2020-01-16 2020-01-29 2020-01-30 2020-02-12
2020-01-09 2020-01-23 2020-02-05 2020-02-06 2020-02-19
2020-01-16 2020-01-30 2020-02-12 2020-02-13 2020-02-26
2020-01-23 2020-02-06 2020-02-19 2020-02-20 2020-03-04
2020-01-30 2020-02-13 2020-02-26 2020-02-27 2020-03-11
2020-02-06 2020-02-20 2020-03-04 2020-03-05 2020-03-18
2020-02-13 2020-02-27 2020-03-11 2020-03-12 2020-03-25
2020-02-20 2020-03-05 2020-03-18 2020-03-19 2020-04-01
2020-02-27 2020-03-12 2020-03-25 2020-03-26 2020-04-08
2020-03-05 2020-03-19 2020-04-01 2020-04-02 2020-04-15
2020-03-12 2020-03-26 2020-04-08 2020-04-09 2020-04-22
2020-03-19 2020-04-02 2020-04-15 2020-04-16 2020-04-29
2020-03-26 2020-04-09 2020-04-22 2020-04-23 2020-05-06
2020-04-02 2020-04-16 2020-04-29 2020-04-30 2020-05-13
2020-04-09 2020-04-23 2020-05-06 2020-05-07 2020-05-20
2020-04-16 2020-04-30 2020-05-13 2020-05-14 2020-05-27
2020-04-23 2020-05-07 2020-05-20 2020-05-21 2020-06-03
2020-04-30 2020-05-14 2020-05-27 2020-05-28 2020-06-10
2020-05-07 2020-05-21 2020-06-03 2020-06-04 2020-06-17
2020-05-14 2020-05-28 2020-06-10 2020-06-11 2020-06-24
2020-05-21 2020-06-04 2020-06-17 2020-06-18 2020-07-01
2020-05-28 2020-06-11 2020-06-24 2020-06-25 2020-07-08
2020-06-04 2020-06-18 2020-07-01 2020-07-02 2020-07-15
2020-06-11 2020-06-25 2020-07-08 2020-07-09 2020-07-22
2020-06-18 2020-07-02 2020-07-15 2020-07-16 2020-07-29
2020-06-25 2020-07-09 2020-07-22 2020-07-23 2020-08-05
2020-07-02 2020-07-16 2020-07-29 2020-07-30 2020-08-12
2020-07-09 2020-07-23 2020-08-05 2020-08-06 2020-08-19
2020-07-16 2020-07-30 2020-08-12 2020-08-13 2020-08-26
2020-07-23 2020-08-06 2020-08-19 2020-08-20 2020-09-02
2020-07-30 2020-08-13 2020-08-26 2020-08-27 2020-09-09
2020-08-06 2020-08-20 2020-09-02 2020-09-03 2020-09-16
2020-08-13 2020-08-27 2020-09-09 2020-09-10 2020-09-23
2020-08-20 2020-09-03 2020-09-16 2020-09-17 2020-09-30
2020-08-27 2020-09-10 2020-09-23 2020-09-24 2020-10-07
2020-09-03 2020-09-17 2020-09-30 2020-10-01 2020-10-14
2020-09-10 2020-09-24 2020-10-07 2020-10-08 2020-10-21
2020-09-17 2020-10-01 2020-10-14 2020-10-15 2020-10-28
2020-09-24 2020-10-08 2020-10-21 2020-10-22 2020-11-04
2020-10-01 2020-10-15 2020-10-28 2020-10-29 2020-11-11
2020-10-08 2020-10-22 2020-11-04 2020-11-05 2020-11-18
2020-10-15 2020-10-29 2020-11-11 2020-11-12 2020-11-25
2020-10-22 2020-11-05 2020-11-18 2020-11-19 2020-12-02
2020-10-29 2020-11-12 2020-11-25 2020-11-26 2020-12-09
2020-11-05 2020-11-19 2020-12-02 2020-12-03 2020-12-16
2020-11-12 2020-11-26 2020-12-09 2020-12-10 2020-12-23
2020-11-19 2020-12-03 2020-12-16 2020-12-17 2020-12-30
2020-11-26 2020-12-10 2020-12-23 2020-12-24 2021-01-06
2020-12-03 2020-12-17 2020-12-30 2020-12-31 2021-01-13
2020-12-10 2020-12-24 2021-01-06 2021-01-07 2021-01-20
2020-12-17 2020-12-31 2021-01-13 2021-01-14 2021-01-27
2020-12-24 2021-01-07 2021-01-20 2021-01-21 2021-02-03
2020-12-31 2021-01-14 2021-01-27 2021-01-28 2021-02-10

2) Which data to “allow” to be used to make a specific ML forecast?


Main datasets for this competition are already available as renku datasets and in climetlab for both variables temperature and total precipitation. In climetlab, we have one dataset lab for the Machine Learning community and S2S forecasting community, which both lead to the same datasets:

tag in climetlab (ML community) tag in climetlab (S2S community) Description renku dataset(s)
training-input hindcast-input deterministic daily lead_time reforecasts/hindcasts initialized once per week 2000 to 2019 on dates of 2020 thursdays forecasts from models ECMWF, ECCC, NCEP biweekly lead_time: {model}_hindcast-input_2000-2019_biweekly_deterministic.zarr
test-input forecast-input deterministic daily lead_time real-time forecasts initialized on thursdays 2020 from models ECMWF, ECCC, NCEP biweekly lead_time: {model}_forecast-input_2020_biweekly_deterministic.zarr
training-output-reference hindcast-like-observations CPC daily observations formatted as 2000-2019 hindcasts with forecast_time and lead_time biweekly lead_time deterministic: hindcast-like-observations_2000-2019_biweekly_deterministic.zarr; probabilistic in 3 categories: hindcast-like-observations_2000-2019_biweekly_terciled.zarr
test-output-reference forecast-like-observations CPC daily observations formatted as 2020 forecasts with forecast_time and lead_time biweekly lead_time: forecast-like-observations_2020_biweekly_deterministic.zarr; binary in 3 categories:
training-output-benchmark hindcast-benchmark ECMWF week 3+4 & 5+6 re-calibrated probabilistic 2000-2019 hindcasts in 3 categories -
test-output-benchmark forecast-benchmark ECMWF week 3+4 & 5+6 re-calibrated probabilistic real-time 2020 forecasts in 3 categories
- - Observations-based tercile category edges calculated from 2000-2019

Note that tercile_edges separating observations into the category "below normal" [0.-0.33), "near normal" [0.33-0.67) or "above normal" [0.67-1.] depend on longitude (240), latitude (121), lead_time (46 days or 2 bi-weekly), forecast_time.weekofyear (53) and category_edge (2).

We encourage to use subseasonal forecasts from the S2S and SubX projects:

However, any other publicly available data sources (like CMIP, NMME, DCPP etc.) of dates prior to the forecast_time can be used for training-input and forecast-input. Also purely empirical methods like persistence or climatology could be used. The only essential data requirement concerns forecast times and dates, see timings.

Ground truth sources are NOAA CPC temperature and total precipitation from IRIDL:



Follow the steps in the template renku project.


Where to train?

We are looking for your smart solutions here. Find a quick start template here.


Please use the issue tracker in the renkulab s2s-ai-challenge gitlab repository for questions to the organizers, discussions, bug reports. We have set up a Gitter chat room for informal communication.


Answered questions from the issue tracker are regularly transferred to the FAQ.



Please find below all submissions with an RPSS > -1:

  Project User Submission tag RPSS Visibility Notebooks Review Prediction netcdf Docs Description Country
1 s2s-ai-challenge-template Bertrand Denis, David Landry, Jordan Gierschendorf, Arlan Dirkson submission-methods-7 0.0459475 visibility badge notebook badge review badge netcdf badge docs badge climatology, EMOS, CNN ECMWF and CNN weighting 🇨🇦
2 s2s-ai-challenge-BSC Sergi Bech, Llorenç Lledó, Lluís Palma, Carlos Gomez, Andrea Manrique submission-ML_models 0.0288718 visibility badge notebook badge review badge netcdf badge docs badge climo + raw ECMWF + logistic regression + random forest 🇪🇸
3 s2s-ai-challenge-uconn Shanglin Zhou, Adam Bienkowski submission-tmp3-0.0.1 0.00620681 visibility badge notebook badge review badge netcdf badge docs badge RandomForestClassifier 🇺🇸
4 s2s-ai-challenge-kit-eth-ubern Nina Horat, Sebastian Lerch, Julian Quinting, Daniel Steinfeld, Pavel Zwerschke submission_local_CNN_final 0.00236371 visibility badge notebook badge review badge netcdf badge docs badge CNN 🇩🇪🇨🇭🇺🇸
5 s2s-ai-challenge-template Damien Specq submission-damien-specq 0.000467425 visibility badge notebook badge review badge netcdf badge docs badge Bayesian statistical-dynamical post-processing 🇫🇷
6 s2s-ai-challenge-ECMWF-internal-testers Matthew Chantry, Florian Pinault, Jesper Dramsch, Mihai Alexe submission-pplnn-0.0.1 -0.0301027 visibility badge notebook badge review badge netcdf badge   CNN, resnet-CNN or Unet 🇬🇧🇫🇷🇩🇪
7 s2s-ai-challenge-alpin Julius Polz, Christian Chwala, Tanja Portele, Christof Lorenz, Brian Boeker submission-resubmit-alpine-0.0.1 -0.597942 visibility badge notebook badge review badge netcdf badge   CNN 🇩🇪
8 s2s-ai-challenge-ylabaiers Ryo Kaneko, Kei Yoshimura, Gen Hayakawa, Gaohong Yin, Wenchao MA, Kinya Toride submission_last -0.756598 visibility badge notebook badge review badge netcdf badge   Unet 🇯🇵🇺🇸
9 s2s-ai-challenge-kjhall01 Kyle Hall & Nachiketa Acharya submission-kyle_nachi_poelm-0.0.6 -0.9918 visibility badge notebook badge review badge netcdf badge   Probabilistic Output Extreme Learning Machine (POELM) 🇺🇸


Extended overview

Submissions have to beat the ECMWF re-calibrated benchmark and climatology while following the rules to qualify for prizes.

We will also publish RPSS subleaderboards, that are purely diagnostic and show RPSS for two variables (t2m, tp), two lead_times (weeks 3-4 & 5-6) and three subregions ([90N-30N], (30N-30S), [30S-60S]).

Peer review

From November 2021 to January 2022, there will be two peer review processes:

Peer review will evaluate:

Open peer review

One goal of this challenge is to foster a conversation about how AI/ML can improve S2S forecasts. Therefore, we will open the floor for discussions and evaluating all methods submitted in an open peer review process. The organizers will create a table of all submissions and everyone is invited to comment on submissions, like in the EGU’s public interactive discussions. This open peer review will be hosted on renku’s gitlab.

Expert peer review

The organizers decided that the top four submissions will be evaluated by expert peer review. This will include 2-3 reviews by experts from the fields of S2S & AI/ML. Additionally, the organizers will host a public showcase session in January 2022, in which these top four submission can present their method in 10 minutes followed by 15 minutes Q&A. The reviewers will give their review grades after an internal discussion moderated by Andrew Robertson and Frederic Vitart acting as editors.

Based on the criteria above, the expert peer reviewers will give a peer review grade. Comments from the open peer review can be taken into account by the expert peer review grades.


The review grades will be ranked. The RPSS leaderboard will also be ranked. The final leaderboard will be determined from the average of both rankings. When two submissions have the same mean ranking, the review ranking counts more. The top three submissions based on the combined RPSS and expert peer-review score will receive prizes.



WMO logo WWRP logo WCRP logo S2S logo SDSC logo ECMWF logo