Weather Model Bias Study — NL (Open-Meteo)

Abstract

Lightweight bias study comparing three weather forecast models against two references
(ERA5 reanalysis and IFS025 as model baseline) across 10 Dutch locations over an 89-day
window. Goal: inform which model to use as weather input features in an ML electricity
load forecasting pipeline, minimising train/serve skew.

Recommendation: train on ERA5, deploy with HRES (ecmwf_ifs on Open-Meteo).


Models

Slug What it is Native grid Latency Historical depth
ecmwf_ifs025 IFS at 0.25° (open-data) ~28 km ~3–4 h after run Years via archive API
ecmwf_ifs IFS HRES at 9 km ~9 km ~3–4 h after run ~89 days via past_days
knmi_harmonie_arome_europe KNMI HARMONIE-AROME EU ~2.5 km ~3–4 h after run ~89 days via past_days
knmi_harmonie_arome_netherlands KNMI HARMONIE-AROME NL ~1 km ~1–3 h after run Single runs from Sep 2025; ~60h horizon
era5 ECMWF reanalysis 0.25° ~5–6 day lag Back to 1940

IFS runs 4× daily (00, 06, 12, 18 UTC). ERA5's latency means it cannot be used
for real-time inference — training/ground-truth reference only.

ecmwf_ifs is confirmed as IFS HRES 9km per Open-Meteo docs. ecmwf_hres is an
invalid slug (returns a data corruption error) — wrong name, not a broken dataset.


Methodology

Sampling

Variable choices for load forecasting relevance:
- wind_speed_10m_mean not max — mean wind drives heating load / wind chill
- relative_humidity_2m_mean not max — RH max peaks at dawn when AC demand is zero
- surface_pressure excluded — causes HTTP 400 for HARMONIE

API calls

Forecast models via Open-Meteo forecast API with past_days=89 — this returns
a seamless blended timeseries (most recent run for each timestep), not a specific
forecast horizon. Sufficient for systematic bias detection; for lead-time specific
bias (e.g. day+2 error) use the Single Runs API (single-runs-api.open-meteo.com).

ERA5 via archive API with explicit date range.

Grid handling

Open-Meteo interpolates all models to the requested lat/lon — no manual grid
alignment needed. HARMONIE is regional (Europe only); out-of-domain detection
applied automatically — if >95% of HARMONIE daily values matched HRES values,
the location was flagged as a fallback and excluded. All 10 NL locations were in-domain.


Results

Bias vs ERA5 (reanalysis)

Variable IFS025 HRES HARMONIE
Tmax (°C) -0.37 +0.05 +1.30
Tmin (°C) +0.03 +0.18 +0.97
Precip (mm) +0.52 +0.34 -0.15
Wind mean (km/h) -2.23 -2.22 -2.31
RH mean (%) +1.44 -0.38 -5.75
SW rad (MJ/m²) +1.00 +0.98 -0.96

Bias vs IFS025

Variable HRES HARMONIE
Tmax (°C) +0.40 +1.70
Tmin (°C) +0.10 +0.97
Precip (mm) -0.11 -0.67
Wind mean (km/h) -0.02 -0.06
RH mean (%) -1.72 -7.56
SW rad (MJ/m²) +0.05 -1.93

Key findings

HRES is closest to ERA5. Tmax bias of +0.05°C — essentially unbiased. IFS025 runs
cold (−0.37°C), so HRES's apparent +0.40°C warm offset vs IFS025 is mostly IFS025
being wrong, not HRES.

HARMONIE has large systematic biases. Tmax +1.30°C and RH −5.75% vs ERA5.
Systematic, not noise — shifts every prediction in the same direction.

Wind: model choice doesn't matter. All three models underestimate mean wind vs
ERA5 by ~2.2–2.3 km/h identically. Likely a gridded-model vs point-observation artefact.

SW radiation: all models overestimate vs ERA5 (~+1 MJ/m²), HARMONIE slightly
undershoots (−0.96). Marginal.

Reference frame matters. HARMONIE looks +1.70°C warm vs IFS025 but only +1.30°C
warm vs ERA5 — because IFS025 itself runs cold. Always compare vs ERA5 for real-world accuracy.

Texel (island) anomaly. HARMONIE Tmax +3.05°C vs ERA5 but Tmin −2.32°C — 2.5km
resolution resolves maritime effects both ways. RH flips to +5.89% (vs inland dry bias)
— sea moisture captured at high resolution.


Lead time analysis — daily aggregates

Single Runs API — 17 initialization dates (Jan–Jun 2026, every 10 days), 3 NL locations.
Used knmi_harmonie_arome_europe; daily aggregates only gave +1d horizon from those runs.

Bias vs ERA5 at lead +1d — all models

Variable IFS025 HRES HARMONIE
Tmax (°C) -0.50 -0.34 +0.35
Tmin (°C) +0.13 +0.17 +0.95
Precip (mm) +0.35 +0.53 +0.95
Wind (km/h) -1.17 -2.44 -0.74
RH mean (%) +1.20 +0.24 -1.90

HRES bias vs ERA5 by lead time (+1 to +5 days)

Variable +1d +2d +3d +4d +5d
Tmax (°C) -0.34 -0.48 -0.85 -0.57 -1.87
Tmin (°C) +0.17 -0.06 -0.25 -0.63 -0.40
Precip (mm) +0.53 -0.29 +0.45 +0.63 +0.31
Wind (km/h) -2.44 -2.59 -2.04 -3.24 -1.75
RH mean (%) +0.24 +0.41 -0.20 +0.65 -0.21

Does bias grow with lead time? Yes, clearly for temperature. HRES Tmax goes from
−0.34°C at +1d to −1.87°C at +5d — models get progressively cold-biased at longer
horizons. Precipitation and wind show no consistent trend (noisy). RH is relatively
stable for HRES across lead times.


Intraday (hourly) lead time analysis

Single Runs API — 20 initialization dates (Sep 2025–Jun 2026, every 14 days),
3 NL locations (Amsterdam, De Bilt, Maastricht). Reference: ERA5 hourly.

Used knmi_harmonie_arome_netherlands (higher res, NL-specific; ~60h horizon
confirmed from single runs). HRES and IFS025 extended to +48h for fair comparison.

Temperature (°C)

Lead HRES HARMONIE
+1h +0.01 +1.08
+2h -0.09 +1.17
+6h -0.34 +0.96
+12h -0.18 +0.22
+18h -0.38 +0.66
+24h -0.00 +1.13
+36h +0.03 +0.24
+48h +0.01 +0.93
Lead HRES HARMONIE
+1h -0.01 +1.06
+2h +0.05 +1.31
+6h +0.03 +1.33
+12h +0.03 +0.43
+18h +0.09 +1.13
+24h +0.19 +1.32
+36h +0.10 +0.31
+48h +0.06 +0.98

Wind speed (km/h)

Lead HRES HARMONIE
+1h -3.06 -1.53
+2h -3.32 -1.67
+6h -2.99 -1.84
+12h -3.39 -2.64
+18h -2.64 -0.85
+24h -2.20 -0.52
+36h -2.71 -2.54
+48h -2.43 -0.44
Lead HRES HARMONIE
+1h -1.41 +0.12
+2h -1.29 +0.36
+6h -1.25 -0.10
+12h -1.65 -0.90
+18h -0.91 +0.88
+24h -0.86 +0.82
+36h -1.51 -1.34
+48h -1.12 +0.87

Relative humidity (%)

Lead HRES HARMONIE
+1h -0.42 -5.47
+2h -0.13 -5.25
+6h +1.32 -5.15
+12h -0.92 -1.98
+18h +1.58 -1.60
+24h +0.03 -4.67
+36h -2.82 -1.92
+48h -0.03 -3.32
Lead HRES HARMONIE
+1h +0.06 -4.99
+2h -0.15 -5.27
+6h -0.38 -6.85
+12h -1.95 -3.01
+18h -1.20 -4.38
+24h -0.79 -5.49
+36h -1.85 -0.95
+48h -0.43 -3.72

HARMONIE has a strong diurnal bias pattern. Temperature warm bias is large at
nighttime hours (+1h=+1.08°C, +2h=+1.17°C, +24h=+1.13°C, +48h=+0.93°C) but
nearly vanishes at noon (+12h=+0.22°C, +36h=+0.24°C). This is consistent with a
nocturnal boundary layer issue — HARMONIE overestimates overnight temperatures.
For load forecasting this is significant: nighttime cooling demand is precisely
when HARMONIE's temperature will be most wrong.

HRES temperature is stable across all lead times — stays within ±0.4°C from
+1h to +48h. Small cold bias at evening hours (−0.38°C at +18h), near-zero at
+1h and +24h.

Wind: HARMONIE is better than HRES here. HRES consistently underestimates
wind by −2.2 to −3.4 km/h. HARMONIE is closer to ERA5 for many lead times
(especially +18h to +48h). IFS025 in between.

RH: HARMONIE dry-bias is worst at nighttime. −5.5% at +1h, −4.7% at +24h,
but only −2.0% at +12h. HRES RH is noisy but small magnitude (±3%). For
heat-index based load forecasting this nighttime dry bias compounds with the
warm bias — partially offsetting on apparent temperature, but both are wrong.


ML load forecasting recommendation

Train on Deploy with Tmax skew RH skew Verdict
ERA5 HRES +0.05°C −0.38% ✅ Best
IFS025 HRES +0.40°C −1.72% ✓ Fine
ERA5 HARMONIE +1.30°C −5.75% ✗ Avoid
IFS025 HARMONIE +1.70°C −7.56% ✗ Worst

Train on ERA5 (years of history, full seasonal coverage), deploy with HRES (near-zero
bias on Tmax and RH vs ERA5). HARMONIE's RH dry bias is particularly harmful — humidity
drives heat index which correlates strongly with cooling load.

For day-ahead forecasts specifically: lead+1 biases are modest for HRES (−0.34°C Tmax,
+0.24% RH vs ERA5). The train/serve skew is small and stable.