Repository logo

Comparing precipitation estimates, model forecasts, and random forest based predictions for excessive rainfall


Flash flooding is an important societal challenge, and improved tools are needed for both real-time analysis and short-range forecasts. We present an evaluation of threshold exceedances of quantitative precipitation estimate (QPE) and forecast (QPF) datasets in terms of their degree of correspondence with observed flash flood events over a seven-year period. We find that major uncertainties persist in QPE for heavy rainfall. In general, comparison with flash flood guidance (FFG) thresholds provides the best correspondence, but fixed thresholds and average recurrence interval thresholds provide the best correspondence in certain regions of the contiguous US (CONUS). QPF threshold exceedances from the High-Resolution Rapid Refresh (HRRR) generally do not correspond as well as QPE exceedances with observed flash floods, except for the 1-h duration in the southwestern CONUS; this suggests that high-resolution model QPF may be a better indicator of flash flooding than QPE in some poorly observed regions. Subsequently, we describe a new random forest (RF) based excessive rainfall forecast system using predictor information from the 3-km operational HRRR. Experiments exploring the use of spatial predictor information reveal the importance of averaging HRRR predictor fields across a spatial radius rather than using only information from sparse input grid points for regimes with small-scale excessive rain events. Tree interpreter results indicate that the forecast benefits of spatial aggregation stem from greater contributions provided by storm attribute predictors. Forecasts are slightly degraded when there is a mismatch between the trained RF model and the daily HRRR forecasts to which the model is applied, both in terms of initialization time and HRRR model version. Use of FFG as an additional predictor leads to forecast improvements, highlighting the potential of hydrologic information to contribute to forecast skill. In addition, averaging predictor information across several HRRR initializations leads to a statistically significant improvement in forecasts relative to using predictor fields from a single HRRR initialization. The HRRR-based RF has been evaluated at the annual Flash Flood and Intense Rainfall Experiment (FFaIR) over the past three years, with year-over-year improvements stemming from the results of sensitivity experiments. The HRRR-based RF represents an important baseline for future machine learning based excessive rainfall forecasts based on convection-allowing models.


Includes bibliographical references.
2023 Fall.

Rights Access


flash flooding
random forests
machine learning
excessive rainfall


Associated Publications