Repository logo
 

Dataset associated with "Estimation of the state-value function for optimal reservoir operations using continuous action deep reinforcement learning"

Date

2020

Authors

Peacock, Matthew E.
Labadie, John W.

Journal Title

Journal ISSN

Volume Title

Abstract

The state-value function of a reservoir system provides information about the long-term rewards that can be accrued from any state which the system can occupy. This function can be used to determine optimal decisions and is also key piece of information needed when reservoir operators wish to incorporate real-time forecast information. Dynamic programming is the most popular method for calculating the state-value function but has well-known limitations. The "curse of dimensionality,'' which can lead to computational intractability, arises from the discrete nature of the formulation and the backwards recursive solution process precluding consideration of delayed rewards. Continuous action deep reinforcement learning (CADRL) is a recent development for estimating the state-value function when delayed rewards are present and avoids the difficulties associated with use of discrete methods. Since application of this technique to reservoir operation problems is not without its own challenges, presented herein is a computational implementation with refinements needed to provide a stable and reliable learning process. CADRL is applied to development of optimal operational strategies for Lake Mendocino in the Russian River basin of Northern California using two single-objective reward functions, along with a multi-objective reward function for verification purposes. Performance of the optimal policy functions developed from the learning process is evaluated through simulation, with results showing that the system is able to learn far-sighted strategies that outperform idealized policies with foresight.

Description

This dataset includes the source code of an implementation of the deep deterministic policy gradients algorithm to a reservoir operations problem. Also included are the input time series data of inflow and withdrawal at each node in the network and the evaporation table.
Department of Civil and Environmental Engineering

Rights Access

Subject

reservoir operations
reinforcement learning
deep deterministic policy gradients
continuous action deep reinforcement learning
ensemble forecast

Citation

Associated Publications

in review: Estimation of the State-Value Function for Optimal Reservoir Operations using Continuous Action Deep Reinforcement Learning