A value-function based method for incorporating ensemble forecasts in real-time optimal reservoir operations

Peacock, Matthew E., authorLabadie, John W., advisorRamirez, Jorge, committee memberAnderson, Chuck, committee memberJohnson, Lynn, committee memberA value-function based method for incorporating ensemble forecasts in real-time optimal reservoir operationsColorado State University. Libraries2020ensemble forecastsreservoir operationoptimizationdeep reinforcement learningMy UniversityMy University2020-08-312020-08-312020engTexthttps://hdl.handle.net/10217/211761https://doi.org/10.25675/3.04678born digitaldoctoral dissertationsCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.Increasing stress on water resource systems has led to a desire to seek methods of improving the performance of reservoir operations. Water managers face many challenges including changes in demand, variable hydrological input and new environmental pressures. These issues have led to an interest in using ensemble streamflow forecasts to improve the performance of a reservoir system. The currently available methods for using ensemble forecasts encounter difficulties as the resolution of the analysis increases in order to accurately model a real-world system. One of the difficulties is due to the "curse of dimensionality'' as computing time exponentially increases when the discretization of the state and action spaces becomes finer or when more state or action variables are considered. Another difficulty is the problem of delayed rewards. When the time step of the analysis becomes shorter than the travel time due to routing, rewards may not be realized in the same time step as the action which caused them. Current methods such as dynamic programming or scenario-tree based methods are not able to handle delayed rewards. This research presents a value function-based method which separates the problem into two subproblems: computing the state-value function in the no-forecast condition, and finding optimal sequences of decisions given the ensemble forecast with the state-value function providing information about the value at any state at the end of the forecast horizon. A continuous action deep reinforcement learning algorithm is used to overcome the problems of dimensionality and delayed rewards, and a particle swarm method is used to find optimal decisions during the forecast horizon. The method is applied to a case study in the Russian River basin and compared to an idealized operating rule. The results show that the reinforcement learning process is able to generate policies that skillfully operate the reservoir without forecasts. When forecasts are used, the method is able to produce non-dominated performance measures. When the water stress to the system is increased by removing a transbasin diversion, the method outperforms the idealized operations.