Online transfer learning strategy for enhancing the scalability and deployment of deep reinforcement learning control in smart buildings

Date Published	01/2023
Publication Type	Journal Article

Authors	Davide Coraci Silvio Brandi Tianzhen Hong Alfonso Capozzoli
DOI	10.1016/j.apenergy.2022.120598
Abstract	In recent years, advanced control strategies based on Deep Reinforcement Learning (DRL) proved to be effective in optimizing the management of integrated energy systems in buildings, reducing energy costs and improving indoor comfort conditions when compared to traditional reactive controllers. However, the scalability and implementation of DRL controllers are still limited since they require a considerable amount of time before converging to a near-optimal solution. This issue is currently addressed in literature through the offline pre-training of the DRL agent. However this solution results in two main critical issues: (1) the need to develop a building surrogate model to perform the training task, and (2) the need to perform a fine-tuning process over several training episodes to obtain a near-optimal control policy. In this context, this paper introduces an Online Transfer Learning (OTL) strategy that exploits two knowledge-sharing techniques, weight-initialization and imitation learning, to transfer a DRL control policy from a source office building to various target buildings in a simulation environment coupling EnergyPlus and Python. A DRL controller based on discrete Soft Actor–Critic (SAC) is trained on the source building to manage the operation of a cooling system consisting of a chiller and a thermal storage. Several target buildings are defined to benchmark the performance of the OTL strategy with that of a Rule-Based Controller (RBC) and two DRL-based control strategies, deployed in offline and online fashion. The strategy adopted for OTL emulates the real world implementation with a simulation process by implementing the transferred DRL agent for a single episode in the target buildings. Target buildings have the same geometrical features and are served by the same energy system as the source building, but differ in terms of weather conditions, electricity price schedules, occupancy patterns, and building envelope efficiency levels. The results show that the OTL strategy can reduce the cumulated sum of temperature violations on average by 50% and 80% respectively when compared to RBC and online DRL while enhancing the energy system operation with electricity cost savings ranging between 20% and 40%. The OTL agent performs slightly worse than the offline DRL controller but it does not require any modeling effort and can be implemented directly on target buildings emulating a real-world implementation.
Journal	Applied Energy
Volume	333
Year of Publication	2023
Pagination	120598
ISSN Number	03062619
URL	https://linkinghub.elsevier.com/retrieve/pii/S0306261922018554
Short Title	Applied Energy
Keywords	Energy efficiency Deep reinforcement learning Online transfer learning Homogeneous transfer learning Intra-agent transfer learning Building adaptive control
Organizations	Building Technology and Urban Systems Division Building Technologies Department Simulation Research
Research Areas	Building Technology and Urban Systems Division BTUS Modeling and Simulation
File(s)	PDF
Download citation	DOI \| Google Scholar \| BibTeX \| Endnote tagged