Reinforcement Learning for Robust Low-Thrust Trajectory Design, with Application to Micro-spacecraft Interplanetary Missions

Anno
2020
Proponente -
Struttura
Sottosettore ERC del proponente del progetto
PE8_1
Componenti gruppo di ricerca
Componente Categoria
Alessandro Zavoli Tutor di riferimento
Abstract

This project proposes to investigate the use of Reinforcement Learning for the robust design of low-thrust trajectories in presence of severe state and control uncertainties, as in the case of micro-spacecraft interplanetary missions.
Recent development in on-board component miniaturization are opening the possibility to realize deep-space exploration missions with small or micro-spacecraft, able to greatly reduce design cost and time. Differently from standard spacecraft, micro-spacecraft are characterized by a reduced orbit control capability, larger uncertainties in state knowledge (limited radio links with ground stations on Earth) and in command execution (low reliability components), as well as low possibility of propellant margins and system redundancy, because of the limited size and cost budget. Therefore, the trajectory design for these kind of mission is mainly driven by its robustness to uncertainties.
Unlike traditional optimization methods, reinforcement learning provides a systematic framework to deal with stochastic optimal control problems, where the system dynamics, or environment, can be characterized by any kind of uncertainty and dynamical model. In reinforcement learning, a deep neural network is used to map the spacecraft states to the optimal control policy and the expected value function, that measures the actual trajectory performance on the basis of mission objectives and requirements. The network is trained by repeatedly interacting with a number of realizations of the environment and progressively refining the control policy in order to maximize the expected cumulative reward.
At the end of the training process, beside a reference robust trajectory, the network outputs an optimal state-feedback control law. For this reason, the trained network can be mounted on-board of the spacecraft and used to provide it with autonomous guidance and control capabilities during the actual orbital operations.

ERC
PE8_1, PE1_19, PE6_7
Keywords:
INGEGNERIA AEROSPAZIALE, MECCANICA DEL VOLO, OTTIMIZZAZIONE, APPRENDIMENTO AUTOMATICO, TEORIA DEI CONTROLLI E CONTROLLO OTTIMO

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma