Abstract: |
Reinforcement learning (RL) is a machine learning technique where an autonomous agent uses the rewards received from its interactions with an initially unknown Markov decision process (MDP) to converge to an optimal policy, i.e., the actions to take in the MDP states in order to maximise the obtained rewards. Although successfully used in applications ranging from gaming to robotics, standard RL is not applicable to problems where the policies learned by the agent must satisfy strict constraints associated with the safety, reliability, performance and other critical aspects of the problem. Our project addresses this significant limitation of standard RL by integrating it with probabilistic model checking, and thus extending the applicability of the technique to mission-critical and safety-critical systems. |