Abstract: |
This paper addresses reinforcement learning problems with constant control delay, both for known case and unknown case. First, we propose an algorithm for known delay, which is a simple extension of the model-free learning algorithm introduced by (Schuitema et al., 2010). We extend it to predict current states explicitly, and empirically show that it is more efficient than existing algorithms. Next, we consider the case that the delay is unknown but its maximum value is bounded. We propose an algorithm using accuracy of prediction of states for this case. We show that the algorithm performs as efficient as the one which knows the real delay.
|