Abstract: |
Data-driven agents for virtual character animation control offer great potential for both recreational and
serious games and applications. For the characters to be most effective in these instances, the behaviour
portrayed by output animation needs to be realistic, dynamic and responsive to live events and cues from the
user. Current state-of-the-art work in the area has shown impressive results for using supervised learning to
output select behaviours provided sufficient motion clips are available for training, but these methods only
allow for limited dynamism. Also, deep reinforcement learning (RL) methods have been shown to be able to
leverage physics engine-derived feedback and signals to generate animation portraying behaviours such as
locomotion. RL-based approaches have the potential for greater generalization, leading to agents being able
to learn a wide range of behaviours efficiently, and generate dynamic animation in real-time. However, current
state-of-the-art RL agents depend heavily on feedback derived from physics engines. Animation portraying
social, interactive behaviours do not elicit physics-driven signals that would allow for a policy to be shaped
effectively, making these behaviours incompatible within a physics-driven paradigm for RL agents.
In this work, we use gazing and pointing as exploratory tasks to explore the feasibility of a paradigm suitable
for RL-based animation agents that learn latent dynamics that are applicable ubiquitously, from a modelling
perspective, to a wider range of behaviours. We introduce a framework where agents learn generalizable
animation dynamics required to portray different behaviours, and we propose a novel method for animation
generation based on online planning in a beta distribution parameterised space. Agents learn a latent dynamics
model enabling them to predict a character state, and generate animation via online planning, using a
corresponding objective state to discern which candidate sequences represent the optimal animation trajectory.
Purely through self-exploration and learned dynamics, agents created within our framework are able to output
animations to successfully complete gaze and pointing tasks robustly while maintaining smoothness of motion,
using a very low number of training epochs. In our experimental validation, we compared outputs generated
from a trained agent to an Inverse Kinematics (IK)-based control implementation. We found that agents
created using our method were significantly more computationally efficient, taking approximately 40
nanoseconds per frame to generate animation, compared to 2 milliseconds for the IK-based control.
In our future work, we plan to develop methodologies to use motion capture data as an external source of
information, in co-ordination with our model-based reinforcement learning training algorithm, to influence
agents during training to account for realism. A video containing an overview of our work and examples of animation output can be found at https://virtualcharacters.github.io/links/ICAART2021 |