Graham Taylor

Factored Conditional Restricted Boltzmann Machines
for Modeling Motion Style

Graham W. Taylor and Geoffrey E. Hinton

The Conditional Restricted Boltzmann Machine (CRBM) is a recently proposed model for time series that has a rich, distributed hidden state and permits simple, exact inference. We present a new model, based on the CRBM that preserves its most important computational properties and includes multiplicative three-way interactions that allow the effective interaction weight between two units to be modulated by the dynamic state of a third unit. We factorize the three-way weight tensor implied by the multiplicative model, reducing the number of parameters from O(N3) to O(N2). The result is an efficient, compact model whose effectiveness we demonstrate by modeling human motion. Like the CRBM, our model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve the model's ability to blend motion styles or to transition smoothly between them.

Download paper in PDF format. (Pre-print)
Download supplementary material. (Details on learning and free energy derivation)
Matlab source code

Videos

These videos represent some sample motion synthesized from models reported in our Experiments section.

The videos were encoded using H.264 and so to view them with this integrated player you will need a relatively new version of Adobe Flash Player (called version 9 Update 3 or v9.0.115.0 which was released on December 3, 2007).

Experiments Section 4.1.1 Standard 1-layer CRBM

Here we show that a single CRBM can capture many different styles of walking, and generate based on the way it is initialized. We use 1200 hidden units, a sparseness penalty to encourage sparse hidden units, and 12 delay taps for this 60fps motion. We have trained using data from subject 137 in the CMU Motion Capture Database.

The player will show in this paragraph unless you do not have flash player installed

Experiments Section 4.1.2 Standard 2-layer CRBM

A standard 2-layer CRBM model with 600 hidden units per layer can capture the same styles as the 1-layer CRBM, but, in addition, generate "old man" style walking. Initialization is based purely on initialization. Note that there is more variety in the movements compared to the 1-layer CRBM (this is apparent in the "cat" sequence), but there is still a spurious (and non-smooth) transition, this time in the "gangly" sequence.

The player will show in this paragraph unless you do not have flash player installed

Experiments Section 4.2 Style-gated Factored CRBM (discrete labels)

We use a factored CRBM with 3-way, multiplicative interactions to synthesize transitions and blends between two styles. Note that the training data does not contain transitions nor blends.

The player will show in this paragraph unless you do not have flash player installed

Experiments Section 4.3 Style-gated Factored CRBM (real-valued labels)

Here the architecture is identical to that in Section 4.2, except that instead of gating interactions by real-valued features connected to discrete labels, we gate directly by real-valued style variables. In our experiment, we use two style variables: stride length and speed. This video shows real-time online generation at 30fps (the video itself is only 10fps). We can manipulate the stride length and speed variables during generation, which changes the effective weights in the model. The hidden units and motion respond to the changing of these variables, producing smooth interpolation and extrapolation beyond the 9 discrete settings of the variables with which the model was trained.

The player will show in this paragraph unless you do not have flash player installed