Articulated-Pose Estimation using Brightness- and Depth-Constancy Constraints

Michele Covell, Ali Rahimi, Michael Harville, Trevor Darrell

This paper explores articulated-pose estimation, assuming that video-rate depth information is available, from either stereo cameras or other sensors. We use these depth measurements in the traditional linear brightness constraint equation, as well as in a similar constraint equation on depth, called the depth constraint equation. We introduce "shifted" constraint equations that allow for larger motions without requiring iterative estimation. The resulting constraint equations are linear on a modified parameter set. After solving these linear constraints, there is a single closed-form non-linear transformation to return the updates to the original pose parameters.

Our tracking results, both on synthetic data and on real data (see below), argue strongly for video-rate depth information, from either stereo cameras or other sensors. Without the true depth information, when we estimated depths, the tracking behavior became unstable and failed catastrophically within duration of our test sequences.

Table 1: Movies of real-data tracking results

This table contains links to the movies, showing our results from tracking on real data. The position and pose of the articulated model is shown superimposed on the input sequence. All of these real-data results use shifted FOEs.
Constraint equations
BCCE without depth BCCE (with depth) ZCCE BCCE + ZCCE
BCCE wo/depth BCCE w/depth ZCCE BCCE+ZCCE

Table 2: Movies of synthetic-data tracking results

This table contains links to the movies, showing our results from tracking on synthetic data. The input brightness image is in the lower right quadrant. The limb motions, both "true" and "estimated" are shown in the other three quantrants. The true pose is shown in cyan. The estimated pose is shown in red. The limb motions as seen from the camera's location are shown in the lower left quadrant. The limb motions as seen from overhead are shown in the upper left quadrant. The limb motions as seen from the side are shown in the upper right quadrant.
Img.
Seq.
FOE Constraint equations
BCCE without depth BCCE (with depth) ZCCE BCCE + ZCCE
unshifted slow jump slow jump, unshifted
BCCE wo/depth
slow jump, unshifted
BCCE w/depth
slow jump, unshifted
ZCCE
slow jump, unshifted
BCCE+ZCCE
slow walk slow walk, unshifted
BCCE wo/depth
slow walk, unshifted
BCCE w/depth
slow walk, unshifted
ZCCE
slow walk, unshifted
BCCE+ZCCE
fast jump fast jump, unshifted
BCCE w/depth
fast jump, unshifted
ZCCE
fast jump, unshifted
BCCE+ZCCE
fast walk fast walk, unshifted
BCCE w/depth
fast walk, unshifted
ZCCE
fast walk, unshifted
BCCE+ZCCE
shifted fast jump fast jump, shifted
BCCE w/depth
fast jump, shifted
ZCCE
fast jump, shifted
BCCE+ZCCE
fast walk fast walk, shifted
BCCE w/depth
fast walk, shifted
ZCCE
fast walk, shifted
BCCE+ZCCE

Table 3: Summary of tracking errors for body orientation and for body, wrist and ankle position on synthetic-data sequences.

Body rotation errors are in degrees. All position errors are in percentage of one body height. Wrist and ankle position errors reflect the combined errors in torso, upper-limb, and lower-limb orientation only: the effects of body-position errors have been removed from them. For each of these, both the mean and the maximum errors are shown.
B wo/Z = BCCE without true depth; B w/Z = BCCE with true depth; Z = ZCCE only; B+Z = BCCE and ZCCE
Img.
Seq.
FOE Eqs Body rotation error Body position error Right-wrist position error Left-wrist position error Right-ankle position error Left-ankle position error
mean max mean max mean max mean max mean max mean max
slow
jump
unshifted B wo/Z 131 197 44 73 68 90 72 92 64 91 12 19
B w/Z 5 15 2 15 4 15 4 11 4 16 2 5
Z 4 7 3 14 14 22 7 11 6 12 1 2
B+Z 2 4 0.2 0.5 7 14 4 5 4 9 0.4 1
slow
walk
unshifted B wo/Z 71 102 73 135 25 42 27 41 44 76 22 34
B w/Z 2 5 3 8 2 7 37 57 6 10 3 7
Z 2 5 1 2 4 8 17 36 4 10 27 32
B+Z 0.3 1 0.2 0.4 3 6 5 10 6 13 7 11
fast
jump
unshifted B w/Z 23 57 9 20 35 61 27 43 30 56 10 23
Z 14 27 7 18 36 52 34 45 28 46 4 13
B+Z 12 27 6 14 37 58 33 53 31 48 5 13
shifted B w/Z 11 23 11 29 11 26 11 20 17 39 6 13
Z 2 4 5 13 14 28 8 16 14 27 2 9
B+Z 2 6 2 5 10 20 7 13 18 33 2 4
fast
walk
unshifted B w/Z 17 41 20 48 10 19 46 67 39 63 27 37
Z 2 6 2 5 32 57 35 60 31 54 33 45
B+Z 2 4 3 5 28 51 32 58 37 57 38 55
shifted B w/Z 5 13 13 28 6 15 15 29 13 29 26 34
Z 1 3 6 11 1 6 5 13 3 9 6 13
B+Z 2 4 2 3 1 2 4 11 3 7 10 15