A SOLUTION FOR LEFT-RIGHT CONFUSION, OCCLUSION, AND LOST TRACKING IN REAL-TIME 3D HUMAN MOTION ESTIMATION


Abstract

  In this paper, we discussed left-right confusion, self-occlusion, and lost tracking phenomenon in a multi-Kinect based 3D skeleton tracking. The left-right confusion appears where a Kinect captures the real-view or the side-view of a user; the occlusion means a joint is occluded by an object or other body parts; and the lost tracking means there are some joints could not be detected for a while. To address these challenges in a unified framework, first we correct the confusion on SDK-skeleton by using an OpenPose skeleton. Since OpenPose extracts the joints by analyzing image content, we can differentiate the front side and the back side and thus can correct the left-right confusion. Later, we reconstruct a universal 3D-OpenPose from multiple cameras. The OpenPose reconstructed joints are treated as robust 3D anchors for a multiple skeleton fusion. Because the reconstruction is based on a back-projection, the universal skeleton is not effected by occlusion. Last but not least, we introduce the inter-joint constraints into our skeleton tracking framework so that we can trace all joints simultaneously. Unlike conventional methods that treat a particular joint individually, our method uses neighbor joints to predict the next joint-position. Therefore, in the lost tracking scenario, the constraint ensures the skeleton movement consistent, and well maintain the length between neighboring joints. We evaluate our method with challenging actions. A practical system is also built for demonstration. The experimental results show the system can track skeleton stably without error propagation and vibration. The average localization error is also smaller than conventional methods
BEND

RELOAD

SEAT

T_POSE

THROW



Challenge

kinect_3D_Challenge_1

The left-right problem (The yellow means the right side, the green means the left side)

kinect_3D_Challenge_2

The self-occlusion problem. (The green/yellow nodes means high/low confident joints)



Proposed Method

kinect_3D_proposed_method


Experiment

Qualitative Evaluation

kinect_3D_QualitativeEvaluation1.jpg

The evaluation of the data correction process


kinect_3D_QualitativeEvaluation2.jpg

With the inter-joint constraint, we localize the occluded elbow well (yellow skeleton). Without the constraint, the error is accumulated (red skeleton).

kinect_3D_QualitativeEvaluation3.jpg

The key joints of a skeleton and their IDs



Quantitative Evaluation

kinect_3D_five_actions.jpg

Five military training actions. Dark points are the point cloud; red points are the ground truth of the 3D joints.


kinect_key_joints.jpg

The key joints of a skeleton and their IDs


THE MEAN OF LOCALIZATION ERROR FOR DIFFERENT JOINTS. FIVE ACTIONS ARE TESTED (UNIT IS CENTIMETER, CM).

kinect_3D_table_1.jpg


PERFORMANCE COMPARISON IN TERMS OF THE PERCENTAGE OF CORRECTLY-TRACKED JOINTS (P-CTJ) AND THE MEAN LOCALIZATION ERROR (MLE).

kinect_3D_table_2.jpg