[Japanese | Thesis | Researches in Minoh Lab | Minoh Lab]
In this article, we propose a method to estimate a pose of human body by using human model. The human body for the pose estimation is represented as a voxel-data in a computer.
There are some attempts to construct a virtual world in order to use it, for distance education or teleconferences. For creating a virtual world, the shapes of real objects have to be represented in a computer so that they can be recognized easily. The shapes of objects are represented as a voxel-data.
A voxel-data can be obtained using a multi-stereo method assuming the images from multiple cameras as the input. But the voxel-data obtained by that method are in many cases those in low resolution, due to the insufficient performance of the computer and errors of camera calibration.
We propose to estimate the pose of human body from voxel-data in low resolution using a pre-defined human model.
We represent a human body as an articulated object to describe the pose of the human body. That human model consists of several nodes which are arranged in a tree structure. Each node is associated with the information about the shape of the corresponding part of a human body, the joints which connect the part to its parent part,and axes of it rotation. By modeling a human body in this way, we can describe a pose of a human body uniquely by determining the scale, position and direction of the body, and the rotation angles of each joints. Therefore, estimating the pose of a human body is equivalent to estimating such parameters from the voxel-data. Those such parameters are estimated by the method below.
The scale of the human body is determined from its stature. The stature of a human is equal to the length of the height from the head to the foot. This height can be estimated based on the area of the body on each horizontal slice of the voxel-data, by comparing the slice with the distribution of that area of a typical human body.
The position and the direction of a human body can be estimated from those of its upper part (upper body). The area of the upper body on each slice is in the shape of an ellipse. The section of the upper body is determined in the same way as in determining the head and the foot. Fitting the ellipse to the section of upper body. The position and orientation of the upper body are estimated by as the center of an ellipse and the direction of its major axis.
The rotation angles of each joint are estimated by matching voxel-data with a voxelized human model. We sample each rotation angle in its possible range, and change the pose of the human model according to the sampled rotation angles. We voxelize the model in the same resolution as the voxel data, and compare it with the voxel-data.
The amount of overlapped area between the voxel-data and the voxelized model is employed as the criterion for evaluating the matching result. We define that the more the amount of overlapped area increases, the more the similarity between pose of the human model and one of voxel data increases.
When we voxelize the human model, we consider the feature of voxel-data related to its resolution. The voxel-data we employ has a feature that the volume of a human body becomes larger when it is represented by voxel-data. So, we voxelize the human body model so that it satisfies this feature.
To decrease the total number of voxel matching, we estimate the rotation angles of each node in a sequential way instead of estimating all the rotation angles at once. The order of the nodes to be estimated, is determined so that the result of estimation for each node does not interfere with the others.
We conducted an experiment to show validity of our method. We prepared voxel data in several resolutions. The results shows the method can estimate the collect pose in most cases,but in some case it can not estimate the collect pose. One of the main cause of bad results is that error in estimating one rotation angle influences the estimation of other rotation angles.