[Japanese | Thesis | Researches in Minoh Lab | Minoh Lab]
A teacher tracking camera work method which always takes a zoomed teacher image in a remote lecture system, is proposed in this paper.
The purpose of this research is to take a zoomed teacher image. As a teacher walks around in a lecture room, teacher position must be measured so that pan-tilt cameras can follow the teacher by rotating the cameras to the direction where the teacher stands. The teacher position is measured by detecting 2-D coordinates of the teacher position in two camera images and then calculating 3-D coordinates from the pan-tilt camera angles using the triangulation method.
We propose a new method of calculating the difference image using two cameras to detect the 2-D teacher position. Our method can extract the teacher region correctly in condition of varying luminance and rotating a camera.
Multiple pan-tilt cameras are used for tracking the teacher. We assume there is exactly one teacher walking in the lecture room and the students remain seated.
A conventional background image difference method could be used for detecting the teacher position in the camera image. This conventional method extracts the teacher region as a difference region. When luminance is varied or the camera is rotated, the difference region contains not only the teacher region but also other regions. In this case, the teacher region can not be extracted by the conventional method. For the purpose of extracting the teacher region from the camera image at the time t, the image that would be obtained if the teacher was not in the lecture room, is essential. In the conventional background image difference method, at the time when the teacher is not in the lecture room is used in place of . Thus, when is not true due to luminance variation, the teacher region fails to be extracted. If the background image at the time t is available, the teacher region is extracted correctly. We pay attention to generate the background image at the time t from another camera at a distant viewpoint.
Our proposed method converts the image of the camera to the image from the camera viewpoint, and makes the difference image . Our method uses the 3-D model of still objects in the lecture room.
Method to convert the image to the im age calculates the point on the still object mapped into the pixel in the image , and calculates the pixel representing the point . As a result, is obtained.
The pixel representing the teacher in the image is converted to the pixe l in the image under the assumption that the pixel represents the point on the still obj ect. The point is mapped into the pixel in the image when there is not the teacher. The point on the still object is not mapped into the image because it is hidden by the teacher from the camera viewpoint . Because the camera viewpoint is diffe rent from the camera viewpoint, the point is mapped to the pixel . Thus, the difference region contains the pixel since the pixel represents the teacher and the pixel represents a point on a still object. In the same way, when the pixel represents the teacher, the pixel represents a point on a still object. Then the difference region contains the pixel . The teacher region is extracted as the difference region.
The same still objects must be taken in the cameras to extract the teacher region correctly. According to this condition, we propose a method to select cameras dynamically among the multiple cameras.
We experimented this method. The same point on the model must have the same pixel value both at camera a nd camera . The pixel values at two cameras are not very different in the lecture room. This method extracts the teacher region even if luminance is varied or the camera is rotated, and measures the teacher position correctly. As the future work, we must consider the problem that the same point might have different pixel values in the condition that the sun shines from windows and its rays reflect to the still objects.