TOP  >  Thesis/Dissertation  >  {Multiple People Tracking across Multiple Views via Partial Relaxation of Spatio-Temporal Cue and Utilization of Route Cue}

Multiple People Tracking across Multiple Views via Partial Relaxation of Spatio-Temporal Cue and Utilization of Route Cue

In shopping malls, it has been beneficial to find similar spots and reallocate the positions of stores for sales growth. For that purpose, ``movement histories'' of many pedestrians can be valuable information. Movement history is a sequence of locations and their times for a person. Acquiring movement histories manually requires enormous time, labor and cost. So, estimating them automatically is desired.

We present a method for estimating the movement histories by utilizing videos taken by existing fixed cameras. We suppose that these cameras are often installed with their views non-overlapping. Therefore, in order to estimate the moving histories, we need to re-identify pedestrians across adjacent views by matching tracklets which are obtained by tracking pedestrians in each view. Here, adjacent views are the camera views which are connected by the direct path which is not crossing by other camera views. When re-identifications between all adjacent views are correctly achieved, moving history for each pedestrian can be estimated by traversing the camera views in which the person was observed in time order.

We use two kinds of information: traveling time between two adjacent views and appearance features of a tracklet has been used. Two tracklets are matched when the similarity calculated between the tracklets with the two information is bigger than a given threshold. This approach is based on the hypothesis that the traveling time between a pair of adjacent views varies little among the pedestrians, and that every pedestrian's appearance varies little among the cameras. Additionally, some existing methods introduce the spatio-temporal cue, which reflects that traveling times between a pair of adjacent views can be bounded by a given time span. This cue reduces matching candidates and increases matching accuracy. However, these methods has following two drawbacks.

Firstly, when an observed pedestrian significantly delayed between a pair of adjacent views, the traveling time is out of the given time span. A delay often happens when there are visiting places between the adjacent views, for example, stores, toilets, rest spaces, signboards, and exhibitions. This makes it difficult to reduce matching candidates by the spatio-temporal cue.

Secondly, except very simple environment such as a straight road, it is sometimes difficult to install cameras to observe pedestrians from the same direction. Thus, each pedestrian's appearance varies depending on each camera. Additionally, lighting condition varies depending on observation time because of weather, light's intensity and the existence of sunlight. These factors cause appearance change in camera views. This makes it difficult to match tracklets by appearance features.

The proposed method deals with these problems via following two ideas. Firstly, we tackle the traveling time variation by selectively relaxing the spatio-temporal cue in matching tracklets. This makes it possible to match people correctly regardless the occurrence of the delay. Secondly, we deal with the appearance variation by introducing a route cue. The route cue is a constraint that when a pedestrian is observed by different two cameras, the pedestrian should be observed by other cameras which exists on a route between the two cameras. This constraint is an enhanced version of the spatio-temporal cue. This constraint contributes to reducing matching candidates by excluding different pedestrians with similar appearance from matching candidates.

To show the effectiveness of the proposed method, we evaluate the method on simulated datasets generated from a public data set collected by 16 cameras in a shopping mall. The dataset includes several settings of camera adjacency and the number of pedestrians who delay. Under each setting, we conducted an experiment to estimate movement histories. The result shows that matching accuracy of the proposed method was higher than that of the existing method under any considered settings. It also shows the accuracy of the proposed method surpassed that of the existing method by 13.1% at the maximum. These experiments prove that the proposed method can match tracklets in higher accuracy.