TOP  >  Thesis/Dissertation  >  Specific Person Image Retrieval from Surveillance Videos via Condition-Separating Relevance Feedback

Specific Person Image Retrieval from Surveillance Videos via Condition-Separating Relevance Feedback

In recent years, the number of fixed cameras (surveillance cameras) has been increasing in various areas, such as banks, airports, streets and so on. Surveillance videos can be used for tracking criminals and missing persons. It is required for those purposes to analyse the videos exhaustively. Traditionally, the analyses have been done manually. However, they require a large amount of time and cost.

Recently specific person image retrieval was proposed for supporting tracking criminals or missing persons. A general system of specific person image retrieval is as follows. Firstly, the system detects and tracks people in videos. The system makes up times of coming in and going out, a sequence of images, a sequence of image features and positions of the detected person into a record. Then the system stores all records into a database. Secondly, a user runs a query with an image of a person whom the user wants to search. Finally, the system shows records ranked by distance to the query in ascending order.

Features of person images vary according to observation conditions, such as illumination, occlusion, pose and resolution. Therefore, it is difficult to rank the records properly, the wanted person's records (records containing an image of the wanted person) are ranked higher than other's records.

First problem is that a image feature of the wanted person on an observation condition and one on another observation condition can be completely different. Then the system can't raise the wanted person's records only containing image features on different conditions from the query image to higher ranks.

Second problem is that an image feature of the wanted person on an observation condition and an image feature of another person on another observation condition can be similar. Then the system may raise the person's records to higher ranks.

Metternich et al. and Fischer et al. introduced Relevance Feedback (RF) to improve retrieval accuracy for specific person image retrieval. RF can handle the first problem, however, it can't handle the second problem.

We introduce Condition-Separating Search to tackle the second problem and propose Condition-Separating Relevance Feedback (CSRF) which combines Condition-Separating Search with RF into a single framework to tackle both of the problems.

In order to realize Condition-Separating Search, a user prepares query images for predefined observation conditions. Then, Condition-Separating Search measures distances between the feature of the query image on a observation condition and image features on the same observation condition. Even if an image feature distribution of the wanted person on an observation condition overlaps with one of another person on different observation conditions, Condition-Separating Search can reject image features of another person in such distributions. Therefore Condition-Separating Search can overcome the second problem.

Condition-Separating Search needs to label each image in the database according to its observation condition. Furthermore, a user must prepare images of the wanted person on the predefined observation conditions.

CSRF labels each image by information which can be gotten by tracking and accumulates each image of the wanted person on the observation conditions by RF.

We performed experiment on videos of multiple surveillance cameras in a shopping mall to confirm efficacy of CSRF on specific person image retrieval. Comparing CSRF to RF by recall for every rank, the result shows that CSRF improves the 1000th recall from 58.5% to 72.2% by 13.7 points.