05 Jun 2017
|
Research article
|
Information and Communications Technologies
Better Face Recognition in Surveillance Environments






This article describes a new adaptive multi-classifier system proposed for video-to-video face recognition in changing environments, as found in person re-identification applications. Systems performance are affected by many conditions: The presented system is designed to overcome the various limitations of such complex recognition environments.
About two months ago, we published on Substance ÉTS an article introducing Video Surveillance Research done at ETS LIVIA Laboratory. This article focuses on the design of robust face classification systems for video-to-video face recognition in changing surveillance environments, as required in person re-identification or search and retrieval. In such applications, an operator may isolate a facial trajectory (set of regions of interest that correspond to a same individual across consecutive frames) for an individual over a network of cameras, and enrol a face model to the system. Then, during operations, facial regions captured in live or archived video streams are matched against facial models of target individuals of interest to be followed.
Conditions affecting systems performance
It is assumed that holistic facial models are estimated by training a neural network or statistical classifier on reference captures extracted from operational videos using a face detector. In this context, the performance of state-of-the-art commercial and academic systems is limited by the difficulty in capturing high quality facial regions from video streams under semi-controlled (e.g., at inspection lanes, portals and checkpoint entries) and uncontrolled (e.g., in cluttered free-flow scenes at airports or casinos) capture conditions. Performance is severely affected by the variations in pose, scale, orientation, expression, illumination, blur, occlusion and ageing.
More precisely, given a face classifier, the various conditions under which a face can be captured by video cameras are representative of different concepts, i.e. different data distributions in the input feature space (see example in Figure 1).
![Figure 1: Example of variations in the facial appearance of two individuals from the Faces in Action database [22]](https://substance.carrousel-staging.com/wp-content/uploads/2015/04/facial-appearance.jpg)
Figure 1: Example of variations in the facial appearance of two individuals from the Faces in Action database [1]
These concepts contribute to the diversity of an individual’s face model, and underlying class distributions are composed by information from all possible capture conditions (e.g. pose orientations and facial expressions that could be encountered during operations). However, in practice, regions of interest extracted from videos are matched against facial models designed a priori, using a limited number of reference captures collected during enrolment. Incomplete design data and changing distributions contribute to a growing divergence between the facial model and the underlying class distribution of an individual. In person re-identification applications, reference videos containing an individual of interest, may become available during operations or through some re-enrolment process.
Under semi or uncontrolled capture conditions, the corresponding captures may be sampled from various concepts (e.g., with different facial orientation), but the presence of all the possible concepts inside a single reference sequence cannot be guaranteed. For this reason, a system for video-to-video face recognition should be able to assimilate new reference trajectories over the time (as they become available) in order to add newly available concepts to the individuals’ facial models, as they may be relevant to perform face recognition under future observation conditions. Therefore, adapting facial models to assimilate new concepts without corrupting previously-learned knowledge is an important feature for face recognition in changing real-world video-surveillance environments.
New adaptive multi-classifier system
In the research realized at Livia laboratory, a new adaptive multi-classifier system is proposed for video-to-video face recognition in changing environments, as found in person re-identification applications. This modular system is comprised of a classifier ensemble per individual that allows to adapt the facial model of target individuals in response to new reference videos, through either incremental learning or ensemble generation (figure 2). When a new video trajectory is provided by the operator, a change detection mechanism is used to compromise between plasticity and stability.
If the new data incorporates an abrupt pattern of change from previously-learned knowledge (representative of a new concept, see Figure 3), a new classifier is trained on the data and combined to an ensemble. Otherwise, previously-trained classifiers are incrementally updated.

Figure 3: Example of gradual and abrupt changes in facial
During operations, faces of each different individual are tracked and grouped over time, allowing to accumulate positive predictions for robust spatio-temporal recognition. A particular implementation of this framework has been proposed for validation. It consists of an ensemble of 2-class Probabilistic Fuzzy-ARTMAP classifiers [2] for each enrolled individual, where each ensemble is generated and evolved using an incremental training strategy based on a dynamic Particle Swarm Optimization [3] , and the Hellinger Drift Detection Method [4] to detect concept changes. The accuracy and resource requirements of this system are assessed using facial trajectories extracted from video surveillance streams of the Face In Action database [1]. It us comprised of over 200 individuals captured over several months, exhibiting gradual (e.g. expression, ageing) and abrupt (e.g. orientation, illumination) changes.
Results
Simulation results indicate that the proposed system is able to maintain a high level of performance when significantly different reference videos are learned for an individual. It exhibits higher classification performance than a probabilistic kNN system [5] adapted to video-to-video face recognition, as well as a reference open-set TCM-kNN system [6], with a significantly lower complexity. The scalable architecture thus employs the change detection mechanism to mitigate the effects of knowledge corruption while bounding its computational complexity.
Research article
To get more information on the design of a robust face classification systems for video-to-video face recognition in changing surveillance environments, we intie you to read the following research article:
Pagano, C., E. Granger, R. Sabourin, G.L. Marcialis and F. Roli (2014). Adaptive ensembles for face recognition in changing video surveillance environments. Information Sciences, vol. 286. p. 75-101, article available via Espace ÉTS.
LIVIA laboratory
To get more information about the Laboratory for Imagery Vision and Artificial Intelligence (Livia), use this link. Livia is looking for students for many research projects.

Christophe Pagano
Christophe Pagano is a PhD student in the Automated Manufacturing Engineering Departement at ÉTS. He is specialized in spatiotemporal systems for face recognition in video sequences.
Program : Automated Manufacturing Engineering
Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

Robert Sabourin
Robert Sabourin is a professor in the Automated Manufacturing Engineering Department at ÉTS. His research includes pattern recognition and inspection, neural networks, machine learning, genetic programming and bank cheques processing.
Program : Automated Manufacturing Engineering
Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

Éric Granger
Eric Granger is a professor in the Systems Engineering Department at ÉTS. His research focuses on machine learning, pattern recognition, computer vision, information fusion, and adaptive and intelligent systems.
Program : Automated Manufacturing Engineering
Research chair : Research Chair in Artificial Intelligence and Digital Health for Health Behaviour Change
Research laboratories : LiNCS - Cognitive and Semantic Interpretation Engineering Laboratory LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

Gian Luca Marcialis
Dr.Gian Luca Marcialis is currently Assistant Professor at University of Cagliari, and member of the PRA lab. His research interests are in the fields of fusion of multiple classifiers for person recognition by biometrics.

Fabio Roli
Fabio Roli is a professor of computer engineering and Director of the Patter Recognition and Application Lab (PRA) at the University of Cagliari, Italy. His research activity is focused on the design of pattern recognition systems.
Research laboratories :
Field(s) of expertise :
Pattern Recognition & Inspection Feature Selection Ensemble of Classifiers Neural Networks Machine Learning Incremental Learning & Novelty Hidden Markov Models (HMM) Genetic Algorithms Genetic Programming Biometrics Handwritten Signature Verification Handwriting Recognition Handwritten Character Recognition Handwritten & Printed Document Bank Cheques Processing Postal Applications Multi-Classifier Systems Adaptive & Intelligent Systems Incremental & On-Line Learning Evolutionary Computation Video Surveillance Computer & Network Security Face Recognition
