Contributions to the segmentation of moving objects in video sequences

  1. de Castro Lopes Martins Pinto Ferreira, Maria Isabel
Dirixida por:
  1. José Luis Alba Castro Director
  2. Luis Corte Real Director

Universidade de defensa: Universidade de Vigo

Fecha de defensa: 24 de xaneiro de 2020

Tribunal:
  1. Ana Maria Rodrigues de Sousa Faria Mendonça Presidente/a
  2. Julio Martín Herrero Secretario
  3. Jesús Cid Sueiro Vogal
Departamento:
  1. Teoría do sinal e comunicacións

Tipo: Tese

Resumo

The intrinsic motivation for this PhD stems from the author’s desire to explore novel alternatives to solve the problem of unsupervised foreground/background segmentation in video sequences by ensuring fast balanced solutions and solutions to problems that, so far, have not been solved adequately such as foreground/background segmentation of night video sequences. On the other hand, there was an old idea of exploring the possibility of experimenting using new approaches inspired in biological models of the human visual system applied to the segmentation of moving objects in video sequences. The area of research in which this thesis is included relates to the topic of change detection in videos and is closely linked to high-level inference tasks such as detection, localization, tracking and classification of moving objects, and is often considered a pre-processing step. Consequently, segmenting moving objects in video sequences is a fundamental step in many computer vision applications and a critical factor for the success of the system as a whole. However, there is no universal solution that successfully addresses all of the challenges that may be present in real- world scenarios such as poor lighting conditions, sudden illumination changes, nighttime scenes, irrelevant motion in the background scene, cast shadows, and camouflaged objects (photometric similarity between the object and the background). It has been a very active research topic for several decades and many methods have been proposed in the literature. Several papers with comprehensive reviews of current approaches to moving objects segmentation, frequently referred to as foreground/background segmentation, have been published. Results of recent research have shown that the different existing methods appear to be complementary in nature and better performance can be achieved when several methods are combined. However, in general, this leads to more complex and computationally heavy solutions, making it unfeasible to use in real-time applications. In this context, this document describes the work undertaken in the scope of the author’s PhD whose main contributions may be summarized as follows: 1. The introduction of a bio-inspired hybrid method, based on the fusion of low-level information from the modeling of the human visual system with state-of-the-art background subtraction methods. The method improves well-known and widely used state-of-the-art algorithms in complex situations where these fail, like challenging illumination conditions or shadows, by greatly reducing the number of false positives. Hence, the combination of the two approaches boosts overall detection accuracy. After the detailed analysis of the results and considering that this method is quite computationally heavy, we concluded that a better compromise between performance and complexity should be pursued by considering a different approach. This work was presented at ICIAR2016 and published in the proceedings (LNCS, 2016). 2. The proposal of a computationally efficient boosted GMM method, BMOG. BMOG stands for “Boosted+MOG”. BMOG uses a background model based on Mixture of Gaussians and explores a novel classification mechanism combining color space discrimination capabilities with pixel classification with hysteresis, that prevents noisy pixels from incorrectly changing the classification, leading to a more robust algorithm. It also introduces a selective mechanism for background model update using a dynamic learning rate, adapted independently for each pixel, ensuring that the model adaptation is faster for dynamic areas of the scene and slower for static ones. The combination of these features proved to boost the overall detection accuracy in different scenarios, while keeping complexity low, making it a good choice for real-time applications with processing time constraints. The proposed method was validated using the ChangeDetection.net benchmark (ChangeDetection.NET). The publicly available results show that, in terms of segmentation quality, BMOG consistently outperforms MOG2 and approaches, sometimes even exceeding, the performance of much more complex state-of-the-art methods, such as SuBSENSE and WeSamBE. The evaluation of the performance, in terms of computational efficiency, led us to conclude that BMOG achieves an excellent compromise in performance versus complexity compared to MOG2 and SuBSENSE and may be considered a serious candidate for real world applications where processing time is a critical feature. This method was presented at IbPRIA2017 and published in the proceedings (LNCS, 2017). After the conference, the author was invited to submit a revised and extended version, that was subjected to a regular review process for Springer journal Pattern Analysis and Applications, and was accepted and published in 2018. 3. The proposal of a new method for the unsupervised segmentation of moving objects devised specifically for nighttime video sequences, the COLBMOG algorithm. COLBMOG stands for “COLlinearity+BMOG”. The method is based on a novel local texture feature integrated with a parametric background model (BMOG). The information obtained from the texture-based change detection method, using local texture modeling, complemented by a color-based change detection method (BMOG), greatly increases the overall detection ac- curacy in these scenarios. This method ranks first in CDnet NightVideos benchmark (as of the date of this document), for the unsupervised methods. It outperforms the best state-of- the-art algorithms in the more complex situations faced in this kind of videos. Not only the average F-measure is higher but also the standard deviation is significantly lower, meaning a more consistent performance across different challenges. Paper submitted to Computer Vision and Image Understanding on April 3, 2019, and currently under revision.