Prominent Moving Object Segmentation from Moving Camera Video Shots Using Iterative Energy Minimization

Chiranjoy Chattopadhyay and Sukhendu Das
Visualization and Perception Lab
Department of Computer Science and Engineering, Indian Institute of Technology, Madras, India

Accepted with minor revision in Signal Image and VIdeo Processing (SIViP), Springer

Abstract

Extraction of the moving foreground object from a given video shot is an important task for spatio-temporal analysis and content representation in many computer vision and digital video processing applications. We propose an iterative framework based on energy minimization, for segmenting the prominent moving foreground object efficiently from moving camera video (MCV) shots. The solution obtained using graph-cut for figure-ground classification, is enhanced using features extracted over a set of neighboring frames. This is used to iteratively update the foreground and background probability (tri-) maps and hence the graph weights. The segmentation results from neighboring frames are integrated as constraints to iteratively guide the energy minimization process, for an efficient solution. The proposed framework is automatic and does not require any human interaction (neither initialization nor refinement). Our method outperforms recent state-of-the-art moving object segmentation techniques on benchmark datasets with MCV shots.

 

Qualitative Segmentation Results (Videos). Clck the links below to view and download the videos (in a new page)

Set1 (Bike, Boat) Set4 (Tennis, Girl)

Set2 (Cheetah, People)(Multiple moving Objects)

Set5 (Soldier, Frog)(SegTrack V2)

Set3 (Birdfall, Parachute) Set6 (Car, Aeroplane)

Set7 (Train)

Segmented foreground object is demarcated with an unique color. In case of selected videos, background pixels are overlayed with a suitable color, while the foreground pixels were kept unaltered to highlight the segmentation results.


Quantitative evaluation of segmentation performance with other methods using average F-measure. 
  Proposed

Elqursh et al.

(ECCV 2012)

Kwak et al.

(ICCV 2011)

Sheikh et al.

(ICCV 2009)

Yong et al.

(ICCV 2011)

Papazoglou et al.

(ICCV 2013)

Cars1 0.95 0.91 0.88 0.77 0.7 0.94
People 1 0.94 0.89 0.9 0.7 0.75 0.94
People 2 0.88 0.77 0.87 0.78 0.59 0.85
Tennis 0.92 0.89 - 0.41 0.72 0.9


Comparison of segmentation error using average number of erroneously labeled pixels per frame as a metric
  Proposed

Papazoglou et al.

(ICCV 2013)

Yong et al.

(ICCV 2011)

Zhang et al.

(CVPR 2013)

Ma et al.

(CVPR 2012)

Tsai et al.

(IJCV 2012)

Birdfall 152 217 288 155 189 252
Girl 1300 3859 1785 1488 1698 1304
Parachute 195 855 201 220 221 235
Cheetah 625 890 905 633 806 1142
Monkey 265 284 521 365 472 563