Robust multi-object tracking via cross-domain contextual information for sports video analysis

Robust multi-object tracking via cross-domain contextual information for sports video analysis

Tianzhu Zhang, Bernard Ghanem, and Narendra Ahuja
"Robust Multi-Object Tracking Via Cross-Domain Contextual Information For Sports Video Analysis"
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012)

Tianzhu Zhang, Bernard Ghanem, and Narendra Ahuja
Tracking, Particle Filter, Cross-Domain, Contextual Information
2012
Multiple player tracking is one of the main building blocks needed in a sports video analysis system. In an uncalibrated camera setting, robust mutli-object tracking can be very dif- ficult due to a number of reasons including the presence of noise, occlusion, fast camera motion, low-resolution image capture, varying viewpoints and illumination changes. To address the problem of multi-object tracking in sports videos, we go beyond the video frame domain and make use of information in a homography transform domain that is denoted the homography field domain. We propose a novel particle filter based tracking algorithm that uses both object appearance information (e.g. color and shape) in the image domain and cross-domain contextual information in the field domain to improve object tracking. In the field domain, the effect of fast camera motion is significantly alleviated since the underlying homography transform from each frame to the field domain can be accurately estimated. We use contextual trajectory information (intra-trajectory and inter-trajectory context) to further improve the prediction of object states within an particle filter framework. Here, intra-trajectory contextual information is based on history tracking results in the field domain, while inter-trajectory contextual information is extracted from a compiled trajectory dataset based on tracks computed from videos depicting the same sport. Experimental results on real world sports data show that our system is able to effectively and robustly track a variable number of targets regardless of background clutter, camera motion and frequent mutual occlusion between targets.