Camera Motion and Surrounding Scene Appearance as Context for Action Recognition

Camera Motion and Surrounding Scene Appearance as Context for Action Recognition

Fabian Caba Heilbron, Ali Thabet, Juan Carlos Niebles, Bernard Ghanem
"Camera Motion and Surrounding Scene Appearance as Context for Action Recognition"
Asian Conference on Computer Vision (ACCV 2014)

Fabian Caba Heilbron, Ali Thabet, Juan Carlos Niebles, Bernard Ghanem
Action Recognition
2014
Abstract. RGB-D sensors are popular in the computer vision community, especially for problems of scene understanding, semantic scene labeling, and segmentation. However, most of these methods depend on reliable input depth measurements. The reliability of these depth values deteriorates significantly with distance. In practice, unreliable depth measurements are discarded, thus, limiting the performance of methods that use RGB-D data. This paper studies how reliable depth values can be used to correct the unreliable ones, and how to complete (or extend) the available depth data beyond the raw measurements of the sensor (i.e. infer depth at pixels with unknown depth values), given a prior model on the 3D scene. We consider piecewise planar environments in this paper, since many indoor scenes with man-made objects can be modeled as such. We propose a framework that uses the RGB-D sensor’s noise profile to adaptively and robustly fit plane segments (e.g. floor and ceiling) and iteratively complete the depth map, when possible. Depth completion is formulated as a discrete labeling problem (MRF) with hard constraints and solved efficiently using graph cuts. To regularize this problem, we exploit 3D and appearance cues that encourage pixels to take on depth values that will be compatible in 3D to the piecewise planar assumption. Extensive experiments, on a new large-scale and challenging dataset, show that our approach results in more accurate depth maps (with 20% more depth values) than those recorded by the RGB-D sensor. Additional experiments on the NYUv2 dataset show that our method generates more 3D aware depth. These generated depth maps can also be used to improve the performance of a state-of-the-art RGB-D SLAM method.