Abstract
Abstract. RGB-D
sensors are popular in the computer vision community, especially for problems
of scene understanding, semantic scene labeling, and segmentation. However,
most of these methods depend on reliable input depth measurements. The
reliability of these depth values deteriorates significantly with distance. In
practice, unreliable depth measurements are discarded, thus, limiting the
performance of methods that use RGB-D data. This paper studies how reliable
depth values can be used to correct the unreliable ones, and how to complete
(or extend) the available depth data beyond the raw measurements of the sensor
(i.e. infer depth at pixels with unknown depth values), given a prior model on
the 3D scene. We consider piecewise planar environments in this paper, since
many indoor scenes with man-made objects can be modeled as such. We propose a
framework that uses the RGB-D sensor’s noise profile to adaptively and robustly
fit plane segments (e.g. floor and ceiling) and iteratively complete the depth
map, when possible. Depth completion is formulated as a discrete labeling
problem (MRF) with hard constraints and solved efficiently using graph cuts. To
regularize this problem, we exploit 3D and appearance cues that encourage
pixels to take on depth values that will be compatible in 3D to the piecewise
planar assumption. Extensive experiments, on a new large-scale and challenging
dataset, show that our approach results in more accurate depth maps (with 20% more
depth values) than those recorded by the RGB-D sensor. Additional experiments
on the NYUv2 dataset show that our method generates more 3D aware depth. These
generated depth maps can also be used to improve the performance of a
state-of-the-art RGB-D SLAM method.