Towards segmenting consumer stereo videos: benchmark, baselines and ensembles
Are we ready to segment consumer stereo videos? The amount of this data type is rapidly increasing and encompasses rich information of appearance, motion and depth cues. However, the segmentation of such data is still largely unexplored. First, we propose therefore a new benchmark: videos, annotations and metrics to measure progress on this emerging challenge. Second, we evaluate several state of the art segmentation methods and propose a novel ensemble method based on recent spectral theory. This combines existing image and video segmentation techniques in an efficient scheme. Finally, we propose and integrate into this model a novel regressor, learnt to optimize the stereo segmentation performance directly via a differentiable proxy. The regressor makes our segmentation ensemble adaptive to each stereo video and outperforms the segmentations of the ensemble as well as a most recent RGB-D segmentation technique.