vioft2nntf2t|tblJournal|Abstract_paper|0xf4fffed72b000000a121060001000700
The paper presents a new Convolutional Neural Network (CNN) architecture, called stacked stereo CNN, for computing disparity map from stereo images. In stacked stereo CNN, left and right image patches are stacked back-to-back and fed to a single tower CNN. This is in contrast to Siamese network where two towers are used, one for the left patch and other for the right patch. The proposed network is trained on a large set of similar and dissimilar image patches, which are generated from stereo images and their ground truth images from Middlebury stereo datasets. The network returns a dissimilarity score for a pair of image patch which is used to compute the cost volume. The cost volume is further refined using post processing steps before generating the final disparity map. The proposed network is evaluated on Middlebury datasets and achieves comparable results to the state-of-art algorithms.