MVMO: A Multi-Object Dataset for Wide Baseline Multi-View Semantic Segmentation

Aitor Alvarez-Gila^1,2	Joost van de Weijer²
Yaxing Wang³	Estibaliz Garrote¹

¹Tecnalia - Basque Research and Technology Alliance (BRTA)

²Computer Vision Center - Universitat Autònoma de Barcelona

³Nankai University, Tianjin, China

IEEE International Conference on Image Processing (ICIP), 2022

[Paper]

[Data (coming soon!)]

[Code (coming soon!)]

Abstract

We present MVMO (Multi-View, Multi-Object dataset): a synthetic dataset of 116,000 scenes containing randomly placed objects of 10 distinct classes and captured from 25 camera locations in the upper hemisphere. MVMO comprises photorealistic, path-traced image renders, together with semantic segmentation ground truth for every view. Unlike existing multi-view datasets, MVMO features wide baselines between cameras and high density of objects, which lead to large disparities, heavy occlusions and view-dependent object appearance. Single view semantic segmentation is hindered by self and inter-object occlusions that could benefit from additional viewpoints. Therefore, we expect that MVMO will propel research in multi-view semantic segmentation and cross-view semantic transfer. We also provide baselines that show that new research is needed in such fields to exploit the complementary information of multi-view setups.

Comparison of datasets suitable for multi/cross-view semantic segmentation (updated 2022-03-01)

Sample scenes

Paper

Aitor Alvarez-Gila, Joost van de Weijer, Yaxing Wang, Estibaliz Garrote

MVMO: A Multi-Object Dataset for Wide Baseline Multi-View Semantic Segmentation

IEEE International Conference on Image Processing (ICIP), 2022

[bibtex]

Contact: Aitor Alvarez-Gila