thumbnail

DDD++: Exploiting Density map consistency for Deep Depth estimation in indoor environments

Giovanni Pintore, Marco Agus, Alberto Signoroni, and Enrico Gobbetti

August 2025

Abstract

We introduce a novel deep neural network designed for fast and structurally consistent monocular 360-degree depth estimation in indoor settings. Our model generates a spherical depth map from a single gravity-aligned or gravity-rectified equirectangular image, ensuring the predicted depth aligns with the typical depth distribution and structural features of cluttered indoor spaces, which are generally enclosed by walls, floors, and ceilings. By leveraging the distinctive vertical and horizontal patterns found in man-made indoor environments, we propose a streamlined network architecture that incorporates gravity-aligned feature flattening and specialized vision transformers. Through flattening, these transformers fully exploit the omnidirectional nature of the input without requiring patch segmentation or positional encoding. To further enhance structural consistency, we introduce a novel loss function that assesses density map consistency by projecting points from the predicted depth map onto a horizontal plane and a cylindrical proxy. This lightweight architecture requires fewer tunable parameters and computational resources than competing methods. Our comparative evaluation shows that our approach improves depth estimation accuracy while ensuring greater structural consistency compared to existing methods. For these reasons, it promises to be suitable for incorporation in real-time solutions, as well as a building block in more complex structural analysis and segmentation methods.

Reference and download information

Giovanni Pintore, Marco Agus, Alberto Signoroni, and Enrico Gobbetti. DDD++: Exploiting Density map consistency for Deep Depth estimation in indoor environments. Graphical Models, 140: 101281, August 2025. DOI: 10.1016/j.gmod.2025.101281.

Related multimedia productions

Bibtex citation record

@article{Pintore:2025:DED,
    author = {Giovanni Pintore and Marco Agus and Alberto Signoroni and Enrico Gobbetti},
    title = {DDD++: Exploiting Density map consistency for Deep Depth estimation in indoor environments},
    journal = {Graphical Models},
    volume = {140},
    pages = {101281},
    publisher = {Elsevier},
    month = {August},
    year = {2025},
    abstract = { We introduce a novel deep neural network designed for fast and structurally consistent monocular 360-degree depth estimation in indoor settings. Our model generates a spherical depth map from a single gravity-aligned or gravity-rectified equirectangular image, ensuring the predicted depth aligns with the typical depth distribution and structural features of cluttered indoor spaces, which are generally enclosed by walls, floors, and ceilings. By leveraging the distinctive vertical and horizontal patterns found in man-made indoor environments, we propose a streamlined network architecture that incorporates gravity-aligned feature flattening and specialized vision transformers. Through flattening, these transformers fully exploit the omnidirectional nature of the input without requiring patch segmentation or positional encoding. To further enhance structural consistency, we introduce a novel loss function that assesses density map consistency by projecting points from the predicted depth map onto a horizontal plane and a cylindrical proxy. This lightweight architecture requires fewer tunable parameters and computational resources than competing methods. Our comparative evaluation shows that our approach improves depth estimation accuracy while ensuring greater structural consistency compared to existing methods. For these reasons, it promises to be suitable for incorporation in real-time solutions, as well as a building block in more complex structural analysis and segmentation methods. },
    doi = {10.1016/j.gmod.2025.101281},
    url = {http://vic.crs4.it/vic/cgi-bin/bib-page.cgi?id='Pintore:2025:DED'},
}