PanoFloor: reconstruction and immersive exploration of large multi-room scenes from a minimal set of registered panoramic images using denoised density maps
Giovanni Pintore, Sara Jashari, Marco Agus, and Enrico Gobbetti
October 2025
Abstract
We introduce a deep learning approach to automatically generate 3D floor plans and immersive multi-room virtual visit experiences from a small set of co-registered 360-degree panoramas - down to just one per room. We integrate novel neural networks that leverage panoramic image broad context and large annotated room datasets to build a geometric and visual graph. Nodes represent stereo-viewable multiple-center-of-projection (MCOP) 360-degree images at the capture locations, while arcs connect them with paths through doors, avoiding clutter and minimizing disocclusions to maximize visual quality. The process starts with depth prediction and floor-plan projection to create a comprehensive but noisy global density map, which is refined via a latent diffusion model. A segmentation network then extracts room layouts, openings, and clutter. This structured representation is lifted to a visual one by creating a 360-degree stereo-explorable MCOP representation at each node, produced using a view-synthesis network from the original image and its predicted depth map. Arc paths are then computed using an optimization process that considers structural constraints, including openings and obstacles, while minimizing visual discontinuities, occlusions, and disocclusions. Finally, 360-degree video transitions are synthesized using a specialized view-synthesis network to obtain a fully precomputed WebXR-ready explorable representation that can be efficiently experienced on Head-Mounted-Displays with limited graphics capabilities. The extracted floor plan not only aids in documenting the captured building but can also enhance immersive experiences by serving as a live map of the building. Our experiments show that the method achieves state-of-the-art reconstruction from sparse inputs and supports compelling immersive visits.
Reference and download information
Giovanni Pintore, Sara Jashari, Marco Agus, and Enrico Gobbetti. PanoFloor: reconstruction and immersive exploration of large multi-room scenes from a minimal set of registered panoramic images using denoised density maps. In Proc. IEEE ISMAR, October 2025. To appear.
Related multimedia productions
Bibtex citation record
@inproceedings{Pintore:2025:PRI, author = {Giovanni Pintore and Sara Jashari and Marco Agus and Enrico Gobbetti}, title = {{PanoFloor}: reconstruction and immersive exploration of large multi-room scenes from a minimal set of registered panoramic images using denoised density maps}, booktitle = {Proc. IEEE ISMAR}, month = {October}, year = {2025}, abstract = { We introduce a deep learning approach to automatically generate 3D floor plans and immersive multi-room virtual visit experiences from a small set of co-registered 360-degree panoramas -- down to just one per room. We integrate novel neural networks that leverage panoramic image broad context and large annotated room datasets to build a geometric and visual graph. Nodes represent stereo-viewable multiple-center-of-projection (MCOP) 360-degree images at the capture locations, while arcs connect them with paths through doors, avoiding clutter and minimizing disocclusions to maximize visual quality. The process starts with depth prediction and floor-plan projection to create a comprehensive but noisy global density map, which is refined via a latent diffusion model. A segmentation network then extracts room layouts, openings, and clutter. This structured representation is lifted to a visual one by creating a 360-degree stereo-explorable MCOP representation at each node, produced using a view-synthesis network from the original image and its predicted depth map. Arc paths are then computed using an optimization process that considers structural constraints, including openings and obstacles, while minimizing visual discontinuities, occlusions, and disocclusions. Finally, 360-degree video transitions are synthesized using a specialized view-synthesis network to obtain a fully precomputed WebXR-ready explorable representation that can be efficiently experienced on Head-Mounted-Displays with limited graphics capabilities. The extracted floor plan not only aids in documenting the captured building but can also enhance immersive experiences by serving as a live map of the building. Our experiments show that the method achieves state-of-the-art reconstruction from sparse inputs and supports compelling immersive visits. }, note = {To appear}, url = {http://vic.crs4.it/vic/cgi-bin/bib-page.cgi?id='Pintore:2025:PRI'}, }
The publications listed here are included as a means to ensure timely
dissemination of scholarly and technical work on a non-commercial basis.
Copyright and all rights therein are maintained by the authors or by
other copyright holders, notwithstanding that they have offered their works
here electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each
author's copyright. These works may not be reposted without the
explicit permission of the copyright holder.
Please contact the authors if you are willing to republish this work in
a book, journal, on the Web or elsewhere. Thank you in advance.
All references in the main publication page are linked to a descriptive page
providing relevant bibliographic data and, possibly, a link to
the related document. Please refer to our main
publication repository page for a
page with direct links to documents.