Scale-Invariant Monocular Depth Estimation via SSI Depth


S. Mahdi H. Miangoleh	Mahesh Reddy	Yağız Aksoy

Proc. SIGGRAPH, 2024

Scale-Invariant Monocular Depth Estimation via SSI Depth

(top) We propose a framework to generate high resolution scale-invariant (SI) depth from a single image that can be projected to geometrically accurate point clouds of complex scenes. Our generalization ability comes from formulating SI depth estimation with SSI inputs. (bottom) For this purpose, we introduce a novel scale and shift invariant (SSI) depth estimation formulation that excels in generating intricate details.

Abstract

Existing methods for scale-invariant monocular depth estimation (SI MDE) often struggle due to the complexity of the task, and limited and non-diverse datasets, hindering generalizability in real-world scenarios. This is while shift-and-scale-invariant (SSI) depth estimation, simplifying the task and enabling training with abundant stereo datasets achieves high performance. We present a novel approach that leverages SSI inputs to enhance SI depth estimation, streamlining the network's role and facilitating in-the-wild generalization for SI depth estimation while only using a synthetic dataset for training. Emphasizing the generation of high-resolution details, we introduce a novel sparse ordinal loss that substantially improves detail generation in SSI MDE, addressing critical limitations in existing approaches. Through in-the-wild qualitative examples and zero-shot evaluation we substantiate the practical utility of our approach in computational photography applications, showcasing its ability to generate highly detailed SI depth maps and achieve generalization in diverse scenarios.

Implementation

GitHub Repository

Video

SIGGRAPH Presentation

Paper

Poster

BibTeX

@INPROCEEDINGS{miangolehSIDepth,
author={S. Mahdi H. Miangoleh and Mahesh Reddy and Ya\u{g}{\i}z Aksoy},
title={Scale-Invariant Monocular Depth Estimation via SSI Depth},
booktitle={Proc. SIGGRAPH},
year={2024},
}

Related Publications

Boosting Monocular Depth Estimation Models to High Resolution

S. Mahdi H. Miangoleh*, Sebastian Dille*, Long Mai, Sylvain Paris, and Yağız Aksoy

CVPR, 2021

Abstract

Neural networks have shown great abilities in estimating depth from a single image. However, the inferred depth maps are well below one-megapixel resolution and often lack fine-grained details, which limits their practicality. Our method builds on our analysis on how the input resolution and the scene structure affects depth estimation performance. We demonstrate that there is a trade-off between a consistent scene structure and the high-frequency details, and merge low- and high-resolution estimations to take advantage of this duality using a simple depth merging network. We present a double estimation method that improves the whole-image depth estimation and a patch selection method that adds local details to the final result. We demonstrate that by merging estimations at different resolutions with changing context, we can generate multi-megapixel depth maps with a high level of detail using a pre-trained model.

Manuscript & more

BibTeX

@INPROCEEDINGS{Miangoleh2021Boosting,
author={S. Mahdi H. Miangoleh and Sebastian Dille and Long Mai and Sylvain Paris and Ya\u{g}{\i}z Aksoy},
title={Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging},
journal={Proc. CVPR},
year={2021},
}

Interactive Editing of Monocular Depth

Obumneme Stanley Dukor, S. Mahdi H. Miangoleh, Mahesh Kumar Krishna Reddy, Long Mai, and Yağız Aksoy

SIGGRAPH Posters, 2022

Abstract

Recent advances in computer vision have made 3D structure-aware editing of still photographs a reality. Such computational photography applications use a depth map that is automatically generated by monocular depth estimation methods to represent the scene structure. In this work, we present a lightweight, web-based interactive depth editing and visualization tool that adapts low-level conventional image editing operations for geometric manipulation to enable artistic control in the 3D photography workflow. Our tool provides real-time feedback on the geometry through a 3D scene visualization to make the depth map editing process more intuitive for artists. Our web-based tool is open-source and platform-independent to support wider adoption of 3D photography techniques in everyday digital photography.

Manuscript & more

BibTeX

@INPROCEEDINGS{interactiveDepth,
author={Obumneme Stanley Dukor and S. Mahdi H. Miangoleh and Mahesh Kumar Krishna Reddy and Long Mai and Ya\u{g}{\i}z Aksoy},
title={Interactive Editing of Monocular Depth},
booktitle={SIGGRAPH Posters},
year={2022},
}

Metric Monocular Reconstruction through Ordinal Depth

Mahesh Kumar Krishna Reddy

MSc Thesis, Simon Fraser University, 2022

Abstract

Training a single network for high resolution and geometrically consistent monocular depth estimation is challenging due to varying scene complexities in the real world. To address this, we present a dual depth estimation setup to decompose the estimations into ordinal and metric depth. The goal of ordinal depth estimation is to leverage novel ordinal losses with relaxed geometric constraints to model local and global ordinal relations for capturing better high-frequency depth details and scene structure. However, ordinal depth inherently lacks geometric structure, and to resolve this, we introduce a metric depth estimation method to enforce geometric constraints on the prior ordinal depth estimations. The estimated scaleinvariant metric depth achieves high resolution and is geometrically consistent in generating meaningful 3D point cloud representation for scene reconstruction. We demonstrate the effectiveness of our ordinal and metric networks by performing zero-shot and in-the-wild depth evaluations with state-of-the-art depth estimation networks.

Thesis and Presentation

BibTeX

@MASTERSTHESIS{mmd-msc,
author={Mahesh Kumar Krishna Reddy},
title={Metric Monocular Reconstruction through Ordinal Depth},
year={2022},
school={Simon Fraser University},
}