Metric Monocular Reconstruction through Ordinal Depth

Mahesh Kumar Krishna Reddy
MSc Thesis
Simon Fraser University, 2022
Metric Monocular Reconstruction through Ordinal Depth

Abstract

Training a single network for high resolution and geometrically consistent monocular depth estimation is challenging due to varying scene complexities in the real world. To address this, we present a dual depth estimation setup to decompose the estimations into ordinal and metric depth. The goal of ordinal depth estimation is to leverage novel ordinal losses with relaxed geometric constraints to model local and global ordinal relations for capturing better high-frequency depth details and scene structure. However, ordinal depth inherently lacks geometric structure, and to resolve this, we introduce a metric depth estimation method to enforce geometric constraints on the prior ordinal depth estimations. The estimated scaleinvariant metric depth achieves high resolution and is geometrically consistent in generating meaningful 3D point cloud representation for scene reconstruction. We demonstrate the effectiveness of our ordinal and metric networks by performing zero-shot and in-the-wild depth evaluations with state-of-the-art depth estimation networks.

Dissertation

Video Presentation

BibTeX

@MASTERSTHESIS{mmd-msc,
author={Mahesh Kumar Krishna Reddy},
title={Metric Monocular Reconstruction through Ordinal Depth},
year={2022},
school={Simon Fraser University},
}

Publications in the context of this thesis


S. Mahdi H. Miangoleh, Mahesh Reddy, and Yağız Aksoy
SIGGRAPH, 2024
Existing methods for scale-invariant monocular depth estimation (SI MDE) often struggle due to the complexity of the task, and limited and non-diverse datasets, hindering generalizability in real-world scenarios. This is while shift-and-scale-invariant (SSI) depth estimation, simplifying the task and enabling training with abundant stereo datasets achieves high performance. We present a novel approach that leverages SSI inputs to enhance SI depth estimation, streamlining the network's role and facilitating in-the-wild generalization for SI depth estimation while only using a synthetic dataset for training. Emphasizing the generation of high-resolution details, we introduce a novel sparse ordinal loss that substantially improves detail generation in SSI MDE, addressing critical limitations in existing approaches. Through in-the-wild qualitative examples and zero-shot evaluation we substantiate the practical utility of our approach in computational photography applications, showcasing its ability to generate highly detailed SI depth maps and achieve generalization in diverse scenarios.
@INPROCEEDINGS{miangolehSIDepth,
author={S. Mahdi H. Miangoleh and Mahesh Reddy and Ya\u{g}{\i}z Aksoy},
title={Scale-Invariant Monocular Depth Estimation via SSI Depth},
booktitle={Proc. SIGGRAPH},
year={2024},
}