lunduniversity.lu.se

Digit@LTH

Faculty of Engineering, LTH

Denna sida på svenska This page in English

Digit@LTH: Events

Halfway Seminar Patrik Persson

Seminarium

From: 2021-12-13 13:15 to 15:00
Place: MH:309A
Contact: carl [dot] olsson [at] math [dot] lth [dot] se
Save event to your calendar


A mixture of traditional and deep learning methods for structure from motion

In this presentation I will present several methods for solving various parts of the Structure From Motion, SFM, problem. Structure from motion has been an active research field for several decades where the goal is to estimate the camera motion and 3D structure from images. This can be formalized as solving a large non-linear optimization problem where the parameters are adjusted to fit the data. Due to the size of the problem, it can be infeasible to run on smaller platforms and I will therefore present a fast and efficient approximate method which can reduce the problem by orders of magnitude by utilizing the trifocal constraints. Furthermore, the optimization problem needs a good initial solution for it to converge which can be obtained using minimal solvers. I will therefor additionally present our work in minimal solvers for relative pose estimation, where we assume that the camera motion can partially be observed using inertial measurement data, allowing us to find very efficient solvers for pose, focal length and radial distortion parameters.

 

The previous methods has been directed towards finding camera motion and a sparse 3D structure. In the final work that will be presented, we look at dense 3D reconstruction using a combination of deep learning and traditional photo-metric methods. In this work we present a self-supervised deep learning method for learning a low-dimensional latent space representation of depth that parameterizes a family of plausible depth-maps given an image. Given additional images, we can perform a search in this low-dimensional space to find the depth map that gives a photo-consistent solution while remaining plausible based on the experience encoded in the network.