What Sparse Light Field Coding Reveals about Scene Structure

Ole Johannsen, Antonin Sulc, and Bastian Goldluecke


Sparse coding scheme for accurate light field depth estimation with multiple depth layers Idea: explain EPI patches as sparse linear combinations of atoms with known disparity:

  • estimate set of lower dimensional generators from center view
  • lift each generator element to EPI space according to a set of disparities
  • sparsely encode every EPI patch with the resulting set of atoms
  • analyze sparse coding coefficients to estimate final patch disparities
  • Robust depth estimation
  • Depth estimation of multiple layers with different depth                                                               

Light fields and epipolar plane images (EPIs)

Light field: regular grid of views, identical cameras with parallel optical axes, parametrized with view coordinates (s, t) and image coordinates (x, y).
2D EPIs: horizontal slices with fixed (x, s) or vertical slices with fixed (t, y). Projection of a 3D point: line on an EPI, whose orientation corresponds to disparity.
Superimposed layers (reflections or tranparent objects): two super-imposed orientations.

Dictionary construction

Key idea: each atom corresponds to a unique disparity

Generators (blue)

Generators are patches in the center view, obtained from dictionary learning or PCA.
Choice of different formats: 1D, crosshair and 2D patches, resulting in 2D and 4D dictionary
elements. In our experience, 1D generators (i.e. 2D atoms) work best.

Dictionary atoms (red)
The Generators are shifted according to a discrete set of disparities. The generators are
larger than the atoms - i.e. the shifted generator (yellow) must be cropped to yield the final
atom (red).

The complete dictionary D

For each combination of disparity label and generator we generate one dictionary element.

Sparse coding for disparity estimation

Each EPI patch is encoded as a sparse linear combination of the atoms,

Idea: analysis of the coefficient distribution should give information about depth layers.

Lambertian surfaces: one disparity layer

Sparse coding coefficents are pooled into groups with the same disparity. A single Gaussian is
fitted to the data, after which we perform variational smoothing on the mean values, weighted
with the standard deviation of the Gaussian.

We also use a statistical test for two-peakedness of a distribution, and embed the score
into a binary segmentation problem to obtain a mask for possible two-layered regions.

Twin peaks: two disparity layers

In the candidate regions for the presence of two layers, we fit a two-component GMM to
the distribution of coefficients. Starting from the two means and their midpoint, we iteratively
optimise over the separation point and the two disparities in a variational framework.

This work was supported by the ERC Starting Grant ”Light Field Imaging and Analysis” (LIA 336978, FP7-2014).
Presented at the Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, June 2016.