Soft Segmentation of Images

Yagiz Aksoy
PhD Thesis
ETH Zurich, 2019
Soft Segmentation of Images

We approach the soft segmentation problem from two complementary properties of a photograph, color and semantics. Each part features an interactive technique commonly used in movie post-production and a fully automatic formulation that generates intermediate representations of photographs for easy and realistic manipulation.

Abstract

Realistic editing of photographs requires careful treatment of color mixtures that commonly occur in natural scenes. These color mixtures are typically modeled using soft selection of objects or scene colors. Hence, accurate representation of these soft transitions between image regions is essential for high-quality image editing and compositing. Current techniques for generating such representations depend heavily on interaction by a skilled visual artist, as creating such accurate object selections is a tedious task.

In this thesis, we approach the soft segmentation problem from two complementary properties of a photograph. Our first focus is representing images as a mixture of main colors in the scene, by estimating soft segments of homogeneous colors. We present a robust per-pixel nonlinear optimization formulation while simultaneously targeting computational efficiency and high accuracy. We then turn our attention to semantics in a photograph and present our work on soft segmentation of particular objects in a given scene. This work features graph-based formulations that specifically target the accurate representation of soft transitions in linear systems. Each part first presents an interactive segmentation scheme that targets applications popular in professional compositing and movie post-production. The interactive formulations are then generalized to the automatic estimation of generic image representations that can be used to perform a number of otherwise complex image editing tasks effortlessly.

The first problem studied is green-screen keying, interactive estimation of a clean foreground layer with accurate opacities in a studio setup with a controlled background, typically set to be green. We present a simple two-step interaction scheme to determine the main scene colors and their locations. The soft segmentation of the foreground layer is done via the novel color unmixing formulation, which can effectively represent a pixel color as a mixture of many colors characterized by statistical distributions. We show our formulation is robust against many challenges in green-screen keying and can be used to achieve production-quality keying results at a fraction of the time compared to commercial software.

We then study soft color segmentation, estimation of layers with homogeneous colors and corresponding opacities. The soft color segments can be overlayed to give the original image, providing effective intermediate representation of an image. We decompose the global energy optimization formulation that typically models the soft color segmentation task into three sub-problems that can be implemented with computational efficiency and scalability. Our formulation gets its strength from the color unmixing energy, which is essential in ensuring homogeneous layer colors and accurate opacities. We show that our method achieves a segmentation quality that allows realistic manipulation of colors in natural photographs.

Natural image matting is the generalized version of green-screen keying, where an accurate estimation of foreground opacities is targeted in an unconstrained setting. We approach this problem using a graph-based approach, where we model the connections in the graph as forms of information flow that distributes the information from the user input into the whole image. By carefully defining information flows to target challenging regions in complex foreground structures, we show that high-quality soft segmentation of objects can be estimated through a closed-form solution of a linear system. We extend our approach to related problems in natural image matting such as matte refinement and layer color estimation and demonstrate the effectiveness of our formulation through quantitative, qualitative and theoretical analysis.

Finally, we introduce semantic soft segments, a set of layers that correspond to semantically meaningful regions in an image with accurate soft transitions between different objects. We approach this problem from a spectral segmentation angle and propose a graph structure that embeds texture and color features from the image as well as higher-level semantic information generated by a neural network. The soft segments are generated via eigendecomposition of the carefully constructed Laplacian matrix fully automatically. We demonstrate that compositing and targeted image editing tasks can be done with little effort using semantic soft segments.

Dissertation

SIGGRAPH Thesis Fast Forward

A high-level description of the doctoral work aimed at general audience, presented at SIGGRAPH 2019.

BibTeX

@phdthesis{ssi,
author={Ya\u{g}{\i}z Aksoy},
title={Soft Segmentation of Images},
year={2019},
school={ETH Zurich},
}

Publications in the context of this thesis


Yağız Aksoy, Tae-Hyun Oh, Sylvain Paris, Marc Pollefeys and Wojciech Matusik
ACM Transactions on Graphics (Proc. SIGGRAPH), 2018
Accurate representation of soft transitions between image regions is essential for high-quality image editing and compositing. Current techniques for generating such representations depend heavily on interaction by a skilled visual artist, as creating such accurate object selections is a tedious task. In this work, we introduce semantic soft segments, a set of layers that correspond to semantically meaningful regions in an image with accurate soft transitions between different objects. We approach this problem from a spectral segmentation angle and propose a graph structure that embeds texture and color features from the image as well as higher-level semantic information generated by a neural network. The soft segments are generated via eigendecomposition of the carefully constructed Laplacian matrix fully automatically. We demonstrate that otherwise complex image editing tasks can be done with little effort using semantic soft segments.
@ARTICLE{sss,
author={Ya\u{g}{\i}z Aksoy and Tae-Hyun Oh and Sylvain Paris and Marc Pollefeys and Wojciech Matusik},
title={Semantic Soft Segmentation},
journal={ACM Trans. Graph. (Proc. SIGGRAPH)},
year={2018},
pages = {72:1-72:13},
volume = {37},
number = {4}
}

Yağız Aksoy, Tunç Ozan Aydın, Aljoša Smolić and Marc Pollefeys
ACM Transactions on Graphics, 2017
We present a new method for decomposing an image into a set of soft color segments, which are analogous to color layers with alpha channels that have been commonly utilized in modern image manipulation software. We show that the resulting decomposition serves as an effective intermediate image representation, which can be utilized for performing various, seemingly unrelated image manipulation tasks. We identify a set of requirements that soft color segmentation methods have to fulfill, and present an in-depth theoretical analysis of prior work. We propose an energy formulation for producing compact layers of homogeneous colors and a color refinement procedure, as well as a method for automatically estimating a statistical color model from an image. This results in a novel framework for automatic and high-quality soft color segmentation, which is efficient, parallelizable, and scalable. We show that our technique is superior in quality compared to previous methods through quantitative analysis as well as visually through an extensive set of examples. We demonstrate that our soft color segments can easily be exported to familiar image manipulation software packages and used to produce compelling results for numerous image manipulation applications without forcing the user to learn new tools and workflows.
@ARTICLE{scs,
author={Ya\u{g}{\i}z Aksoy and Tun\c{c} Ozan Ayd{\i}n and Aljo\v{s}a Smoli\'{c} and Marc Pollefeys},
title={Unmixing-Based Soft Color Segmentation for Image Manipulation},
journal={ACM Trans. Graph.},
year={2017},
pages = {19:1-19:19},
volume = {36},
number = {2}
}

Yağız Aksoy, Tunç Ozan Aydın and Marc Pollefeys
CVPR, 2017 (spotlight)
We present a novel, purely affinity-based natural image matting algorithm. Our method relies on carefully defined pixel-to-pixel connections that enable effective use of information available in the image and the trimap. We control the information flow from the known-opacity regions into the unknown region, as well as within the unknown region itself, by utilizing multiple definitions of pixel affinities. This way we achieve significant improvements on matte quality near challenging regions of the foreground object. Among other forms of information flow, we introduce color-mixture flow, which builds upon local linear embedding and effectively encapsulates the relation between different pixel opacities. Our resulting novel linear system formulation can be solved in closed-form and is robust against several fundamental challenges in natural matting such as holes and remote intricate structures. While our method is primarily designed as a standalone natural matting tool, we show that it can also be used for regularizing mattes obtained by various sampling-based methods. Our evaluation using the public alpha matting benchmark suggests a significant performance improvement over the state-of-the-art.
@INPROCEEDINGS{ifm,
author={Aksoy, Ya\u{g}{\i}z and Ayd{\i}n, Tun\c{c} Ozan and Pollefeys, Marc},
booktitle={Proc. CVPR},
title={Designing Effective Inter-Pixel Information Flow for Natural Image Matting},
year={2017},
}

Yağız Aksoy, Tunç Ozan Aydın, Marc Pollefeys and Aljoša Smolić
ACM Transactions on Graphics, 2016
Due to the widespread use of compositing in contemporary feature films, green-screen keying has become an essential part of post-production workflows. To comply with the ever-increasing quality requirements of the industry, specialized compositing artists spend countless hours using multiple commercial software tools, while eventually having to resort to manual painting because of the many shortcomings of these tools. Due to the sheer amount of manual labor involved in the process, new green-screen keying approaches that produce better keying results with less user interaction are welcome additions to the compositing artist's arsenal. We found that --- contrary to the common belief in the research community --- production-quality green-screen keying is still an unresolved problem with its unique challenges. In this paper, we propose a novel green-screen keying method utilizing a new energy minimization-based color unmixing algorithm. We present comprehensive comparisons with commercial software packages and relevant methods in literature, which show that the quality of our results is superior to any other currently available green-screen keying solution. Importantly, using the proposed method, these high-quality results can be generated using only one-tenth of the manual editing time that a professional compositing artist requires to process the same content having all previous state-of-the-art tools at his disposal.
@ARTICLE{keying,
author={Ya\u{g}{\i}z Aksoy and Tun\c{c} Ozan Ayd{\i}n and Marc Pollefeys and Aljo\v{s}a Smoli\'{c}},
title={Interactive High-Quality Green-Screen Keying via Color Unmixing},
journal={ACM Trans. Graph.},
year={2016},
volume = {35},
number = {5},
pages = {152:1--152:12},
}