TL;DR
This paper introduces a computational method to synthesize shallow depth-of-field images on mobile phones using single-camera input, combining segmentation and depth estimation for realistic defocus effects.
Contribution
It presents a fully automatic, real-time system that synthesizes depth-of-field effects on mobile phones without requiring specialized hardware or user input.
Findings
Processes 5.4 MP images in 4 seconds on a mobile device
Works with or without dual-pixel hardware or human subjects
Produces realistic defocused images suitable for non-expert users
Abstract
Shallow depth-of-field is commonly used by photographers to isolate a subject from a distracting background. However, standard cell phone cameras cannot produce such images optically, as their short focal lengths and small apertures capture nearly all-in-focus images. We present a system to computationally synthesize shallow depth-of-field images with a single mobile camera and a single button press. If the image is of a person, we use a person segmentation network to separate the person and their accessories from the background. If available, we also use dense dual-pixel auto-focus hardware, effectively a 2-sample light field with an approximately 1 millimeter baseline, to compute a dense depth map. These two signals are combined and used to render a defocused image. Our system can process a 5.4 megapixel image in 4 seconds on a mobile phone, is fully automatic, and is robust enough to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
