Contents
  1. 1. Abstract

##Deep Interactive Video Inpainting: an Invisibility Cloak for Harry Potter

Abstract

In this paper, we propose a new task of deep interactive video inpainting and an application for users interact with the machine. To our knowledge, this is the first deep learning based interactive video inpainting work that only uses a free form user input as guidance (i.e. scribbles) instead of mask annotations for each frame, which has academic, entertainment, and commercial value. With users’ scribbles on a certain frame, it can simultaneously perform interactive video object segmentation and video inpainting tasks throughout the whole video. We utilize a shared spatial-temporal memory module, which combines the interactive video object segmentation and video inpainting tasks into an end-to-end pipeline. In our framework, the past frames with object masks(either the user’s scribbles or the predicted masks) form an external memory, and the current frame as the query is segmented and inpainted using the information in the shared memory. Furthermore, our method allows users to iteratively refine the segmentation results, which can effectively improve the inpainting results where the video object segmentation fails, thus allowing users to obtain high-quality video inpainting results even on challenging sequences. Qualitative and quantitative experimental results demonstrate the superiority of our approach.

Contents
  1. 1. Abstract