Location
Event Description
Face Editing with Machine Learning presented by Zhixin Shu
ABSTRACT: The face is the most informative feature of humans and has been a long-standing research topic in Computer Vision and Graphics. Images of faces are also ubiquitous in photography and social media, and people have devoted significant resources to capturing and editing face images. Face editing can be broadly viewed as the encoding, manipulation and the decoding of some representations for face images. The challenges are that we want to manipulate an image in a controllable way and generate results that are both desirable and as realistic as possible. This thesis explores different Machine Learning-based face-editing approaches. I discuss the role of machine learning for achieving desirable edits by learning both the physical aspects as well as the statistical manifold of human faces. In my work for eye-editing, I discuss the importance of understanding multiple physical elements of a face image, such as shape, illumination, pose, etc. In a deep-learning-based approach, I introduce image formation domain knowledge to the construction and training of a neural network. This network provides transparent access to the disentangled representations of the aforementioned physical properties. With this network, we can achieve various face editing tasks in forms of representation manipulation. After that, I introduce Deforming Autoencoders, a network that learns to disentangle shape and appearance in an unsupervised manner. This disentanglement is beneficial for the learning of some other factors of variations, such as illumination and facial expression. In an extension of Deforming Autoencoders, we incorporate non-rigid structure-from-motion to learn a 3D morphable model for faces that only requires an image set for training. At last, I describe an image-to-image network for 3D face reconstruction, which also utilizes structure-from-motion in deep learning. With real face images in training, this network not only reconstructs 3D faces more accurately than prior art but also has better generalization ability in real-life testing cases.