Ke Ma Thesis Proposal: Data-Driven Document Unwarping

Event Description


Time: Jan 26, 2021 03:00 PM Eastern Time (US and Canada)

All are welcome!

Zoom Meeting:
https://stonybrook.zoom.us/j/93818552212?pwd=ajZkT2x4a2tiaDJUL1h3VFhLZEgwQT09

Meeting ID: 938 1855 2212
Passcode: 802722

Title: Data-Driven Document Unwarping

Abstract: Capturing document images is a common way to digitize and record physical documents due to the ubiquitousness of mobile cameras. To make text recognition easier, it is often desirable to digitally flatten a document image when the physical document sheet is folded or curved. However, unwarping a document from a single image in natural scenes is very challenging due to the complexity of document sheet deformation, document texture, and environmental conditions. Previous model-driven approaches struggle with inefficiency and limited generalizability. In this thesis, I investigate several data-driven approaches to tackle the document unwarping problem.

Data acquisition is the central challenge in data-driven methods. I first design an efficient data synthesis pipeline based on 2D image warping and train DocUNet, the pioneering data-driven document unwarping model, on the synthetic data. A benchmark dataset is also created to facilitate comprehensive evaluation and comparison. To improve the unwarping performance by training on more realistic data, I introduce the Doc3D dataset and DewarpNet. Supervised by 3D shape ground truth in Doc3D, DewarpNet is significantly better than DocUNet. DocUNet and DewarpNet depend on the synthetic data for the ground truth deformation annotation. To exploit the real-world images, I propose PaperEdge, a weakly supervised model trained with in-the-wild document images with easy-to-obtain boundary information. PaperEdge surpasses DewarpNet by utilizing both the synthetic data and weakly annotated real data in the Document In the Wild (DIW) dataset. Finally, I propose to incorporate the 3D physical constraints in training DewarpNet and PaperEdge. The constraints regulate the possible deformations on document papers. I also propose to augment the Doc3D and DIW dataset by introducing an online document segmentation model and better hardware.

Date Start

Date End