Event Description
Time:
Sep 7, Tue, 11:00am EDT
Place:
NCS 220 or on Zoom (info below)
Title: Data-Driven Document Unwarping
Abstract:
Capturing document images is a common way to digitize and record physical documents due to the ubiquitousness of mobile cameras. To make text recognition easier, it is often desirable to digitally flatten a document image when the physical document sheet is folded or curved. However, unwarping a document from a single image in natural scenes is very challenging due to the complexity of document sheet deformation, document texture, and environmental conditions. Previous model-driven approaches struggle with inefficiency and limited generalizability. In this thesis, I investigate several data-driven approaches to tackle the document unwarping problem.
Data acquisition is the central challenge in data-driven methods. I first design an efficient data synthesis pipeline based on 2D image warping and train DocUNet, the pioneering data-driven document unwarping model, on the synthetic data. A benchmark dataset is also created to facilitate comprehensive evaluation and comparison. To improve the unwarping performance by training on more realistic data, I introduce the Doc3D dataset and DewarpNet. Supervised by 3D shape ground truth in Doc3D, DewarpNet is significantly better than DocUNet. DocUNet and DewarpNet depend on the synthetic data for the ground truth deformation annotation. To exploit the real-world images, I propose PaperEdge, a weakly supervised model trained with in-the-wild document images with easy-to-obtain boundary information. PaperEdge surpasses DewarpNet by utilizing both the synthetic data and weakly annotated real data in the Document In the Wild (DIW) dataset. Finally, I propose directly predicting the $uv$ parameterized 3D mesh of the document with 3D constraints and using the accessible 3D presentations like depth maps as training targets. Predicting the 3D mesh of the document solves the unwarping task and also benefits VR/AR applications.
Join Zoom Meeting
https://stonybrook.zoom.us/j/
Meeting ID: 964 4059 2912
Passcode: 793149
One tap mobile
+16468769923,,96440592912# US (New York)
+13017158592,,96440592912# US (Washington DC)
Dial by your location
+1 646 876 9923 US (New York)
+1 301 715 8592 US (Washington DC)
+1 312 626 6799 US (Chicago)
+1 253 215 8782 US (Tacoma)
+1 346 248 7799 US (Houston)
+1 408 638 0968 US (San Jose)
+1 669 900 6833 US (San Jose)
Meeting ID: 964 4059 2912
Find your local number: https://stonybrook.