Emergent Mind

Abstract

Arbitrary-shaped text detection is an important and challenging task in computer vision. Most existing methods require heavy data labeling efforts to produce polygon-level text region labels for supervised training. In order to reduce the cost in data labeling, we study weakly-supervised arbitrary-shaped text detection for combining various weak supervision forms (e.g., image-level tags, coarse, loose and tight bounding boxes), which are far easier for annotation. We propose an Expectation-Maximization (EM) based weakly-supervised learning framework to train an accurate arbitrary-shaped text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data. Meanwhile, we propose a contour-based arbitrary-shaped text detector, which is suitable for incorporating weakly-supervised learning. Extensive experiments on three arbitrary-shaped text benchmarks (CTW1500, Total-Text and ICDAR-ArT) show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to state-of-the-art methods, (2) with 100% strongly annotated data, our method outperforms existing methods on all three benchmarks. We will make the weakly annotated datasets publicly available in the future.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.