Localizing and Segmenting Text in Images and Videos


Lienhart, Rainer ; Wernicke, Axel


Document Type: Article
Year of publication: 2002
The title of a journal, publication series: IEEE Transactions on Circuits and Systems for Video Technology
Volume: 12
Issue number: 4
Page range: 256-258
Place of publication: New York, NY
Publishing house: IEEE
ISSN: 1051-8215
Publication language: English
Institution: School of Business Informatics and Mathematics > Praktische Informatik IV (Effelsberg -2017)
Subject: 004 Computer science, internet
Abstract: Many images especially those used for page design on web pages as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized auto-matically, they would be a valuable source of high-level seman-tics for indexing and retrieval. In this paper, we propose a novel method for localizing and segmenting text in complex images and videos. Text lines are identified by using a complex-valued multi-layer feed-forward network trained to detect text at a fixed scale and position. The network s output at all scales and positions is in-tegrated into a single text-saliency map, serving as a starting point for candidate text lines. In the case of video, these candidate text lines are refined by exploiting the temporal redundancy of text in video. Localized text lines are then scaled to a fixed height of 100 pixels and segmented into a binary image with black characters on white background. For videos, temporal redundancy is exploited to improve segmentation performance. Input images and videos can be of any size due to a true multiresolution approach. Moreover, the system is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video, so that one text bitmap is created for all instances of that text line. Therefore, our text segmentation results can also be used for ob-ject- based video encoding such as that enabled by MPEG-4.

Dieser Eintrag ist Teil der Universitätsbibliographie.




+ Citation Example and Export

Lienhart, Rainer ; Wernicke, Axel (2002) Localizing and Segmenting Text in Images and Videos. IEEE Transactions on Circuits and Systems for Video Technology New York, NY 12 4 256-258 [Article]


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item