Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene Text Detection in Video Images

Scene text detection from video as well as natural scene images is challenging due to the variations in background, contrast, text type, font type, font size, and so on. Besides, arbitrary orientations of texts with multi-scripts add more complexity to the problem. The proposed approach introduces a...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 24(2015), 11 vom: 18. Nov., Seite 4488-501
1. Verfasser:	Liang, Guozhu (VerfasserIn)
Weitere Verfasser:	Shivakumara, Palaiahnakote, Lu, Tong, Tan, Chew Lim
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2015
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article Research Support, Non-U.S. Gov't


LEADER	01000naa a22002652 4500
001	NLM251695735
003	DE-627
005	20231224162547.0
007	cr uuu---uuuuu
008	231224s2015 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2015.2465169 \|2 doi
028	5	2	\|a pubmed24n0839.xml
035			\|a (DE-627)NLM251695735
035			\|a (NLM)26259083
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Liang, Guozhu \|e verfasserin \|4 aut
245	1	0	\|a Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene Text Detection in Video Images
264		1	\|c 2015
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 16.09.2015
500			\|a Date Revised 10.09.2015
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Scene text detection from video as well as natural scene images is challenging due to the variations in background, contrast, text type, font type, font size, and so on. Besides, arbitrary orientations of texts with multi-scripts add more complexity to the problem. The proposed approach introduces a new idea of convolving Laplacian with wavelet sub-bands at different levels in the frequency domain for enhancing low resolution text pixels. Then, the results obtained from different sub-bands (spectral) are fused for detecting candidate text pixels. We explore maxima stable extreme regions along with stroke width transform for detecting candidate text regions. Text alignment is done based on the distance between the nearest neighbor clusters of candidate text regions. In addition, the approach presents a new symmetry driven nearest neighbor for restoring full text lines. We conduct experiments on our collected video data as well as several benchmark data sets, such as ICDAR 2011, ICDAR 2013, and MSRA-TD500 to evaluate the proposed method. The proposed approach is compared with the state-of-the-art methods to show its superiority to the existing methods
650		4	\|a Journal Article
650		4	\|a Research Support, Non-U.S. Gov't
700	1		\|a Shivakumara, Palaiahnakote \|e verfasserin \|4 aut
700	1		\|a Lu, Tong \|e verfasserin \|4 aut
700	1		\|a Tan, Chew Lim \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 24(2015), 11 vom: 18. Nov., Seite 4488-501 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:24 \|g year:2015 \|g number:11 \|g day:18 \|g month:11 \|g pages:4488-501
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2015.2465169 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 24 \|j 2015 \|e 11 \|b 18 \|c 11 \|h 4488-501