Please use this identifier to cite or link to this item: http://gukir.inflibnet.ac.in:8080/jspui/handle/123456789/3701
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHangarge M
dc.contributor.authorDhandra B.V.
dc.date.accessioned2020-06-12T15:01:07Z-
dc.date.available2020-06-12T15:01:07Z-
dc.date.issued2008
dc.identifier.citationProceedings - 1st International Conference on Emerging Trends in Engineering and Technology, ICETET 2008 , Vol. , , p. 1175 - 1180en_US
dc.identifier.uri10.1109/ICETET.2008.177
dc.identifier.urihttp://gukir.inflibnet.ac.in:8080/jspui/handle/123456789/3701-
dc.description.abstractIn this paper, a technique of language identification in document images is described to discriminate five major Indian languages: Hindi, Marathi, Sanskrit, Assamese and Bengali belong to Devnagari and Bangla scripts. A text block of each language containing at least two text lines is selected and characterized by employing global and local features. Morphological transformations are used to decompose a text block in two directions at three levels, to capture fine texture primitives. Shape features of connected components are used to retain the local properties of the text block. Further, combination of these features is used to classify 500 text blocks of proposed languages based on Binary decision tree and KNN classifier. Proposed method is quite different from reported method on non-Indian languages, which are based on shape coding of characters, words and document vectorization. This method directly captures word shapes without segmentation and it is tolerant to variations in font style and size. The language identification results are encouraging. © 2008 IEEE.en_US
dc.subjectAnd Binary decision tree
dc.subjectDocument images
dc.subjectLanguage identification
dc.subjectMorphological transformations
dc.titleShape and morphological transformation based features for language identification in Indian document imagesen_US
dc.typeConference Paper
Appears in Collections:2. Conference Papers

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.