Please use this identifier to cite or link to this item: http://theses.iitj.ac.in:8080/jspui/handle/123456789/75
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorHarit, Gaurav-
dc.date.accessioned2016-09-05T13:19:09Z-
dc.date.available2016-09-05T13:19:09Z-
dc.date.issued2015-05-
dc.identifier.citationRupali. (2015). Robust Font Adaptive Word Recognition from Printed Document Images (Master's thesis). Indian Institute of Technology Jodhpur, Jodhpur.en_US
dc.identifier.urihttp://theses.iitj.ac.in:8080/jspui/handle/123456789/75-
dc.description.abstractA large amount of data present in books and ancient manuscripts is available as scanned images. The content search in machine readable documents like .txt, .pdf is performed by string matching operations that cannot be applied to document images. The contents of word image of a particular language can be extracted using Optical Character Recognition of that specific language. This method strictly requires clear document images so that character segmentation process doesn't give ambiguous characters. But, the document images extracted from scanned books and ancient manuscripts are not clear as most of its characters are merged or broken, however, they are human readable. This limitation of word recognition using Optical Character Recognition gives rise to a new method of word indexing in document images called Keyword Spotting. This method avoids the ambiguous character segmentation and recognition step as it extract features from complete word image and compare the images with their corresponding characteristics. The focus of our research is to find robust Keyword Spotting methods that can perform search of similar words in the data set which contains word images in widely varying fonts. The word images where words are present in different fonts are indexed such that the images with similar content as that of the query image are retrieved as top results. We have used two techniques to accomplish this task. The first technique is Self-Organizing maps where the characters with same content irrespective of the font are mapped to neurons that are close to each other in the two-dimensional neuron map. The second method extracts few interest points from each word image by applying k-means on its ink pixels, which are represented by the Scale Invariant Feature Transform descriptors. The results obtained with these techniques are found comparable to the existing approaches.en_US
dc.description.statementofresponsibilityby Rupalien_US
dc.format.extentxii, 55p.en_US
dc.language.isoenen_US
dc.publisherIndian Institute of Technology Jodhpuren_US
dc.rightsIIT Jodhpuren_US
dc.subject.ddcRobust Font Adaptiveen_US
dc.titleRobust Font Adaptive Word Recognition from Printed Document Imagesen_US
dc.typeThesisen_US
dc.creator.researcherRupali-
dc.date.registered2013-
dc.date.awarded2015-
dc.publisher.placeJodhpuren_US
dc.publisher.departmentCenter for Information Communication and Technologyen_US
dc.type.degreeMaster of Technology (M.Tech.)en_US
dc.format.accompanyingmaterialCDen_US
dc.description.notecol. ill.; including bibliographyen_US
dc.identifier.accessionTM00070-
Appears in Collections:M. Tech. Theses

Files in This Item:
File Description SizeFormat 
TM00070.pdf1.62 MBAdobe PDFView/Open    Request a copy


Items in IIT Jodhpur Theses Repository are protected by copyright, with all rights reserved, unless otherwise indicated.