Word sense disambiguation in Hindi applied to Hindi-English machine translation

Word sense disambiguation in Hindi applied to Hindi-English machine translation

S Mall, U C Jaiswal
COMPUTER MODELLING & NEW TECHNOLOGIES 2017 21(2) 58-68

Madan Mohan Malaviya University of Technology Gorakhpur, India

The Word Sense Disambiguation for Hindi Language is one of the biggest challenges faced by Natural Language Processing. In this paper we discuss issues in reducing ambiguity in Word Sense Disambiguation for Hindi Language. The concepts are induced in two modules Parsing and Word Sense Disambiguation for Hindi Language. Parsing is an extension of our previous work on shallow parser method that creates groups word which are essential for Machine Translation. Monolingual Hindi and English corpora are used. Following this we used machine learning technique such as supervised approach, unsupervised approach and domain specific sense with the help of Knowledge based methods. Knowledge based method uses Hindi and English WordNet tools. Supervised method is used to disambiguate the multiple tags in the context label with the correct tag. Unsupervised method is used to update the sentence with the correct sense and parts of speech tag. There are various websites which provide the facility of translation of Hindi language to English language such as Google Translator and Babefish Translator but these translators fail to resolve polysemy words in Hindi sentences the result is discussed in this paper. The accuracy result of part of speech tagging generated by our system is 92.09%. The accuracy results generated by our system for Chunk are window-3, window 2 and window1 are: 94.45%, 81.23%, and 81.11% respectively. We modify and develop Lesk algorithm which uses WordNet tools for Word Sense Disambiguation. We compare the system's performance with the website Google Translator. We also examine errors made by Google Translator for given input Hindi sentence. Our system generates correct translation with Word Sense Disambiguation for given input Hindi sentence as shown in the Figure12.