DIWAKAR MISHRA

diwakarmishra@gmail.com

Language Engineer

Amazon Alexa, Bengaluru (Karnataka)

www.diwakarji.com

sanskrit.jnu.ac.in/samvacaka/

 

Research Interest:

  • Artificial Intelligence
  • Natural languages processing
  • Speech Technology
  • Sanskrit and other Indian languages
  • Languages and Linguistics
  • Heritage Multimedia

 

Educational Qualification:

  • Ph.D. (2016) --- Ph.D. at Special Centre for Sanskrit Studies, Jawaharlal Nehru University on “A Speech Synthesis System for Sanskrit Prose” in consultation with Prof. Girish Nath Jha & Dr. Kalika Bali
  • M.Phil. (2010) --- M.Phil. in the area of Computational Linguistics from Special Centre for Sanskrit Studies, Jawaharlal Nehru University (topic: “Issues and Challenges in Computational Processing of Vyanjana Sandhi”) under the supervision of Prof. Girish Nath Jha. Final grade (CGPA) 8.16/9
  • Post MA Diploma in Applied Hindi Linguistics (2007) --- from Kendriya Hindi Sansthan, Agra, Delhi Centre, in 2007, with marks 208/400 (52%)
  • M.A. (2006) --- M.A. in Sanskrit from Special Centre for Sanskrit Studies, Jawaharlal Nehru University, New Delhi in 2006, with final grades (CGPA) 8.18/9
  • B.A. (2004) --- B.A. in Sanskrit, Economics and Education from Lucknow University, Lucknow, Uttar Pradesh in 2004, with marks 615/900 (68.33%)
  • Intermediate (2001) --- R.M.P. Inter College, Sitapur (U.P. board of Secondary Education, Allahabad) in 2001, with marks 320/500 (64%)
  • Secondary School Examination (1999) --- Kendriya Vidyalaya, Sitapur (C.B.S.E.) in 1999, with marks 309/500 (61.8%)

 

Work History:

  • Amazon Alexa, Bangalore - Language Engineer for Amazon Alexa, Bangalore, since January 28, 2019
  • ezDI, Ahmedabad - NLP Research Engineer at ezDI - Healthcare Data Intelligence, Ahmedabad, November 2, 2015 to January 25, 2019
  • C-DAC, Pune - Consultant at Applied Artificial Intelligence (AAI) Group, Centre for Development of Advance Computing (C-DAC), Pune, July 10, 2013 to October 9, 2015
  • Microsoft, Bangalore - Research Intern in Multilingual Systems Research group at Microsoft Research Lab India, Bangalore for 14 weeks, February 6, 2012 to May 11, 2012
  • JNU, New Delhi - Worked as Honorary post of Senior Linguist in DIT, MHRD sponsored project in Special Centre for Sanskrit Studies, JNU, New Delhi April 5, 2010 to January 5, 2011
  • Microsoft, Bangalore - Research Intern in Multilingual Systems Research group at Microsoft Research Lab India, Bangalore for 16 weeks, November 16, 2009 to March 5, 2010
  • JNU, New Delhi - Worked as DEO in project on Multilingual, Multimedia Encyclopedic Dictionary of Intellectual Terms of Indian Culture (Under UPOE) at Special Centre for Sanskrit Studies, JNU, New Delhi since February 1, 2006 to March 31, 2007.
  • Data-entry and editing of base words and synonyms of Amarkosha in Online Multilingual Amarakosha (OMA) project funded by UPOE at Special centre for Sanskrit Studies, Jawaharlal Nehru University, New Delhi (in 2006)
  • Recorded sound and guided in recording in developing Multimedia Stories in DIT, MHRD sponsored project in Special Centre for Sanskrit Studies, JNU, New Delhi since April 2008 to May 2009 and April 2010

 

Teaching Experience

  • B.A. optional course Language Technology in India in Monsoon semester 2008, in Special Centre for Sanskrit Studies, JNU.
  • B.A. optional courses Language Technology in India-II in winter semester 2008 and winter 2009, in Special Centre for Sanskrit Studies, JNU.

 

Publication: Books

  • Paninian Morphophonemics: A Computational Approach to Analysis and Interpretation (co-author- Girish Nath Jha), Lambert Academic Publishing, Germany, 2011, ISBN: 987-3-8465-2953-9
  • Indian Language Parts of Speech Tagset: Sanskrit (co-authored with Girish Nath Jha, Madhav Gopal), Language Data Consortium (LDC), February 2011, ISBN: 1-58563-575-8, LDC Catalogue number- LDC2011T04
  • Science And Technology In Ancient Indian Texts (ed. jointly with Bal Ram Singh, Girish Nath Jha, Umesh Kumar Singh), 2011, DK Printworld, Delhi, ISBN: 978-81-246-0632-2
  • Modern Perspectives in Vedanta (ed. jointly with Girish Nath Jha, Bal Ram Singh, R. P. Singh), DK Printworld, Delhi, 2012, ISBN: 978-81-246-0639-1

 

Publication: Research Papers and Articles

  • Online Indexing for Ashtadhyayi in 28th All India Conference of Linguists, on November 2 to 4, 2006 at Banaras Hindu University, Varanasi (abstract published).
  • Strategies for Metrical Analysis of Sanskrit Text in National Seminar on Perspectives in Linguistics, at University of Kashmir, Srinagar on November 8-9, 2006. (published)
  • Sanskrit Lexicographic Tradition and its Adaptation for Language Technology in 30th All India Conference of Linguists, on November 26-28, 2008, at Deccan College, Pune. (abstract published)
  • An Algorithm for Morphophonemic Processing of Sanskrit in 3rd National Seminar in Methods and Models in Computing on December 8-9, 2008, at School of Computer and System Sciences, JNU. (published by Allied)
  • Inflected Morphology Analyzer for Sanskrit in First International Sanskrit Computational Linguistics Symposium, on October 29-31, 2007, at INRIA, Paris, France. (published by Springer)
  • Anaphors in Sanskrit in Second Workshop in Anaphora Resolution (WAR-II), 2008, at University of Bergen, Norvey. (published)
  • भारत में भाषा प्रौद्योगिकी : एक सर्वेक्षण in Gaveshana, Vol 90, a journal of Kendriya Hindi Sansthan, Agra.
  • Discourse Anaphora and Resolution Techniques in Sanskrit in 7th Discourse Anaphora and Anaphora Resolution Colloquium, Goa, organized by AU-KBC Research Centre on 5-6 November, 2009 (poster presentation, published)
  • Annotating Sanskrit Adapting IL-POST in 4th Language and Technology Conference, Poznan, Poland on 5-8 November, 2009 (published)
  • Hindi Dialects Phonological Transfer Rules for Verb Root cala in O-COCOSDA 2010, Kathmandu, Nepal on 24-25 November, 2010. (published in CD proceedings)
  • Evaluating Tagsets for Sanskrit in 4th International Computational Linguistics Symposium, 10-12 December, 2010 (published by Springer)
  • Challenges in Developing a TTS for Sanskrit in International Conference on Information Systems for Indian Languages (ICISIL 2011), Punjabi University, Patiala, March 9-11, 2011. (poster presentation, published by Springer)
  • अद्वैत वेदान्त के मत में तार्किकता : ब्रह्मसूत्र शांकर भाष्य और वेदान्तसार के आधार पर, in 20th International Congress of Vedanta, to be organized by Special Centre for Sanskrit Studies & Centre for Philosophy, JNU, New Delhi, December 28-31, 2011 (proceedings by DK Printworld, Delhi)
  • Theories of Anaphora Resolution in Traditional Sanskrit Texts, in 15th World Sanskrit Conference, New Delhi, January 5-10, 2012
  • Grapheme to Phoneme Converter for Sanskrit Speech Synthesis, in First Workshop on Indian Languages Data: Resources and Evaluation (WILDRE 2012) under LREC, Istanbul, Turkey, May 21, 2012
  • Text Normalizer for Sanskrit, in 5th International Sanskrit Computational Linguistics Symposium, IIT Bombay, January 4-6, 2013 (proceedings by DK Printworld, Delhi)
  • Syllabification and Stress Assignment in Phonetic Sanskrit Text, in O-COCOSDA 2013, KIIT, Gurgaon, November 25-27, 2013 (CD proceedings by IEEE)
  • Editing Unicode Devanagari Text through Digital Braille Typewriter, in 22nd International Congress of Vedanta, organized by Special Centre for Sanskrit Studies & Centre for Philosophy, JNU, New Delhi, and Center for Indic Studies, University of Massachusetts, Dartmouth, at New Delhi December 27-30, 2015

 

Computer Proficiency:

Programming languages    :           Java, JSP, C++, Android, Prolog, Python (basic)

Scripting language             :           HTML, Java Script, Scheme, XML, LaTeX

Database systems              :           MS SQL Server, MySql

TTS/ASR Engines              :           Festival, Sphinx

 

Technical Developments:

  • Developed Samvacaka - Sanskrit TTS as part of PhD research
  • Contributed in program code of sandhi splitter program (Java, JSP) with Prof. Girish Nath Jha
  • Developed Unicode to Digital Braille and Digital Braille to Unicode Devanagari Converter in Java, applicable with computerized Braille mode
  • Developed vowel sandhi generator in Prolog as course project in MA, I semester
  • Data design for Online Indexing of Aşţādhyāyī as course project in MA, III semester
  • Patterns for transfer grammar for Sanskrit-Hindi verbs’ translation
  • Contributed in code of Phoneme Splitting program (Java, JSP) for Devanagari Unicode text with Prof. Girish Nath Jha
  • Contributed in program code and rule base of Subanta generator program (Java, JSP) with Prof. Girish Nath Jha and Dr. Subhash Chandra
  • Contributed in program code in Marathi Nominal Word Segmenter (Java, JSP) with Prof. Girish Nath Jha and Avanti Nioding
  • Grapheme to Phoneme (G2P) converter for Sanskrit (in Java) used in Sanskrit Speech Synthesis System
  • Text Normalizer for Sanskrit (in Java) used in Sanskrit Speech Synthesis System

 

Participation:

  • As trainer in Three day National Workshop on Computational Sanskrit (Under the Restructuring Sanskrit Scheme) jointly organized by Department of Sanskrit and Computer Science, Sanatan Dharma College (Lahore), Ambala Cantt and Haryana Sanskrit Academy, Panchkula, November 11-13, 2011.
  • Fourth International Computational Linguistics Symposium (4i-SCLS), Jawaharlal Nehru University, New Delhi, December 10-12, 2010.
  • Third International Computational Linguistics Symposium, University of Hyderabad, January 15-17, 2009.
  • Residential week on Science and Spiritual Heritage of India, organized by Cortona India in Hyderabad, November 20-27, 2010.
  • As trainer in Computational Sanskrit Workshop at Kanya Maha-vidyalaya, Jalandhar, October 27-29, 2010.
  • Short term course on Automatic Speech Recognition (ASR-10), Research and Training Unit for Navigational Electronics, Osmania University, Hyderabad, September 6-9, 2010.
  • Short course on RedHat Server Configuration, CIS, Jawaharlal Nehru University, New Delhi, August 25-28, 2010.
  • International Seminar on Science and Technology in Ancient Indian Texts (STAIT), Jawaharlal Nehru University, New Delhi, January 9-10, 2010.
  • International Workshop on Spoken Language Prosody, Centre for Development in Advance Computing, Kolkata, November 25-27, 2009.
  • As trainer in a Workshop on Sanskrit E-learning and Multimedia, for Sanskrit teachers and students of secondary schools of Delhi, at SCSS, JNU, New Delhi, May 6, 2009.
  • Ten-Day Summer Course on “Language and Knowledge Representation: The Method of Indian Logic”, sponsored by Indian Council of Philosophical Research, New Delhi at Shri Mata Vaishno Devi University, Kakryal, Katra, Jammu & Kashmir, May 21-30, 2008.
  • As trainer in “Workshop on Computational Sanskrit” at BPS College of Under Graduate Studies, Sonepat, February 29, 2008.
  • As trainer in “Workshop on Computational Sanskrit” for college teachers of Haryana, Punjab and Chandigarh at Sanatan Dharma College (Lahore), Ambala Cantt, February 2, 2008.
  • “Seminar on Restructuring Sanskrit” conducted by Haryana Sanskrit Academy at S.D. College (Lahore), Ambala Cantt, July 21, 2007.
  • One day training program for Cataloguing of Manuscripts, National Manuscripts Mission, New Delhi on August 14, 2006.
  • National Workshop on Manuscriptology and Paleography, SCSS, JNU, New Delhi February 2-10, 2006.
  • Workshop on Hindi Support & Compatibility for Computers, JNU, New Delhi, March 31 to April 1, 2005.
  • National Seminar on Veda as Word, SCSS, JNU, New Delhi February 11th to February 13, 2005.

 

Scholarships:

  • Junior Research Fellowship (JRF) from UGC since October 2007.
  • Two years D.S. Gardi Sanskrit Scholarship from Jawaharlal Nehru University, New Delhi, from July 2004 to July 2006.
  • Two year Merits-cum-Means Scholarship from Jawaharlal Nehru University.