The Computational Linguistics R&D at Special Centre for Sanskrit Studies J.N.U., started since 2002 under the supervision of Dr. Girish Nath Jha. We are doing R&D in several areas of language technology for Sanskrit and other Indian languages. Our current focus is on developing Sanskrit analysis tools for building Sanskrit - Hindi Translator (SaHiT).   So far, the following tools and resources have been developed -

Upcoming Seminar
1st International Workshop on Indian Language Data: Resources & Evaluation May 21, 2012, Istanbul, Turkey

Lexical Resources
Online Multilingual Amarakosha (बहुभाषीय अमरकोश)
Mahabharata Search (महाभारत अनुक्रमणी)
Ayurveda Search (आयुर्वेद अनुक्रमणी)
Bhava-Prakasha Nighantu Search
Brhadaranyaka Upanishad Search (बृहदारण्यकोपनिषद् अनुक्रमणी)
Indexing System for Mainstream Texts (वेदांत अनुक्रमणी)

Language Analyzers
Indian Language Transliterator ( भारतीय-भाषा-लिप्यन्तरक )
Sanskrit Morphological Analyzer(पद-विश्लेषक)
Marathi Noun Stemmer (मराठी संज्ञापद विश्लेषक)
Vowel Sandhi Splitter (स्वर-संधि विश्लेषक)
Subanta analyzer (सुबंत विश्लेषक)
Tinanta analyzer (तिङन्त विश्लेषक)
Kridanta Analyzer (कृदन्त विश्लेषक)
POS tagger (पद-परिचायक)
Sanskrit Consortium POS tagger (संस्कृतसंकाय पद-परिचायक)
Sanskrit Anaphora Resolution System (संस्कृत पद-सम्मेलक)
Karaka analyzer (कारक-विश्लेषक)
Gender Recognizer and Analyzer for Sanjna Pada (GRASP)
(संस्कृत-संज्ञापद-लिङ्ग-विश्लेषण)

Sanskrit Homonym Analyzer (संस्कृत अनेकार्थ-विश्लेषक )
Hindi Homonym Marker (हिन्दी अनेकार्थ-विश्लेषक)
Great Andamanese Verb Analyzer (अंदमानी किया-विश्लेषक)
Russian-English Divergence Marker- (रूसी-अंग्रेज़ी-भाषांतर-अंकक)
Recently organized seminar
20th International Vedanta Congress Dec 28-31, 2011

4th International Sanskrit Computational Linguistics Symposium, Dec 10-12, 2010

Science & Technology in Ancient Indian Texts (STAIT), Jan 9-10, 2010

Multimedia & E-learning
Sanskrit Multimedia & e-learning


Corpora and Tagsets
Tagsets/annotated corpora
Raw corpus - current Sanskrit prose
ILCI Annotation Tool

Language Generators
Sandhi Generator (संधि)
Subanta Generator(सुबंत)
Tinanta generator (तिङन्त)

Course projects by MA students
Student Projects
downloadable prolog projects


Currently funded projects
  • Development of Sanskrit-Hindi Machine Translation (DIT sponsored)
  • Indian Languages Corpora Initiative (DIT sponsored)
  •   Register and give feedback   See user feedback   Admin area