
Senior research scientist with expertise in natural language processing and computational linguistics, focusing on multilingual NLP model optimization and data-centric methodologies. Successfully architected synthetic data generation pipelines that enhance intent classification and improve entity extraction across multiple languages. Skilled in designing annotation quality assurance frameworks and conducting linguistic analysis to support robust AI systems. Proven leadership in mentoring teams and collaborating across ASR, NLU, and product systems to optimize speech-to-intent pipelines.
Development of a Rule-based Multi-lingual Parser and Machine Translation System
Synthesis of Data for Conversational AI
Gold Data Creation for Question Data Disambiguation
Creation of Synthesized and Gold Question-Answer Data for Banking Chatbots
Technical Skills
Python, Panda, Bash, Regex, Git, JIRA, JavaScript, SQL
ServiceNow API, Translation memory tools (TDS, MemoQ)
Unix, MS Office
Complex Predicate Analysis in Oriya
Description: Conducted in-depth research based on Beth Levin's verb classification and Talmy's Lexicalization Patterns. Explored the syntactic and semantic properties of Psych and Motion verbs in Oriya.
Enhancement of Anusaaraka Machine Translation System (English-Hindi)
Description: Focused on identifying templates and patterns in Hindi and English using text simplification algorithms and tokenization. Developed linguistic rules using Python, NLTK, and CLIPS to enhance translation accuracy.
Development of Multi-lingual Machine Translation (MT) Tools
Languages Covered: English, Hindi, Oriya, Punjabi, Sanskrit, Marathi, Japanese, German
Description: Developed MT tools to handle multiple languages, addressing idiomatic expressions and semantic structures.
Mapped semantic representations across languages using morpho-syntactic and grammar transformation rules.
Utilized Grammatical Framework (GF), Python, HTML, and CLIPS to create robust language processing solutions.
Authoring Tool Development for Hindi-English Translation
Description: Designed and developed a tool to enhance Hindi-English translation by enabling users to verify semantically disambiguated data through an interactive question-answer system, improving machine translation output.
Tools used: Python, HTML, CLIPS.
Field Linguistics Research
Description: Led field research focusing on the Kui language (spoken near Vishakhapatnam, Andhra Pradesh), conducting comprehensive linguistic analysis covering morpho-syntactic, semantic, and phonological aspects.
Research (Computational Linguistics & NLP), learning new languages, reading, and creative writing
6,C2,2,A2,6,C2,6,C2