Repository logo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. Commentator: A Code-mixed Multilingual Text Annotation Framework
 
  • Details

Commentator: A Code-mixed Multilingual Text Annotation Framework

Source
Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of System Demonstrations
Date Issued
2024-01-01
Author(s)
Sheth, Rajvee
Nisar, Shubh
Prajapati, Heenaben
Beniwal, Himanshu
Singh, Mayank  
DOI
10.18653/v1/2024.emnlp-demo.11
Abstract
As the NLP community increasingly addresses challenges associated with multilingualism, robust annotation tools are essential to handle multilingual datasets efficiently. In this paper, we introduce a code-mixed multilingual text annotation framework, Commentator, specifically designed for annotating code-mixed text. The tool demonstrates its effectiveness in token-level and sentence-level language annotation tasks for Hinglish text. We perform robust qualitative human-based evaluations to showcase Commentator led to 5x faster annotations than the best baseline. Our code is publicly available at https://github.com/lingo-iitgn/commentator. The demonstration video is available at https://bit.ly/commentator_video.
Unpaywall
URI
https://d8.irins.org/handle/IITG2025/28482
IITGN Knowledge Repository Developed and Managed by Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify