CEFR-Based Sentence Difficulty Annotation and Assessment

Yuki Arase, Satoru Uchida, Tomoyuki Kajiwara

研究成果: 会議への寄与タイプ学会誌査読

5 被引用数 (Scopus)

抄録

Controllable text simplification is a crucial assistive technique for language learning and teaching. One of the primary factors hindering its advancement is the lack of a corpus annotated with sentence difficulty levels based on language ability descriptions. To address this problem, we created the CEFR-based Sentence Profile (CEFR-SP) corpus, containing 17k English sentences annotated with the levels based on the Common European Framework of Reference for Languages assigned by English-education professionals. In addition, we propose a sentence-level assessment model to handle unbalanced level distribution because the most basic and highly proficient sentences are naturally scarce. In the experiments in this study, our method achieved a macro-F1 score of 84.5% in the level assessment, thus outperforming strong baselines employed in readability assessment.

本文言語英語
ページ6206-6219
ページ数14
出版ステータス出版済み - 2022
イベント2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, アラブ首長国連邦
継続期間: 12月 7 202212月 11 2022

会議

会議2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
国/地域アラブ首長国連邦
CityAbu Dhabi
Period12/7/2212/11/22

!!!All Science Journal Classification (ASJC) codes

  • 計算理論と計算数学
  • コンピュータ サイエンスの応用
  • 情報システム

フィンガープリント

「CEFR-Based Sentence Difficulty Annotation and Assessment」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル