Towards Automated Lexicography: Generating and Evaluating Definitions for Learner's Dictionaries
By: Yusuke Ide , Adam Nohejl , Joshua Tanner and more
Potential Business Impact:
Makes dictionaries easier to understand for learners.
We study dictionary definition generation (DDG), i.e., the generation of non-contextualized definitions for given headwords. Dictionary definitions are an essential resource for learning word senses, but manually creating them is costly, which motivates us to automate the process. Specifically, we address learner's dictionary definition generation (LDDG), where definitions should consist of simple words. First, we introduce a reliable evaluation approach for DDG, based on our new evaluation criteria and powered by an LLM-as-a-judge. To provide reference definitions for the evaluation, we also construct a Japanese dataset in collaboration with a professional lexicographer. Validation results demonstrate that our evaluation approach agrees reasonably well with human annotators. Second, we propose an LDDG approach via iterative simplification with an LLM. Experimental results indicate that definitions generated by our approach achieve high scores on our criteria while maintaining lexical simplicity.
Similar Papers
AutoDDG: Automated Dataset Description Generation using Large Language Models
Databases
Makes finding data easier by writing descriptions.
AutoDDG: Automated Dataset Description Generation using Large Language Models
Databases
Helps find data by writing better descriptions.
DVAGen: Dynamic Vocabulary Augmented Generation
Computation and Language
Helps computers understand new words better.