Javascript must be enabled to continue!
Can Large Language Models Help Augment English Psycholinguistic Datasets?
View through CrossRef
Research on language and cognition relies extensively on large, psycholinguistic datasets —sometimes called “norms”. These datasets contain judgments of lexical properties like concreteness and age of acquisition, and can be used to norm experimental stimuli, discover empirical relationships in the lexicon, and stress-test computational models. However, collecting human judgments at scale is both time-consuming and expensive. This issue of scale is made more difficult for norms containing multiple semantic dimensions and especially for norms that incorporate linguistic context. In the current work, I explore whether advances in Large Language Models (LLMs) can be leveraged to augment the creation of large, psycholinguistic datasets in English. I use GPT-4 to collect multiple kinds of semantic judgments (e.g., word similarity, contextualized sensorimotor associations, iconicity) for English words and compare these judgments against the human “gold standard”. For each dataset, I find that GPT-4’s judgments are positively correlated with human judgments, in some cases rivaling or even exceeding the average inter-annotator agreement displayed by humans. I then explore whether and how LLM-generated norms differ from human-generated norms systematically. I also perform several “substitution analyses”, which demonstrate that replacing human-generated norms with LLM-generated norms in a statistical model does not change the sign of parameter estimates (though in select cases, there are significant changes to their magnitude). Finally, I conclude by discussing the limitations of this approach and under what conditions LLM-generated norms could be useful to researchers.
Title: Can Large Language Models Help Augment English Psycholinguistic Datasets?
Description:
Research on language and cognition relies extensively on large, psycholinguistic datasets —sometimes called “norms”.
These datasets contain judgments of lexical properties like concreteness and age of acquisition, and can be used to norm experimental stimuli, discover empirical relationships in the lexicon, and stress-test computational models.
However, collecting human judgments at scale is both time-consuming and expensive.
This issue of scale is made more difficult for norms containing multiple semantic dimensions and especially for norms that incorporate linguistic context.
In the current work, I explore whether advances in Large Language Models (LLMs) can be leveraged to augment the creation of large, psycholinguistic datasets in English.
I use GPT-4 to collect multiple kinds of semantic judgments (e.
g.
, word similarity, contextualized sensorimotor associations, iconicity) for English words and compare these judgments against the human “gold standard”.
For each dataset, I find that GPT-4’s judgments are positively correlated with human judgments, in some cases rivaling or even exceeding the average inter-annotator agreement displayed by humans.
I then explore whether and how LLM-generated norms differ from human-generated norms systematically.
I also perform several “substitution analyses”, which demonstrate that replacing human-generated norms with LLM-generated norms in a statistical model does not change the sign of parameter estimates (though in select cases, there are significant changes to their magnitude).
Finally, I conclude by discussing the limitations of this approach and under what conditions LLM-generated norms could be useful to researchers.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Aviation English - A global perspective: analysis, teaching, assessment
Aviation English - A global perspective: analysis, teaching, assessment
This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...
A Wideband mm-Wave Printed Dipole Antenna for 5G Applications
A Wideband mm-Wave Printed Dipole Antenna for 5G Applications
<span lang="EN-MY">In this paper, a wideband millimeter-wave (mm-Wave) printed dipole antenna is proposed to be used for fifth generation (5G) communications. The single elem...
The Legacy of Empire: Exploring British Colonial English in the Works of Manto and Hamid
The Legacy of Empire: Exploring British Colonial English in the Works of Manto and Hamid
In the last few years, English has gained extraordinary respect in Pakistan. Due to this increased traction, students have started learning and speaking English despite losing thei...
Teaching English to Young Learners: Second Language Acquisition or Foreign Language Learning? – A Case Study
Teaching English to Young Learners: Second Language Acquisition or Foreign Language Learning? – A Case Study
A well-known, long-standing and heated debate across the literature concerning applied linguistics is whether within the classroom teachers are dealing with second language acquisi...
The Role of Interdisciplinary English Literature in the Age of English Language Education Crisis: Exploring the Possibilities of Convergent English Education beyond Communicative Language Teaching
The Role of Interdisciplinary English Literature in the Age of English Language Education Crisis: Exploring the Possibilities of Convergent English Education beyond Communicative Language Teaching
This study aims to offer how English language education based on English literary works (including children's literature) can respond more actively to the current crisis in which '...
PERCEPTIONS OF PROFICIENT ENGLISH LANGUAGE USERS AND LEARNERS WITH LIMITED ENGLISH LANGUAGE PROFICIENCY ABOUT THE BENEFITS OF PROFICIENCY IN ENGLISH
PERCEPTIONS OF PROFICIENT ENGLISH LANGUAGE USERS AND LEARNERS WITH LIMITED ENGLISH LANGUAGE PROFICIENCY ABOUT THE BENEFITS OF PROFICIENCY IN ENGLISH
The main objective of this study was to identify the perceptions of proficient English language users (PELU) and learners with limited English language proficiency (LELP) about the...

