Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Cost of ungrammatical predictions during online sentence processing: evidence against surprisal

View through CrossRef
The surprisal metric (Hale, 2001; Levy, 2008) successfully predicts syntactic complexity in a large number of online studies (e.g., Demberg and Keller, 2009; Levy and Keller, 2013). Surprisal assumes a probabilistic grammar that drives the expectation of upcoming linguistic material. Consequently, wrong predictions lead to a processing cost, presumably due to reranking related computations (Levy, 2013). Critically, surprisal assumes that the predicted parses generated by the probabilistic grammar are grammatical. However, it has been found that syntactic predictions can be ungrammatical (e.g., Apurva & Husain, 2018). Consequently, similar to reranking costs incurred due to incorrect (grammatical) predictions, a cost should also appear for ungrammatical predictions. Evidence for such a cost during comprehension will not be explained by the surprisal metric. To test the ecological validity of the surprisal metric, it becomes critical to investigate if ungrammatical predictions incur a cost. In this study, we investigate this issue in Hindi (a verb-final language) using a cloze task followed by a self-paced reading (SPR) study. All analyses were carried out in R using linear mixed models. Log RTs (reading time) were used for the RT analyses. In the cloze study (N=30), participants were asked to complete the sentences (such as 1a, 1b) meaningfully using the SPR paradigm. The two conditions differed in the case markers on the three nouns. 12 sets of experimental items along with 64 fillers were used. Participants’ responses were coded for the predicted verb class and the overall grammaticality of the completion (grammatical prediction vs ungrammatical prediction). 1a. hari-ne geeta-se umesh-ko…. Hari-ERG Geeta=ABL Umesh=ACC. 1b. hari-ko geeta-ne umesh-ko …. Hari-ACC Geeta-ERG Umesh-ACC. Grammaticality analysis of the completion data showed that participants make more ungrammatical completions in conditions (b) compared to (a) (z=5.25). The overall grammatical completions in condition (a) was 96% while in (b) it was 60%. In addition, the verb class analysis showed that in both conditions participants completed the sentences with a transitive non-finite verb followed by a ditransitive matrix verb (hereafter T.NF-DT.M) most frequently. T.NF-DT.M were predicted in 33% instance in condition (a) and 34% in condition (b) (z=0.18). Given the similar cloze probabilities, the surprisal metric will predict no difference in RT at T.NF-DT.M in the two conditions during online processing (cloze probabilities can be used to compute surprisal, see Levy and Keller, 2013). If the RTs at T.NF-DT.M in condition (a) is less than (b) that would be better explained by the higher cost due to the ungrammatical prediction. To ascertain this, we conducted an SPR study (n=50) using items similar to the ones used in the previous experiment (see, 2a and 2b). The critical region was T.NF-DT.M. 24 set of items along with 72 fillers were constructed. 2a hari-ne geeta-se umesh-ko milne ko kaha, Hari-ERG Geeta=ABL Umesh=ACC meet-inf(T.NF) told(DT.M) 2b hari-ko geeta-ne umesh-ko milne ko kaha , ... Hari-ACC Geeta=ERG Umesh=ACC meet-inf(T.NF) told(DT.M) While the prediction of T.NF-DT.M is the same in the two conditions, % ungrammatical predictions are more in (b) vs (a). Results show that the RT in (a) < (b) at the critical region (t=2.32). This goes against the surprisal metric and shows the cost incurred due to ungrammatical predictions. Our work establishes that the cost of ungrammatical predictions indeed appears during online processing. This processing cost is not predicted by a metric like surprisal and highlights its limitations. This study also provides evidence against the robust predictions in head-final languages. It suggests that the prediction mechanism in such languages is more nuanced and points to the need to study the nature of ungrammatical predictions during processing.
Center for Open Science
Title: Cost of ungrammatical predictions during online sentence processing: evidence against surprisal
Description:
The surprisal metric (Hale, 2001; Levy, 2008) successfully predicts syntactic complexity in a large number of online studies (e.
g.
, Demberg and Keller, 2009; Levy and Keller, 2013).
Surprisal assumes a probabilistic grammar that drives the expectation of upcoming linguistic material.
Consequently, wrong predictions lead to a processing cost, presumably due to reranking related computations (Levy, 2013).
Critically, surprisal assumes that the predicted parses generated by the probabilistic grammar are grammatical.
However, it has been found that syntactic predictions can be ungrammatical (e.
g.
, Apurva & Husain, 2018).
Consequently, similar to reranking costs incurred due to incorrect (grammatical) predictions, a cost should also appear for ungrammatical predictions.
Evidence for such a cost during comprehension will not be explained by the surprisal metric.
To test the ecological validity of the surprisal metric, it becomes critical to investigate if ungrammatical predictions incur a cost.
In this study, we investigate this issue in Hindi (a verb-final language) using a cloze task followed by a self-paced reading (SPR) study.
All analyses were carried out in R using linear mixed models.
Log RTs (reading time) were used for the RT analyses.
In the cloze study (N=30), participants were asked to complete the sentences (such as 1a, 1b) meaningfully using the SPR paradigm.
The two conditions differed in the case markers on the three nouns.
12 sets of experimental items along with 64 fillers were used.
Participants’ responses were coded for the predicted verb class and the overall grammaticality of the completion (grammatical prediction vs ungrammatical prediction).
1a.
hari-ne geeta-se umesh-ko….
Hari-ERG Geeta=ABL Umesh=ACC.
1b.
hari-ko geeta-ne umesh-ko ….
Hari-ACC Geeta-ERG Umesh-ACC.
Grammaticality analysis of the completion data showed that participants make more ungrammatical completions in conditions (b) compared to (a) (z=5.
25).
The overall grammatical completions in condition (a) was 96% while in (b) it was 60%.
In addition, the verb class analysis showed that in both conditions participants completed the sentences with a transitive non-finite verb followed by a ditransitive matrix verb (hereafter T.
NF-DT.
M) most frequently.
T.
NF-DT.
M were predicted in 33% instance in condition (a) and 34% in condition (b) (z=0.
18).
Given the similar cloze probabilities, the surprisal metric will predict no difference in RT at T.
NF-DT.
M in the two conditions during online processing (cloze probabilities can be used to compute surprisal, see Levy and Keller, 2013).
If the RTs at T.
NF-DT.
M in condition (a) is less than (b) that would be better explained by the higher cost due to the ungrammatical prediction.
To ascertain this, we conducted an SPR study (n=50) using items similar to the ones used in the previous experiment (see, 2a and 2b).
The critical region was T.
NF-DT.
M.
24 set of items along with 72 fillers were constructed.
2a hari-ne geeta-se umesh-ko milne ko kaha, Hari-ERG Geeta=ABL Umesh=ACC meet-inf(T.
NF) told(DT.
M) 2b hari-ko geeta-ne umesh-ko milne ko kaha , .
Hari-ACC Geeta=ERG Umesh=ACC meet-inf(T.
NF) told(DT.
M) While the prediction of T.
NF-DT.
M is the same in the two conditions, % ungrammatical predictions are more in (b) vs (a).
Results show that the RT in (a) < (b) at the critical region (t=2.
32).
This goes against the surprisal metric and shows the cost incurred due to ungrammatical predictions.
Our work establishes that the cost of ungrammatical predictions indeed appears during online processing.
This processing cost is not predicted by a metric like surprisal and highlights its limitations.
This study also provides evidence against the robust predictions in head-final languages.
It suggests that the prediction mechanism in such languages is more nuanced and points to the need to study the nature of ungrammatical predictions during processing.

Related Results

Pola Fungsi Kalimat pada Novel “Pulang” Karya Tere Liye dan Kelayakannya sebagai Materi Pengayaan Siswa Kelas Xll SMA
Pola Fungsi Kalimat pada Novel “Pulang” Karya Tere Liye dan Kelayakannya sebagai Materi Pengayaan Siswa Kelas Xll SMA
Understanding sentence function patterns plays a major role in reading a novel, especially in class XII. By studying the understanding of sentence function patterns, class XII stud...
Lexical predictability during natural reading: Effects of surprisal and entropy reduction
Lexical predictability during natural reading: Effects of surprisal and entropy reduction
What are the effects of word-by-word predictability on sentence processing times during the natural reading of a text? Although information-complexity metrics such as surprisal and...
Study on Electromagnetic Shielding of Infrared /Visible Optical Window
Study on Electromagnetic Shielding of Infrared /Visible Optical Window
In allusion to electromagnetic radiation damage that existed in daily life, social safety and military field, electromagnetic shielding technology of infrared and infrared optical ...
KALIMAT TANYA DALAM BAHASA INDONESIA
KALIMAT TANYA DALAM BAHASA INDONESIA
Interrogative sentence is one kind of sentences in Indonesian, which formed as proposition that required answer from hearer. It also called as requesting question. The difference w...
STRUKTUR KALIMAT TUNGGAL BAHASA KANUM SOTA THE STRUCTURE OF THE SIMPLE SENTENCE OF KANUM SOTA LANGUAGE
STRUKTUR KALIMAT TUNGGAL BAHASA KANUM SOTA THE STRUCTURE OF THE SIMPLE SENTENCE OF KANUM SOTA LANGUAGE
Abstract Kanum Sota language is spoken by speaker aroun Sota District, Merauke, Papua Province. This study uses descriptive method to describe the structure of the simple sen...
An Analysis on Students’ Errors in Writing Sentence Patterns
An Analysis on Students’ Errors in Writing Sentence Patterns
Simple, compound, complex, and compound-complex sentences are sentence patterns which each sentence has different pattern, and it is strongly related to grammatical structure, punc...
Do evidence summaries increase health policy‐makers' use of evidence from systematic reviews? A systematic review
Do evidence summaries increase health policy‐makers' use of evidence from systematic reviews? A systematic review
This review summarizes the evidence from six randomized controlled trials that judged the effectiveness of systematic review summaries on policymakers' decision making, or the most...
Parsing errors in Hindi: Investigating limits to verbal prediction in an SOV language
Parsing errors in Hindi: Investigating limits to verbal prediction in an SOV language
The role of prediction during sentence comprehension is widely acknowledged to be very critical in SOV languages. Robust clause-final verbal prediction and its maintenance have be...

Back to Top