" />
" />
" />
" />

Become a member

Language Magazine is a monthly print and online publication that provides cutting-edge information for language learners, educators, and professionals around the world.

― Advertisement ―

― Advertisement ―

AI Biased by User Language

A new study, “How User Language Affects Conflict Fatality Estimates in ChatGPT,” published in Journal of Peace, suggests that the language used to ask...
HomeFeaturesAssessmentThe Promise of Automated Writing Evaluation for English Learners

The Promise of Automated Writing Evaluation for English Learners

Corey Palermo and Joshua Wilson share their study suggesting that automated writing assessment software may promote equity


Automated writing evaluation (AWE) encompasses a range of educational technology tools that facilitate the teaching and learning of writing. These tools offer immediate automated feedback, including improvement suggestions and quality ratings, which are generated by algorithms trained to simulate human feedback and ratings. Findings from several studies, including research syntheses, indicate that AWE helps students improve their writing skills, their ability to identify and fix problems in their writing, and their confidence as writers (Li, 2022; Palermo and Thomson, 2018; Palermo and Wilson, 2020; Wilson and Roscoe, 2020; Zhai and Ma, 2022).

Moreover, both teachers and students report generally positive perceptions of AWE (Grimes and Warschauer, 2010; Wilson et al., 2021a; Wilson et al., 2021b). Perhaps for these reasons, AWE has been increasingly adopted in recent years (Deeva et al., 2021; Huang et al., 2023). Little research, however, has examined AWE use by English learners (ELs) in the US, in particular the youngest ELs, including those in the upper elementary grades. Although AWE generally shows promise, it is important to determine whether AWE is a viable option for elementary-aged ELs, who may experience greater difficulty interpreting and applying automated feedback and may thereby realize less benefit from AWE than their L1 English-speaking peers.

Therefore, we conducted a study investigating this very topic. Specifically, we investigated whether elementary-aged ELs access and benefit from AWE’s automated feedback as much as non-ELs and if they make productive revisions to their writing, if they focus on similar writing features during revision, and if they perceive the automated feedback as beneficial to the same extent as non-ELs. We specifically examined ELs’ response to the MI Write AWE system.

MI Write (www.miwrite.com) is an AWE tool developed by Measurement Incorporated (of which co-author Corey Palermo is chief strategy officer). It provides immediate feedback and scores to students based on their writing, utilizing the automated essay scoring system Project Essay Grade (PEG). MI Write offers trait-specific feedback, metacognitive prompts, grammar and spelling error feedback, multimedia lessons, and peer review functionality, and it allows teachers to create custom prompts and provide additional feedback.

A Focal Study: Do ELs Access and Benefit from MI Write to a Similar Extent as Non-ELs?

Our study included 3,459 students in grades 3–5 from a school district in the mid-Atlantic region of the US. Of these students, 24% were ELs, and the majority those (90%) were Spanish-speaking ELs. We used EL student scores on the ACCESS for ELLs English language proficiency assessment to classify ELs as English language proficient or nonproficient. Students in the sample were racially and ethnically diverse, 60% non-White, and represented 14 different elementary schools, several of which received Title 1 funding.

Students completed district benchmark writing assessments, administered in the fall and spring of the 2017–18 school year, the first year in which the whole district had adopted MI Write. Students were asked to plan, draft, and revise a response to a source-based informative writing prompt. The prompts were created by the research team in partnership with the school district and were embedded within the program, which then gave students automated feedback and automated quality ratings for each draft submitted.

Findings from our analyses indicate that AWE shows promise as a viable tool to support the growth of ELs’ writing skills. Specifically, after accounting for other demographic differences between ELs and non-ELs like grade level, race, and gender, as well as reading ability, we learned that:

For the number of drafts produced: There were no differences in the number of drafts students completed on the fall benchmark writing test, but nonproficient ELs produced fewer drafts than non-ELs in the spring; however, the difference (-0.079 drafts) was not practically meaningful. We can thus conclude that ELs accessed automated feedback to virtually the same extent as non-ELs.

For the gain in writing quality rating from first draft to final draft: There were no differences in gains in writing quality over consecutive drafts between ELs and non-ELs for either the fall or spring benchmark assessments. ELs derived an equal benefit to non-ELs from automated feedback. This was true for both proficient and nonproficient ELs.

For the extent to which students made substantive rather than surface-level changes to their texts: There were no differences between ELs and non-ELs regarding the degree of substantive vs. surface-level changes to their texts in response to automated feedback. This finding indicates that ELs applied the AWE feedback similarly to non-ELs, assuaging concerns that ELs might not derive an equal educational benefit from AWE feedback to L1 students.

For the extent to which ELs and non-ELs focused on revising the same features of their writing: There were no differences between ELs and non-ELs with respect to which features of their writing they revised. This is encouraging because it suggests that ELs are not, for example, just using AWE feedback to edit their writing and ignoring feedback on other dimensions like cohesion, syntax, and vocabulary.

For the extent to which ELs and non-ELs agreed that automated feedback was beneficial: Students tended to agree that AWE was a beneficial learning tool overall. Proficient ELs expressed significantly stronger agreement than non-ELs that it was beneficial. There were no differences between nonproficient ELs and non-ELs, though nonproficient ELs trended toward stronger agreement. This finding indicates that both ELs and non-ELs tend to agree that MI Write is beneficial, but proficient ELs tend to agree more strongly.

Implications and Recommendations for Using AWE with ELs

Our study was the first of its kind to examine the response to and benefit from AWE for an important and burgeoning population of elementary-aged ELs. Collectively, our findings indicate that there is value in using AWE tools with this population of language learners. ELs accessed and benefited from AWE feedback to the same extent as their non-EL peers. They operated on similar aspects of their writing, and they ultimately endorsed the software to the same or greater extent (in the case of proficient ELs) than their non-EL peers. Future research should continue to explore the implications of using AWE with ELs and adopt research designs that involve longer time frames and scenarios with a more elaborated writing process that might more closely mirror classroom formative assessment.

The study demonstrates that ELs accessed and benefited from AWE feedback to a similar extent to non-ELs. These findings should assuage possible concerns among teachers that AWE might not be appropriate or accessible for elementary-aged ELs.

Thus, teachers should ensure equitable access to AWE tools for all students, regardless of their language backgrounds. Providing ELs with the opportunity to use AWE tools can contribute to their writing improvement and boost their confidence as writers. Thus, teachers can consider integrating AWE into their instruction to provide immediate and automated feedback to ELs, like for their non-EL peers. This technology can help bridge the gap in writing proficiency and provide targeted support for ELs’ writing development.

While the study focused on independent student access to AWE within the context of district-administered benchmark writing assessments, it is crucial for teachers to consider how to incorporate AWE into their classroom instruction effectively. Teachers can provide differentiated support to ELs during writing assignments that involve more elaborate planning, drafting, and revision cycles. This support might include explicit instruction on using AWE, scaffolded writing tasks, and individualized feedback to address specific language needs.

Finally, it is important to remember that AWE feedback should complement but never replace teacher feedback. Teachers best understand the needs of their students and how to meet those needs through high-quality language instruction. AWE is an excellent tool to support teachers’ efforts; it cannot replace those efforts. Thus, a final recommendation is that teachers may benefit from professional development opportunities focused on effectively integrating AWE into their writing instruction for ELs. Training sessions can provide guidance on leveraging AWE features, interpreting feedback, and tailoring instruction to meet the unique needs of ELs. Ongoing support and collaboration among educators can foster best practices and exchange ideas for optimizing the use of AWE with ELs.

Nevertheless, by considering these study implications and recommendations, teachers can harness the potential of AWE tools to enhance writing instruction for English learners, fostering their language development and overall success in learning to write.

References

Deeva, G., Bogdanova, D., Serral, E., Snoeck, M., and De Weerdt, J. (2021). “A Review of Automated Feedback Systems for Learners: Classification framework, challenges and opportunities.” Computers and Education, 162, 104094. https://doi.org/10.1016/j.compedu.2020.104094
Grimes, D., and Warschauer, M. (2010). “Utility in a Fallible Tool: A multi-site case study of automated writing evaluation.” Journal of Technology, Learning, and Assessment, 8 (6). www.jtla.org
Huang, X., Zou, D., Cheng, G., Chen, X., and Xie, H. (2023). “Trends, Research Issues and Applications of Artificial Intelligence in Language Education.” Educational Technology and Society, 26 (1), 112–131. www.jstor.org/stable/48707971
Li, R. (2022). “Still a Fallible Tool? Revisiting effects of automated writing evaluation from activity theory perspective.” British Journal of Educational Research, 00, 1–17. https://doi.org/10.1111/bjet.13294
Palermo, C., and Thomson, M. M. (2018). “Teacher Implementation of Self-Regulated Strategy Development with an Automated Writing Evaluation System: Effects on the argumentative writing performance of middle school students.” Contemporary Educational Psychology, 54, 255-270. https://doi.org/10.1016/j.cedpsych.2018.07.002
Palermo, C., and Wilson, J. (2020). “Implementing Automated Writing Evaluation in Different Instructional Contexts: A mixed-methods study.” Journal of Writing Research, 12 (1), 63–108. https://doi.org/10.17239/jowr-2020.12.01.04
Wilson, J., and Roscoe, R. D. (2020). “Automated Writing Evaluation and Feedback: Multiple metrics of efficacy.” Journal of Educational Computing Research, 58, 87–125. https://doi.org/10.1177%2F0735633119830764
Wilson, J., Huang, Y., Palermo, C., Beard, G., and MacArthur, C. A. (2021a). “Automated Feedback and Automated Scoring in the Elementary Grades: Usage, attitudes, and associations with writing outcomes in a districtwide implementation of MI Write.” International Journal of Artificial Intelligence in Education, 31 (2), 234–276. https://doi.org/10.1007/s40593-020-00236-w
Wilson, J., Ahrendt, C., Fudge, E., Raiche, A., Beard, G., and MacArthur, C. A. (2021b). “Elementary Teachers’ Perceptions of Automated Feedback and Automated Scoring: Transforming the teaching and learning of writing using automated writing evaluation.” Computers and Education, 168, 104208. https://doi.org/10.1016/j.compedu.2021.104208
Zhai, N., and Ma, X. (2022). “The Effectiveness of Automated Writing Evaluation on Writing Quality: A meta-analysis.” Journal of Educational Computing Research, 0 (0). https://doi-org.udel.idm.oclc.org/10.1177/07356331221127300

Corey Palermo, PhD, is chief strategy officer at Measurement Incorporated. His research examines rater effects in large‐scale assessment contexts, automated scoring, and automated writing evaluation.

Joshua Wilson, PhD, is an associate professor in the School of Education at the University of Delaware. His research broadly focuses on ways to improve the teaching and learning of writing and specifically focuses on ways that automated writing evaluation systems can facilitate those improvements.

Language Magazine
Send this to a friend