Academic Journal
Who grades best?: Comparing ChatGPT, peer, and instructor evaluations across varying levels of student project quality
| Τίτλος: | Who grades best?: Comparing ChatGPT, peer, and instructor evaluations across varying levels of student project quality |
|---|---|
| Συγγραφείς: | Usher, Maya, Faraon, Montathar |
| Συνεισφορές: | Kristianstad University, Faculty of Business, Högskolan Kristianstad, Fakulteten för ekonomi, Originator, Kristianstad University, Faculty of Business, Design A_ Research Collaboration, Högskolan Kristianstad, Fakulteten för ekonomi, Design A_ Research Collaboration, Originator, Kristianstad University, Faculty of Business, Department of Design, Högskolan Kristianstad, Fakulteten för ekonomi, Avdelningen för design, Originator |
| Πηγή: | Assessment and Evaluation in Higher Education. |
| Θεματικοί όροι: | Social Sciences (5), Educational Sciences (503), Pedagogy (50301), Samhällsvetenskap (5), Utbildningsvetenskap (503), Pedagogik (50301), Natural sciences (1), Computer and Information Sciences (102), Human Computer Interaction (10204), Naturvetenskap (1), Data- och informationsvetenskap (102), Människa-datorinteraktion (10204) |
| Περιγραφή: | As generative AI (GenAI) services like ChatGPT become more prevalent in higher education, their role in assessment raises questions about grading reliability and alignment with human evaluators. This mixed-methods study examined grading alignment across three evaluators of student group projects: ChatGPT, peers, and the course instructor. It further investigated whether evaluator agreement varied by project quality. This study addressed both assessment patterns and student perceptions by integrating quantitative grade comparisons with qualitative analysis of students' reflections. A total of 184 undergraduate students participated, with each project evaluated by all three sources. The quantitative analyses revealed that alignment with instructor grading varied systematically by both grading source and project quality. The results showed that ChatGPT's alignment with instructor grading improved as project quality increased, with the largest overestimation for low-quality projects. In contrast, peer-instructor alignment was strongest for lower-quality work. The qualitative analyses further revealed that students actively interpreted key differences between grading sources, particularly in relation to perceived leniency, alignment between grades and feedback, and the contrasting evaluative logics of GenAI and human assessors. While ChatGPT may offer structured and consistent grading, especially for higher-quality projects, human evaluations contribute contextual and interpersonal insights that are crucial to assessment practices. |
| Περιγραφή αρχείου: | electronic |
| Σύνδεσμος πρόσβασης: | https://researchportal.hkr.se/ws/files/98409671/Who_grades_best.pdf |
| Βάση Δεδομένων: | SwePub |
| ISSN: | 02602938 1469297X |
|---|---|
| DOI: | 10.1080/02602938.2025.2588682 |