Currently reading Corbin et al. (2025) on “Talk is cheap: why structural assessment changes are needed for a time of GenAI”

April 13th, 2026
Posted in literature
Tagged assessmentGenAI

Some reading about AI and assessment on the bus this morning… Quick summary so I don’t forget!

In “Talk is cheap: why structural assessment changes are needed for a time of GenAI“, Corbin et al. (2025) explain the difference between discursive and structural changes to assessment. Discursive changes are changes in instructions, which “students remain free to ignore“, for example the traffic light approaches from everything permitted at tasks marked green, to nothing permitted at tasks marked red; or the declare-exactly-what-you-did strategy. Structural changes, on the other hand, are changes to the assessment itself.

Analysing how GenAI in assessment of learning is most commonly addressed, they find that almost always people rely on discursive changes. There are several problems with this approach:

The underlying assumption — that students will understand what exactly is permitted and what is not — is really not a valid assumption, because the lines between, for example, drafting and editing, are blurry (as are having a question explained and getting hints for how to approach it; brainstorming-yes-idea-generation-no; “refining” an argument; correcting grammar but not alterning meaning; finding errors but not helping towards a solution; and many more).
Students need to voluntarily follow the rules, even when there are disagreements (or misunderstanding, as above) about what is legitimate use of GenAI for learning vs in assessment.
Compliance cannot be enforced because (other than leaving in parts of a response like “Sure, here is the response to the question …”, the teacher directly observing a student using GenAI, or the student admitting to it) there is no way to reliably detect GenAI. And this is where the traffic lights in the analogy are substantially different from traffic lights in real traffic, which exist in much larger structures with speed cameras, police patrols, fines, etc, so structures that can actually detect rule violations and enforce compliance much better.

This leads to the “‘discursive paradox’: The more detailed and specific our instructions become about ‘acceptable’ AI use, the more we highlight the gap between what we can specify and what we can verify“. And the problem with that is that “By attempting to control through communication what can only be ensured through structure, these approaches create assessment environments where compliance becomes optional and potentially disadvantageous to students who follow the rules.”

Structural changes, on the other hand, are “Modifications that directly alter the nature, format, or mechanics of how a task must be completed, such that the success of these changes is not reliant on the student’s understanding, interpretation, or compliance with instructions. Instead, these changes reshape the underlying framework of the task, constraining or opening the student’s approach in ways that are built into the assessment itself“, thus increasing assessment validity. Ways that can be done (and is being done in many places) are for example to go back to pen-and-paper assessment, assuming that if students don’t have access to technology, they also cannot use GenAI to cheat (but then they might be using that isn’t as easy to see, like smart glasses…), adding oral assessment components on questions/topics picked randomly out of a large pool covering the desired learning outcomes, having tutors sign off data generated in labs that students will continue working with on their own. Typically, the focus shifts from product to process.

Corbing et al. (2025) share a “two-lane approach”, where lane 1 is controlled, in-person assessment, and lane 2 is open in the sense that there is no attempt to control how students use GenAI (but of course there can be structural changes there that discourage misuse, like having someone sign off on data produced in a lab so that students have to continue with that data set rather than generating synthetic data; or checkpoints where students discuss their work and then asynchronously build on the ideas discussed with the instructor). They conclude that “The path forward through this increasingly challenging terrain [of increasing AI capabilities] lies not in more sophisticated rules about AI use, but in fundamentally redesigning how we structure assessments to demonstrate student capability. This will require significant effort and creativity from educators but has the advantage of allowing for genuine solutions to maintaining assessment validity in an AI-enabled world. These must be solutions based not on what we say, but on what we do.”

Corbing et al. (2025) make it very clear that while assessment for learning is important, the focus of their article is on assessment of learning, and that therefore the focus is on validity of said assessment; fair enough. But it is important to remember (and I am saying this to the regular readers of my blog, not the authors of the article, who I am sure keep that in mind) that the way we approach students, the kind of relationships we have, influence what happens in the classroom. Trying to police students, trying to create cheat-proof assessments, might just undermine our relationships with them and make students want to find ways around obstactles that are put in their way (and especially at a faculty of engineering, I am sure they can find ways that teachers cannot dream up). And another point that I think we need to mention in any conversation about making assessments cheat-proof is accessibility. Making people write pen-on-paper under time pressure does not just assess learning, but it also assesses, for example, who deals well with time pressure, and who can write pen-on-paper quickly and legibly. Adding more pressure to assessment situations is likely to influence students’ test anxiety unequally, and thus reinforcing inequitable structures. So proceed with caution!

Oki, so far, so good. Two more interesting things that caught my eye this morning on the bus:

Stokel-Walker (2026) reports, in a Nature News Feature, on a researcher who put two — obviously fake — papers about a made-up medical condition on a preprint server (and so obviously fake that it repeatedly says in the paper that it is not real, that participants are made up, …) and then finds GenAI models referring to that condition later. This is cool and fun, and at the same time obviously concerning and ethically questionable, and that makes it such an interesting case to discuss! (And Rachel then reminded me of the post by Germain (2026) who convinced GenAI that he’s the tech journalist who has eaten most hot dogs. Similar idea, but less potentially harmful impact…)
Munoz (2026) presents a tool (which runs locally to protect data) to test quizzes against AI tools. It is intended as a tool that instructors use to test the quizzes they have developed themselves, and I think it could be really useful and interesting! I will definitely need to try next chance I get!

Corbin, T., Dawson, P., & Liu, D. (2025). Talk is cheap: why structural assessment changes are needed for a time of GenAI. Assessment & Evaluation in Higher Education, 50(7), 1087–1097. https://doi.org/10.1080/02602938.2025.2503964

Germain, T. (2026). I hacked ChatGPT and Google’s AI — and it took only 20 minutes. https://www.bbc.com/future/article/20260218-i-hacked-chatgpt-and-googles-ai-and-it-only-took-20-minutes

Munoz, A. (2026). Assessing and addressing online quiz vulnerability with AI. https://herdsa.org.au/herdsa-connect/assessing-and-addressing-online-quiz-vulnerability-ai

Stokel-Walker, C. (2026). Scientists invented a fake disease. AI told people it was real. Nature News Feature. doi: https://doi.org/10.1038/d41586-026-01100-y

Featured image and pics below from when my parents were visiting over Easter!

It looks warm but it was not. And now the trees are starting to get green!

Currently reading Corbin et al. (2025) on “Talk is cheap: why structural assessment changes are needed for a time of GenAI”

Leave a ReplyCancel reply

#WaveWatchingWednesday at a windy weir

Contact me!

Search "Adventures in Teaching and Oceanography"

Archives

Recent Posts

Categories

Tags