Currently reading Fawns et al. (2026) on “Identifying what our students have learned: a framework for practical assessment validation”

February 4th, 2026
Posted in literature
Tagged assessmentrecommended reading

It is really difficult to know if students have learned what we wanted them to learn, and especially when we have to collect evidence of that learning for a valid assessment — the best we can do is look for proxies that we take to stand for bigger and more complex outcomes. But “[t]he proxies we use determine what is made visible, what is foregrounded and what remains hidden” — so in their paper “Identifying what our students have learned: a framework for practical assessment validation“, Fawns et al. (2026) present the 4Ps framework that is really helpful to consider for that purpose!

The 4Ps framework considers the 4 main types of evidence used in assessment:

Product: Product assessment focus on a (typically finished) artifact created by students, for example on a report, prototype, video presentation. Since artifacts are typically portable and accessible once submitted (especially if submitted electronically), they can be assessed anywhere, any time, without the student being present. Since they can also be stored, assessment can be revisited by other assessors if necessary. But a finished product does not tell us anything about the process of the product was being produced: the choices that were made, the tools that were used, the communication with others about it and, important in group work, who contributed how. Assessment of products can also easily be influenced by skills or resources that were not supposed to be assessed because they aren’t part of the intended learning outcomes, like design skills or craft supplies for posters that are supposed to display research skills.
Performance: Performance assessment is about observing students directly while they are doing something, for example demonstrating teaching or participating in a role-play. A recording submitted by students would count as a product, not performance, and might be rehearsed or doctored, so recordings are only permissible if they are done by an official in a controlled way while also observing the live performance. A performance is also only a performance if the focus is on the direct observation of ability, not on the quality of products like pre-prepared slides or scripts. But, “performance is volatile and context-dependent“, and therefore need to be assessed repeatedly over time in different contexts.
Process: Process assessment is about how a product or performance were prepared, focusing on decision making, problem solving, etc. When the process is observed through products like lab notes or reflection papers, this introduces the problems that the product might be misrepresenting the actual process (because students think they know what the teacher wants to read: “Such retrospective accounts collapse into product assessment, privileging polished narratives or hero’s journeys over authentic traces of learning“). Process assessment is most useful in formative settings, and most powerful when combined with proxies of performance and products.
Practices: “Practice assessments focus on the patterned yet situated ways students carry out their work, whether studying, engaging in professional activities, or enacting assessment performances“, so it is about observing pattern unfolding in the context of authentic situations: “Practice assessments illuminate how learning is enacted not only in tasks but in the ongoing cultivation of ways of being and working“

There are different ways in which what we want to assess is not actually assessed by the method we chose, for example

mismatch, for example assessing teaching skill through reports on a theoretical approach to teaching
mismanagement, for example assessing knowledge about a topic via how well someone could facilitate a group discussion on it
misinterpretation, for example of a GenAI generated text as a proxy for student learning
slippage, for example “evaluating group contributions based on final products rather than collaborative processes“
spillage, for example “what might seem like a process-oriented assessment, such as observing a student’s work in a laboratory, can sometimes also involve inferences made from a product (e.g. the accuracy of the results) or a performance (e.g. a presentation of the results).“
over-saturation, which is when “one type of assessment proxy is repeatedly employed across a curriculum, implicitly communicating to students that this form of evidence defines academic success” — which we see with the focus on products: “Products can serve only as partial indicators of capability. Without adequate observation of or evidence about production processes, educators cannot confidently determine a student’s authentic contribution to the work“, nor does it tell us anything about the student’s learning.

So what could we do to minimize those problems?

aggregate: this is what is commonly done, where we use several smaller assessments to be lumped together in a final grade or pass/fail decision. But averaging is problematic as it conceals a more nuanced picture of learning
triangulate: using different proxies to look at competence from different perspectives and see that they either strengthen each other or point to gaps
holistically assess: “Holistic approaches rely on assessors’ evaluative judgement to make sense of multiple pieces of evidence in relation to the intended learning outcomes“

But: who synthesizes different pieces of evidence? A software that calculates the average, or are students involved in self- or peer-assessment, or are there teams of assessors? All of these need to be handled with care.

Also, Fawns et al. (2026) suggest to consider whether every small unit need to be assessed, or is it possible to assess students on program level to “support balanced sampling, sustainable resource use and more meaningful feedback“.

I found reading this article super helpful to put my finger on some things that have bugged me in assessment but that I couldn’t really articulate. For example how the form of written artifacts, the language capabilities, the formatting, often get conflated with the underlying understanding, which the assessment is supposed to measure. The 4Ps framework is really valuable to disentangle the different types of assessments and if they actually evaluate what they were intended to.

Fawns, T., Boud, D., and Dawson, P. (2026). Identifying what our students have learned: a framework for practical assessment validation, Assessment & Evaluation in Higher Education, DOI: 10.1080/02602938.2026.2620053

Featured image and the ones below from a cooold walk on Sunday! My favorite picnic spot…

And a view of Öresund bridge and Malmö on the horizon!

Currently reading Fawns et al. (2026) on “Identifying what our students have learned: a framework for practical assessment validation”

Leave a ReplyCancel reply

If we compare GenAI to humans, that begs the question “which humans?”

Journal club on Nagatsu et al. (2020)’s “Philosophy of science for sustainability science”

Contact me!

Search "Adventures in Teaching and Oceanography"

Archives

Recent Posts

Categories

Tags

Currently reading Fawns et al. (2026) on “Identifying what our students have learned: a framework for practical assessment validation”

Leave a ReplyCancel reply

If we compare GenAI to *humans*, that begs the question “which humans?”

Journal club on Nagatsu et al. (2020)’s “Philosophy of science for sustainability science”

Contact me!

Search "Adventures in Teaching and Oceanography"

Archives

Recent Posts

Categories

Tags

If we compare GenAI to humans, that begs the question “which humans?”