One of my pet peeves are student evaluations that are interpreted way beyond what they can actually tell us. It might be people not considering sample sizes when looking at statistics (“66,6% of students hated your class!”, “Yes, 2 out of 3 responses out of 20 students said something negative”), or not understanding that student responses to certain questions don’t tell us “objective truths” (“I learned much more from the instructor who let me just sit and listen rather than actively engaging me” (see here)). I blogged previously about a couple of articles on the subject of biases in student evaluations, which were then basically a collection of all the scary things I had read, but in no way a comprehensive overview. Therefore I was super excited when I came a sytematic review of the literature this morning. And let me tell you, looking at the literature systematically did not improve things!
In the article “Sexism, racism, prejudice, and bias: a literature review and synthesis of research surrounding student evaluations of courses and teaching.” (2021), Troy Heffernan reports on a systematic analysis of the existing literature of the last 30 years represented in the major databases, published in peer-reviewed English journals or books, and containing relevant terms like “student evaluations” in their titles, abstracts or keywords. This resulted in 136 publications being included in the study, plus an initial 47 that were found in the references of the other articles and deemed relevant.
The conclusion of the article is clear: Student evaluations of teaching are biased depending on who the students are that are evaluating, depending on the instructor’s person and prejudices that are related to characteristics they display, depending on the actual course being evaluated, and depending on many more factors not related to the instructor or what is going on in their class. Student evaluations of teaching are therefore not a tool that should be used to determine teaching quality, or to base hiring or promotion decisions on. Additionally, those groups that are already disadvantaged in their evaluation results because of personal characteristics that students are biased against, also receive abusive comments in student evaluations that are harmful to their mental health and wellbeing, which should be reason enough to change the system.
Here is a brief overview over what I consider the main points of the article:
It matters who the evaluating students are, what course you teach and what setting you are teaching in.
According to the studies compiled in the article, your course is evaluated differently depending on who the students are that are evaluating it. Female students evaluate on average 2% more positively than male students. The average evaluation improves by up to 6% when given by international students, older students, external students or students with better grades.
It also depends on what course you are teaching: STEM courses are on average evaluated less positively than courses in the social sciences and humanities. And comparing quantitative and qualitative subjects, it turns out that subjects that have a right or wrong answer are also evaluated less positively than courses where the grades are more subjective, e.g. using essays for assessment.
Additionally, student evaluations of teaching depend on even more factors beside course content and effectiveness, for example class size and general campus-related things like how clean the university is, whether there are good food options available to students, what the room setup is like, how easy to use course websites and admission processes are.
It matters who you are as a person
Many studies show that gender, ethnicity, sexual identity, and other factors have a large influence on student evaluations of teaching.
Women (or instructors wrongly perceived as female, for example by a name or avatar) are rated more negatively than men and, no matter the factual basis, receive worse ratings at objective measures like turnaround time of essays. Also the way students react to their grades depends on their instructor’s gender: When students get the grades they expected, male instructors get rewarded with better scores, when their expectations are not met, men get punished less than women. The bias is so strong that for young (under 35 years old) women teaching in male-dominated subjects, this has been shown to have an effect of up to 37% lower ratings for women.
These biases in student evaluations result in strengthening the position of an already privileged group: white, able-bodied, heterosexual, men of a certain age (ca 35-50 years old), who the students believe to be heterosexual and who are teaching in their (and their students’) first language get evaluated a lot more favourable than anybody who does not meet one or several of the criteria.
Abuse disguised as “evaluation”
Sometimes evaluations are also used by students to express anger or frustration, and this can lead to abusive comments. Those comments are not distributed equally between all instructors, though, they are a lot more likely to be directed at women and other minorities, and they are cummulative. The more minority characteristics an instructor shows, the more abusive comments they will receive. This racist, sexist, ageist, homophobic abuse is obviously hurtful and harmful to an already disadvantaged population.
My 2 cents
Reading the article, I can’t say I was surprised by the findings — unfortunately my impression of the general literature landscape on the matter was only confirmed by this systematic analysis. However, I was positively surprised to read the very direct way in which problematic aspects are called out in many places: “For example, women receive abusive comments, and academics of colour receive abusive comments, thus, a woman of colour is more likely to receive abuse because of her gender and her skin colour“. This is really disheartening to read on the one hand because it becomes so tangible and real, especially since in addition to being harmful to instructors’ mental health and well-being when they contain abuse, student evaluations are also still an important tool in determining people’s careers via hiring and promotion decisions. But on the other hand it really drives home the message and call to action to change these practices, which I really appreciate very much: “These practices not only harm the sector’s women and most underrepresented and vulnerable, it cannot be denied that [student evaluations of teaching] also actively contribute to further marginalising the groups universities declare to protect and value in their workforces.”.
So let’s get going and change evaluation practices!
Heffernan, T. (2021). Sexism, racism, prejudice, and bias: a literature review and synthesis of research surrounding student evaluations of courses and teaching. Assessment & Evaluation in Higher Education, 1-11.