Student evaluations of teaching are not only unreliable, they are significantly biased against female instructors

To answer this question, we apply nonparametric permutation tests to data from a natural experiment at a French university (the original study by Anne Boring is here), and a randomized, controlled, blind experiment in the US (the original study by Lillian MacNell, Adam Driscoll and Andrea N. Hunt is here). We confirm and extend the studies’ main conclusion: Student evaluations of teaching (SET) are strongly associated with the gender of the instructor. Female instructors receive lower scores than male instructors. SET are also significantly correlated with students’ grade expectations: students who expect to get higher grades give higher SET, on average. But SET are not strongly associated with learning outcomes.