^{1}

^{2}

^{3}

^{4}

^{4}

^{1}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: DZG BLW SLE AJC. Performed the experiments: DZG BLW SLE AJC. Analyzed the data: DZG SMG. Contributed reagents/materials/analysis tools: DZG SMG. Wrote the paper: DZG SLE BLW AJC SMG SEB.

Women who start college in one of the natural or physical sciences leave in greater proportions than their male peers. The reasons for this difference are complex, and one possible contributing factor is the social environment women experience in the classroom. Using social network analysis, we explore how gender influences the confidence that college-level biology students have in each other’s mastery of biology. Results reveal that males are more likely than females to be named by peers as being knowledgeable about the course content. This effect increases as the term progresses, and persists even after controlling for class performance and outspokenness. The bias in nominations is specifically due to males over-nominating their male peers relative to their performance. The over-nomination of male peers is commensurate with an overestimation of male grades by 0.57 points on a 4 point grade scale, indicating a strong male bias among males when assessing their classmates. Females, in contrast, nominated equitably based on student performance rather than gender, suggesting they lacked gender biases in filling out these surveys. These trends persist across eleven surveys taken in three different iterations of the same Biology course. In every class, the most renowned students are always male. This favoring of males by peers could influence student self-confidence, and thus persistence in this STEM discipline.

Male faculty members outnumber female faculty members in every science, technology, engineering, and math (STEM) discipline [

STEM faculty members provide some of the first professional feedback and interactions that students receive in their disciplines. Unfortunately, both male and female faculty members behave in ways that subtly favor males in STEM disciplines: (a) they are more likely to spend time mentoring males [

In addition to interactions with faculty members, interactions with other students could impact a student’s sense of belonging and confidence in her discipline. In contrast to the work on gender biases among faculty, only limited research has been performed on the disposition of current college-age students (the “millennial” generation) towards women in STEM and how this disposition may impact their female peers (but see [

In this paper we focus on the formative experience of nascent STEM professionals during an introductory college science course, a key transition period for the development of a STEM identity [

We explore the impact of gender on how students perceive their peers, as well as how students are perceived by their peers. It is important to note that the gender data used in this study come from the school registrar, and are thus defined by information given during student enrollment. The registrar constrains choice for gender identification to ‘male’ or ‘female’ choices. Given these complications, we choose to refer to student genders, but recognize that in some cases the data may not accurately reflect the true gender identity of each student.

To investigate how gender impacts peer perception, undergraduate students were asked to anonymously list class peers who they felt were “strong in their understanding of classroom material” at multiple time points throughout three iterations of a large introductory biology class. We employ longitudinal social network analyses of these data to (1) describe the distribution of nominations received between males and females, and (2) identify the factors that predict who a student will nominate as having mastered the content in their field. Finally, (3) we examine the characteristics of students receiving the most nominations in each class (to whom we refer to as “celebrities”). We focus on these students given our assumption that their ability to draw widespread acknowledgment of their excellence makes them among the most likely in the class to continue in the field beyond the undergraduate level.

We obtained human subjects approval from the University of Washington Institutional Review Board (#44438). Because students were not asked to do anything outside of the normal class curriculum, an altered consent process was approved for use in this study. Subjects were informed that a research study was taking place and that their data would be analyzed as part of this study. Students were informed that they could opt out of the study at any time by filling out a form in a centralized office.

Data come from three different iterations of the same large introductory undergraduate biology class (

All three iterations of this course included a lab section with a maximum capacity of 24 students that met once a week for several hours. Classes A, B, and C contained 9, 33, and 33 lab sections, respectively. The gender distribution within lab sections is approximately normal and mirrors that of the overall class (Mean = 57.4% female, SD = 0.11). The lecture portion of the course met for 50 minutes a day four days out of the week, and employed active learning techniques in all three iterations of the course. In all three cases, lectures were split into two sections with approximately 100 students in each for class A, and approximately 375 in each for Classes B and C; the instructor stayed consistent between lectures each class day to assure minimal differences between the two sections. Classes A and B were both taught by a male instructor, while Class C had three total instructors: two male instructors teaching 75% of class days and one female instructor teaching 25% of class days. All three iterations of the class included three exams spaced throughout the quarter, and a non-cumulative final exam that took place one week after the end of the quarter. Grades were not publically posted in any of the three classes.

A measurement of student outspokenness was collected by polling the course instructor of record immediately after the end of each course, and thus represents active participation as perceived by the instructor who was blind to the hypotheses being tested. Thus, a student who frequently offers an incorrect answer in class is considered equally outspoken as students who frequently offer the correct answers. Because measurements come from instructors, the list may be subject to each instructor’s own implicit biases.

All three classes consisted primarily of white and Asian students (40.5% and 29.9% of entire population across the three classes, respectively). Student ethnicity is not included in these analyses for two reasons. First, the diversity in each classroom is such that statistical power to understand the perception of minority students is lacking. Second, this issue is substantial enough to warrant its own separate analysis.

All network surveys were administered via a confidential online survey. For Class A, students were given a class roster after the first and second exams and were asked to mark students they felt were particularly strong with class material. In Class B students were asked at the beginning of the class to list students by name who they felt would do particularly well in the course. After the first, second, and third exams, they were asked to list students they felt were particularly strong with class material. The same collection method was performed in Class C as Class B, but in addition students were surveyed again before the final exam of the course. Surveys in Class C distinguished between students who responded and didn’t know anyone they felt were knowledgeable and students who didn’t list anybody due to a non-response to the survey. Thus, Class C offers the most accurate means to calculate response rates. An average of 81.4% (SD = 0.02) of students responded across the five surveys in this class, with 82.8% of female students responding (SD = 0.02) and 79.9% of males responding (SD = 0.01). We have no reason to believe that Classes A or B differed in response rates, or that response rates were skewed by gender in any manner.

To assess the hypotheses about nomination structure, we used exponential-family random graph models (ERGMs). This approach can be thought of as a kind of generalization of logistic regression to social networks–with the log-odds of a tie (here, a nomination) between two actors being dependent on a set of predictors of interest [

We specify two models, both of the general form:
_{ij} represents the value of the tie from _{ij,} i.e. the state of all of the ties in the network other than _{ij.} The δ vector represents the amount by which the model statistics change when _{ij} is toggled from 0 to 1, and the θ vector represents the coefficients on these statistics.

The first model contains seven model statistics (δ_{1} through δ_{7}) and the second model contains nine (δ_{1} through δ_{9}):

δ _{1} = 1 for all dyads [the main effect or intercept];

δ _{2} = 1 if

δ _{3} = 1 if

δ _{4} = 1 if both

δ _{5} = 1 if both

δ _{6} = 1 if

δ _{7} = -1 if

δ _{8} = j’s final grade in the class [grade of nominee];

δ _{9} = 1 if

We use the R package

A summary of student data stratified by gender can be found in

Proportionately, more males than females were listed as outspoken (p = 0.0258; Mantel-Haenszel test). While instructor bias causing this gender difference in outspoken status is something we cannot check, it is worth noting that any male bias in the assignment of outspoken status would make our estimates of male bias in peer perception more conservative than they actually are.

Classes are majority female in all three cases. Males performed slightly better than females in each class, and also tended to be more outspoken. Numerical counts are accompanied by total percentage in the class in parentheses. Means are accompanied by standard deviations in parentheses.

Class A | Class B | Class C | ||||
---|---|---|---|---|---|---|

Total students | 110 (56%) | 86 (44%) | 431 (55.4%) | 328 (44.6%) | 444 (58.4%) | 316 (41.6%) |

Mean class grade (out of 4.0) | 2.68 (1.01) | 2.93 (0.82) | 2.74 (0.83) | 2.86 (0.84) | 2.75 (0.82) | 2.89 (0.76) |

Number of students listed as outspoken | 16 (14.5%) | 16 (16.3%) | 64 (14.8%) | 52 (15.8%) | 98 (22.1%) | 95 (30.1%) |

Mean number nominations at S_{1} |
- | - | 1.14 (1.50) | 1.20 (1.73) | 1.19 (1.52) | 1.13 (1.52) |

Mean number nominations at S_{2} |
1.05 (1.39) | 1.60 (2.81) | 0.98 (1.45) | 1.16 (2.25) | 1.01 (1.41) | 1.08 (1.58) |

Mean number nominations at S_{3} |
1.06 (1.55) | 1.69 (2.95) | 1.22 (1.55) | 1.48 (2.44) | 1.02 (1.43) | 1.17 (1.78) |

Mean number nominations at S_{4} |
- | - | 1.12 (1.64) | 1.55 (3.63) | 1.23 (1.60) | 1.44 (1.92) |

Mean number nominations at S_{5} |
- | - | - | - | 1.21 (1.55) | 1.36 (1.87) |

Across the 11 peer perception surveys, students received an average of 1.20 nominations with a standard deviation of 1.85; males averaged 1.31 nominations with a standard deviation of 2.23, while females averaged 1.12 nominations with a standard deviation of 1.51. Males consistently received more nominations than females in every survey, with the first survey in Class C as the only exception.

In all three classes, the number of nominations given to males increased throughout the course. This pattern was particularly strong in Classes B and C, where data were collected across a longer time span. No consistent longitudinal trend for females is visible in any of the three classes. Combined, these patterns result in a growing gender gap in the number of nominations received between males and females when comparing data from surveys early in the class to those taken later in the class (

Sociographs at the beginning of course (S1) and after exam 3 (S4) in class B. Male students are represented by green circles and females by orange circles. The size of nodes correlates with how many nominations each student received. Arrows show direction from the nominator to the nominee.

To determine the significance of these results, we use Exponential Random Graph Models (ERGM). Our base model does not include grade or outspokenness in order to give an absolute sense of the gender differences in receiving nominations (

Over-representation of males in received nominations could be explained either by the higher frequency of outspokenness in males, or the higher average grades achieved by males compared to females, as both of these measures may indicate that, on average, males indeed know the material better or at least make their knowledge more visible to their peers. To test these explanations, we expanded our ERGM to include the class grade and outspokenness of the nominee as mediating factors (

Each column represents coefficients from a different survey (S): S1 surveys were taken the first week, S2 after the first exam, S3 after the second exam, and so on. Coefficients represent the influence on the log-odds of a nomination for each predictor; each is formally defined in the Methods section. Bolded coefficients indicate significance at α of 0.05. Positive coefficients indicate that ties are more likely to occur, while negative coefficients indicate that ties are less likely to occur. Values in parentheses represent 95% confidence intervals.

Coefficient name | Course A | Course B | Course C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

S2 | S3 | S1 | S2 | S3 | S4 | S1 | S2 | S3 | S4 | S5 | |

Intercept | |||||||||||

Mutuality | |||||||||||

Grade of nominee | |||||||||||

Outspokenness of nominee | 0.02 (-0.11, 0.15) | ||||||||||

Homophily on lab section | |||||||||||

0-indegree | |||||||||||

Female nominator | 0.16 (-0.05, 0.37) | ||||||||||

Female-female bias | -0.04 (-0.33, 0.25) | -0.21 (-0.49, 0.07) | 0.08 (-0.06, 0.22) | 0.01 (-0.15, 0.18) | -0.04 (-0.20, 0.11) | -0.11 (-0.26, 0.03) | 0.14 (-0.04, 0.32) | 0.12 (-0.05, 0.29) | 0.09 (-0.07, 0.26) | 0.11 (-0.04, 0.26) | 0.15 (0.00, 0.31) |

Male-male bias |

Cell entry = Point estimate (95% CI); S = survey number; Bold = significant (p < 0.05).

Note: for Course C Survey S5, the lower bound of the confidence interval is positive for female nominator and negative for female-female bias, although both round to 0 with only two decimal places.

Performance is a strong and significant predictor of receiving a nomination in every survey, indicating that students have an accurate sense of other students’ performance, despite not having any public way to view their peers’ grades. In addition, outspokenness has a significant effect in all but one case, indicating that students also nominate based on this trait. Being in the same lab section is also universally predictive of a nomination from one student to another. There is a significant tendency for there to be more students with no nominations than expected by chance given the overall nomination rates and the other terms in the model. The female nominator coefficient indicates that females make more nominations overall than males do, without considering the gender of those they nominate.

With performance and outspokenness in the model, females no longer show a bias toward nominating males in any of the 11 surveys; their nominations do not diverge from gender expectations in either direction in any survey. Males, on the other hand, continue to show a significant bias towards males in all 11 surveys; in each case the magnitude of the effect declined, but remains significant.

Model based predictions for a hypothetical class comprising 50% males and 50% females. To isolate the effect of gender bias this class was also modeled as having an equal grade distribution and level of outspokenness across genders. We plot the results from 100 simulations for each of the models; the main bars represent the mean, and the whiskers reflect the range in which the central 95% of the simulations fall. Even with equal performance and outspokenness in this hypothetical class across all three model predictions, the longitudinal increase in bias of male students to nominate males remains. Female students also demonstrate a pattern of moving from female to male nominations over the course of each class.

Another way to understand the magnitude of the gender bias is to compare its coefficient to that for class grade point average (GPA), our best proxy for actual mastery of course material scored on a 4 point scale. Averaged across the 11 surveys, females give a boost to fellow females relative to males that is equivalent to an increase in GPA of 0.040; i.e. they would be equally likely to nominate an outspoken female with a 3.00 and an outspoken male with a 3.04. On the other hand, males give a boost to fellow males that is equivalent to a GPA increase of 0.765; for an outspoken female to be nominated by males at the same level as an outspoken male her performance would need to be over three-quarters of a GPA point higher than the male’s. On this scale, the male nominators’ gender bias is 19 times the size of the female nominators’.

The three-to-four most nominated students in all classes examined were male. In each class, most students received very few nominations, while several students emerged over the course of the class as exceptionally well known; we refer to these students as “celebrities”. Several patterns are evident in the distribution of nominations in these classes (^{th} in two classes, and are 5^{th} most well-known in the other. Third, male students at the top of the distribution tend to be considerably more well-known than any other student in the course. This is especially pronounced in Class B, where the most renowned male (52 nominations) received 5.78 times the nominations as the most renowned female (9 nominations). The most renowned male in Class A (16 nominations) has twice as many nominations than the most renowned female (8 nominations), while in Class C the most renowned male (13 nominations) has 1.63 times as many nominations than the most renowned female (8 nominations). These high nomination counts are notable, given the low average number of nominations seen across all 11 surveys (1.20).While the number of nominations achieved by celebrities in each class varies, the male biased pattern among the most frequently nominated peers holds.

Students with the five highest numbers of nominations are depicted for each class. The numbers above each student represent how many nominations that student received, while the numbers below each student represent their grade point average earned in the course out of 4 points. These data come from the last surveys administered in Classes A, B, and C, and represent our best estimate for the perceptions developed by the end of each class.

The male majority among classroom celebrities could be explained if males were the only students who both achieved high grades and spoke up frequently in class. However, this is not the case. While male students on average scored slightly higher than female students and were more likely to be outspoken in every class, outspoken females with grades as high as these most renowned male students exist in every class (

The underrepresentation of women in STEM is a complex and daunting problem. Increasing gender equity requires tackling both inequalities in students’ initial interest in STEM and the retention of women who have expressed that interest. While there is strong evidence that precollege factors influence a student’s initial decision to major in a STEM field [

In three iterations of an undergraduate biology class, we found that even after controlling for actual course performance and outspokenness, male peers still disproportionately nominate males as being knowledgeable about biology while females nominate males and females equally. This indicates that males hold a bias against their female peers’ competence in biology. Our finding of peers as a second source of differential treatment by gender, beyond known biases of faculty, contributes to a more complete picture of the experiences of undergraduate women in STEM fields. The coalescence of subtle messages about their STEM abilities from both faculty and peers may undermine the self-confidence females have to persist in STEM fields beyond their undergraduate education [

The finding that a gender bias impacts the perception of millennial students may at first seem surprising, but is supported by work on implicit biases. Implicit biases are unconscious associations that people hold related to certain groups. Across many cultures, STEM is associated with males and not females [

One potential analytical concern for the current study is multiple comparisons. This occurs when statistical analyses involves multiple outcome measures, testing for an effect of multiple independent variables on a single outcome measure, or when the research design is repeated across several populations. In each case, the chance of finding a false positive is increased by adding another test. Because we repeated our study design three times and include multiple independent variables in our models, we are performing multiple tests, and thus have increased chances of a false positive. However, the repeated significance of our main result (that males over-nominate their male peers) across every survey gives us no reason to suspect that they are spurious due to multiple comparisons. It appears that males consistently hold a bias against their female peers’ competence in biology.

Our work suggests that processes in the classroom may either be reinforcing pre-existing implicit biases over the quarter, or at least facilitating behaviors based on these biases. The end of every class term shows a stronger male bias than the beginning. This pattern is mediated by two class-related factors: 1) whether or not a student is outspoken in class and 2) level of achievement in the class. These factors, which seem to influence the opinions of both male and female peers, have previously been found to differ by gender in biology: males are more likely to be heard speaking in class and males slightly, but systematically, outperform women [

We propose that the specific classroom environment can influence the effect size of the male bias, with some support for this hypothesis from Class C. In this term males did not behave differently than in previous years. Females, however, developed a stronger bias towards nominating other females than in the other two classes. Though this bias was not significant, it effectively lessens the overall magnitude of bias towards male students. Although we cannot specifically pinpoint why this was the case, this class differed from the other two in two critical ways. First, one of the three instructors in this course was female, whereas all instructors were male in the other two classes. Female instructors, when they are considered role models, have been shown to reduce the science-gender biases of female students, and this may have impacted the latter’s nomination patterns [

The context of this research on peer perceptions was an introductory biology classroom. We can only speculate on the peer biases present in other STEM fields, but we predict that the male bias observed in this study may be conservative relative to other STEM fields for three reasons. First, biology is thought to be the STEM field with the most gender equity: undergraduate enrollment is nearly equal in terms of males and females [

Our findings have strong implications regarding the effectiveness of existing strategies to increase women in STEM fields. Without addressing social dynamics that perpetuate gender biases in the college classroom, simply increasing the number of young women entering STEM majors may not be enough. The patterns of uneven peer perceptions by gender shown in our student population suggest that future populations of academics may perpetuate the same gender stereotypes that have been illuminated among current faculty. This may not only be the case because the male students receiving high celebrity are reaffirmed in their abilities and are better able to advance through the STEM pipeline than women who do not receive this affirmation, but also because the existence of “celebrity” males and other individuals with distinction can impact and reaffirm the stereotypes held by others [

(DOCX)

Male students are represented by green circles and females by orange circles. The size of nodes correlates with how many nominations each student received in the corresponding survey. Arrows show direction from the nominator to the nominee.

(EPS)

Male students are represented by green circles and females by orange circles. The size of nodes correlates with how many nominations each student received in the corresponding survey. Arrows show direction from the nominator to the nominee.

(EPS)

Male students are represented by green circles and females by orange circles. The size of nodes correlates with how many nominations each student received in the corresponding survey. Arrows show direction from the nominator to the nominee.

(EPS)

Even though outspoken females with extremely high scores exist, they fail to reach the same “celebrity” status as their male counterparts.

(TIFF)

Even though outspoken females with extremely high scores exist, they fail to reach the same “celebrity” status as their male counterparts.

(TIFF)

Even though outspoken females with extremely high scores exist, they fail to reach the same “celebrity” status as their male counterparts.

(TIFF)

Plots compare the in-degree distribution across students in the observed data to that for 10 network simulations from the model. Plots cover the six networks from the first two classes in consecutive order (top row: Course A, S2 and S3; middle row: Course B, S1 and S2; bottom row: Course B, S3 and S4). The x-axis is defined by number of nominations (“in-degree”), and the y-axis by the proportion of students displaying that in-degree. Thick black lines represent the observed distribution. Boxplots represent the simulations, with boxes representing the median and interquartile range, whiskers representing the minimum and maximum, and circles and gray lines representing the 95% support intervals.

(TIFF)

Data points are represented as numbers (1–4) and colors (black, red, green, and blue) corresponding to the first, second, third, and fourth exams. In each class, exam scores correlate strongly with overall course grades. Due to this correlation, we chose to simplify our analyses by using course grade as a predictor across all models as opposed to using a unique contemporaneous exam scores at each time point.

(PNG)

This model controls for mutuality, and thus takes into account the increased likelihood of a nomination from student A to student B, given a nomination from B to A. This model shows the gender bias in nominations before taking into account outspokenness and class performance.

(DOCX)

We thank Mary Pat Wenderoth, Scott Freeman, Erin Shortlidge, and Jim Collins for their thoughtful comments on this manuscript. We thank the many students for their participation in this study. Lastly, we thank Arielle Desure, Carrie Sjogren, Katherine Cook, and Sarah Davis for their help compiling the network data used in these analyses. We also thank Nicholas Horrocks and two anonymous reviewers for their useful comments.