^{1}

^{2}

The authors have declared that no competing interests exist.

Discrimination and prejudice against overweight people is common in Western societies. In this article we aim to understand whether these attitudes reverberate into the school setting, by investigating whether teachers grade overweight students more severely than comparable normal weight students. By relying on the Attribution-Value Model of Prejudice (AVMP) and previous studies, we test a series of hypotheses using data from the German National Educational Panel Study (NEPS SC3) on a sample of students enrolled in the 7^{th} grade (lower secondary education). We used hierarchical ordered logit regression to assess whether overweight and obese students receive systematically lower grades by their teachers in German and mathematics, adjusting for subject-specific competences measured with a standardized test, and a rich set of socio-demographic and socio-psychological students’ characteristics (e.g. the “big five”). Results suggested that overweight and obese students were more severely graded in both subjects. The penalty for overweight students, and especially for obese students, was slightly larger in German and in the lowest part of the grade distribution. There was also indication of heterogeneous penalties by gender, with overweight male students being especially penalized in math. Possible ways to help teachers in assigning grades in a fairer way are discussed at the end.

Discrimination against overweight people in the labor market and in social life is well documented and increased over time [

In this article we aim to understand whether obese and overweight people constitute stigmatized and penalized social categories in a specific context that has attracted comparatively less attention, namely the school. In particular, we are interested in how teachers evaluate the scholastic performance of overweight students compared to normal weight students in German lower secondary school.

In most previous studies, school grades are mainly intended as indicators of academic proficiency, assuming that body mass diminishes performance in school via poor health and a resulting reduction in study effectiveness [

The literature on teachers’ grading in school posits that grades are not always a fair measure of subject-specific competences because they can also reflect teachers’ attitudes about the students [

While gender and migration background are commonly considered to be unchangeable, overweight is perceived as a controllable attribute that is culturally undesirable at the same time [

It fits to the controllability aspect of the AVMP that overweight people are perceived as lacking personal control, being lazy, less conscientious [

The cultural preference for thinness in Western countries is also well documented. In all socioeconomically highly developed societies thin bodies have the highest ratings of attractiveness [

Against this background, our research questions are: Are overweight and obese students graded more severely by their teachers than equally competent normal weight students? Moreover, does the potential grading bias vary by subject and gender? The answers to these questions are relevant for at least two reasons.

The first reason is fairness for the sake of the effect of grades on students’ motivation to learn and on their well-being [

We investigate the potential grading bias by using data of the German National Educational Panel Study with students attending the seventh grade [

The paper is organized in the following way: We set the basis for the research hypotheses by outlining results from the previous literature about grading bias towards overweight people. After presenting the German National Educational Panel Study data, variables and our methods, we report the empirical findings of the hierarchical ordinal logistic regression models. The article concludes with a discussion of the results and possible implications for teachers and schools.

In the school context, the beliefs of teachers about students’ competences and work ethics matter and overweight students face difficulties that can have consequences for their future careers. In a study of Neumark-Sztainer and colleagues [

Some studies reveal a relationship between student BMI and school grades: After controlling for sociodemographic factors, overweight students have a lower grade point average than normal weight students [

Other studies, using American samples, go further and partly account for competences as an alternative cause of grade differences to discrimination. Sabia [

Despite some heterogeneity in the research findings, most works found that BMI is negatively related to academic achievement. Moreover, several studies report that teachers might have prejudice against overweight students. Following these insights, and regarding our research question if overweight and obese students are graded more severely, we expect that

(Hyp. 1).

The second research question asked whether teachers’ bias varies by subject. In this work, we focus on two key subjects in secondary education, namely German and mathematics. If discrimination against overweight students is just an inherent feature of the teachers in their role of graders, we could predict a similar degree of bias in both subjects. If instead the way in which students are commonly tested and the standards usually adopted by the teachers play a role as well, we could expect some differences in the grading bias across subjects. In mathematics, most exams are based on exercises with a precise solution, whereas in German teachers assess students more often with oral exams and open-ended questions, in which subjective elements have more room to affect the evaluation process. If these characteristics matter, we should expect that

(Hyp. 2).

Our third research question asks whether teachers’ grading bias differs between males and females. It has been argued that obesity reduces the perception of femininity by others but not as much as it reduces the perception of masculinity [

(Hyp. 3).

We make use of the German National Educational Panel Study (NEPS)–Starting Cohort 3, which provides information on students attending the grade 7 in 2012, who are repeatedly interviewed in the subsequent years [

We restricted the sample by excluding students without matches in the parent and institution data set. 47% of the cases from the original target data set were deleted because they had no matches (institution or parent data). After this operation, we explored the data and found that while the incidence of missing values is null or modest for several socio-demographic variables, the share of missing values is higher for the outcomes (around 7%) and especially for the main independent variable (25%) (see

Imputation phase: We filled the missing data with estimated values and created a complete dataset using the chained equations/MICE approach, which uses a separate conditional distribution for each imputed variable [

Analysis Phase: We applied the hierarchical ordinal logistic regression models (see below) to analyze teachers’ grading in each of the 50 complete data sets.

Pooling Phase: The parameter estimates (coefficients and standard errors) obtained from each analyzed data set are then combined for inference using the so-called ‘Rubin’s rules’, which take into account properly the variation between imputations [

The variables come from several sub-data sets of the NEPS. The sub-data sets contain information from the student lists, the students and parents, and standardized competence tests. In the case of students, all time-variant variables were measured in the third wave while the time-invariant variables (gender, native language) were recorded in the first wave. The information on parents is measured once in the first wave and updated in subsequent waves in case of changes.

The grades in mathematics and German are self-reported by the students. Taking into consideration the highly skewed distribution of the original variable and to maintain enough statistical power for the analysis, we recoded the original grades into four categories:

Body mass index (BMI) categories were built using the BMI-for-age growth charts of the National Center for Health Statistics [

We control for subject-specific competence as an alternative cause of lower grades instead of teacher bias. Subject-specific competence is measured by weighted maximum likelihood estimates (WLE), which are corrected for the position of the test domain in the test book. The resulting score is the best estimate of each respondent’s observed answers to several items belonging to a domain [

Aside from competences, other variables need to be accounted for because they can influence both grades and weight. Parents’ socioeconomic status (SES) is negatively associated to overweight [

Variable | Item from NEPS | Type of Answer |
---|---|---|

Reading competence (test scores) | Booklet with 29/30 reading competence items | WLE score corrected for the test position, computed by the NEPS staff |

Mathematics competence (test scores) | Booklet with 23 mathematics competence items | WLE score corrected for the test position, computed by the NEPS staff |

Gender | School’s list: gender child | 0—female |

1—male | ||

School region | Please enter the postal code of your school. | 1—former West Germany |

2—former East Germany incl. Berlin | ||

Age in years | When were you born? | Open question |

Native language | Now let’s talk about your mother tongue: which language did you learn as a child in your family? | 0—German only |

1—other native language | ||

Parental SES | International Socio-Economic Index of occupational status (ISEI) (Highest most recent ISEI among pairs, most recent ISEI if single parent) | 16 to 90 |

Parental ISCED | International Standard Classification of Education (ISCED) (Highest most recent ISCED among pairs, most recent ISCED if single parent) | 1–2 or lower |

2—3b | ||

3—3a & 3c | ||

4—4a & 5b | ||

5—5a & 6 | ||

School type | Sample: type of school | 0—Hauptschule |

1—Realschule | ||

2—Gymnasium | ||

3—schools with different tracks | ||

Openness | I do not care much about arts. [R] & I have an active imagination, I am an imaginative person. | 9 step sum scale from 2 items: 1—does not apply at all |

Conscientiousness | I am easy-going and tend to be a bit lazy. [R] & I am thorough. | |

Extraversion | I tend to be cautious, reserved. [R] & I am out-going and sociable. | |

Agreeableness | I tend to be critical of other people. [R] & I trust other people easily, I believe in the goodness in people. | |

Neuroticism | I am relaxed and don’t get easily stressed. [R] & I am considerate, sensitive. | |

Attachment to school | Going to school for a long time is a waste of time. | 1—completely disagree |

5—completely agree | ||

Homework duration | How much time do you normally spend on your homework and learning for school? | 1—less than half an hour per day |

2—about half an hour to 1 hour per day | ||

3—about 1 to 2 hours per day | ||

4—about 2 to 3 hours per day | ||

5—about 3 to 4 hours per day | ||

6—more than 4 hours per day |

To identify teachers’ grading bias against overweight and obese students with observational data we apply a so-called “grade-equation” approach [

Model 1 is intended to measure the total teacher grading bias, which we can interpret as the average deviation of teachers’ evaluations from a more objective and standardized form of competence assessment: According to the models, if students with comparable competence scores receive different grades, the grade difference identifies teachers’ bias. Model 2 includes possible mediators that can explain the total teacher grading bias. If the included mediators are sufficient to capture all the possible relevant mechanisms by which teachers’ judgements are distorted, then we can interpret the coefficients corresponding to the BMI variable as the result of (taste or statistical) discrimination processes [

In the last part of the analysis, we investigate whether the grading bias due to overweight and obesity is heterogeneous across boys and girls, by introducing an interaction term between gender (SEX) and BMI. For the sake of simplicity, we omitted the main effects of the interacted variables in the notation, which are included in all models to avoid unbiased estimates, as suggested by Brambor et al. [

Given the nature of our outcome variables and the data/sampling structure, we use three-level ordinal hierarchical (random-intercept) regression models in which individuals are nested in courses that are nested in schools. By this way we allow the standard errors to be dependent within classes and schools, taking into account that students within the clusters are more similar to each other than between the clusters, resulting from exposure to similar contexts. The statistical models are estimated separately for German and mathematics, and in each model, the subject-specific test score is included as a control variable. Accounting for potential non-linearities in the relationship between the competence variables and the outcomes does not alter the results in a significant way. As described above, each model was estimated on each of the 50 datasets generated from our imputation model and then combined in order to take into account the variability between the estimates across the imputed data.

The analytical sample is constituted by 51% of males and 20% of the students speak other native languages beneath German. Among the types of secondary schools, 61% were enrolled in the Gymnasium (high-level track), 20% in the Realschule (medium-level track), 5% in the Hauptschule (lowest-level track), and 14% in schools with different tracks (comprehensive). Regarding the BMI, 80% (n = 1719) of the students have a normal weight, 11% (n = 230) have underweight, 7% (n = 149) have overweight, and 2% (n = 44) are obese.

Looking at the socio-demographic characteristics, we see that overweight and obese students are more often males, come more often from socio-economically disadvantaged families and are more likely to have a non-native background. Moreover, the share of students attending the Gymnasium is much lower among overweight (44%) and obese (36%) students compared to normal weight students (63%) (see

We now proceed by describing the bivariate relationship between students’ BMI and academic performance in lower secondary school in more detail. In the upper panel of

The upper-left graph shows Kernel density estimates of the distribution of test scores in German, the upper-right graph in mathematics. The lower-left graph shows the distribution of teachers’ grades by students’ weight category in German, and the lower-right graph in mathematics.

The patterns are similar across the two subjects: The incidence of children who received a low grade is largest among obese students, decreases but stays rather high among overweight students and is lower among normal weight students. The opposite pattern is found when looking at high grades, but in this case the difference between normal weight and obese students is more pronounced as when comparing normal weight to overweight students. Underweight students usually perform very similarly compared to normal weight students, either when academic performance is measured by teachers’ grades or standardized tests.

To assess whether overweight and obese students are less generously graded by their teachers compared to equally performing normal-weight students, we estimated a series of hierarchical ordered logistic regression models in which teachers’ grades are modelled as a function of students’ BMI categories, subject-specific performance in standardized tests, and a set of relevant individual characteristics. To correct for the missing values on relevant covariates, we conducted the analysis on 50 multiply imputed datasets. In

The graph shows log-odds ratios from the 3-level hierarchical ordered logit model on mathematics grades (left) and German (right). Bars represent 95% confidence intervals. Model 1 adjusts for students’ competence level and socio-demographic variables, while model 2 also includes students’ psychological traits, and school-related attitudes and behavior. N = 3,754. Analysis conducted on 50 multiply imputed datasets.

The results indicate that underweight students are graded by their teachers very similarly to normal weight students: The estimates are close to zero and not statistically significant at the 95% confidence level. Conversely, overweight and obese students receive on average lower grades by their teachers than normal weight students with the same level of subject-specific competence. The disadvantages of obese students are larger than those of overweight students and are slightly more pronounced in German than in mathematics.

While relying on the log-odds gives us a synthetic picture of the overall relationship between BMI and teachers’ grading, it prevents us to quantify the strength of this relationship in a meaningful way. For this purpose, we present average partial effects of being underweight, overweight, or obese on the four levels of teachers’ grades in German (upper panel) and mathematics (lower panel) in

The graph shows average partial effects from the 3-level hierarchical ordered logit model on mathematics grades (upper part) and German (lower part). Bars represent 95% confidence intervals. The models are adjusted for students’ competence level, socio-demographic characteristics, psychological traits, and school-related attitudes and behavior. N = 3,754. Analysis conducted on 50 multiply imputed datasets.

The largest differences among social groups are found at the lower end of the grade distribution and for medium-high grades. For instance, obese students have on average an 8 (mathematics) to 9 (German) percentage points larger probability of receiving a low grade by their teachers than equally competent normal weight students. They also have a lower probability of receiving medium-high (7–12 percentage points) and high (3–4 percentage points) grades than comparable normal weight students. Overweight is also associated to a penalization in grading, but its magnitude is smaller, ranging between 2 and 5 percentage points. From a qualitative point of view, the estimated differences appear to be larger in German than in mathematics, albeit the confidence intervals around the point estimates are to a large extent overlapped, making it difficult to state that the effect size across subjects actually differs in the reference population.

We estimated additional models in which we included an interaction term between the BMI categories and test scores, with the aim of assessing whether the grading bias differs among students with higher or lower levels of academic proficiency (see

As a last step, we investigate whether the penalty in grading associated with being overweight differs by gender, as found by a previous study in the US [

Effect of being overweight/ obese vs normal weight | German | Mathematics | ||||||
---|---|---|---|---|---|---|---|---|

Low | Medium-Low | Medium-High | High | Low | Medium-Low | Medium-High | High | |

Females | -0.006 | -0.012 | 0.010 | 0.008 | 0.054 | 0.026 | -0.051 | -0.030 |

(0.015) | (0.029) | (0.025) | (0.019) | (0.032) | (0.012) | (0.029) | (0.015) | |

Males | 0.105*** | 0.064*** | -0.129*** | -0.040*** | 0.046 | 0.029* | -0.044* | -0.030* |

(0.025) | (0.009) | (0.026) | (0.007) | (0.023) | (0.012) | (0.022) | (0.013) | |

Δ_{Male-Female} |
0.110*** | 0.075* | -0.139*** | -0.048* | -0.009 | 0.003 | 0.007 | -0.000 |

(0.029) | (0.030) | (0.035) | (0.020) | (0.039) | (0.016) | (0.036) | (0.019) |

Note: The first two main rows present the average partial effect of being overweight/obese by gender and the last row reports a formal statistical test of the interaction BMI and gender. Standard errors in parentheses, statistical significance levels (* p<0.05; ** p<0.01; *** p<0.001). N = 2,194, underweight students are excluded from this analysis. 50 multiply imputed datasets.

Predicted probabilities (and 95% confidence intervals) derived from hierarchical ordinal logistic regression models of teachers’ grading in German (left graph) and mathematics (right graph), adjusted for students’ competence level, socio-demographic characteristics, psychological traits, and school-related attitudes and behavior. N = 3,754. Analysis conducted on 50 multiply imputed datasets.

The penalization in German due to being overweight/obese among males is substantially larger than among females, ranging from 5 to 14 percentage points, and statistically significant across all the four grades. In mathematics, the qualitative pattern differs to some extent: Albeit we detected statistically significant effects of being overweight only among males (ranging between 2 and 4 percentage points), the effect size among females is very similar. The differences in weight penalty by gender are not statistically significant in mathematics. Thus, given the empirical evidence, we cannot reject the null hypothesis of no difference in the grading penalization against overweight/obese students between boys and girls in the reference population.

From previous studies we knew that overweight students get, on average, lower grades than their peers in school [

By means of hierarchical ordinal logistic regression models we have shown that both, overweight and obese students, receive on average lower grades than comparatively similar normal weight students, thus corroborating our first hypothesis. This pattern is found irrespective of whether we adjust for basic socio-demographic characteristics alone or for a more extensive set of individuals’ features including students’ personality traits, school-related attitudes and behavior.

While it is difficult to strictly prove “discrimination” within an observational setting, the rich set of control variables and the rather large size of the estimated differences seem to suggest that overweight, and especially obese students, are subject to some forms of penalty in the evaluations they receive by their teachers. The penalization is not only statistically significant, but also substantial. Indeed, the effect size of being obese rather than normal weight is larger than that of being male against being female, a comparison that has attracted much more attention among scholars [e.g.,

Empirical evidence is instead less clear in supporting our second hypothesis, which states that the grading penalty should be larger in German than in mathematics. In the common wisdom, math teachers are expected to base decisions on grades on a certain score that is linked to pre-determined correct solutions. Consequently, one might argue that math exams leave less room for interpretation regarding which grades the students should receive. Apart from that, teachers may set a variety of exams in the subject German with a different scope for interpretation if a result is right or wrong. While dictation exercises are solved either correctly or incorrectly, there is not a unique correct answer in essays and text comprehension exercises, and therefore the grade could incorporate subjective sources of bias. We found that, from a qualitative point of view, the effect sizes associated to being obese or overweight across subjects follows the expected pattern. From a statistical inference point of view, the confidence intervals around the point estimates are instead rather large and overlapped to a large extent, thus not supporting the hypothesis of differential effects across subjects. Anyway, our findings suggest that–despite mathematics being usually considered a more objective subject–this discipline is also not exempted from teachers’ grading bias linked to body appearance.

Additional insights on the heterogeneity in the penalization of overweight and obese students come from the second analysis, in which we investigate whether gender moderates the effect of BMI. Following theoretical arguments on the role of femininity in contemporary Western societies and previous findings from the US, we expected a larger detrimental effect of obesity on grades among females. Contrarily, we found that the more severe grading to which overweight/obese students are exposed in lower secondary school is mainly concentrated among male students, a pattern that is particularly pronounced in the German subject and less in mathematics. Differently from what was observed in the US, in Germany it seems that body appearance linked to negative stereotypes amplifies the stricter grading standards to which males are already exposed in comparison to females. While this might be due to processes of statistical discrimination by teachers, part of this result might be linked to individuals’ behavior in classrooms that we are not able to capture with our variables. If the two stereotypes of boys being less diligent than girls [

Before concluding with policy recommendations, it is important to discuss some limitations. First, the grades are self-reported which means that the outcome variable can suffer from measurement error if the students intentionally or unintentionally reported wrong grades. Yet, we do not believe this issue should be a major concern, for two reasons. First, although especially students with unfavorable grades might have social desirability biases and report a better grade to increase their self-esteem, self-reported grades were found to be very similar to grades in the report cards in previous studies [

The second limitation lies within the cross-sectional nature of the study, which does not allow us to disentangle whether overweight causes lower grades or if lower grades cause overweight by leading to stress eating [

The last limitation refers to the main assumption behind the grade-equation models, which posits that standardized test scores act as a good proxy for latent subject-specific competences. This assumption could be questioned if some unobserved factors associated with student BMI also affect the idiosyncratic performance in the standardized test. We are not aware of specific studies that try to tackle this aspect in-depth. However, results from previous studies on teachers’ grading bias against immigrants or boys using a grade-equation model [

Despite these limitations, we believe our study contributed to the literature by showing that overweight students have lower grades in German which can neither be attributed to lower subject-specific competence nor to psychological characteristics or key school-related attitudes and behavior. If institutions use grades as information on the competence of an applicant [

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)