The authors have declared that no competing interests exist.
Conceived and designed the experiments: NA MI. Performed the experiments: NA. Analyzed the data: NA MI. Wrote the paper: NA MI.
Since the time of Plato, philosophers and educational policymakers have assumed that the study of mathematics improves one's general ‘thinking skills’. Today, this argument, known as the ‘Theory of Formal Discipline’ is used in policy debates to prioritize mathematics in school curricula. But there is no strong research evidence which justifies it. We tested the Theory of Formal Discipline by tracking the development of conditional reasoning behavior in students studying postcompulsory mathematics compared to postcompulsory English literature. In line with the Theory of Formal Discipline, the mathematics students did develop their conditional reasoning to a greater extent than the literature students, despite them having received no explicit tuition in conditional logic. However, this development appeared to be towards the socalled defective conditional understanding, rather than the logically normative material conditional understanding. We conclude by arguing that Plato may have been correct to claim that studying advanced mathematics is associated with the development of logical reasoning skills, but that the nature of this development may be more complex than previously thought.
“Those who have a natural talent for calculation are generally quick at every other kind of knowledge; and even the dull, if they have had an arithmetical training […] become much quicker than they would otherwise have been.” (Plato
For millennia it has been assumed that people can be taught to think more logically, and in particular, that mathematics is a useful tool for doing so. This idea is known as the Theory of Formal Discipline (TFD) and dates from the time of Plato. It is exemplified by the philosopher John Locke's suggestion that mathematics ought to be taught to “all those who have time and opportunity, not so much to make them mathematicians as to make them reasonable creatures”
In view of its intellectual pedigree and clear policy implications, variants of the TFD are regularly cited in educational policy debates and curricula reform documents
Society's views on the TFD have important practical implications. Stanic
Psychological evidence relating to the TFD is inconclusive. Thorndike
Cheng, Holyoak, Nisbett and Oliver
Despite these negative findings, there has been some support for the idea that studying mathematics might develop conditional reasoning ability. Lehman and Nisbett
Inglis and Simpson
Abstract conditional reasoning consists of drawing conclusions from a conditional statement ‘if
MP  DA  AC  MT  
Prem  Con  Prem  Con  Prem  Con  Prem  Con  
if 
not 
not 
not 
not 

if 
not 
not 
not 
not 

if not 
not 
not 
not 
not 

if not 
not 
not 
not 
not 
Validity  
Material Conditional  Valid  Invalid  Invalid  Valid 
Defective Conditional  Valid  Invalid  Invalid  Invalid 
Biconditional  Valid  Valid  Valid  Valid 
Conjunction  Valid  Invalid  Valid  Invalid 
The validity of each inference is shown for the material, defective, biconditional and conjunction interpretations of the conditional.
The validity of these four inferences depends upon how the reasoner interprets ‘if
p  q  if p then q  
Material  Defective  Biconditional  Conjunction  
T  T  T  T  T  T 
T  F  F  F  F  F 
F  T  T  I  F  F 
F  F  T  I  T  F 
The material conditional ‘if
Under the biconditional interpretation all four inferences are valid: ‘if
Some reasoners believe that ‘if
Theories of reasoning differ on the causes of the different interpretations. For example, the mental models theory
Some highability reasoners may flesh out the implicit model (a cognitively demanding task), giving them access to the material conditional and the MT inference. But reasoners who forget about the implicit model, or who lack the working memory capacity to flesh it out, are left with their initial explicit model, leading to either the defective or conjunctive interpretation.
In contrast, Evans et al.'s
Our goal here was to determine whether, as predicted by TFD, studying mathematics impacts upon students' conditional reasoning. In particular, we investigated whether the extent to which students adopted the material, biconditional, defective and conjunctive interpretations of the conditional changed following a year of mathematical study.
While the TFD claims that studying mathematics develops one's reasoning skills, it does not suggest any cognitive mechanisms for the change. Reasoning performance is related to measures of cognitive capacity (i.e., general intelligence;
Of course, it is neither practical nor ethical to randomly assign participants to courses when highstakes qualifications are at stake. However, our inclusion of a comparison group who were studying English literature allowed us to attenuate the nonrandom assignment to conditions to some extent. The comparison group allowed us to distinguish changes that occur simply due to age or education from those specifically related to some aspect of (or related to a factor correlated with) studying mathematics.
In sum, we asked two main questions. First, does studying postcompulsory mathematics influence how one reasons with conditionals? Second, if there is development of conditional reasoning skills, is this the result of a domaingeneral change in cognitive capacity or thinking disposition?
One hundred and twenty four participants (aged 15 years 4 months–17 years 8 months, M = 16 years 6 months, at Time 1) were recruited from five schools in Leicestershire, Hampshire and Derbyshire, UK. Seventyseven (41 male) were studying mathematics amongst any other subjects and 47 (17 male) were studying English literature and not mathematics. The literature students served as a comparison group. To avoid factors such as stereotype threat
The study followed a longitudinal quasiexperimental design. Participants were recruited after they had chosen their postcompulsory subjects and were tested at the beginning (during the first term and as close to the start of term as possible) and end (after teaching had finished) of their first year of postcompulsory study. They completed the same set of tasks at both time points.
Participants in the mathematics group were all studying the first year of AdvancedLevel mathematics. Although there are three different versions of this course available to students in England, all have similar content. Among other topics, the syllabus contained sections on algebra, geometry, calculus, trigonometry, probability, mathematical modeling, kinematics and forces (e.g.,
Participants completed Evans et al.'s
The problems ask about a) the
An 18 item subset of RAPM with a 15 minute time limit was used as a measure of cognitive capacity
As suggested by Toplak et al.
We asked participants to report their General Certification of Secondary Education (GCSE, the examinations taken by 16 yearold school leavers in England) grades. Each grade was converted to an 8point scale (A* = 8, A = 7, etc) and summed to produce a total score.
A 15item mathematics test was included as a manipulation check. Twelve items were taken from the WoodcockJohnson III Calculation subtest. Nine had shown an average accuracy of less than 55% and correlated with performance on the whole test at.86 in a previous dataset with mixeddiscipline undergraduate students
Participants took part in groups (5–34) during the school day under examination conditions. All tasks were given in a single paper booklet. The RAPM task was always completed first with a 15 minute time limit, and the order of the subsequent tasks was counterbalanced betweenparticipants following a Latin Square design. Participants were instructed to work at their own pace until they had completed all tasks and the sessions lasted approximately 45 minutes.
Fortyfour mathematics students and thirtyeight literature students took part at both time points and were included in the analysis. Those who dropped out of the study had typically moved schools or changed courses; there were no significant differences in Time 1 scores on any of the measures between those who took part at Time 2 and those who dropped out (
Descriptive statistics for the various covariates are shown in
Mathematics  Literature  
Theoretical maximum  Mean  Std Dev  Mean  Std Dev  
Time 1 RAPM  18  9.64  3.32  6.94  3.54 
Time 1 CRT intuitive (reverse scored)  3  1.79  1.14  .89  .85 
Time 1 Mathematics  15  4.86  1.59  3.50  .97 
Prior academic attainment  –  66.45  9.78  61.53  14.03 
Time 2 RAPM  18  10.64  2.93  7.32  3.15 
Time 2 CRT intuitive (reverse scored)  3  1.98  1.02  1.11  1.02 
Time 2 Mathematics  15  6.95  1.94  3.19  .57 
Units are number of correct responses except for prior academic attainment, which is sum of grades for all GCSEs where A* = 8, A = 7, B = 6 etc.
The mathematics group showed significantly greater improvement on the mathematics test than the literature students,
Endorsement rates of each inference type were analysed with a 2×4×2 ANOVA with two withinsubjects factors: Time (start and end of the year) and Inference Type (MP, DA, AC, MT), and one betweensubjects factor: Group (mathematics and literature). This revealed a significant threeway interaction,
Error bars show ±1 SE of the mean.
These responses appear most consistent with an increased tendency for the mathematics students to adopt a defective conditional interpretation (more MP inferences and fewer DA, AC and MT inferences were made at Time 2 compared to Time 1). To test for this we calculated four indices for each participant (at each time point) giving the proportion of responses consistent with each of the four interpretations of the conditional. For example, a person responding entirely in line with the material conditional would respond ‘yes’ to all MP and MT inferences and ‘no’ to all DA and AC inferences. A Material Conditional Index was therefore computed as: number of MP inferences endorsed + number of MT inferences endorsed + (8 – number of DA inferences endorsed) + (8 – number of AC inferences endorsed), giving a material conditional index score out of 32 for each participant at each time point. The consistency scores for each interpretation are shown in
Error bars show ±1 SE of the mean.
On the material conditional analysis Group and Time interacted,
On the defective analysis, Group and Time interacted,
Comparing the effect sizes of these analyses confirms that the change in the mathematics group is best understood as an increased tendency to adopt the defective interpretation of the conditional. In other words, that over time the mathematics group became more likely to endorse the MP inference, but less likely to endorse the DA, AC and MT inferences. Next we considered whether changes in either cognitive capacity or thinking disposition could represent domaingeneral mechanisms for this change in conditional reasoning behavior.
To investigate whether changes in the domaingeneral reasoning measures could account for the changes in the mathematics group's conditional reasoning behaviour, we regressed participants' defective conditional change scores (Time 2 defective conditional index minus Time 1 defective conditional index) against their Time 1 RAPM and CRT scores, their prior academic attainment, their RAPM and CRT change scores (the difference between their Time 2 and Time 1 scores), the group they were in, and the two group by changescore interaction terms. If the increased defective conditional indices of the mathematics students could be accounted for by changes in domain general factors, we would expect that some of the change scores or the group by changescore interactions would be significant predictors. However, if the primary factor was the experience of studying mathematics, we would expect the group factor to be the only significant predictor.
The regression model is presented in
R^{2}  Predictors  B  Std. Error  β 
.253 
Initial RAPM  .003  .005  .093 
Initial CRT  .012  .017  .107  
Prior attainment  .000  .001  −.019  
RAPM change  .010  .008  .217  
CRT change  .012  .021  .084  
Group (0 = literature, 1 = mathematics)  .082  .033  .337 

RAPM change × Group  .004  .010  .067  
CRT change × Group  −.023  −.032  −.106 
*
**
Since Plato asserted that studying mathematics improves one's ‘quickness’ of thought, philosophers, educational policymakers and the employment market have placed a high value upon having an advanced education in mathematics. Here we asked whether Plato's position is reasonable; in particular, we asked whether studying postcompulsory mathematics is associated with a development in conditional reasoning behavior, even if that study contained no explicit reference to conditional logic. We found that students studying postcompulsory mathematics did change their reasoning behavior to a greater extent than a comparison group over the course of a year of postcompulsory mathematical study. Further, we found that this change appeared to be best described as development away from a biconditional understanding of the conditional, and towards a defective understanding: at the end of their studies, the mathematics group endorsed more MP inferences and fewer DA, AC and MT inferences. Finally, we demonstrated that this effect was not the result of a domaingeneral change in cognitive capacity or thinking disposition, but rather seems most likely to be associated with the domainspecific study of mathematics.
Inglis and Simpson
Given that Cheng et al.
An alternative possibility is that the study of mathematics influences conditional reasoning behavior in a different way to the study of formal logic and that this, in some cases at least, is more educationally effective. This possibility is plausible for two reasons. First, we found that development in conditional inference was not related to changes in intelligence or thinking disposition, suggesting that studying mathematics could provide some specific experiences of manipulating concepts logically, which may not be provided by studying logic (we speculate below on what these experiences could be). Second, we found that the mathematics students in our sample did not become consistently more material across inference types, as we would expect if they had simply developed a more normative understanding of conditional statements (which presumably would be the aim of an education in formal logic). In fact, a defective interpretation is unlikely to lead to normative responses to the Wason Selection Tasks used by Cheng et al. (one might expect that reasoners adopting such an interpretation would choose the true antecedent card and no others, rather than selecting the normatively correct true antecedent and false consequent cards).
What then could be the nature of the experiences provided by mathematical study that could develop a defective interpretation of the conditional? Mathematics as a discipline is concerned with deducing the consequences of assumptions. Even before a student begins to study advancedlevel mathematical proofs and axiomatic systems, their daytoday activity consists of making modus ponens deductions from assumptions. Consider, for example, the activity of solving an equation. One starts with an assumption,
Our findings also have implications for the debate between those who favor the suppositional account of conditional reasoning (e.g.,
Finally, it is important to consider the limitation that results from the quasiexperimental design of our study: we cannot infer that if all students were compelled to study advanced mathematics there would be a societywide change in conditional reasoning behavior. It remains a possibility that the TFD only applies to those who have
To summarize, our study has provided evidence that the claims made by Plato