^{1}

^{1}

^{2}

^{3}

^{1}

The authors have declared that no competing interests exist.

Figures in scientific publications are critically important because they often show the data supporting key findings. Our systematic review of research articles published in top physiology journals (

A systematic review of research articles reveals widespread poor practice in the presentation of continuous data. The authors recommend training for investigators and supply templates for easy use.

Data presentation is the foundation of our collective scientific knowledge, as readers’ understanding of a dataset is generally limited to what the authors present in their publications. Figures are critically important because they often show the data that support key findings. However, studies of the Journal of the American Medical Association [

Bar graphs are designed for categorical variables; yet they are commonly used to present continuous data in laboratory research, animal studies, and human studies with small sample sizes. Bar and line graphs of continuous data are “visual tables” that typically show the mean and standard error (SE) or standard deviation (SD). This is problematic for three reasons. First, many different data distributions can lead to the same bar or line graph (

The full data may suggest different conclusions from the summary statistics. The means and SEs for the four example datasets shown in Panels B–E are all within 0.5 units of the means and SEs shown in the bar graph (Panel A).

The bar graph (mean ± SE) suggests that the groups are independent and provides no information about whether changes are consistent across individuals (Panel A). The scatterplots shown in the Panels B–D clearly demonstrate that the data are paired. Each scatterplot reveals very different patterns of change, even though the means and SEs differ by less than 0.3 units. The lower scatterplots showing the differences between measurements allow readers to quickly assess the direction, magnitude, and distribution of the changes. The solid lines show the median difference. In Panel B, values for every subject are higher in the second condition. In Panel C, there are no consistent differences between the two conditions. Panel D suggests that there may be distinct subgroups of “responders” and “nonresponders.”

In contrast, univariate scatterplots, box plots, and histograms allow readers to examine the data distribution. This approach enhances readers’ understanding of published data, while allowing readers to detect gross violations of any statistical assumptions. The increased flexibility of univariate scatterplots also allows authors to convey study design information. In small sample size studies, scatterplots can easily be modified to differentiate between datasets that include independent groups (

We conducted a systematic review of standard practices for data presentation in scientific papers, contrasting the use of bar graphs versus figures that provide detailed information about the distribution of the data (scatterplots, box plots, and histograms). We focused on physiology because physiologists perform a wide range of studies, including human studies, animal studies, and in vitro laboratory experiments. We systematically reviewed all full-length, original research articles published in the top 25% of physiology journals between January 1 and March 31, 2014 (

In addition to showing data for key findings, figures are important because they give authors the opportunity to display a large amount of data very quickly. However, most figures provided little more information than a table (Panel A in

Our data show that most bar and line graphs present mean ± SE.

While scatterplots prompt the reader to critically evaluate the statistical tests and the authors’ interpretation of the data, bar graphs discourage the reader from thinking about these issues. Placental endothelin 1 (

The infrequent use of univariate scatterplots, boxplots, and histograms is a missed opportunity. The ability to independently evaluate the work of other scientists is a pillar of the scientific method. These figures facilitate this process by immediately conveying key information needed to understand the authors’ statistical analyses and interpretation of the data. This promotes critical thinking and discussion, enhances the readers’ understanding of the data, and makes the reader an active partner in the scientific process. In contrast, bar and line graphs are “visual tables” that transform the reader from an active participant into a passive consumer of statistical information. Without the opportunity for independent appraisal, the reader must rely on the authors’ statistical analyses and interpretation of the data.

Sample size is an important consideration when designing figures and selecting statistical analysis procedures (

The distribution of the data and the sample size are critical considerations when selecting statistical tests. Univariate scatterplots immediately convey this important information.

T-tests and analysis of variance (ANOVA) are examples of parametric tests. These tests compare means and assume that the data are normally distributed with no outliers. In small samples, these tests are prone to errors if the data contain outliers or are not normally distributed.

The Wilcoxon rank sum test is an example of a nonparametric test. Nonparametric tests don’t make assumptions about the distribution of the variables that are being assessed. These tests often compare the ranks of the observations or the medians across groups. Nonparametric statistics are often preferred to parametric tests when the sample size is small and the data are skewed or contain outliers.

Some statisticians recommend nonparametric tests for small sample size studies. Others argue that these tests are underpowered, especially if the data distribution appears symmetric.

Our data suggest that most authors assume that their data are normally distributed, use parametric statistical analysis techniques, and select figures that show parametric summary statistics (Table B in

More than half of the authors who performed non-parametric analyses showed means when presenting their data. Investigators should show medians whenever they use nonparametric statistical tests. Medians are often used in situations where the mean is misleading due to outliers or a skewed distribution.

Investigators who use nonparametric statistics for paired or matched data should report the median difference instead of the median values for each condition (

Scientists and statisticians continue to debate many statistical practices that are commonly used in basic science research. These include whether to test the assumptions underlying parametric analyses [

These results suggest that, as scientists, we urgently need to change our standard practices for presenting and analyzing continuous data in small sample size studies. We recommend three changes to resolve the problems identified in this systematic review.

While Microsoft Excel allows scientists to quickly and efficiently create bar graphs, univariate scatterplots are more challenging. We created free Excel templates that are available in the supplemental files for the manuscript (

Presenting data in scientific publications is a critical skill for scientists [

Our systematic review identified several critical problems with the presentation of continuous data in small sample size studies. A coordinated effort among investigators, medical journals, and statistics instructors is recommended to address these problems. We created free Excel templates (

This file contains the methods and results for the systematic review, including Table A in S1 Text, Table B in S1 Text, Table C in S1 Text and Table D in S1 Text. Table A in S1 Text: The number of articles examined by journal. Values are n, or n (% of articles reviewed that were eligible and included in the analysis). Journals are organized by 2012 impact factor. Articles that were not full length original research articles were excluded after screening (i.e. reviews, editorials, perspectives, commentaries, letters to the editor, short communications, etc.). Abbreviations: AJP, American Journal of Physiology; APS, American Physiological Society. *APS Journal. Table B in S1 Text: Most studies performed parametric analyses. Values are n (%). *n (%) of 493 articles which performed parametric analyses. The remaining articles did not specifically state whether these assumptions were tested. Table C in S1 Text: Relationship between journal affiliation and the use of bar graphs and univariate scatterplots. Abbreviations: APS, American Physiological Society. Seven of the top 20 physiology journals are published by the American Physiological Society (APS), which specifies that outcome data should be presented in figures rather than in tables whenever possible. Nonhuman studies did not include human participants, tissues, cells or cell lines. Human studies included human participants, tissues, cells or cell lines. Table D in S1 Text: Relationship between journal affiliation and the use of histograms and line graphs/point and error bars plots. Abbreviations: APS, American Physiological Society. Seven of the top 20 physiology journals are published by the American Physiological Society (APS), which specifies that outcome data should be presented in figures rather than in tables whenever possible.

(DOCX)

Use this template to create scatterplots for independent data in two to five groups. Independent data means that the variable of interest is measured one time in each subject, and subjects are not related to each other. If your data do not meet these criteria, see the spreadsheet for paired or nonindependent data.

(XLSX)

Use this template to create scatterplots for paired or matched data. Paired data are when you measure the variable of interest more than one time in each participant. Matched data are when participants in groups one and two are matched for important characteristics. If your data are independent, please see the template for independent data. The template will allow you to create scatterplots for one group with two conditions, or two groups with two conditions.

(XLS)

Use these instructions to create univariate scatterplots for independent data in one or more groups of subject using GraphPad PRISM 6.0. Independent data means that the variable of interest is measured one time in each participant or specimen and participants or specimens are not related to each other. If your data are paired or matched, please see the instructions for paired or matched data.

(PDF)

Use these instructions to create univariate scatterplots for paired or matched data (two or more conditions) in one group of participants or specimens using GraphPad PRISM 6.0. Paired data are when you measure the variable of interest more than one time in each participant. Matched data are when participants in group one and group two are matched for important characteristics. If your data are independent, please see the instructions for independent data.

(PDF)

Use these instructions to create scatterplots for paired data (two conditions) in two groups of participants or specimens using GraphPad PRISM 6.0. Paired data are when you measure the variable of interest more than one time in each participant. If your data are independent, please see the instructions for Independent data.

(PDF)

(TIF)

Panel a: Bar graphs and other figures that typically show mean and SE or mean and SD were strongly preferred to figures that provide detailed information about the distribution of the data (scatterplots, box plots, and histograms). Panel b: Most bar graphs show mean ± SE. Panel c: Box plots show the minimum and maximum sample sizes for any group presented in a figure. The box shows the median and interquartile range. Whiskers show the furthest point that is within 1.5 times the interquartile range. Note that a few very high outliers are not shown (

(TIF)

analysis of variance

Animal Research: Reporting of In Vivo Experiments

standard deviation

standard error