The authors have declared that no competing interests exist.
Estimating divergence times on phylogenies is critical in paleontological and neontological studies. Chronostratigraphicallyconstrained fossils are the only direct evidence of absolute timing of species divergence. Strict temporal calibration of fossilonly phylogenies provides minimum divergence estimates, and various methods have been proposed to estimate divergences beyond these minimum values. We explore the utility of simultaneous estimation of tree topology and divergence times using BEAST tipdating on datasets consisting only of fossils by using relaxed morphological clocks and birthdeath tree priors that include serial sampling (BDSS) at a constant rate through time. We compare BEAST results to those from the traditional maximum parsimony (MP) and undated Bayesian inference (BI) methods. Three overlapping datasets were used that span 250 million years of archosauromorph evolution leading to crocodylians. The first dataset focuses on early Sauria (31 taxa, 240 chars.), the second on early Archosauria (76 taxa, 400 chars.) and the third on Crocodyliformes (101 taxa, 340 chars.). For each dataset three timecalibrated trees (timetrees) were calculated: a minimumage timetree with node ages based on earliest occurrences in the fossil record; a ‘smoothed’ timetree using a range of time added to the root that is then averaged over zerolength internodes; and a tipdated timetree. Comparisons within datasets show that the smoothed and tipdated timetrees provide similar estimates. Only near the root node do BEAST estimates fall outside the smoothed timetree range. The BEAST model is not able to overcome limited sampling to correctly estimate divergences considerably older than sampled fossil occurrence dates. Conversely, the smoothed timetrees consistently provide nodeages far older than the strict dates or BEAST estimates for morphologically conservative sistertaxa when they sit on long ghost lineages. In this latter case, the relaxedclock model appears to be correctly moderating the nodeage estimate based on the limited morphological divergence. Topologies are generally similar across analyses, but BEAST trees for crocodyliforms differ when clades are deeply nested but contain very old taxa. It appears that the constantrate sampling assumption of the BDSS tree prior influences topology inference by disfavoring long, unsampled branches.
Biologists and paleontologists need dated phylogenies to test a host of evolutionary questions ranging from global phenomena like climaticbiotic interactions through time and intercontinental historical biogeography, to more local or taxonspecific processes, such as estimating rates of morphological change, origination, and extinction. Fossils and the chronostratigraphic data associated with them are the only direct source of absolute timing for the Tree of Life. For neontological studies focused primarily on estimating dated phylogenies for extant taxa, the most common method for including absolute timing from fossils has been via prior probability distributions applied to internal nodes (node date calibrations). The problem of how to most objectively and effectively translate fossil specimens into node calibrations is difficult and has received treatment in general [
These advances are useful contributions to the scientific project of dating a tree of life. However, most of the tree of life is now extinct. Most extinct lineages do not have extant members from which genomic data can be collected, and their relationships can only be estimated from fossil morphological data. Dating these phylogenies is as important as dating trees of extant taxa for reconstructing the timetree of life.
Advances in nodecalibration methods do not translate into advances in timescaling fossilonly phylogenies. Node calibration methods have no analog in fossilonly trees, and it is noncontemporaneous fossil tips that possess the chronostratigraphic data necessary to directly timescale the tree. Thus the question with fossilonly trees is how best to use these tip ages to inform the node ages of the tree.
Any attempt to incorporate fossil data in timetrees should be cognizant of the various types of uncertainties inherent to the fossil record. Fossil tip ages have an associated uncertainty from to the stratigraphic uncertainty of the fossil age estimates [
Prephylogenetic approaches to dating the origins and durations of extinct species and taxa relied on a literal reading of first and last occurrence data from the fossil record [
The occurrence of a fossil in the rock record represents temporal minimum for species or clade age [
Methods for timescaling zerolength branches are needed to utilize GLA trees in comparative analyses. A common approach is to assign an arbitrary timelength to zerolength branches [
Using fossils to inform nodedate calibrations remains, by far, the most common use of paleontological data in modelbased dating procedures. This approach does not consider the wealth of other data fossils provide. Of particular importance is the fact that nodecalibration does not allow fossils to inform the phylogenetic hypothesis, relying instead on ad hoc placement of fossils. Furthermore, as discussed above, it is not a viable timescaling method for fossilonly datasets, or any dataset that contains extinct taxa lacking molecular data in the absence of a morphological model and unwarranted assumption of direct ancestry.
If the tips of a phylogeny are not all contemporaneous (as in fossilonly datasets), then an alternative to nodedating is available to calibrate the tree prior during divergence estimation. The variation in tip ages combined with estimates of evolutionary change (from either sequence or morphological characters) allow phylogenies to be calibrated to generate estimates of absolute rates and times. Tipdating methods have been adapted to include fossil data and models of morphological evolution [
The choice of time scaling approach can have a strong effect on macroevolutionary and biogeographic analyses [
Three phenotypic datasets were selected that span the 270 million year history of Sauria, with a focus on basal saurians [
Clade  # of 
# of 
% 
Author  MorphoBank 

basal Sauria  246  31  31%  Pritchard et al., 2015  P854 
basal Archosauria  412  76  37%  Nesbitt, 2011  P198 
Crocodyliformes  318  101  39%  Turner, 2015  P1200 
The datasets vary in both the amount of ingroup and character sampling. All datasets include a mixture of ordered and unordered characters and have a similar amount of missing entries (
Our choice of tree estimation procedure was motivated by our interest in comparing traditional empirical paleontological estimates to modelbased tipdating methods on fossilonly datasets. Phylogenetic relationships were estimated for each dataset using three different criteria; maximum parsimony (MP), undated Bayesian inference (BI), and simultaneous estimation of topology and divergence times in BEAST.
MP remains the most frequently employed tree building method for phenotypic datasets and its behavior is well known. MP trees were reconstructed using equallyweighted parsimony using TNT v1.5 [
Bayesian inference (BI) trees were estimated using MrBayes v3.2 [
Tree topologies were estimated simultaneously with divergence times using a relaxed clock model implemented in BEAST 2 [
The strict consensus MP tree and the maximum clade credibility BI tree obtained from each dataset were used to generate an empirically timecalibrated phylogeny using GLA. Two timecalibrated trees per dataset were generated corresponding to the strict GLA procedure and the “temporally smoothed” GLA procedure (sGLA). This was implemented using the R [
We estimated mean node age and the 95% Bayesian credible interval (CI) using BEAST 2 [
The NEXUSformatted phylogenetic datasets and BEAST model run parameters were converted into BEAST’s unique XML format using
Parsimony analysis resulted in two most parsimonious trees 667 steps long (CI = 0.385; RI = 0.607) (
Strict consensus of two most parsimonious trees (TL = 667, CI = 0.385; RI = 0.607). Major clades discussed in the main text are highlight for ease of comparison across figures.
The maximum clade credibility tree derived from the Bayesian analysis is well resolved except near the base (
Maximum clade credibility tree with posterior probabilities displayed at the nodes. Major clades discussed in the main text are highlight for ease of comparison across figures.
MP analysis for the archosaur dataset resulted in 360 most parsimonious trees with a length of 1285 (CI = 0.375; RI = 0.782). The strict consensus tree (
Strict consensus of 360 most parsimonious trees (TL = 1285, CI = 0.375; RI = 0.782). Major clades discussed in the main text are highlight for ease of comparison across figures.
The maximum clade credibility tree from the Bayesian analysis is almost completely resolved and most nodes have posterior probabilities >0.90 (
Maximum clade credibility tree with posterior probabilities displayed at the nodes. Major clades discussed in the main text are highlight for ease of comparison across figures.
MP analysis of the crocodyliform dataset resulted in 108 optimal trees with a length of 1662 steps (CI = 0.239; RI = 0.700). A reduced strict consensus tree is presented in
Strict consensus of 108 most parsimonious trees (TL = 1662, CI = 0.239; RI = 0.700). Major clades discussed in the main text are highlight for ease of comparison across figures.
Bayesian analysis resulted in a wellresolved maximum clade credibility tree depicting the monophyly of nearly all generally accepted higherlevel clades (
Maximum clade credibility tree with posterior probabilities displayed at the nodes. Major clades discussed in the main text are highlight for ease of comparison across figures.
Because of the overall topological similarity between trees estimated from BI and MP, we present ghost lineage analysis (GLA) results here using only the BI topologies. Selection of 1, 5, or 10 million years for the vartime value for sGLA had little to no effect on reconstructed node ages (Supplementary Material), as such we present only vartime = 1 results. The MP trees are available in the supplementary material.
A. Chronogram and minimum node ages for the Sauria dataset (BI results) obtained from strict GLA (light purple dashed lines) superimposed on the sGLA chronogram (solid dark blue lines). B. Chronogram based on the maximum clade credibility tree for the saurian dataset with branch lengths drawn to reflect BEAST divergence time estimations. Error bars reflect the 95% highest probability density. All trees scaled to geologic time scale above.
A. Chronogram and minimum node ages for the Archosauria dataset (BI results) obtained from strict GLA (light purple dashed lines) superimposed on the sGLA chronogram (solid dark blue lines). B. Chronogram based on the maximum clade credibility tree for the archosaur dataset with branch lengths drawn to reflect BEAST divergence time estimations. Error bars reflect the 95% highest probability density. All trees scaled to geologic time scale above.
A. Chronogram and minimum node ages for the Crocodyliformes dataset (BI results) obtained from strict GLA (light purple dashed lines) superimposed on the sGLA chronogram (solid dark blue lines). B. Chronogram based on the maximum clade credibility tree for the crocodyliform dataset with branch lengths drawn to reflect BEAST divergence time estimations. Error bars reflect the 95% highest probability density. All trees scaled to geologic time scale above.
The archosaur nodes ages show a similar pattern to those of the saurian dataset. Much of the basal divergences in Archosauria are constrained by multiple taxa occurring in small time window in the Early Triassic. Occurrences among the crocodylomorphs and avemetatarsalians are less dominated by a single occurrence of old, deeply nested, taxa and therefore show larger changes in estimated node ages.
The maximum clade credibility tree derived from the BEAST analysis is very well resolved and similar to the BI and MP topologies in most respects (
The maximum clade credibility tree is well resolved and similar to the MP and BI topologies (
Among the three reptile datasets examined, the maximum clade credibility tree for Crocodyliformes deviates the most from the topology recovered by BI and MP. Major crocodyliform clades are recovered with high posterior probabilities, including Protosuchidae, Thalattosuchia, Mesoeucrocodylia, Notosuchia, Neosuchia, and Crocodylia (
Tipdated node ages are presented in Figs
Nineteen key nodes were chosen for direct comparison across the three methods examined. Time scaled to geologic time scale above. Clade name abbreviations are on the right. For each clade name, the depicted node estimate from top to bottom go as follows; GLA estimate with solid bar showing the minimum to maximum stratigraphic error, sGLA estimate, BEAST estimate with error bar representing the 95% highest probability density. Key for these node labels is in the upper left hand corner. A summary cladogram is provided at the bottom.
Node  GLA  sGLA  BEAST2  Dataset 

CROC  77.1  101.7  95.6  T15 
EUSUCHIA  154.7  172.8  145.1  T15 
NOTOSUCHIA  119.5  155.2  171.9  T15 
GOBIOSUCHIDAE  77.9  134.4  83.7  T15 
Sich + Zosuchus  122.5  156.8  144.3  T15 
CROCODYLIFORMES  218.5  228.9  237.2  T15 
CROCODYLIFORMES  199.2  199.9  201.7  N11 
CROCODYLOMORPHA  218.5  239.5  250.4  T15 
CROCODYLOMORPHA  222.3  223.3  227.3  N11 
PSEUDOSUCHIA  249  249.3  256.3  N11 
DINOSAURIA  231.5  238.1  243.6  N11 
THEROPODA  231  231.3  235.9  N11 
AVEMETATARSALIA  244.6  249  255  N11 
ARCHOSAURIA  249  249.7  258.1  N11 
ARCHOSAURIA  238.5  241.5  239.1  PEA15 
TANYSTROPHIDAE  244.6  250.9  255.8  PEA15 
ALLOKOTOSAURIA  240  245.9  254.1  PEA15 
ARCHOSAURIFORMES  251.8  253  257.7  PEA15 
ARCHOSAUROMORPHA  258  258.1  272.8  PEA15 
LEPIDOSAUROMORPHA  195.6  226.8  200.7  PEA15 
SAURIA  258  258.3  282.3  PEA15 
One possible hurdle to adoption of modelbased divergence dating methods in fossilonly datasets is the need to use probabilistic models for phenotypic change. Parsimony remains the most commonly employed method to estimate trees using phenotypeonly datasets and the use of probabilistic models for morphology has been questioned on both theoretical and practical grounds [
For the three datasets we examined, the BI tree topologies differed only slightly from the MP trees. For the crocodylomorphs, 26 tips require moves to match the two topologies. Most of these moves are one internode moves or resolving polytomies. For the archosaur tree, moving 11 tips match the two trees, 9 of those are resolving polytomies and the remaining 3 are one node jumps. The saurian trees are even better matched; five tip moves match the topologies, with all but one resolving a polytomy. Although both MP and BI trees possessed polytomies, the BI trees for all datasets had more polytomies than the MP tree. This finding is inline with the recent simulation study of O’Reilly et al. [
This suggests that, at least for the datasets we have examined, the utilization of the M
Results of divergence estimation across methods proved to be dataset dependent, but some general patterns are present. GLA and sGLA returned similar node estimates for the saurian and archosaur datasets with the sGLA values typically within the stratigraphic error of the GLA estimates (
Dataset  sGLA 
BEAST2 
% correction difference 


Sauria  mean  10.8  7.6  70% 
median  3.3  7.6  230%  
Crocodyliformes  mean  13.8  11.5  80% 
median  9.8  10.3  100%  
Archosauria  mean  3.7  5.0  130% 
median  2.8  4.3  150% 
This relationship did not hold for the crocodyliform dataset, where sGLA estimates tended to be considerably older than the GLA values and, at least for the nodes of interest, the sGLA lie outside the range of stratigraphic uncertainty for the oldest fossils establishing the GLA estimate (
The BEAST estimates are generally older than the sGLA estimates, but the 95% highest probability density (HPD) typically overlaps with the sGLA estimate (
The most notable deviation from the pattern of slightly older BEAST nodes estimates are with clades that consist of morphologically similar species that sit on long unsampled ghost lineages (see Gobiosuchidae node in
Because the three datasets we examined overlap taxonomically, we can see how the choice to data (both characters and taxa) affects divergence estimates.
These node ages show the common pattern that when the node is deeply nested the sGLA and BEAST estimates do not differ much and the sGLA is included in the 95% HPD (Archosauria in the basal saurian dataset; Crocodylomorpha in the basal archosaur dataset; Crocodyliformes in the basal archosaur dataset). When the node is near the root of the tree, the BEAST estimates are much older than the sGLA and have very large 95% HPD. Thus it appears that sGLA and BEAST converge on a node age when there are a large number of nested taxa influencing the estimate so long as the estimated node is not near the root. It is difficult to know which method is underperforming. It could be that sGLA significantly underestimates node age because there are no long internodes to divide the divergence time on. Conversely, BEAST may be overestimating the node age because there are not enough taxa near the root to constrain the morphological rate. This points to the notion that taxon sampling within a dataset is a more important factor for node divergence estimation than is character sampling.
For any given branch length “correction”, there is a linear relationship between the amounts of “correction” that sGLA provides and the length of the supporting branch, given that the method pulls nodes halfway down that supporting branch. Even for the zero length branches this relationship holds when one considers that the time of the previous nonzero supporting branch is being divided between the number of zero length internodes between it and the next nonzero length branch. Per internode, half of that allotted time is assigned to the internode and the other half to the adjacent pendent edge. Considered treewide sGLA “corrections” follow the hyperbolic paraboloid equation z = x/(y+1), where z = the amount of time added to the zero length branch, x = length of the prior nonzero supporting branch, and y = number of intervening zerolength internodes. Thus, short supporting branches still lead to shorter corrections and longer supporting branches lead to longer corrections. As a result of this procedure, sGLA is strictly sensitive to the distribution of sampled taxa in geologic time and the stratigraphic congruence of that sample. More precisely, the procedure is sensitive to how tightly clustered the occurrence data are in time.
This observation is borne out in the three reptile analyses we examined, which indeed show this to be a strong driver of node age correction. Both the saurian and archosaur datasets are more temporally evenly sampled treewide than the crocodyliform taxa. Their occurrences are more tightly clustered in time and proportionally more nodes are constrained by one old occurrence horizon. The crocodyliform dataset differs from this pattern with less tightly sampled species through time but with much less even sampling over the entirety of the tree (see average ghost lineage length in
Strict GLA chronogram of crocodyliform phylogeny illustrated the prevalence of extremely long unsampled lineages. All branches longer than 25 Ma have been highlighted. Blue branches are 25 Ma to 50 Ma, purple branches are 50 Ma to 75 Ma, and red branches of 75 Ma or longer.
One potential criterion that would lead one to favor the tipdating approach over sGLA would be if the method escaped the linear (hyperbolic treewide) node correction relationship. Another would be if it could ameliorate the effects of uneven temporal sampling of taxa. Tight temporal clustering of fossil occurrences often results in numerous basally diverging clades having their divergence estimate constrained by a single old occurrence point or by occurrence from a single old locality. This is evident in the three datasets we have analyzed. Basal archosauromorph nodes in the saurian dataset, basal archosauriform and archosaurian nodes in the basal archosaur dataset, and basal mesoeucrocodylian nodes in the crocodyliform dataset, all show the clear signs of minimal sGLA correction driven by a single old occurrence point (Figs
As discussed above, Bayesian inference using the M
A chronogram for crocodylomorphs using GLA shows the striking number of very long ghost lineages for the group (
Tipdating in BEAST generally performs well when compared to sGLA. Both recover reasonable divergence estimates for most nodes in densely sampled regions of the phylogeny. Tipdating appears to be correctly moderating the nodeage estimate (otherwise overestimated by sGLA) based on the limited morphological divergence of closely related taxa, sitting on long unsampled lineages. Tipdating also is not constrained by the linear relationship between branch length and divergence “correction” that otherwise constrains the sGLA method. However, tipdating provides very old node estimates near the root.
Long unsampled lineages confound both methods. In the crocodyliform dataset the sGLA values differ greatly from the GLA estimates and typically lay outside the stratigraphic error for the oldest occurrence driving the GLA estimate. Long unsampled lineages violate the constantsamplingrate assumption of the BDSS tree prior and result in the estimate of tree topologies considerable different from the BI or MP trees.
These results suggest that when choosing a suitable method for estimated fossilonly node ages, researchers should take note of the potential biases in their data, such as unevenness of geologic sampling within their study taxa, and the prevalence of long unsampled lineages. These two factors will influence the results from a BEAST or sGLA analysis, perhaps more than choices about character data and relaxed clock models. In datasets with long unsampled lineages and a limited amount of character data, strong violation of the assumptions of the BDSS prior might make the sGLA approach a reasonable, more conservative option. In cases where sampling through time is more even, the BEAST approach has the advantage of nonlinear node corrections and quantification of uncertainty in the form of 95% highest probability distributions.
The BDSS prior we have used here is a special case of a birth death skyline model (BDSKY—Stadler et al. [
We thank April Wright and Graeme Lloyd (coconveners with NJM) for organizing and providing an opportunity to participate in the 2014 SVP Berlin “Putting Fossils in Trees” workshop, which served as the genesis of this paper. We are grateful to Chris Brochu, Paul Gignac, Sarah Werning, and the Stony Brook Evolutionary Biology Discussion Group for providing valuable feedback and criticism of an earlier version of this work. Reviews from Mike Lee and one anonymous reviewer greatly improved the manuscript.