^{1}

^{*}

^{1}

^{2}

^{1}

^{1}

DM is associated to Proteus Wildlife Research Consultants. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

Conceived and designed the experiments: GGA JJLM DM. Analyzed the data: GGA JJLM. Wrote the paper: GGA JJLM DM BW MM. Interpreted the results: GGA JJLM DM BW MM.

In a recent paper, Welsh, Lindenmayer and Donnelly (WLD) question the usefulness of models that estimate species occupancy while accounting for detectability. WLD claim that these models are difficult to fit and argue that disregarding detectability can be better than trying to adjust for it. We think that this conclusion and subsequent recommendations are not well founded and may negatively impact the quality of statistical inference in ecology and related management decisions. Here we respond to WLD's claims, evaluating in detail their arguments, using simulations and/or theory to support our points. In particular, WLD argue that both disregarding and accounting for imperfect detection lead to the same estimator performance regardless of sample size when detectability is a function of abundance. We show that this, the key result of their paper, only holds for cases of extreme heterogeneity like the single scenario they considered. Our results illustrate the dangers of disregarding imperfect detection. When ignored, occupancy and detection are confounded: the same naïve occupancy estimates can be obtained for very different true levels of occupancy so the size of the bias is unknowable. Hierarchical occupancy models separate occupancy and detection, and imprecise estimates simply indicate that more data are required for robust inference about the system in question. As for any statistical method, when underlying assumptions of simple hierarchical models are violated, their reliability is reduced. Resorting in those instances where hierarchical occupancy models do no perform well to the naïve occupancy estimator does not provide a satisfactory solution. The aim should instead be to achieve better estimation, by minimizing the effect of these issues during design, data collection and analysis, ensuring that the right amount of data is collected and model assumptions are met, considering model extensions where appropriate.

Species occupancy is a state variable widely used in ecology. It can be defined as the proportion of sites where the target species is present (or in terms of the underlying probability), and is relevant to monitoring programs and the study of species distributions. Models that allow its estimation while simultaneously accounting for imperfect detection are available and have become increasingly used over the past decade

In a paper in this journal

WLD support their criticisms of hierarchical occupancy modelling by stating that these models lead to boundary estimates, “multiple solutions” and imprecise estimators of occupancy and detectability if the sample size is small. While we agree that estimator quality deteriorates with decreasing sample size, which is true for any type of statistical model, this does not justify general claims about lack of utility of hierarchical occupancy models. WLD select a few scenarios to justify their argument that disregarding detectability can be a better approach than explicitly modelling it. Using a more comprehensive analysis, including additional parameter values and methods of assessing the performance of the two approaches, we will demonstrate the true value of hierarchical occupancy models and how they outperform estimators that ignore detectability.

Despite suggestions by WLD to the contrary, the performance of single-species single-season occupancy models has been previously evaluated in the literature. For instance, in presenting the model,

Before moving to more specific comments in the next section, we clarify that, contrary to the assertion by WLD,

We make five main points, which are supported by evidence from simulation results and mathematical derivations. Following WLD, we ran simulations for a scenario (hereafter Scenario A1) where occupancy was

We simulated 5000 data sets per scenario, and fitted hierarchical occupancy models with the package

Following WLD, we ran a set of simulations in which detectability at each site within a covariate category was a random variable rather than constant (but the hierarchical model fitted still assumed that detectability was constant within each category, following a logistic regression as above, i.e.

Occupancy probability | Detection probability | |

Scenario A1* | ||

Scenario A2 | ||

Scenario B1* | ||

Scenario B2 | ||

Scenario B3 |

WLD state that hierarchical occupancy models often lead to boundary estimates (i.e. estimates that take value 0 or 1) and suffer from multiple solutions. Boundary estimates are only a problem when the sample size is small (relative to the sparseness of the data) and occupancy estimates of 1 can be obtained even if the true underlying occupancy is low

WLD present the system of equations that the maximum-likelihood estimates (MLEs) satisfy and support their claim that boundary estimates are often a problem in hierarchical occupancy models by observing that all sites being occupied (

Consider for simplicity the model without covariates. Let

This example corresponds to a constant hierarchical occupancy model and a data set where _{d}_{T}

(a) Results from WLD | ||||||||

all 0 | some 0 | some 0&1 | some 1 | all 1 | interior | total | ||

all 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

some 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

some 0&1 | 0 | 0 | 9 | 0 | 0 | 0 | 9 | |

some 1 | 48 | 1 | 11 | 0 | 0 | 0 | 60 | |

all 1 | 62 | 0 | 0 | 0 | 0 | 0 | 62 | |

interior | 10 | 0 | 21 | 57 | 12 | 4769 | 4869 | |

Total | 120 | 1 | 41 | 57 | 12 | 4769 | 5000 |

(b) Our results | ||||||||

all 0 | some 0 | some 0&1 | some 1 | all 1 | interior | total | ||

all 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

some 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

some 0&1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

some 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

all 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |

interior | 0 | 0 | 4 | 104 | 0 | 4892 | 5000 | |

total | 0 | 0 | 4 | 104 | 0 | 4892 | 5000 |

The simulation results presented by WLD for Scenario A1 show occupancy estimates

WLD claim that obtaining multiple solutions to the system of likelihood equations is a problem in hierarchical occupancy models. To evaluate the extent to which multiple solutions are indeed a problem for model fitting, we reran our Scenario 1 simulations for the smaller sample size (

In 98.5% of the simulations,

The optimization algorithm used by

When imperfect detection is disregarded, the metric being estimated is no longer species occupancy (

The first three columns correspond to the hierarchical model: in column 1 estimates of occupancy probability

For details in figure arrangement see

When interpreted as an estimator of species occupancy, the naïve model is biased whenever overall detection is imperfect (i.e.

As we increase the number of survey sites

(a) Scenario A1 | |||

NA/0.042 | NA/0.039 | NA/0.038 | |

0.047/0.017 = 2.82 | 0.019/0.013 = 1.44 | 0.011/0.011 = 0.94 | |

0.017/0.011 = 1.62 | 0.007/0.007 = 1.08 | 0.004/0.005 = 0.84 | |

0.011/0.009 = 1.22 | 0.005/0.005 = 1.04 | 0.003/0.004 = 0.97 | |

0.010/0.009 = 1.10 | 0.005/0.005 = 1.04 | 0.003/0.003 = 1.02 |

(b) Scenario A2 | |||

NA/0.413 | NA/0.411 | NA/0.411 | |

0.075/0.271 = 0.28 | 0.052/0.267 = 0.19 | 0.043/0.268 = 0.16 | |

0.052/0.182 = 0.28 | 0.034/0.177 = 0.19 | 0.027/0.177 = 0.16 | |

0.038/0.124 = 0.31 | 0.025/0.119 = 0.21 | 0.019/0.118 = 0.16 | |

0.030/0.085 = 0.36 | 0.019/0.081 = 0.24 | 0.014/0.080 = 0.18 |

WLD present Scenario A1 (

Accounting for imperfect detection does not necessarily require increased survey effort. However, it is necessary to collect survey data in such a way that the detection process can be modelled

WLD present as a difficulty the need for “extra data collection […] to adjust for non-detection”. However, inconsistently, when fitting the naïve logistic model in their simulations, WLD use the data corresponding to the full sampling effort (collapsing the replicate records as we do). If WLD choose to associate the additional replicate surveys as a complication introduced by modelling detectability, then a fair comparison would have fitted the naïve model to the data from a single replicate survey per site. We include these results (denoted “

WLD's stated key result is that “when the detection process depends on abundance, the bias in the fitted probabilities can be of similar magnitude to the bias when the detection process is ignored, and this is very difficult to overcome”. They also point out that increasing the sample size (

A close inspection of the scenario simulated by WLD (hereafter Scenario B1) reveals the root of this contradiction. As in the previous example, occupancy was set constant for all sites (

Lines correspond to

(a) Scenario B1 | |||

NA/0.046 | NA/0.040 | NA/0.038 | |

0.028/0.032 = 0.87 | 0.018/0.025 = 0.72 | 0.017/0.023 = 0.73 | |

0.021/0.026 = 0.82 | 0.016/0.020 = 0.82 | 0.015/0.017 = 0.84 | |

0.019/0.023 = 0.86 | 0.015/0.017 = 0.88 | 0.013/0.014 = 0.90 | |

0.018/0.020 = 0.89 | 0.013/0.015 = 0.92 | 0.012/0.013 = 0.93 |

(b) Scenario B2 | |||

NA/0.041 | NA/0.041 | NA/0.041 | |

0.022/0.022 = 0.97 | 0.013/0.021 = 0.60 | 0.009/0.020 = 0.44 | |

0.014/0.016 = 0.93 | 0.008/0.013 = 0.63 | 0.006/0.012 = 0.51 | |

0.012/0.012 = 0.96 | 0.007/0.009 = 0.71 | 0.005/0.008 = 0.61 | |

0.011/0.011 = 0.98 | 0.006/0.007 = 0.80 | 0.004/0.006 = 0.70 |

(c) Scenario B3 | |||

NA/0.148 | NA/0.156 | NA/0.158 | |

0.029/0.068 = 0.42 | 0.024/0.072 = 0.34 | 0.020/0.071 = 0.28 | |

0.019/0.038 = 0.49 | 0.016/0.040 = 0.39 | 0.013/0.038 = 0.34 | |

0.014/0.025 = 0.58 | 0.011/0.024 = 0.46 | 0.009/0.023 = 0.40 | |

0.011/0.017 = 0.66 | 0.009/0.016 = 0.54 | 0.007/0.015 = 0.46 |

So, what would happen if we consider a different and plausible scenario? Let us consider an example where

For details in figure arrangement see

The same conclusions can be drawn from our detailed exploration of a wide range of heterogeneity scenarios (from none to extreme), assuming a single covariate category (

In the data-generating model, occupancy is constant and detectability at each site is drawn from a single distribution

In summary, based on our simulations and theoretical results, we can conclude that the hierarchical model is also less biased than the naïve model when detectability varies across sites, for instance as a result of variation in abundance. We also note that there are extensions of the hierarchical occupancy model that explicitly allow heterogeneous detection probabilities

WLD present an overly negative picture of the performance of hierarchical occupancy models, questioning their value to the extent of suggesting that in general “ignoring non-detection can actually be better than trying to adjust for it”. Disregarding detectability implies modelling ‘where the species is

WLD claim that “the extra data collection and modelling effort to try to adjust for non-detection is simply not worthwhile”. However, modelling detectability is not as hard as WLD would have readers believe. One does not necessarily need more sampling effort; instead the data need simply be collected and recorded in a way that is informative about the detection process

WLD partly support their argument by pointing out that hierarchical occupancy models can produce estimates that are imprecise or at the boundary of the parameter space and that they can have problems with multiple solutions when the sample size is small. We have shown why we believe that WLD have overstated the severity of these issues. It is undeniable that, as with any type of statistical model, estimator performance will degrade as the sample size decreases, but in itself this does not justify discarding a method (this would suggest abandoning all statistical inference). Sample sizes can be too small to robustly infer species occupancy but disregarding detectability does not solve this situation.

We believe that accounting for detectability is important as otherwise it is impossible to know whether the “occupancy” estimates (even if precise) are accurate or not. In the naïve model, the occupancy and detection processes are confounded. One can find examples where disregarding detectability leads to estimators with better properties (in terms of MSE) but, since the same data can be produced by very different occupancy-detection scenarios, as shown in our simulations, we can never be confident that the naïve estimates reflect true occupancy unless detection is known to be perfect.

In contrast, the hierarchical occupancy model separates the occupancy and detection processes. If overall detection is nearly perfect (i.e., at occupied sites the probability of at least one detection for

WLD's key result is that hierarchical occupancy models do not perform any better than the naïve model when detectability depends on abundance, regardless of the amount of survey data, and that this “undermines the rationale for occupancy modelling”. We have shown that their result arises from a particular choice and limited interpretation of a specific scenario, which involves occupied sites in which the species is virtually undetectable while detectability is relatively large for other sites. We have demonstrated how the hierarchical model clearly outperforms the naïve model in other scenarios where heterogeneity is still substantial. We also show how this difference in performance is more apparent as the number of sites increases even if some bias remains when the number of sites is large. The basic hierarchical model is asymptotically biased when there is unaccounted heterogeneity in detection, as already pointed out by

In conclusion, although we fully agree with WLD about the need to be honest about the limitations of statistical procedures, we do not share their opinion that accounting for detectability is “very difficult” in general and that it is better to disregard the fact that detection can, and usually will, be imperfect. The difficulty is not so much in the modelling of detectability, but in imperfect detection itself. We do not claim that the modelling stage is straightforward. Indeed coming up with useful models for real data can be highly challenging. There will be cases for which meaningful parameter estimates cannot be obtained with the available data regardless of one's statistical skills. Unfortunately, and as much as one may desire it, naïve estimates are not a solution to this problem.

(PDF)

(PDF)

(PDF)

(ZIP)

The authors thank Byron Morgan, Martin Ridout and Marc Kéry for useful comments.