^{1}

^{2}

^{1}

^{1}

^{3}

^{4}

The authors have declared that no competing interests exist.

We apply standard demographic principles of inflows and outflows to estimate the number of undocumented immigrants in the United States, using the best available data, including some that have only recently become available. Our analysis covers the years 1990 to 2016. We develop an estimate of the number of undocumented immigrants based on parameter values that tend to underestimate undocumented immigrant inflows and overstate outflows; we also show the probability distribution for the number of undocumented immigrants based on simulating our model over parameter value ranges. Our conservative estimate is 16.7 million for 2016, nearly fifty percent higher than the most prominent current estimate of 11.3 million, which is based on survey data and thus different sources and methods. The mean estimate based on our simulation analysis is 22.1 million, essentially double the current widely accepted estimate. Our model predicts a similar trajectory of growth in the number of undocumented immigrants over the years of our analysis, but at a higher level. While our analysis delivers different results, we note that it is based on many assumptions. The most critical of these concern border apprehension rates and voluntary emigration rates of undocumented immigrants in the U.S. These rates are uncertain, especially in the 1990’s and early 2000’s, which is when—both based on our modeling and the very different survey data approach—the number of undocumented immigrants increases most significantly. Our results, while based on a number of assumptions and uncertainties, could help frame debates about policies whose consequences depend on the number of undocumented immigrants in the United States.

Immigration policy remains a hotly debated issue in the United States, with perhaps no aspect more controversial than how to address undocumented immigrants who do not have legal status. Policy debates about the amount of resources to devote to this issue, and the merits of alternative policies, including deportation, amnesty, and border control, depend critically on estimates of the number of undocumented immigrants in the U.S., which sets the scale of the issue. The most widely accepted estimate of this number currently is approximately 11.3 million [

An alternative approach to estimating the size of the undocumented population follows directly from basic demographic principles. Starting from a known population size at a given date, the population size at a future date equals the starting value plus the cumulative inflows minus the cumulative outflows. We employ this approach to estimate the number of undocumented immigrants in the U.S. for each year from 1990 to 2016, using the best available data and parameter values from the academic literature and government sources. Some of the information we use has been collected and made available only recently, so our approach is timely.

Our analysis has two main outputs. First, we generate what we call our conservative estimate, using parameter values that intentionally underestimate population inflows and overestimate population outflows, leading to estimates that will tend to underestimate the number of undocumented immigrants. Our conservative estimate for 2016 is 16.7 million, well above the estimate that is most widely accepted at present, which is for 2015 but should be comparable. Our model as well as most work in the literature indicates that the population size has been relatively stable since 2008; thus 2015 and 2016 are quite comparable. For our second step, recognizing that there is significant uncertainty about population flows, we simulate our model over a wide range of values for key parameters. These parameter values range from very conservative estimates to standard values in the literature. We sample values for each key parameter from uniform distributions over the ranges we establish. In our simulations, we also include Poisson population uncertainty conditional on parameter values, thus addressing the inherent variability in population flows. Our simulation results produce probability distributions over the number of undocumented immigrants for each year from 1990 to 2016. The results demonstrate that our conservative estimate falls towards the bottom of the probability distribution, at approximately the 2.5th percentile. The mean of the 2016 distribution is 22.1 million, which we take as the best overall estimate of the number of undocumented immigrants based on our modeling approach and current data. We also show the variability in our model based on the simulations for each year from 1990 through 2016.

The model works as follows (mathematical formulation, parameter values, and data sources underlying this model are detailed in the Supporting Information). For our conservative estimate we begin with a starting 1990 population of 3.5 million undocumented immigrants, in agreement with the standard estimate [

Population inflows are decomposed into two streams: (I) undocumented immigrants who initially entered the country legally but have overstayed their visas; and (II) immigrants who have illegally crossed the border without being apprehended. We describe our approach for each source, explain the basis for our assumptions and why they are conservative, and list parameter ranges for the simulation.

DHS [

Most experts agree that the apprehension rate was significantly lower in earlier years [

Additional facts support the view that the apprehension rate has increased in recent years. The number of border agents has increased dramatically over the timespan of our analysis [

Notwithstanding our view that we make conservative choices in setting up our model and parameter values, we acknowledge that border apprehension rates for the 1990’s are not based on as well-developed data sources as estimates for more recent years. Thus it remains a possibility that these rates are higher than we believe. One aspect of this uncertainty concerns deterrence. When deterrence is higher border crossings will fall. Most researchers believe deterrence has increased in recent years [

Population outflows are broken into four categories: (I) voluntary emigration; (II) mortality; (III) deportation; and (IV) change of status from unauthorized to lawful.

For our simulation analysis we divide first-year voluntary emigration into two categories, visa overstayers and illegal border crossers. For visa overstayers we assume the first-year rate falls in the range [.25,.50] (uniform) for each year; based on the discussion in the preceding paragraph and literature cited there, this is a relatively conservative range with midpoint 37.5% above nearly all accepted estimates. For illegal border crossers there is data indicating that first-year voluntary emigration rates vary across cohorts [

An important issue is circular flow of migrants, which refers to individuals who enter the country, then exit temporarily and re-enter a short time later. There is limited numerical data for circular flow rates. However, it is logical and recognized in the literature [

Lastly,

Our simulation is designed to evaluate the range of outcomes the model produces, thus taking into account important sources of variability. There are two main sources of uncertainty: parameter uncertainty, and inherent population variability conditional upon a set of parameter values. We take both sources into account, but note that the first source is the main factor contributing to the variability of the population distribution in the model.

We address parameter uncertainty by establishing ranges for key parameters. As documented above, these key parameters are (i) the visa overstay rate; (ii) the border apprehension rate for individuals attempting to cross the border illegally; (iii) the voluntary emigration rate, which is set separately for illegal border crossers and visa overstayers for the first year and then jointly for years 2-10 and years 10 and above, and for which we establish a cohort-specific range for each annual cohort for the first-year rate for illegal border crossers; and (iv) the mortality rate. For each parameter, we establish a uniform distribution over the set range (and impose a negative correlation between the border apprehension rate and first-year voluntary emigration rate for illegal border crossers). Then, in each simulation run we sample a value for each parameter from its underlying distribution. All of the ranges for the parameter distributions have been specified in the preceding sections. We also sample a value for the initial population of undocumented immigrants in 1990 from a Poisson distribution with a mean of 3.5 million, the most widely accepted estimate of the population of undocumented immigrants as of that date. See the Supporting Information for further details.

To model inherent population uncertainty given a set of parameter values, we impose a Poisson structure on our model. Specifically, the population in a particular year, conditional on a set of parameter values, is represented as the sum of all individuals who have entered the country in previous years and have remained in the country from their year of arrival until the particular year in question. The number of entries (in Poisson terminology, arrivals) in any year is drawn from a Poisson distribution with mean dependent upon the underlying parameter values governing apprehension probabilities and visa overstays for that year, while the probability that a new immigrant remains in the country from entry until the particular year in question is determined based on the parameters governing voluntary emigration, mortality, deportation and change-of-status rates. It follows (see the Supporting Information for mathematical details) that the number of individuals who enter the country in any given year and are still in the country at some future date will also follow a Poisson distribution. Further, the number of individuals who enter in any given year and remain in the country at a future time can be considered to be statistically independent given the underlying parameter values (see the Supporting Information for details). Thus, the population of undocumented immigrants in a particular year, which is the sum of those who have entered in past years and are still in the country in the particular year in question, also follows a Poisson distribution, for the sum of independent Poisson random variables is itself Poisson distributed.

We ran 1,000,000 trials simulating the model. For each trial we recorded the total number of undocumented immigrants predicted to be in the U.S. in each year from 1990 through 2016 for that trial.

Following suggestions made by the Academic Editor based on comments made by a reviewer, we performed an additional set of simulations making even more conservative assumptions about net inflows over the period 1990-98. This is the period for which there is significant uncertainty about net inflows of undocumented immigrants. Specifically, we calibrated the model such that the net inflows are half a million per year over this period (in line with the residual method’s estimates during this period) and computed the pooled number of undocumented immigrants at the end of 1998 based on this approach. We then simulated our model forward from that point using the same framework described above.

The results of our analysis are clear: The number of undocumented immigrants in the United States is estimated to be substantially larger than has been appreciated at least in widely accepted previous estimates. Even an estimate based on what we view as conservative assumptions, in some cases unrealistically so, generates an estimate of 16.7 million, well above the conventional estimate of 11.3 million. The mean of our simulations, which range over more standard but still conservative parameter values, is 22.1 million, essentially twice the current widely accepted estimate; the ninety-five percent probability interval is [16.2,29.5].

Even for the scenario presuming net inflows of 0.5 million per year for 1990-98 our results still exceed the current estimates substantially. The mean estimate is 17.0 million with a 95% probability interval of 13.5 million to 21.1 million. The conservative estimate for this scenario is 14.0 million, still significantly above the widely accepted estimate of 11.3 million.

It is currently fairly widely accepted that there are approximately 11 million undocumented immigrants in the United States. This estimate, derived from population surveys and legal immigration records, has formed the backdrop for the immigration policy debate in the United States. Using a different approach grounded in operational data, and demographic and mathematical modeling, we have arrived at higher estimates of the undocumented immigrant population.

A possible explanation for the discrepancy in these results is that the survey-based approach taken in [

Our approach, summarized above and detailed in the Supporting Information, is grounded in fundamental principles of demographic flows. The size of any population can be represented as its initial value plus cumulative inflows minus cumulative outflows. We have specialized this approach to the number of undocumented immigrants in the United States, and have drawn upon previously unavailable data. From border apprehensions and visa overstays, it is possible to infer the number of new undocumented arrivals by reversing the flow: how many new arrivals are necessary in order to see the number of apprehensions and visa overstayers observed? Similarly, consideration of deportations, voluntary emigration, mortality and change-of-status enables one to infer the duration of stay in the country from the time of arrival. Together, this logic enables reconstructing the arrival and departure processes governing population inflows and outflows that result in the population of undocumented immigrants in the country.

In developing estimates we have attempted to utilize parameter values that understate inflows and overstate outflows. Our results are most sensitive to the assumptions we make about the probability of border apprehension and the voluntary emigration rates of undocumented immigrants leaving the United States. Further research could explore in greater detail the impact of assumptions about these parameters on estimates of the number of undocumented immigrants. To explore the uncertainty of our estimates we have conducted extension simulations over parameters, simulating 1 million different population trajectories; further research could widen the ranges of parameters and consider additional parameter uncertainty. Further research could also analyze inflows and outflows based on country of origin.

Our results lead us to the conclusion that the widely accepted estimate of 11.3 million undocumented immigrants in the United States is too small. Our model estimates indicate that the true number is likely to be larger, with an estimated ninety-five percent probability interval ranging from 16.2 to 29.5 million undocumented immigrants.

Contains the mathematical model, parameter values, and data sources underlying the model.

(PDF)

The spreadsheet used to calculate the conservative estimate.

(XLSX)