^{1}

^{1}

^{1}

^{2}

^{1}

^{3}

^{*}

The authors declare receipt of funding from The Boeing Company. There are no other declarations relating to employment, consultancy, patents, products in development or marketed products. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

Developed the concept of the study: JET JDF. Conceived and designed the experiments: JET JDF BN. Performed the experiments: BN. Analyzed the data: QMB. Wrote the paper: JET JDF BN.

Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation.

Innovation is by definition new and unexpected, and might therefore seem inherently unpredictable. But if there is a degree of predictability in technological innovation, understanding it could have profound implications. Such knowledge could result in better theories of economic growth, and enable more effective strategies for engineering design, public policy design, and private investment. In the area of climate change mitigation, the estimated cost of achieving a given greenhouse gas concentration stabilization target is highly sensitive to assumptions about future technological progress

There are many hypotheses about technological progress, but are they any good? Which, if any, hypothesis provides good forecasts? In this paper, we present the first statistically rigorous comparison of competing proposals.

When we think about progress in technologies, the first product that comes to mind for many is a computer, or more generally, an information technology. The following quote by Bill Gates captures a commonly held view: “Exponential improvement – that is rare – we've all been spoiled and deeply confused by the IT model”

It is not possible to quantify the performance of a technology with a single number

We test six different hypotheses that have appeared in the literature

The dependent variable

Another hypothesis is due to Goddard

We also consider the three multi-variable hypotheses in Eq. (1): Nordhaus

We test these hypotheses on historical data consisting of 62 different technologies that can be broadly grouped into four categories: Chemical, Hardware, Energy, and Other. All data can be found in the online Performance Curve Database at pcdb.santafe.edu. The data are sampled at annual intervals with timespans ranging from 10 to 39 years. The choice of these particular technologies was driven by availability – we included all available data, with minimal constraints applied, to assemble the largest database of its kind.

The data was collected from research articles, government reports, market research publications, and other published sources. Data on technological improvement was used in the analysis if it satisfied the following constraints: it retained a functional unit over the time period sampled, and it included both performance metric (price or cost per unit of production) and production data for a period of at least 10 years, with no missing years in between. This inclusive approach to data gathering was required to construct a large dataset, which was necessary to obtain statistically significant results. The resulting 62 datasets are described in detail in File S1.

These datasets almost certainly contain significant measurement and estimation errors, which cannot be directly quantified and are likely to increase the error in forecasts. Including many independent data sets helps to ensure that any biases in the database as a whole are random rather than systematic, minimizing their effects on the results of our analysis of the pooled data.

To compare the performance of each hypothesis we use hindcasting, which is a form of cross-validation. We pretend to be at time

The quality of forecasts is examined for all datasets and all hypotheses (and visualized as a three-dimensional error mountain, as shown in File S1). For Wright's law, an illustration of the growth of forecasting errors as a function of the forecasting horizon is given in

The mean value of the logarithmic hindcasting error for each dataset is plotted against the hindcasting horizon

An alternative to our approach is to adjust the intercepts to match the last point. For example, for Moore's law this corresponds to using a log random walk of the form

Developing a statistical model to compare the competing hypotheses is complicated by the fact that errors observed at longer horizons tend to be larger than those at shorter horizons, and errors are correlated across time and across functional forms. After comparing many different possibilities (as discussed in detail in File S1), we settled on the following approach. Based on a search of the family of power transformations, which is known for its ability to accommodate a range of variance structures, we take as a response the square root transformation of the logarithmic error. This response was chosen to maximize likelihood when modeled as a linear function of the hindcasting horizon

Specifically, we use the following functional form to model the response:

The parameters

In order to avoid adding 62

Finally, we add an

We also define an exponential correlation structure within each error mountain (corresponding to each combination of dataset and hypothesis, see File S1), as a function of the differences of the two time coordinates with a positive range parameter

Using this statistical model, we compared five different hypotheses. (We removed the Nordhaus model from the sample because of poor forecasting performance

We fit the error model to the

The plot shows the predicted root absolute log error

The error model allows us to compare each hypothesis pairwise to determine whether it is possible to reject one in favor of another at statistically significant levels. The comparisons are based on the intercept and slope of the error model of Eq. (6). The parameter estimates are listed in Tables S1 and S3 in File S1 and the corresponding

We thus have the surprising result that most of the methods are quite similar in their performance. Although the difference is not large, the fact that we can eliminate Goddard for short term forecasts indicates that there is information in the cumulative production not contained in the annual production, and suggests that there is a learning effect in addition to economies of scale. But the fact that Goddard is not that much worse indicates that much of the predictability comes from annual production, suggesting that economies of scale are important. (In our database, technologies rarely decrease significantly in annual production; examples of this would provide a better test of Goddard's theory.) We believe the SKC model performs worse at long times because it has an extra parameter, making it prone to overfitting.

Although Moore performs slightly worse than Wright, given the clear difference in their economic interpretation, it is surprising that their performance is so similar. A simple explanation for Wright's law in terms of Moore's law was originally put forward by Sahal

We have chosen these examples to be representative: The top row contains an example with one of the worst fits, the second row an example with an intermediate goodness of fit, and the third row one of the best examples. The fourth row of the figure shows histograms of

We test this in

The value of the Wright parameter

The differences in the data sets can be visualized by plotting

The data-specific contribution to the slope,

To illustrate the practical usefulness of our approach we make a forecast of the cost of electricity for residential scale photovoltaic solar systems (PV).

The solid line is the expected forecast and the dashed line is the expected error.

The expected PV cost in 2020, shown in _{2} emissions

The costs of other technologies can be forecasted in a similar way, using historical data on the cost evolution to project future performance. The expected error in this forecast is calculated using our error model (Eq. (6)). The error is determined for each future year

Our primary goal in this paper is to compare the performance of proposed models in the literature for describing the cost evolution of technologies. Our objective is not to construct the best possible forecasting model. Nonetheless we outline above the steps one would take in making a forecast in order to demonstrate the utility of the general approach we develop, which centers on analyzing a large, pooled database, and estimating the expected, time horizon-dependent error associated with a given forecasting model. This approach can be applied to other forecasting models in the future.

The key postulate that we have made in this paper is that the processes generating the costs of technologies through time are generic except for technology-specific differences in parameters. This hypothesis is powerful in allowing us to view any given technology as being drawn from an ensemble. This means that we can pool data from different technologies to make better forecasts, and most importantly, make error estimates. This is particularly useful for studying technology trends, where available data is limited. Of course we must add the usual caveats about making forecasts – as Niels Bohr reputedly said, prediction is very difficult, especially of the future. Our analysis reveals that decreasing costs and increasing production are closely related, and that the hypotheses of Wright and Moore are more similar than they might appear. We should stress, though, that they are not the same. For example, consider a scenario in which the exponential rate of growth of PV production suddenly increased, which would decrease the current production doubling time of roughly 3 years. In this case, Wright predicts that the rate at which costs fall would increase, whereas Moore predicts that it would be unaffected. Distinguishing between the two hypotheses requires a sufficient number of examples where production does not increase exponentially, which our current database does not contain. The historical data shows a strong tendency, across different types of technologies, toward constant exponential growth rates. Recent work, however, has demonstrated super-exponential improvement for information technologies over long time spans

(PDF)

We thank all contributors to the Performance Curve Database (pcdb.santafe.edu).