The authors have declared that no competing interests exist.
Analyzed the data: DWC. Contributed reagents/materials/analysis tools: DWC SC CL. Wrote the paper: DWC SC CL. Assembled data: DWC.
Estimates of recreational fishing harvest are often unavailable until after a fishing season has ended. This lag in information complicates efforts to stay within the quota. The simplest way to monitor quota within the season is to use harvest information from the previous year. This works well when fishery conditions are stable, but is inaccurate when fishery conditions are changing. We develop regressionbased models to “nowcast” intraseasonal recreational fishing harvest in the presence of changing fishery conditions. Our basic model accounts for seasonality, changes in the fishing season, and important events in the fishery. Our extended model uses Google Trends data on the internet search volume relevant to the fishery of interest. We demonstrate the model with the Gulf of Mexico red snapper fishery where the recreational sector has exceeded the quota nearly every year since 2007. Our results confirm that data for the previous year works well to predict intraseasonal harvest for a year (2012) where fishery conditions are consistent with historic patterns. However, for a year (2013) of unprecedented harvest and management activity our regression model using search volume for the term “red snapper season” generates intraseasonal nowcasts that are 27% more accurate than the basic model without the internet search information and 29% more accurate than the prediction based on the previous year. Reliable nowcasts of intraseasonal harvest could make inseason (or inyear) management feasible and increase the likelihood of staying within quota. Our nowcasting approach using internet search volume might have the potential to improve quota management in other fisheries where conditions change yeartoyear.
All federallymanaged saltwater fisheries in the United States have a “hard” cap or quota on harvest and measures to prevent the quota from being exceeded. The commercial and recreational fishing sectors are held accountable for exceeding their share of the quota with adjustments in the following year [
We introduce a regression approach that uses information on fisheryrelated internet search volume to provide more timely intraseasonal predictions of recreational harvest. There is a growing literature showing that the internet search volume on a particular topic (e.g., unemployment insurance) can be used to predict current levels of policyrelevant variables (e.g., unemployment rates) [
We demonstrate the harvest nowcasting approach with the Gulf of Mexico recreational red snapper fishery. This fishery has been particularly difficult to manage with progressively shortening seasons in the presence of changes in effort and an increasing average fish size. The recreational sector has overharvested red snapper in every year from 2007 to 2013 with the exception of 2010, when the Deepwater Horizon (DWH) oil spill forced a mandatory closure of prime fishing grounds during the busy summer season.
NOAA fisheries forecasts recreational harvest of red snapper and other key species several months in advance in order to set fishing seasons [
Our recreational fishing harvest data series comes from the Annual Catch Limit (ACL) database assembled by the NOAA Southeast Fisheries Science Center. This database is used to monitor recreational harvest as the season proceeds in twomonth waves and is compiled from three separate surveys. The NOAA MRIP provides bimonthly (wave) estimates of fish harvested by forhire charter and private boats fishing in the marine waters of Florida, Alabama, Mississippi, and Louisiana. Preliminary wave estimates from the MRIP are typically available 45days after the end of the wave and final estimates for the previous year are usually released during April. Harvest estimates from private and charter boats fishing in the marine waters of Texas are provided by the Texas Parks and Wildlife Division (TPWD) Creel Survey. The TPWD estimates can be delayed anywhere from six months to a year or more. The NOAA Southeast Headboat Survey (HBS) provides estimates of harvest from head boats in the marine waters of all states in the Gulf of Mexico. The HBS estimates are also often delayed by more than a year.
Our analysis uses bimonthly (wave) estimates of aggregate recreational fishing harvest (whole weight) of red snapper in the Gulf of Mexico between 2004 and 2013. In 2013, MRIP implemented an improved sampling design for the onsite Access Point Angler Intercept Survey (APAIS) that is used to estimate the mean harvest per angler fishing trip. Revised harvest estimates for 2013 were released in December of 2014 along with recalibrated estimates of harvest for 2004 to 2012. However, the nowcasting thought experiment used to test competing forecast models used in this analysis requires that we develop models using only the information available at the time of the forecast.
The seasonal and trend Loessdecomposition of the harvest series is displayed in
The historic red snapper fishing seasons and bag limits for the federal waters of the Gulf of Mexico are shown in
Previous projections estimated landings per day (in numbers) during summer when tourism is high and weather tends to be better than fall. Because the Gulf Council is proposing to reopen in September or October, landings per day are not expected to be as high as summer due to lower tourism, more severe weather, and children returning to school. Because the season has not been open in fall since 2010, and prior to then not since 2007, it is difficult to predict fall landings per day…Season lengths presented herein are contingent on previous projections accurately estimating the length of the fishing season under a 4.145 mp quota. Given inconsistent regulations and historical overages of the quota, there is potential that the summer season length (Jun 1Jun 28) was too long and may result in a quota overage. Similarly, given the short duration of the season, it may have been too short and the quota might not have been met. If landings during summer are greater than the current 4.145 mp quota, then season lengths presented herein will be overestimated. If landings during summer are less than the current 4.145 mp quota, then average weights and/or landings per day have been overestimated, and season lengths presented herein would be underestimated.
Year  Season Start  Season End  Bag Limit  Days Open 

2004  0421  1031  4  190 
2005  0421  1031  4  190 
2006  0421  1031  4  190 
2007  0426  1031  2  185 
2008  0601  0805  2  64 
2009  0601  0815  2  74 
2010  0601  0724  2  52 
1001  1121 (Fri,Sat,Sun only)  2  24  
2011  0601  0719  2  47 
2012  0601  0717  2  45 
2013  0601  0629  2  28 
1001  1014  2  14 
Note that the process to reopen a season is a lengthy administrative process. Thus, the harvest estimates for the MayJun wave of 2013 were not available when the decision to reopen the fall season was made. It was assumed that the quota was harvested exactly during the summer season. Data from 2011 and 2012 were used to project the harvest information necessary to set the fall season.
Though not shown in
Google provides two tools to examine the periodic volume of queries that users enter into the Google internet search engine. The first, Google Trends, allows users to download monthly or weekly indices of the web search volume for a particular term. The index value for each period is calculated by dividing by the total count for all queries in that period from the U.S. and then scaled such that the highest volume in the series is assigned 100 and the lowest volume is assigned 0. Importantly, the normalization controls for the growth in all Internet search use over time.
The second Google tool, Google Correlate, returns the search volume for any term along with the volume of searches for the top correlated terms. The tool will also return the top internet search terms that is correlated with any data series you enter. All series from the Google Correlate tool are further standardized to have a mean value of zero and a variance of one [
We downloaded the standardized monthly series from 2004 to 2013 for the term “red snapper season” in the U.S. from the Google Correlate tool. This series contains more information than the Google Trends series because Google Trends sets the search volume to 0 for periods where the volume is below a certain (unspecified) threshold. In this case, the index for the term “red snapper season” was 0 in most months prior to 2011 in the Google Trends series, but not the Google Correlate series.
There are many other search terms that might be correlated with red snapper harvest. Some are too general (e.g., red snapper) because the terms refer to behavior not related to fishing (e.g., recipes). Other terms (e.g., red snapper bag limit) are too specific because there simply is not enough search volume. We started with the most obvious term, “red snapper season”, and it worked reasonably well. As noted, above, the Google Correlate tool will return the top internet search terms correlated with any data series you enter.
Correlation  Search Term 

0.8006  coleslaw recipes 
0.7932  stray kittens 
0.7918  six flags over ga 
0.7880  nebraska softball 
0.7857  behr deck stain 
0.7732  steel cooler 
0.7731  kill poison ivy 
0.7695  poisonous snakes 
0.7635  sand filter 
0.7612  king island 
0.7607  red spider 
0.7547  injured bird 
0.7541  sandusky weather 
0.7539  deck coating 
0.7539  delonghi pinguino 
0.7533  deadhead 
0.7532  citronella 
0.7529  cookout ideas 
0.7519  ice chest 
0.7510  hayward super pump 
Our harvest data is in bimonthly periods and the internet search volume data is monthly. Rather than aggregating the monthly internet search volume data to bimonthly observations, we chose to consider the search volume of both months in a twomonth period separately in our discussion and models.
There is approximately a two month lag in the preliminary estimates of the recreational harvest of red snapper in the Gulf of Mexico; that is, the earliest estimated harvest for the twomonth period just completed is available at the end of the following twomonth period. Therefore, managers do not know whether the quota has been exceeded at the end of a period until two months later. We consider three different approaches to nowcast the harvest level of the period just completed. The approaches are based on three different forecasting models. The first model is a naive prediction that assumes the current period harvest is the same as the harvest level in the same period of the previous year. This nowcasting approach typically works well for series with stable seasonality [
Our next model addresses the seasonal pattern evident in the harvest time series (see
We evaluate the fit of each of the three model specifications and the suitability for forecasting using data from 2004 through 2011. The data for 2012 and 2013 are used later to evaluate the ability of the best fitting specification to nowcast the harvest in each twomonth period of 2012 and 2013 with days open to red snapper fishing. Model specification goodnessoffit is examined using the adjusted
The performance of each model in nowcasting harvest is examined by comparing the actual harvest in periods of 2012 and 2013 that have some days open for red snapper fishing with the model predictions for these periods. For any given model the harvest for each period is predicted by estimating the model with data up to the period with open days and then calculating a onestepahead forecast. For example, the harvest nowcasts for MayJun of 2012 is based on the models estimated using data up to MarApr of 2012. Note that in this example the third (complete) model would use the internet search volume information for MayJun to help predict MayJun harvest. We compare the nowcasting ability of the second and third models to evaluate the potential difference in nowcasts with and without the internet search volume information. This follows the approach used by others who have examined whether search volume can improve the best forecasts available using the data available at the time of forecast [
As a check on the policy relevance of the modeling results we examine the ability of a model to determine whether the harvest has exceeded the quota during a period with open days. Two performance measures are examined. For each model, the first performance measure takes the cumulative harvest from the previous periods and adds the onestepahead forecast for the current period, i.e.,
The parameter estimates and diagnostics for the three potential forecasting models using the data through 2011 are shown in
Naive  Without Search  With Search  

Intercept  0.000 (0.000)  0.454 (0.241)  1.354 
MarApr  0.315 (0.168)  0.185 (0.156)  
MayJune  1.815 
1.157 

JulAug  1.799 
0.919 

SepOct  0.621 
0.119 (0.227)  
NovDec  0.129 (0.172)  0.048 (0.131)  
Harvest(t1)  0.098 (0.127)  0.021 (0.092)  
Harvest(t6)  0.212 (0.165)  0.057 (0.148)  
Deepwater Horizon  0.990 
1.360 

Days Closed in Wave  0.007 (0.004)  0.023 

Search in 1st Month in Wave  0.148 (0.434)  
Search in 2nd Month in Wave  0.890 

Search in 1st Month x Days Closed  0.001 (0.010)  
Search in 2nd Month x Days Closed  0.018 

R^{2}  0.581  0.863  0.942 
Adj. R^{2}  0.581  0.830  0.920 
Num. obs.  48  48  48 
RMSE  0.778  0.321  0.220 
Residual mean  0.029  0.000  0.000 
BoxLjung pvalue  0.000  0.418  0.508 
ShapiroWilk pvalue  0.000  0.000  0.006 
BreuschPagan pvalue  0.000  0.473  0.190 
***
**
*
The coefficients are scaled to million pounds.
The residual mean in all models, except the naive, is zero indicating that the models can produce unbiased forecasts. The null of independence in the LjungBox test cannot be rejected at the 95% level in all models, except the naive. The null of normality in the ShapiroWilk test is rejected at the 95% level in all models. The null of a constant residual variance (homoscedasticity) in the BreuschPagan test cannot be rejected at the 95% in any model, except the naive. Thus, the complete model comes close to satisfying all of the tests for a good forecast model, passing the tests for constant residual variance and residual independence, but not the test for residual normality. The normality of the residuals is not essential, but does make calculating confidence intervals easier. In any case, the complete model specification including the seasonality, the fishery closures, and the internet search volume appears to fit the historic data best and provides the best forecasting properties. Based on this finding we proceed with the complete model for the onestepahead nowcast of the harvest in each twomonth period of 2012 and 2013 with days open to red snapper fishing. However, we also generate the nowcasts using the specification (model 2) without the internet search volume so that we can compare the nowcasts with and without the internet search volume.
Before discussing the results of the nowcasts for 2012 and 2013, we briefly review the estimated parameters from the complete specification (the last column) in
The effects of the number of days closed to red snapper fishing in a wave and the internet search volume are more complicated because of the interaction terms. At the average level of internet search volume in a period (i.e.,
2012.2  2012.3  2013.2  2013.4  

Intercept  1.355 
1.295 
1.366 
1.680 
MarApr  0.180 (0.139)  0.229 (0.128)  0.129 (0.115)  0.047 (0.193) 
MayJune  1.152 
1.284 
1.026 
0.932 
JulAug  0.913 
1.075 
0.829 
0.828 
SepOct  0.115 (0.215)  0.186 (0.201)  0.010 (0.187)  0.127 (0.285) 
NovDec  0.047 (0.123)  0.038 (0.122)  0.015 (0.115)  0.132 (0.195) 
Harvest(t1)  0.021 (0.089)  0.031 (0.088)  0.001 (0.086)  0.157 (0.106) 
Harvest(t6)  0.059 (0.141)  0.031 (0.102)  0.078 (0.095)  0.177 (0.153) 
Deepwater Horizon  1.362 
1.233***(0.205)  1.377 
1.639 
Days Closed in Wave  0.023 
0.022 
0.022 
0.027 
Search in 1st Month in Wave  0.144 (0.418)  0.025 (0.397)  0.223 (0.370)  0.102 (0.617) 
Search in 2nd Month in Wave  0.892 
0.912 
0.733 
1.531 
Search in 1st Month x Days Closed  0.000 (0.009)  0.001 (0.009)  0.001 (0.008)  0.007 (0.014) 
Search in 2nd Month x Days Closed  0.018 
0.018 
0.017 
0.030 
R^{2}  0.944  0.950  0.947  0.920 
Adj. R^{2}  0.923  0.932  0.931  0.896 
Num. obs.  50  51  56  58 
RMSE  0.214  0.213  0.218  0.372 
***
**
*
The coefficients are scaled to million pounds.
We use
The cumulative harvest in waves 3 and 5 of 2013 is nowcast using
The MAE of the nowcasts over the four periods considered in 2012 and 2013 is 1.88, 2.14, and 1.48, respectively, for the naive approach, the model without search volume, and the model with search volume. The model with internet search volume performs best, followed by the naive approach and then closely by the model without internet search volume. Overall, the model with internet search volume nowcasts 31% better than the model without internet search volume and 22% better than the naive prediction.
Fisheries in the United States are managed with hard caps on harvest for any given year. Staying under the cap throughout the year can be difficult, especially for recreational fishing where preliminary harvest estimates are usually unavailable until more than a month after the monitoring period is completed. This can be long after the season ends. We examined the potential for using nowcasting with internet search information to generate predictions of the recreational harvest before official estimates are available. The Gulf of Mexico recreational red snapper fishery has been managed for over ten years using effortbased controls such as bag limits and seasonal closures. During this time, the bag limit has been cut in half, and the season been reduced from nearly 200 days to less than 20 days. Despite these measures, harvest has exceeded the allowable catch in recent years, and the lag in reporting harvest estimates is a significant impediment to successfully managing the fishery.
Estimates of the recreational harvest of red snapper in the Gulf of Mexico are generated in bimonthly waves and preliminary estimates are available around 45 days after the completion of each wave. Harvest estimates from the same wave in the previous year are a simple, intuitive proxy for the estimate of harvest in the wave just completed. This “naive” approach works well when fishery conditions (seasons, angler effort, and catch rates) are relatively stable. Indeed, the naive approach produced the best nowcasts for the open waves of 2012 when the season was similar to the previous year. However, we show that the naive approach can be problematic when current fishery conditions are considerably different than conditions during the previous year. For instance, a harvest prediction for the current period will not be available if the same period was closed to fishing during the previous year. In these cases, a more flexible modeling approach can be helpful.
Our results support a more flexible modeling approach for nowcasting harvest during 2013 when harvest levels were unprecedented due to changes in seasons, angler effort, and average fish size. A regression model using internet search information generates nowcasts that are nearly 30% better than the naive prediction and the the model without search information. This result is consistent with improvements in nowcasts with internet search information found in other (nonfishery) applications [
The regression modelling approach we propose is also useful because it makes uncertainty explicit. In the face of missing or delayed information on current harvest, fishery managers need to decide on the amount of overharvesting risk they are willing to tolerate. We show how our approach can be used to quantify this risk in terms of a probability of exceeding the quota. Understanding the risk of exceeding the harvest quota is very important when managers consider whether there is enough quota remaining to reopen the season for some period in the later part of the year as was done in the red snapper fishery during 2013. Due to data reporting delays in 2013, managers were not aware that the quota had already been exceeded by a considerable margin during the summer season. Our nowcasting model with internet search information would have predicted a nearly 20% probability that the quota had been exceeded during the summer season of 2013. Fishery managers could have considered this information when deciding to reopen the fishery in the fall of 2013. Staying within quotas has become more important recently for U.S. recreational fisheries where accountability measures have been introduced that require overages in one year to be “repaid” in the following year with tighter regulations on harvest.
It is important to note that the relationship between the “red snapper season” search term and red snapper harvest might change over time. This could occur if, for example, there are changes in the number of people using Google to find information about fishing relative to the population of potential anglers in the fishery of interest. In our case study, we attempted to keep the forecast model current by reestimating the model using all of the data available up to the point of the nowcast. In other cases, though, if the relationship between internet search activity and harvest is suspected to have changed significantly, then it may be necessary to drop some of the data from early years when reestimating the model.
There are several ways the nowcasting approach presented in this paper could be extended. First, as already noted, it would be useful to see if the harvest nowcasting improvements we found with internet search volume hold for other fisheries or other natural resource management situations. Second, the regression model used to predict harvest could include the internet search volume measured at different times during each twomonth period. Google Trends provides weekly estimates of search volume which gives eight potential estimates that could be used in our model of bimonthly harvest. Lastly, the regression approach could potentially be improved with other timely indicators of harvest behavior. This could include the volume of other internet search terms or multiple search terms combined in an index of sportfishing interest. In addition, the early work using social media activity to nowcast economic phenomena is promising [