Survey Statistics Overview
Sampling populations, calculating weighted estimates, and accounting for errors, outliers, and uncertainty are important steps in our work to produce recreational catch statistics.
The Marine Recreational Information Program (MRIP) is transparent about its work to collect recreational fishing data and produce estimates of total recreational catch. While our surveys use standard, proven, and peer-reviewed data collection and estimation methods, our catch statistics should not be viewed as actual population values. Instead, they are estimates, with a level of uncertainty reported as percent standard error.
The exact number and species of finfish caught in saltwater by recreational anglers fishing from shore, private boats, and for-hire vessels is impossible to determine, because one complete marine recreational fishing census—in which all saltwater recreational anglers report every fish they catch on every trip they take—would be impossible to administer and verify. We are confident in our estimates, however, because they are derived from sound survey methods that have been developed, tested, and independently reviewed by experts in statistical survey design.
Here, we’ve outlined some fundamental concepts behind the survey statistics that inform our work.
Our suite of recreational fishing surveys includes both census and sample designs. A census collects information from all members of a target population, and is feasible when the target population is known (e.g., all federally permitted for-hire vessels in a particular geographic region). A sample collects information from a randomly selected and representative part of the population, and is an effective method of collecting information when it’s not possible or practical to conduct a census. By constructing a comprehensive sample frame and using probability sampling methods, we ensure our samples reflect the characteristics of the larger group.
There are two broad categories of sampling methods: probability sampling and non-probability sampling.
In a probability sample survey, each member of the target population has a known chance of being sampled. When properly designed and implemented, probability samples are representative of the target population and provide data that can be used to produce unbiased estimates.
In a non-probability sample survey, the chance that any given member of the target population will be sampled is unknown. For this reason, there is no guarantee that a non-probability sample will produce unbiased estimates. Examples of non-probability samples include:
- Convenience sampling, in which members of a population are sampled based on the relative ease of reaching them.
- Snowball sampling, in which each individual who is sampled refers an acquaintance to be sampled next.
- Volunteer, or opt-in, sampling, in which the sample member self-selects into the survey (for instance, through the use of an angler reporting app).
A sample frame is the list of population members from which a sample is drawn. The sample frame for the Access Point Angler Intercept Survey, for example, is a list of public fishing access sites where anglers can be interviewed after they complete their fishing trips. The Fishing Effort Survey samples from a list of residential mailing addresses supplemented with information from state-based recreational fishing license and registration programs.
Sample size describes the number of units measured in a sample survey. If, for example, you draw 10 marbles at random from a bag of 100 black and white marbles to estimate the number of each color in the bag, your sample size would be 10.
Different factors can inform the “ideal” sample size, from the size of your target population to the margin of error you’re willing to allow. While increasing sample size is one way to improve the precision of your estimates, actual sample size will ultimately depend on the resources you have and the cost of surveying each member of the target population.
Stratification is a sampling technique that divides one population into sub-populations, or strata, before samples are drawn. This improves the precision of our estimates, and allows us to produce estimates specific to each strata. Sampling for the Access Point Angler Intercept Survey is stratified across time (year, month, time of week, and time of day), geographic area, and the predominant fishing mode at a particular sample site. Sampling for the Fishing Effort Survey is stratified across state, sub-state region (defined by how close a household is to the coast), and the fishing license or registration status of each residential household.
Clustering is a sampling technique that places population members into logical groups before samples are drawn. Public fishing access sites on the Access Point Angler Intercept Survey sample frame are clustered by anticipated fishing pressure and geographic location, forming groups of either one site with high fishing activity or two geographically close sites with less fishing activity. This increases the efficiency of our shoreside intercepts.
Multi-stage sampling is a sampling technique wherein samples are drawn in nested stages, using increasingly smaller sampled units. The Access Point Angler Intercept Survey involves four sampling stages: a site-day-time interval; the duration of the sampling assignment; the angler trip that is intercepted; and catch, by species, on an individual angler trip, which is subsampled for individual fish length and weight measurements. This is a necessary part of weighting.
Weighting is a statistical method that ensures each sampled unit is properly represented in a final estimate. It allows us to account for the fact that some members of a sample frame are more likely than others to be drawn in a sample or participate in a survey.
In basic weighting, the assigned weight of a sample unit equals the inverse of the probability that unit will be included in a sample. If, for instance, you sample 10 anglers at random from a frame of 100, each angler would have a one in 10 chance of being selected for the sample. Each angler’s assigned weight, therefore, would be 10. If you sample 20 anglers from the same frame, each angler would have a two in 10 chance of being selected, and would have a weight of five. To estimate the number of fish caught by our target population in the second scenario, we would multiply the catch from each angler by five.
More complicated survey designs may require different sub-populations to have different weights. To optimize the time interviewers spend in the field, the Access Point Angler Intercept Survey samples high-activity sites more often than low-activity sites, and the data collected from the two types of sites are weighted differently. Strata that have a higher chance of being selected for a sample are generally “down-weighted,” while strata that have a lower chance of being selected are generally “up-weighted.”
How Weighting Our Data Helps Your Catch Count
In this video, we visit a tackle shop to see how weighting is used to accurately estimate anglers’ catch.
Estimating total recreational catch is a multi-step process. First, we calculate catch rate, or the average number of fish caught per angler trip. Then, we calculate fishing effort, or the total number of angler trips taken by residents of sampled states. Finally, we multiply catch rate by effort to estimate total recreational catch. We produce estimates for all modes of fishing (including shore, private boat, and charter boat), all species caught, and three types of catch: observed fish that are caught, brought back to the dock, and identified by field interviewers; reported fish that are caught, released dead, used for bait, filleted, or otherwise unobservable by field interviewers; and reported fish released alive.
Accounting for Errors, Outliers, and Uncertainty
All surveys include some amount of error. Non-sampling errors are errors that are not due to sampling, and can occur in both census and sample surveys. Variable non-sampling errors can impact estimate precision; systematic non-sampling errors can result in bias. Common non-sampling errors include:
- Coverage error, which occurs when members of a target population are omitted, duplicated, or wrongly included in a sample frame.
- Measurement error, which occurs when a respondent provides an incorrect response to a survey question (e.g., because the question is ambiguous, poorly worded, or inconsistently asked; because the respondent can’t recall an activity or event; or because the respondent has intentionally misreported their response, which can occur if a question is controversial or asks about a sensitive topic).
- Non-response error, which occurs when a respondent is unable or unwilling to respond to a survey.
- Data processing error, which can occur while entering, coding, editing, or otherwise preparing survey data.
Sampling errors are inherent in sample surveys, and can impact estimate precision. The size of the sampling error can depend on the size of the sample, the design of the sample, and natural variability within the population being sampled. (Increasing sample size, for example, generally decreases sampling error.)
We measure the precision of each point estimate we produce through its percent standard error, or PSE. This value indicates how far the point estimate is likely to deviate from the actual population value, expressed as a percentage of that estimate. The lower the PSE, the more precise the estimate.
We work to reduce the potential for bias in our recreational fishing surveys by testing our survey designs; weighting our samples; including incentives to increase response rates; applying correction factors to account for undercoverage; validating responses, where possible, to account for measurement error; and following quality assurance and quality control procedures.
Outliers and Uncertainty
Because larger sample sizes generally produce more precise estimates, our estimates are best viewed annually at the state or regional scale. Viewing estimates at too small a scale—by two-month sampling wave, for instance, or by individual fishing mode—is more likely to produce outliers. In these circumstances, a small number of catch surveys is expanded to produce estimates that seem unrealistically high or low. Such estimates are almost always accompanied by high PSEs, which indicate they should be used with caution or considered highly imprecise. In such cases, fisheries scientists and managers may use statistically sound techniques to “smooth” high and low outliers.
Outliers can also arise when we attempt to produce catch estimates for species that are rarely encountered by field interviewers. By collecting more data at fishing sites that see higher levels of offshore fishing activity, we are addressing the need for more precise estimates of rare-event species. By helping our partners implement specialized methods of monitoring red snapper catch in the Gulf of Mexico, we are addressing the need for more precise estimates of a short-season species.