Survey Statistics Overview
Sampling populations, calculating weighted estimates, and accounting for errors, outliers, and uncertainty are important steps in our work to produce recreational catch statistics.
NOAA Fisheries' Marine Recreational Information Program is committed to implementing carefully designed surveys, collecting high-quality data, and producing sound fisheries statistics that meet science and management needs. Here, we’ve outlined some of the fundamental concepts behind the survey statistics that guide our work.
Sampling
Our suite of recreational fishing surveys includes both census and sample designs. A census collects information from all members of a target population, and is feasible when the target population is known (e.g., all federally permitted for-hire vessels in a particular geographic region). A sample collects information from a randomly selected and representative part of the population, and is an effective method of collecting information when it’s not possible or practical to conduct a census.
There are two broad categories of sampling methods: probability sampling and non-probability sampling.
In a probability sample survey, each member of the target population has a known chance of being sampled. When properly designed and implemented, probability samples are representative of the target population and provide data that can be used to produce unbiased estimates.
In a non-probability sample survey, the chance that any given member of the target population will be sampled is unknown. For this reason, there is no guarantee that a non-probability sample will produce unbiased estimates. Examples of non-probability samples include:
- Convenience sampling, in which members of a population are sampled based on the relative ease of reaching them.
- Snowball sampling, in which each individual who is sampled refers an acquaintance to be sampled next.
- Volunteer, or opt-in, sampling, in which the sample member self-selects into the survey (for instance, through the use of a volunteer angler reporting app).
Sample Frame
A sample frame is the list of population members from which a sample is drawn. The sample frame for the Access Point Angler Intercept Survey, for example, is a list of public fishing access sites where anglers can be interviewed after they complete their fishing trips. The Fishing Effort Survey samples from a list of residential mailing addresses supplemented with information from state-based recreational fishing license and registration programs.
Sample Size
Sample size describes the number of units measured in a sample survey. If, for example, you draw 10 marbles at random from a bag of 100 black and white marbles to estimate the number of each color in the bag, your sample size would be 10.
Different factors can inform the “ideal” sample size, from the size of your target population to the margin of error you’re willing to allow. While increasing sample size is one way to improve the precision of your estimates, actual sample size will ultimately depend on the resources you have and the cost of surveying each member of the target population.
Stratification
Stratification is a sampling technique that divides one population into sub-populations, or strata, before samples are drawn. This improves the precision of our estimates, and allows us to produce estimates specific to each strata. Sampling for the Access Point Angler Intercept Survey is stratified across time (year, month, time of week, and time of day), geographic area, and the predominant fishing mode at a particular sample site. Sampling for the Fishing Effort Survey is stratified across state, sub-state region (defined by how close a household is to the coast), and the fishing license or registration status of each residential household.
Clustering
Clustering is a sampling technique that places population members into logical groups before samples are drawn. Public fishing access sites on the Access Point Angler Intercept Survey sample frame are clustered by anticipated fishing pressure and geographic location, forming groups of either one site with high fishing activity or two geographically close sites with less fishing activity. This increases the efficiency of our shoreside intercepts.
Multi-stage Sampling
Multi-stage sampling is a sampling technique wherein samples are drawn in nested stages, using increasingly smaller sampled units. The Access Point Angler Intercept Survey involves four sampling stages: a site-day-time interval; the duration of the sampling assignment; the angler trip that is intercepted; and catch, by species, on an individual angler trip, which is subsampled for individual fish length and weight measurements. This is a necessary part of weighting.
Estimation
Estimating total recreational catch is a multi-step process. First, we calculate catch rate, or the average number of fish caught per angler trip. Then, we calculate fishing effort, or the total number of angler trips taken by residents of sampled states. Finally, we multiply catch rate by effort to estimate total recreational catch. Statistical weighting ensures our sampled units are able to represent themselves and the broader population we weren’t able to sample. In this way, we’re able to draw conclusions about the full recreational fishing community without having to collect information from each member of that community.
Learn more about estimating total recreational catch
Reducing the Potential for Errors
All surveys include some amount of error. Some errors are inherent to the act of sampling. These sampling errors are quantifiable, which means we can determine the extent of their impact on our estimates. The size of the sampling error can depend on the size of the sample, the design of the sample, and natural differences among the population being sampled. (Increasing sample size, for example, generally decreases sampling error.) Sampling errors impact the precision of our estimates.
Errors that are not due to sampling are known as non-sampling errors. These errors can occur in both census and sample surveys, and can impact the precision and accuracy of our estimates. Common non-sampling errors include:
- Coverage error, which occurs when members of a target population are omitted, duplicated, or wrongly included in a sample frame.
- Measurement error, which occurs when a respondent provides an incorrect response to a survey question (e.g., because the question is ambiguous, poorly worded, or inconsistently asked; because the respondent can’t recall an activity or event; or because the respondent has intentionally misreported their response, which can occur if a question is controversial or asks about a sensitive topic).
- Non-response error, which occurs when a respondent is unable or unwilling to respond to a survey.
- Data processing error, which can occur while entering, coding, editing, or otherwise preparing survey data.
To reduce the potential for survey errors, the Marine Recreational Information Program:
- Follows best practices in our recreational fishing data collection programs, including offering small incentives to increase response rates, applying correction factors to account for undercoverage, and validating responses, where possible, to account for measurement error.
- Conducts extensive research to improve existing surveys and test new methods.
- Uses quality assurance and quality control procedures to prevent invalid data from entering our system, detect and correct errors, and systematically identify outlier estimates.