Using the Data
Below is a primer for using and selecting the appropriate data and tools for your purposes.
Introduction to the Data
What kinds of datasets are available? Information about available datasets can be found in the Glossary and the MRIP Read Me document, which describes each of the public use survey datasets and the template programs that are available. Variable descriptions and formats can be found in the following files: Survey Data Variables and Estimate Data Variables.
Why are there non-integer catch counts? There are two ways that we end up with non-integer catch counts: 1) grouped catch and 2) incomplete shore mode trips.
In our standard estimation, type A grouped catches require a different sample weight than type B1 or B2 catches, which are always for individual angler trips. However, we did not want folks to have to worry about two different sample weights when using the public-use datasets. Having two sample weights complicates calculating combined (A + B1) landings. To avoid this situation and use only one sample weight, we multiply the claim counts (A) by an adjustment for the records with grouped catch.
For shore mode assignments, we allow samplers to intercept incomplete trips under specific conditions. In these cases, anglers are asked to estimate the amount of additional time that they will continue fishing. This additional fishing time is used to expand the catch counts recorded during our interviews with fishermen.
What is the difference between PRT_CODE and LEADER? The PRT_CODE and LEADER codes often have the same values, but they provide different information. The PRT_CODE is the ID_CODE of the party leader. A fishing party includes everyone that fished during the same boat trip. Within a party, there may be multiple groups with separate grouped catches. These groups each have distinct LEADER codes. Headboat trips generally have multiple groups with grouped catch (and therefore multiple leader codes) within the same fishing party (PRT_CODE). In the majority of private/rental (PR) and charter (CH) trips, there is only one group in the party, so the PRT_CODE will equal LEADER.
How is grouped catch recorded? If type A catch is grouped, it will only be reported under the leader's ID_CODE. B1, B2, and all A catch that is not grouped are reported separately by individual ID_CODE. We are working on a modified public-use dataset that will eliminate the grouped catch and greatly simplify many analyses.
Access to the Metadata
Access to metadata (data describing the data) is also critical for understanding and using MRIP data. You can access metadata for MRIP (and other NOAA Fisheries data) through InPort.
Limitations of the Data
When working with MRIP datasets, users should be aware of the limitations outlined below.
Revisions: All preliminary estimates will likely be revised before being posted as final. The direction and magnitude of such revisions are unpredictable.
Percent standard error: The percent standard error, or PSE, is a measure of precision presented with all estimates. Estimates should be viewed with increasing caution as PSEs increase beyond 25.
Large PSEs—those above 50—indicate high variability around the estimate and therefore low precision. Estimates with large PSEs should be viewed cautiously.
Granularity: During the year, we produce preliminary estimates by sampling wave, mode of fishing, and state. These estimates, particularly at lower levels of aggregation, may be imprecise because of small sample sizes. For this reason, MRIP estimates are best viewed in aggregate—annually and at the state or regional level.
Time series: When comparing catch estimates across an extended time series, note differences in sampling coverage through the years. Some estimates may not be comparable over long time series. For more information about changes in our sampling and coverage, see Program Evolution.
Fish weight estimates: USE CAUTION WITH WEIGHT DATA! Fish weight estimates are minimums and may not reflect the actual total fish weight landed or harvested.
Fish weight estimates before 2004: We calculated weight estimates by multiplying the estimated number harvested in a cell (year/wave/state/mode/area/species) by the mean weight of the measured fish in that cell. Sometimes we have an estimate of harvest but no mean weight, either because the harvest is all reported by the anglers (B1), or the interviewers couldn't weigh any fish (too big, already gutted and gilled, etc.).
If a cell is missing a mean weight and we have at least two fish measured in the state (all fishing areas and modes combined), we substitute the entire state’s mean for that wave. We also need two measured fish to estimate the variance.
After state substitution, if the mean weight is still missing, we use the mean from the whole subregion for that wave. In such cases, the "two fish rule" still applies.
Fish weight estimates from 2004 to present: As part of the MRIP re-estimation project, we recalculated all estimates of landings by weight (lb. or kg) using the same design-based estimation methodology used to recalculate the catch estimates in numbers of fish.
During the MRIP re-estimation project, the MRIP team also developed a new method to handle missing weights. The new method uses a mix of hot and cold deck imputation as well as length-weight modeling. In hot and cold deck imputation, we fill in—or impute—missing length or weight values by species at the individual angler trip level.
For individual fish records where lengths are present, we impute missing weights using length-weight modeling of the form: Weight = a*Length^b. In most cases, models are fit by species and two-month wave in the current year. Should a model fail to converge, models are fit by species using the most recent 10 years of data.
For intercepted angler trips with landings but no corresponding length and weight measurements, we impute paired length and weight observations from complete cases using hot and cold deck imputation. We conduct up to five rounds of imputation in an attempt to fill in missing values. The rounds begin with imputation cells that correspond to the most detailed MRIP estimation cells. These are aggregated to higher levels in subsequent rounds to bring in more length-weight data:
- Round 1: current year, wave, subregion, state, mode, area fished, species
- Round 2: current year, half-year, subregion, state, mode, species
- Round 3: current + most recent prior year, wave, subregion, state, mode, area fished, species
- Round 4: current + most recent prior year, subregion, state, mode, species
- Round 5: current + most recent prior year, subregion, species
For all years: If fish weights are STILL missing after applying all the imputation methods, we give up and leave a missing fish weight estimate. At that point, it is up to the user to determine whether to substitute and what substitution is most appropriate (a mean from the preceding and following waves, the whole year, same wave over years, whole Atlantic and Gulf Coasts, or other model-based approaches). We don't make those decisions because the information needs and sensitivity of the data vary among species.
The phenomenon of missing fish weights is more widespread with rarely caught species and with large fish (e.g., tuna). You can find the existence and/or extent of missing weights for your query in the column labeled “Landings (no.) without Size Information” in the weight estimates query output. This column provides the number of landed (A+B1) fish that are not included in the weight estimate column, labeled “Harvest (A+B1) Total Weight (lb. or kg).” If the “Landings (no.) without Size Information” column contains a 0 value, then all landed fish are included in the weight estimate.
Please review the Glossary for other important tips on using MRIP data.
We developed the MRIP queries to address our most common survey data and estimate requests. For further customization, statistical analysis software (SAS) template programs and public-use survey datasets are available to data customers through the download query and on the downloads page.
For more information on available template programs and survey data, please review the MRIP Read Me.