research-article

Bayesian inference from the conditional genetic stock identification model

Publication: Canadian Journal of Fisheries and Aquatic Sciences25 June 2018https://doi.org/10.1139/cjfas-2018-0016

Abstract

Genetic stock identification (GSI) estimates stock proportions and individual assignments through comparison of genetic markers with reference populations. It is used widely in anadromous fisheries to estimate the impact of oceanic harvest on riverine populations. Here, we provide a formal, explicit description of Bayesian inference in the conditional GSI model, documenting an approach that has been widely used in the last 5 years, but not formally described until now. Subsequently, we describe a novel cross-validation method that permits accurate prediction of GSI accuracy when making Bayesian inference from the conditional GSI model. We use cross-validation and simulation of genetic data to confirm the occurrence of a bias in reporting-unit proportions recently reported in Hasselman et al. (2016). Then, we introduce a novel parametric bootstrap approach to reduce this bias, and we demonstrate the efficacy of our correction. Our methods have been implemented as a user-friendly R package, rubias, which makes use of Rcpp for computational efficiency. We predict rubias will be widely useful for GSI of fish populations.

Résumé

L’identification génétique des stocks (IGS) estime les proportions de stocks et les affectations d’individus en comparant des marqueurs génétiques à ceux de populations de référence. L’approche est largement utilisée dans les pêches aux espèces anadromes pour estimer l’incidence de la pêche océanique sur les populations de rivières. Nous présentons une description explicite formelle d’inférence bayésienne dans le modèle d’IGS conditionnel, qui documente une approche largement utilisée au cours des 5 dernières années, mais dont une description formelle n’avait pas encore été faite. Nous décrivons ensuite une nouvelle méthode de validation croisée qui permet la prédiction exacte de l’exactitude de l’IGS quand une inférence bayésienne est faite à partir du modèle d’IGS conditionnel. Nous utilisons la validation croisée et la simulation de données génétiques pour confirmer la présence d’un biais dans les proportions d’unités de rapport récemment rapportées dans Hasselman et al. (2016). Nous présentons ensuite une nouvelle approche d’autoamorçage paramétrique pour réduire ce biais et démontrons l’efficacité de cette correction. Nos méthodes ont été mises en application dans un progiciel en R, rubias, qui utilise Rcpp à des fins d’efficacité computationnelle. Nous prédisons que rubias sera très utile pour l’IGS de populations de poissons. [Traduit par la Rédaction]

References

Anderson, E.C., and Garza, J.C. 2005. A description of full parental genotyping. Technical Report, Pacific Salmon Commission.
Anderson E.C. and Garza J.C. 2006. The power of single nucleotide polymorphisms for large-scale parentage inference. Genetics, 172: 2567–2582.
Anderson E.C. and Thompson E.A. 2002. A model-based method for identifying species hybrids using multilocus genetic data. Genetics, 160: 1217–1229.
Anderson E.C., Kalinowski S.T., and Waples R.S. 2008. An improved method for predicting the accuracy of genetic stock identification. Can. J. Fish. Aquat. Sci. 65(7): 1475–1486.
Araujo H.A., Candy J.R., Beacham T.D., White B., and Wallace C. 2014. Advantages and challenges of genetic stock identification in fish stocks with low genetic resolution. Trans. Am. Fish. Soc. 143: 479–488.
Beacham T.D., Candy J.R., Jonsen K.L., Supernault J., Wetklo M., and Deng L. 2006. Estimation of stock composition and individual identification of chinook salmon across the pacific rim by use of microsatellite variation. Trans. Am. Fish. Soc. 135: 861–888.
Beacham T.D., Wallace C.G., MacConnachie C., Jonsen K., McIntosh B., Candy J.R., Devlin R.H., and Withler R.E. 2017. Population and individual identification of Coho Salmon in British Columbia through parentage-based tagging and genetic stock identification: an alternative to coded-wire tags. Can. J. Fish. Aquat. Sci. 74(9): 1391–1410.
Campbell N.R., Harmon S.A., and Narum S.R. 2015. Genotyping-in-Thousands by sequencing (GT-seq): a cost effective SNP genotyping method based on custom amplicon sequencing. Mol. Ecol. Resour. 15: 855–867.
Clemento A.J., Crandall E.D., Garza J.C., and Anderson E.C. 2014. Evaluation of a single nucleotide polymorphism baseline for genetic stock identification of Chinook Salmon (Oncorhynchus tshawytscha) in the California Current large marine ecosystem. Fish. Bull. 112: 112–131.
Dawson K.J. and Belkhir K. 2001. A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet. Res. 78: 59–77.
Debevec E.M., Gates R.B., Masuda M., Pella J., Reynolds J., and Seeb L.W. 2000. SPAM (Version 3.2): Statistics program for analyzing mixtures. J. Hered. 91: 509–510.
deFinetti, B. 1972. Probability, induction and statistics. The Art of Guessing. John Wiley & Sons, New York.
Diebolt J. and Robert C.P. 1994. Estimation of finite mixture distributions through Bayesian sampling. J. R. Stat. Soc. 56: 363–375.
Eddelbuettel, D. 2013. Seamless R and C++ Integration with Rcpp. Springer, New York. ISBN 978-1-4614-6867-7.
Eddelbuettel D. and François R. 2011. Rcpp: Seamless R and C++ Integration. J. Stat. Softw. 40: 1–18.
Efron, B., and Tibshurani, R. 1993. An introduction to the bootstrap. Chapman & Hall, New York.
Falush D., Stephens M., and Pritchard J.K. 2003. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics, 164: 1567–1587.
Fournier D., Beacham T., Riddell B., and Busack C. 1984. Estimating stock composition in mixed stock fisheries using morphometric, meristic, and electrophoretic characteristics. Can. J. Fish. Aquat. Sci. 41(3): 400–408.
Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. 2004. Bayesian Data Analysis, 2nd ed. Chapman and Hall, New York.
Grant W., Milner G., Krasnowski P., and Utter F. 1980. Use of biochemical genetic variants for identification of sockeye salmon (Oncorhynchus nerka) stocks in Cook Inlet, Alaska. Can. J. Fish. Aquat. Sci. 37(8): 1236–1247.
Hall P. and Martin M.A. 1988. On bootstrap resampling and iteration. Biometrika, 75: 661–671.
Hasselman D.J., Anderson E.C., Argo E.E., Bethoney N.D., Gephard S.R., and Post D.M. 2016. Genetic stock composition of marine bycatch reveals disproportional impacts on depleted river herring genetic stocks. Can. J. Fish. Aquat. Sci. 73(6): 951–963.
Hudson R.R. 2002. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics (Oxford), 18: 337–338.
Johnson, N.L., Kotz, Z., and Balakrishnan, N. 1997. Discrete multivariate distributions. Wiley & Sons, New York.
Kalinowski, S.T., Manlove, K.R., and Taper, M.L. 2007. ONCOR: a computer program from genetic stock identification. Technical report, Montana State University.
Larson W.A., Seeb L.W., Everett M.V., Waples R.K., Templin W.D., and Seeb J.E. 2014. Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha). Evol. Appl. 7: 355–369.
Liu J.S., Wong W.H., and Kong A. 1994. Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika, 81: 27–40.
McCraney W.T., Farley E.V., Kondzela C.M., Naydenko S.V., Starovoytov A.N., and Guyon J.R. 2012. Genetic stock identification of overwintering chum salmon in the North Pacific Ocean. Environ. Biol. Fishes, 94: 663–668.
McKinney G.J., Seeb J.E., and Seeb L.W. 2017. Managing mixed-stock fisheries: genotyping multi-SNP haplotypes increases power for genetic stock identification. Can. J. Fish. Aquat. Sci. 74(4): 429–434.
Milner, G.B., Teel, D.J., Utter, F.M., and Burley, C.L. 1981. National Marine Fisheries Service, Columbia River stock identification study: validation of genetic method. Final Report to Bonneville Power Administration, Contract No. 1980BP18488, 57 electronic pages (BPA Report DOE/BP-18488-1). Technical report.
Moriya S., Sato S., Azumaya T., Suzuki O., Urawa S., Urano A., and Abe S. 2007. Genetic stock identification of chum salmon in the Bering Sea and North Pacific Ocean using mitochondrial DNA microarray. Mar. Biotechnol. 9: 179–191.
Neaves, P.I., Wallace, C.G., Candy, J.R., and Beacham, T.D. 2005. CBayes: Computer program for mixed stock analysis of allelic data. Version v3.0 [online]. Free program distributed by the authors over the internet. Available form http://www.pac.dfo-mpo.gc.ca/sci/mgl/Cbayes_e.htm. Technical report.
Pella J. and Masuda M. 2001. Bayesian methods for analysis of stock mixtures from genetic characters. Fish. Bull. (Seattle), 99: 151–167.
Piry S., Alapetite A., Cornuet J.M., Paetkau D., Baudouin L., and Estoup A. 2004. GENECLASS2: a software for genetic assignment and first-generation migrant detection. J. Hered. 95: 536–539.
Pritchard J.K., Stephens M., and Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics, 155: 945–959.
R Core Team. 2016. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Rannala B. and Mountain J.L. 1997. Detecting immigration by using multilocus genotypes. Proc. Natl. Acad. Sci. U.S.A. 94: 9197–9201.
Sato S., Moriya S., Azumaya T., Suzuki O., Urawa S., Abe S., and Urano A. 2004. Genetic stock identification of chum salmon in the central Bering Sea and adjacent North Pacific Ocean by DNA microarray during the early falls of 2002 and 2003. NPAFC Doc. 793: 21.
Satterthwaite W.H., Ciancio J., Crandall E., Palmer-Zwahlen M.L., Grover A.M., and O’Farrell M.R. 2015. Stock composition and ocean spatial distribution inference from California recreational Chinook salmon fisheries using genetic stock identification. Fish. Res. 170: 166–178.
Seeb L.W. and Crane P.A. 1999. Allozymes and mitochondrial DNA discriminate Asian and North American populations of chum salmon in mixed-stock fisheries along the south coast of the Alaska Peninsula. Trans. Am. Fish. Soc. 128: 88–103.
Seeb L.W., Antonovich A., Banks A.A., Beacham T.D., Bellinger A.R., and Blankenship S.M. 2007. Development of a standardized DNA database for Chinook salmon. Fisheries, 32: 540–552.
Smouse P.E., Waples R.S., and Tworek J.A. 1990. A genetic mixture analysis for use with incomplete source population data. Can. J. Fish. Aquat. Sci. 47(3): 620–634.
Spielman R.S. and Smouse P.E. 1976. Multivariate classification of human populations. I. Allocation of Yanomama indians to villages. Am. J. Hum. Genet. 28: 317–331.

Supplementary Material

Supplementary data (cjfas-2018-0016suppla.pdf)

Information & Authors

Information

Published In

Canadian Journal of Fisheries and Aquatic Sciences cover image
Canadian Journal of Fisheries and Aquatic Sciences
Volume 76Number 42019
Pages: 551 - 560

History

Received: 12 January 2018
Accepted: 31 May 2018
Published online: 25 June 2018

Permissions

Request permissions for this article.

Authors

Affiliations

Benjamin M. Moran
Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 110 McAllister Road, Santa Cruz, CA 95060, USA.
Department of Marine and Environmental Sciences, Northeastern University, 360 Huntington Ave., Boston, MA 02115, USA.
Eric C. Anderson eric.anderson@noaa.gov
Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 110 McAllister Road, Santa Cruz, CA 95060, USA.

Metrics & Citations

Metrics

Citations

View Options

Media

Figures

Other

Tables

Share

Information & Authors
Metrics & Citations
Other Metrics
 
Cite As


 
Export Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.


 
Cited by
1. Mixed-stock analyses of migratory, non-native Chinook salmon at sea and assignment to natal sites in fresh water at their introduced range in South America
2. Power of a dual‐use SNP panel for pedigree reconstruction and population assignment
3. A GT‐seq panel for walleye ( Sander vitreus ) provides important insights for efficient development and implementation of amplicon panels in non‐model organisms
4. Accurate estimation of conservation unit contribution to coho salmon mixed-stock fisheries in British Columbia, Canada, using direct DNA sequencing for single nucleotide polymorphisms
5. Resolving fine‐scale population structure and fishery exploitation using sequenced microsatellites in a northern fish
7. Close relatives in population samples: Evaluation of the consequences for genetic stock identification
8. Genetic assignment of individuals to source populations using network estimation tools
9. Genetic evidence of a northward range expansion in the eastern Bering Sea stock of Pacific cod
10. Comparison of coded-wire tagging with parentage-based tagging and genetic stock identification in a large-scale coho salmon fisheries application in British Columbia, Canada
Share Options
Share the article link
Share on social media
Get Access
Login options

Check if you access through your login credentials or your institution to get full access on this article.

Subscribe to csp

Click on the button below to subscribe now

Restore your content access

Enter your email address to restore your content access:

Note: This functionality works only for purchases done as a guest. If you already have an account, log in to access the content to which you are entitled.

View Options
Tables
References