TY - JOUR
T1 - Effects of single nucleotide polymorphism ascertainment on population structure inferences
AU - Dokan, Kotaro
AU - Kawamura, Sayu
AU - Teshima, Kosuke M.
N1 - Publisher Copyright:
© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.
PY - 2021
Y1 - 2021
N2 - Single nucleotide polymorphism (SNP) data are widely used in research on natural populations. Although they are useful, SNP genotyping data are known to contain bias, normally referred to as ascertainment bias, because they are conditioned by already confirmed variants. This bias is introduced during the genotyping process, including the selection of populations for novel SNP discovery and the number of individuals involved in the discovery panel and selection of SNP markers. It is widely recognized that ascertainment bias can cause inaccurate inferences in population genetics and several methods to address these bias issues have been proposed. However, especially in natural populations, it is not always possible to apply an ideal ascertainment scheme because natural populations tend to have complex structures and histories. In addition, it was not fully assessed if ascertainment bias has the same effect on different types of population structure. Here, we examine the effects of bias produced during the selection of population for SNP discovery and consequent SNP marker selection processes under three demographic models: the island, stepping-stone, and population split models. Results show that site frequency spectra and summary statistics contain biases that depend on the joint effect of population structure and ascertainment schemes. Additionally, population structure inferences are also affected by ascertainment bias. Based on these results, it is recommended to evaluate the validity of the ascertainment strategy prior to the actual typing process because the direction and extent of ascertainment bias vary depending on several factors.
AB - Single nucleotide polymorphism (SNP) data are widely used in research on natural populations. Although they are useful, SNP genotyping data are known to contain bias, normally referred to as ascertainment bias, because they are conditioned by already confirmed variants. This bias is introduced during the genotyping process, including the selection of populations for novel SNP discovery and the number of individuals involved in the discovery panel and selection of SNP markers. It is widely recognized that ascertainment bias can cause inaccurate inferences in population genetics and several methods to address these bias issues have been proposed. However, especially in natural populations, it is not always possible to apply an ideal ascertainment scheme because natural populations tend to have complex structures and histories. In addition, it was not fully assessed if ascertainment bias has the same effect on different types of population structure. Here, we examine the effects of bias produced during the selection of population for SNP discovery and consequent SNP marker selection processes under three demographic models: the island, stepping-stone, and population split models. Results show that site frequency spectra and summary statistics contain biases that depend on the joint effect of population structure and ascertainment schemes. Additionally, population structure inferences are also affected by ascertainment bias. Based on these results, it is recommended to evaluate the validity of the ascertainment strategy prior to the actual typing process because the direction and extent of ascertainment bias vary depending on several factors.
UR - http://www.scopus.com/inward/record.url?scp=85114394131&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85114394131&partnerID=8YFLogxK
U2 - 10.1093/g3journal/jkab128
DO - 10.1093/g3journal/jkab128
M3 - Article
C2 - 33871576
AN - SCOPUS:85114394131
SN - 2160-1836
VL - 11
JO - G3: Genes, Genomes, Genetics
JF - G3: Genes, Genomes, Genetics
IS - 9
M1 - jkab128
ER -