Repository logo

Developing a strategy for identifying genetically important animals


Livestock researchers often need to sample animals within a breed to serve as a representative sample of the breed. Identifying the most relevant animals to include in research for genotyping, building a reference population, or inclusion in a gene bank is a complex issue. A suboptimal sampling strategy can lead to biased results, the need for additional sampling, and can be costly. When using public funds (e.g., federal grant or federal appropriations) or member fees (e.g., breed association funds), we have a responsibility to efficiently spend these investments in a wise manner, optimizing which animals are sampled before the research, genotyping, or gene banking begins. The first objective was to develop a sampling strategy to maximize the genetic diversity captured for the sampled animals. Simulated data is ideal for this type of study as there is no limitation to the testing parameters. The primary benefit of simulation with this research was the opportunity to have known genotypes for every animal in the population. Since genotypes will almost never be available for the entire population in the real world, and identifying animals to genotype may in fact be the purpose of the sampling, pedigree-based sampling methods were chosen. Sampling methods tested included optimal contribution selection (OCS) and the genetic conservation index (GCI). The OCS selects parents based on constraining their co-ancestry rather than minimizing inbreeding. GCI seeks to maximize the number of founders in an animal's pedigree. The sampling strategy developed in Objective 1 was used to identify a subset of 100, 50, and 25 animals from each breed and the genetic diversity captured by each sampling method was assessed using both quantitative and molecular methods. AlphaSimR was used to simulate the population for sampling. After an initial randomly mating founder population was developed, an additional 15 years of selection for phenotypic weaning weight was simulated and resulted in a fully genotyped population with 13,662 animals per year. The simulation was designed to represent a sheep population. After the sampling strategies were applied to the simulated population, they were next applied to Suffolk sheep and Simmental beef populations for further assessment of their ability to capture genetic diversity. To assess population structure based on molecular data, the Suffolk and Simmental populations were limited to genotyped animals and their ancestors. The simulated population represented a large purebred population (n=204,930) with a moderate number of markers (n=53,901). The Suffolk population represented a small population (n=1,565) with many markers (n=606,006). Lastly, the Simmental population represented a large, admixed population (n=54,790) with a moderate number of markers (n=29,449). For the second objective, the population structure of the full populations, comprised of genotyped animals, was assessed, and compared to the population structure of the animals from each sampling strategy. Each sampling strategy selected 100, 50, and 25 animals. The measure of success of capturing the genetic diversity of the population was a molecular-based measure defined by capturing the available alleles in the population. Other population structure measures included a comparison of a phenotypic trait, breeding values, inbreeding levels, heterozygosity, minor allele frequency (MAF) category classification, runs of homozygosity (ROH), Ne, and model-based population structure to visualize subpopulations. While both sampling strategies were effective at capturing the available alleles in the population, OCS was more successful than GCI when comparing the same sample size. Success of capturing alleles decreased as sample size decreased from 100 to 50 to 25. Overall, OCS with a sample of 100 animals (OCS 100) was the most successful at capturing the available alleles in the population, capturing 96.5, 99.3, and 99.9 percent of the alleles for the simulated, Suffolk, and Simmental populations, respectively. For a sampling strategy to be useful, it needs to be effective across a variety of species and breeds with a variety of breed histories and population sizes. The third objective was to compare the three populations evaluated in this research and compare the effectiveness of the sampling strategies across these populations. Population structure was compared for the three populations. Then, the effectiveness of OCS 100 was compared. The three populations differed in population size and the amount of admixture present. The simulated population was characterized by a large number of low frequency alleles (n=5,339) that proved difficult to capture. The Suffolk population was small and consisted of 14 distinct subpopulations. The Simmental population had high levels of heterozygosity and less distinct subpopulation structure. Despite disparate populations, OCS 100 was the most robust across the three populations, consistently capturing the highest percentage of available alleles compared to the other sampling strategies. In summary, OCS 100 was the most effective sampling strategy across three different populations. A low-cost pedigree-based sampling strategy can be used to capture the genetic diversity in a population. Researchers will need to weigh the risk of a greater loss of alleles when selecting a smaller population size. Risk could be further reduced by increasing the selected population size. Knowledge of the prevalence of low frequency alleles in the population and the value of capturing them should be considered.


Rights Access


genetic diversity
Ovis aries
Bos taurus
optimal contribution selection


Associated Publications