Open access

Signatures of selection analysis using whole-genome sequence data reveals novel candidate genes for pony and light horse types

Publication: Genome
14 May 2020


Natural selection and domestication have shaped modern horse populations, resulting in a vast range of phenotypically diverse breeds. Horse breeds are classified into three types (pony, light, and draft) generally based on their body type. Understanding the genetic basis of horse type variation and selective pressures related to the evolutionary trend can be particularly important for current selection strategies. Whole-genome sequences were generated for 14 pony and 32 light horses to investigate the genetic signatures of selection of the horse type in pony and light horses. In the overlapping extremes of the fixation index and nucleotide diversity results, we found novel genomic signatures of selective sweeps near key genes previously implicated in body measurements including C4ORF33, CRB1, CPN1, FAM13A, and FGF12 that may influence variation in pony and light horse types. This study contributes to a better understanding of the genetic background of differences between pony and light horse types.


La sélection naturelle et la domestication ont façonné les populations modernes de chevaux, ce qui a produit une vaste gamme de races distinctes sur le plan phénotypique. Les races de chevaux sont classifiées en trois types (poney, cheval de selle et cheval de trait) selon leur morphologie. Une compréhension de l’assise génétique de la variation pour la morphologie et des pressions sélectives liées à la tendance évolutive peut s’avérer importante pour les stratégies de sélection actuelles. Des séquences génomiques ont été générées pour 14 poneys et 32 chevaux de selle pour étudier les signatures génétiques de la sélection du type chevalin chez les poneys et les chevaux de selle. Au sein des extrêmes chevauchant pour l’indice de fixation et la diversité nucléotidique, les auteurs ont trouvé des signatures génomiques inédites de balayages sélectifs à proximité de gènes clés déjà rapportés comme étant impliqués dans le déterminisme de caractères morphologiques incluant C4ORF33, CRB1, CPN1, FAM13A et FGF12, lesquels influencent possiblement la variation pour la morphologie corporelle chez les poneys et les chevaux de selle. Cette étude fournit un éclairage qui contribuera à une meilleure compréhension de l’assise génétique des différences phénotypiques entre les poneys et les chevaux de selle.


Horse domestication started in Western Asia approximately 5000–6000 years ago (Bowling and Ruvinsky 2000). Horses have played pivotal roles in the human history including agriculture, transportation, and sport (Bowling and Ruvinsky 2000; Jun et al. 2014). Domestication, intensive selection, and various environmental conditions, as well as horse industry modernization, have shaped the more than 400 horse breeds with different physiology, behavior, and body measurements (Brooks et al. 2010; Metzger et al. 2014; Frischknecht et al. 2016). Based on their type, horse breeds are categorized into pony, light, and draft (Dall’Olio et al. 2010; Gurgul et al. 2019). Performance and marketability improvement of horses are significantly related to their types and body measurements in different fields (Meira et al. 2014). Ponies have small skeletal structure and have been used for child horseback riding (Dall’Olio et al. 2010), and during the industrial revolution they were used for coal mining in the UK (Colby 1921). Their height is typically shorter than 14.2 hands (1.44 m), with a distinctive characteristic of short legs in relation to the body depth (Draper et al. 2014). In contrast, light horses have long thin muscles with longer wither heights and have been used in different fields such as sports (show jumping, dressage, and eventing), endurance, harness racing, and racing. Draft horses are heavy and extremely muscular with tall stature; they have been commonly used for meat production, working in farms, and pulling carriages (Dall’Olio et al. 2010; Draper et al. 2014). In general, horse type classification is based on some phenotypic traits, such as body shape and specifically body stature (Dall’Olio et al. 2010; Petersen et al. 2013; Gurgul et al. 2019).
Typically, deciphering the genetic foundation of body measurements is important for performance improvement of horses (Kader et al. 2015). Today, signatures of selection studies that identify the genomic regions that have been subjected to selective pressures have become feasible in most animal species (Wang et al. 2016; Salek Ardestani et al. 2020). Signatures of selection and genome-wide association studies have played key roles in identifying the genomic regions underlying body measurements in several animal species such as horse (Kader et al. 2015; Grilz-Seger et al. 2019), cattle (W. Zhang et al. 2016), and swine (Rubin et al. 2012) using sequencing and genotyping technologies. Several signatures of selection studies have been able to detect candidate genes associated with horse type (Gurgul et al. 2019) and body measurements, especially wither height (Petersen et al. 2013; Kader et al. 2015; Frischknecht et al. 2016; Al Abri et al. 2018). A signatures of selection study in Shetland ponies detected IGF1R and ADAMTS17 genes as selective signals related to wither height (Frischknecht et al. 2016). Similarly, a study of Debao ponies pointed to candidate genes such as TBX3 and HMGA2 underlying body size (Kader et al. 2015). Furthermore, ANKRD1 gene was found to be associated with wither height using selective sweep analysis in American Miniature horses (Al Abri et al. 2018). Gurgul et al. (2019) identified LCORL, NCAPG, TBX3, and LASP1 genes as selective signals in draft and pony horses.
Although most studies suggested employing single nucleotide polymorphism (SNP) array data as a useful and economical tool for identification of selective signatures, the detection power is limited compared to whole-genome sequence data, due to their low-density coverage. In our former study (Salek Ardestani et al. 2020), selective signals were identified by utilizing different selection signatures methods in sport (German and Dutch warmblood) and non-sport horse groups (Arab, Akhal-Teke, Thoroughbred, Standardbred, draft, and pony breeds). However, the main objective of our current study was to identify the genetic signatures of selection for horse type using whole-genome sequence data of pony and light horses and different population combinations. This study may help us to better understand the role of selection during evolutionary process and recent breeding efforts in light and pony horse types. Moreover, it may be helpful to optimize horse SNP array panels that are extensively used for breeding purposes, e.g., horse genomic evaluation.

Materials and methods


Whole-genome sequence data of 46 horses with Illumina HiSeq (2000, 2500, and 3000) and NextSeq 500 platforms, including 32 light (11 breeds) and 14 pony horses (6 breeds) (Table S11 ), were downloaded from the European Nucleotide Archive ( The light group included Akhal-Teke (n = 3), Arabian (n = 2), Baden-Wurttemberg (n = 1), Dutch warmblood (n = 1), Hanoverian (n = 6), Holstein (n = 2), Oldenburg (n = 3), Standardbred (n = 6), Thoroughbred (n = 5), Trakehner (n = 1), and Westphalian (n = 2). The pony group included American Miniature (n = 2), Dülmen (n = 1), Connemara (n = 4), Jeju pony (n = 2), Shetland pony (n = 4), and Welsh pony (n = 1) (Table S21).

Alignment and variant calling

After converting the Sequence Read Archives to Fastq paired-end format using fastq-dump command of SRA Toolkit (version 2.9,, the quality control was performed by FastQC (version 0.11.6, for each sample. Adaptors and low quality reads were filtered using Trimmomatic 0.36 (Bolger et al. 2014). The clean reads were aligned to the reference genome of equine (EquCab2.0) using Burrows-Wheeler Aligner 0.7.17-r1188 (Li and Durbin 2009) and converted to binary with SAMtools 1.7 (Li et al. 2009). Picard 2.17.11 ( was used to remove the potential PCR duplications as well as avoiding systematic biases. The base quality score recalibration was performed according to the recommended workflow in Genome Analysis Toolkit 3.8 (McKenna et al. 2010).
Variant calling was performed by applying “HaplotypeCaller”, “stand emit conf 10”, and “stand call conf 30” options to detect insertion/deletions (Indels) and SNPs in the genomic variant call format file. We separated SNPs and Indels through “selectVariant” option in Genome Analysis Toolkit and discarded the sex chromosomes. High-quality SNPs were identified using the following stringent filtration criteria: (1) Phred-scaled quality score (QUAL) < 40.0, (2) Quality by depth (QD) < 2.0, (3) Mapping quality (MQ) < 25.0, (4) Phred-scaled p-value using Fisher’s exact test to detect strand bias (FS) > 60.0, (5) Mapping quality rank sum test (MQRankSum) < −12.50, (6) Read position rank sum test (ReadPosRankSum) < −8.0. All filtered SNPs were annotated using Variant Effect Predictor ( Additionally, SNPs were filtered using PLINK 2.0 (Purcell et al. 2007) according to the following criteria: minor allele frequency (maf) 0.01, Hardy–Weinberg p-value (hwe) 0.001, individuals with more than 10% missing genotypes (mind) 0.1, and missing rate per SNP (geno) 0.1. Visualization of SNP densities per each 100 kb window was performed by utilizing circos 0.69.6 (Krzywinski et al. 2009).

Phylogenetic analysis

The phylogenetic analysis can describe the genetic relationship between different breeds based on genetic diversity, and thus, it can be applied as a useful tool for managing the genetic diversity (Davies et al. 2008). To decipher the genetic relationship among all individuals, the phylogenetic tree was constructed for the whole SNP data using neighbour-joining method in VCF-kit 0.1.6 (Cook and Andersen 2017). FigTree 1.4.3 ( was also used to visualize the phylogenetic network. Additionally, the neighbour-joining tree was converted to the genetic distance matrix by employing a python script.

Selective signals and gene ontology

As cross-population and allele-frequency based methods of signatures of selection, fixation index (FST) (Weir and Cockerham 1984) and pairwise nucleotide diversity (θπ) (Nei and Li 1979) were used to detect selective signals in pony and light horse breeds. The FST and θπ values were calculated for pony and light horses using the sliding window approach (100 kb with a step size of 50 kb) in VCFtools 0.1.15 (Danecek et al. 2011). The formula of Weir and Cockerham (1984) was used to calculate the FST values:
where and are the mean and variance of allele frequencies, respectively; wi is the ratio of ni to M in which, ni and M are the number of individuals in ith population and the total number of individuals in both populations, respectively.
The θπ values were calculated through
where N is the total number of individuals in a population, πij is the number of different nucleotides per site between ith and jth sequences, and pi and pj are allele frequencies in ith and jth sequences, respectively. To normalize the FST values, Z-transformation was performed using “scale” command in R program. Also, the θπ values were transformed using log2 function (Yang et al. 2016). Top 1% windows that overlapped between Z(FST) and values were defined as strong selective signals in pony and light groups (Yang et al. 2016; Li et al. 2017). These windows were mapped to genes using “biomaRt” package ( To reveal the functional enrichment of genomic regions underlying selection pressures, gene ontology analysis was executed according to biological processes using gprofiler ( and Benjamini–Hochberg FDR correction.

Results and discussion

Genomic variants

The whole-genome sequences of pony and light horses (1786 Gb) were aligned to 94.59%–99.84% of the equine genome reference assembly at 14.7× average depth. After removing the marked PCR duplicates,and performing base quality score recalibration as well as variant calling, we detected 16 075 591 SNPs (Table S31). The annotation of SNPs distribution for all samples is shown in Table S41. We calculated SNP densities per each 100 kb window in which the highest density of SNPs was located on horse chromosome (ECA) 20: 32.7–32.8 Mb with 3343 SNPs (Fig. S11). This genomic region contains MHC class II DQ-alpha chain and MHC class II DR-beta chain genes (Fig. S11). The group of MHC genes, known as high polymorphic and recombinative genomic regions (Gaudieri et al. 2000), is associated with disease resistance (Van Oosterhout 2009). The highest nucleotide diversities for light and pony groups in 100 kb windows were located on ECA 2: 107.85–107.95 Mb and ECA 12: 13.05–13.15 Mb, respectively (Fig. S11). ECA 12: 13.05–13.15 Mb includes olfactory receptor 5D18-like and 4P4-like genes. ECA 12 is enriched by copy number of variants (CNV) due to the existence of major clusters of olfactory receptor genes (Ghosh et al. 2014).

Phylogenetic analysis

In this study, the phylogenetic neighbour-joining tree divided pony and light groups into two separate clusters (Fig. 1). In the light cluster, similar to the study by Petersen et al. (2013), there were three main branches including Arabian–Akhal-Teke, Standardbred, and German warmblood–Thoroughbred. All German warmblood breeds including Dutch warmblood, Baden-Wurttemberg, Hanoverian, Holstein, Oldenburg, Trakehner, and Westphalian were classified in one branch, indicating the genetic similarity and probable sharing of founder lines. Our phylogenetic results showed two Westphalians (WF1, WF2) in separate sub-branches; a Hanoverian (HAN6) was located near WF2, which might be due to hybridization among German warmblood breeds. In the pony cluster, two main branches were illustrated, one of which belongs to Connemara, and the other consists of Dülmen, Welsh pony, Jeju pony, Shetland pony, and American Miniature. Similar to the study by Petersen et al. (2013), there was a close genetic distance between Shetland pony and American Miniature breeds (Table S51). It should be noted that in the phylogenetic analysis the interpretation of the results related to the breeds with one individual, such as Baden-Wurttemberg, Dutch warmblood, Trakehner, Dülmen, and Welsh pony can still be reliable due to the large number of SNPs used in this study similar to previous studies (C. Zhang et al. 2018; Asadollahpour Nanaei et al. 2019).
Fig. 1.
Fig. 1. Neighbour-joining tree for light and pony breeds. The light breeds are Akhal-Teke (AKT), Dutch warmblood (KW), Baden-Wurttemberg (BW), Hanoverian (HAN), Holstein (HOL), Oldenburg (OLD), Trakehner (TRA), Westphalian (WF), Arabian (AR), Standardbred (ST), and Thoroughbred (TH). The Pony breeds are Welsh pony (WP), Shetland pony (SHP), American Miniature (AMP), Dülmen (DUP), Connemara (CONP), and Jeju pony (JEP).

Genome-wide signatures of selection analysis

Previous studies have indicated that signatures of selection analyses are particularly helpful for detecting genes related to body measurements and specifically wither height in horse (Kader et al. 2015; Frischknecht et al. 2016; Al Abri et al. 2018). A lot of qualitative evidence has proved that there are several effective genetic factors in body morphology variation, such as wither height which is controlled by lots of minor effect genes in naturally evolving species, whereas in domesticated animals it is controlled by a few major effect genes (Kader et al. 2015). Therefore, regarding the conservative role of genes in body morphology regulation, signatures of selection analyses can potentially provide an opportunity to detect genomic regions associated with horse type. To identify genomic regions under putative selection, several statistical methods based on allele frequency variation have been developed such as FST (Weir and Cockerham 1984) and θπ (Nei and Li 1979).
In this study, we calculated the mean of FST(pony-light) for each window of 100 kb with a step size of 50 kb. After FST Z-transformation, the Z(FST) values followed a normal distribution that is presented in Fig. 2. A total number of 446 windows including 377 genes were detected in the top 1% of Z(FST) values, ranged from 3.20 to 10.44 (Table S61). The highest Z(FST) value was located at 51.25–51.35 Mb of ECA 7. This region consists of the putative gustatory receptor clone (LOC100061702) gene and a novel identified gene in horse (ENSECAG00000002851) with human ortholog of L1TD1. In human, L1TD1 gene is related to bone mineral density (Chesi et al. 2017) and lip morphology (Cha et al. 2018). Furthermore, we could identify 13 genes in the top 1% of Z(FST) values that had been previously reported as candidate genes for body measurements (Fig. 3) in human (N′Diaye et al. 2011; Cousminer et al. 2013; Wood et al. 2014; Bae et al. 2016) and other species such as horse (Makvandi-Nejad et al. 2012; Metzger et al. 2013; Staiger et al. 2016), cattle (W. Zhang et al. 2016; Han et al. 2017), and swine (Rubin et al. 2012). These candidate genes are LCORL, NCAPG, DCAF16, FAM189A1, C4ORF33, BMP2, CRB1, IGFBP3, OPCML, FGF12, DDX55, FAM13A, and CPN1.
Fig. 2.
Fig. 2. The distribution of (A) Z-transformed fixation index or Z(FST) and (B) logarithm of transformed nucleotide diversity or values in 100 kb windows with sliding windows of 50 kb.
Fig. 3.
Fig. 3. The distribution of Z-transformed fixation index (Z(FST)) values in horse autosomes. Data points above the blue horizontal line are the top 1% of Z(FST) values. The NCAPG, LCORL, and DCAF16 genes were not overrepresented in the top 1% of values.
Identifying processes governing diversity of genome has played an effective role in evolutionary genetics (Ellegren and Galtier 2016) and is a spotlight for selective sweep studies (Moon et al. 2015; Yang et al. 2016; Li et al. 2017; Z. Zhang et al. 2018). Approaches based on nucleotide diversity such as pairwise nucleotide diversity have been used to detect selective signals for animal species such as sheep (Yang et al. 2016), goat (Li et al. 2017), duck (Z. Zhang et al. 2018b), and horse (Moon et al. 2015). In this study, the transformed θπ values were calculated for each window of 100 kb with a step size of 50 kb (Fig. 4). These values followed a normal distribution (Fig. 2). A total number of 446 windows including 366 genes were identified in the top 1% of transformed θπ values, ranged from 0.91 to 2.64 (Table S61). To identify more reliable selective signals, we determined the overlapped windows between the top 1% of Z(FST) and transformed θπ values by the cut-off ratio (Fig. 5).
Fig. 4.
Fig. 4. The distribution of transformed nucleotide diversity values in horse autosomes. Data points above the blue horizontal line are the top 1% of values. The HMGA2 gene was not overrepresented in the top 1% of Z(FST) values.
Fig. 5.
Fig. 5. (A) Overrepresented windows of the top 1% of transformed θπ and Z(FST) values. Data points located in the right side of the vertical line (the top 1% of transformed θπ values, where transformed θπ value is 0.91) and above the horizontal line (the top 1% of Z(FST) values, where Z(FST) value is 3.20) were detected as selective signals. (B) The venn diagram for the number of overlapped genes between the top 1% of transformed θπ and Z(FST) values.
We detected 139 overlapped genes as selective signals in the pony and light groups (Table S61), 10 of which including BMP2, CPN1, IGFBP3, OPCML, DDX55, FGF12, FAM13A, FAM189A1, C4ORF33, and CRB1 genes have been detected to be related to body measurements (Table 1) in human (N′Diaye et al. 2011; Cousminer et al. 2013; Wood et al. 2014; Bae et al. 2016) and animals (Fang et al. 2010; Fan et al. 2011; Olivieri et al. 2016; Xia et al. 2017). The LCORL, DCAF16, and NCAPG genes were not overrepresented by transformed θπ approach, conversely, HMGA2 gene was only screened in the top 1% of transformed θπ values; however, these genes were detected as functional candidate genes for body measurements in previous genome-wide association studies (Makvandi-Nejad et al. 2012; Kader et al. 2015; W. Zhang et al. 2016; Sevane et al. 2017).
Table 1.
Table 1. Overrepresented horse type candidate genes between Z(FST) and transformed θπ values.

Genes associated with wither height

The genetic background of horse wither height variation has been studied by Makvandi-Nejad et al. (2012), who reported one locus near LCORL and NCAPG genes with three other loci located near HMGA2, ZFAT, and LASP1 genes explaining approximately 83% of wither height variation. On the other hand, Signer-Hasler et al. (2012) indicated a quantitative trait loci (QTL) on ECA 3 near LCORL gene and another QTL on ECA 9, together explaining 18.2% of de-regressed estimated breeding values for horse wither height (Signer-Hasler et al. 2012). A majority of former studies (Makvandi-Nejad et al. 2012; Kader et al. 2015; Metzger et al. 2018) illuminated the effective role of LCORL gene in controlling horse wither height variation. Sevane et al. (2017) also revealed a strong association between one SNP on LCORL gene and several other body measurements such as hock circumference, knee perimeter, hind cannon circumference, and fore cannon circumference. In contrast to the HMGA2, LCORL, and NCAPG genes, the CPN1 and DDX55 genes were identified in the top 1% of both Z(FST) and transformed θπ values. These genes have significant associations with human height (Allen et al. 2010) and pubertal height growth (Cousminer et al. 2013). Additionally, DDX55 gene was also detected as a selective signal in German warmbloods (Nolte et al. 2019).

Genes associated with skeletal confirmation and growth

BMP2, FGF12, and IGFBP3 are known as candidate genes for growth and skeletal development. The insulin-like growth factor (IGF) axis is known as a conserved evolutionarily system (Teumer et al. 2016) and some genes in this family group are associated with cellular growth regulators (Cheng et al. 2007), such as IGFBP3 that is related to skeletal development (Chan et al. 2015). The BMP2 gene plays a key role in shaping bones, growth, and skeletal development (Huang et al. 2002). This gene was found to be associated with body trunk in goat (Fang et al. 2010), and body depth, length, and width in swine (Fan et al. 2011). Both BMP2 and IGFBP3 genes were previously confirmed as selective signals in German warmbloods (Nolte et al. 2019). The fibroblast growth factor family consists of 22 members and genes, such as FGF12 that plays a pivotal role in growth and formation of bones (F. Zhang et al. 2016). FAM189A1, CRB1, OPCML, and C4ORF33 are known as candidate genes for body mass index in human (Frischknecht et al. 2015; Locke et al. 2015; Akiyama et al. 2017; Huckins et al. 2018). Moreover, the OPCML gene was found to be associated with body weight and growth rate in broiler chickens (Gu et al. 2011). This gene was confirmed by Gurgul et al. (2019) as a selective signal between draft and light horse breeds that might have potentially contributed to developing different types of horse. Frischknecht et al. (2016) detected FAM189A1 as a selective signal for body mass index in Shetland ponies (Frischknecht et al. 2016). Also, in a genome-wide association study, Xia et al. (2017) demonstrated the relation of FAM13A with bone weight, size, and number of muscle fibers in skeletal muscles of Simmental cattle.
Our selection signatures analysis between pony and light horses allowed the detection of 10 genes including BMP2, CPN1, IGFBP3, OPCML, DDX55, FGF12, FAM13A, FAM189A1, C4ORF33, and CRB1 as direct selective signals. These genes were found to be associated with body measurements in some species, and thus they might potentially be related to pony and light horse type variation. To the best of our knowledge, except for DDX55, BMP2, IGFBP3, OPCML, and FAM189A1 genes, the other five identified selective signals have not been previously detected for horse type variation.

Gene ontology of selective signal

To further detect biological function of selective signal candidate genes, we classified the significant biological processes of 139 genes underlying selection pressures in pony and light horses. These significant biological processes (Table S71) include insulin-like growth factor receptor signaling pathway (GO:0048009, p-value = 0.01), regulation of insulin-like growth factor receptor signaling pathway (GO:0043567, p-value = 0.04), positive regulation of presynaptic cytosolic calcium concentration (GO:0099533, p-value = 0.04), and induction of synaptic vesicle exocytosis by positive regulation of presynaptic cytosolic calcium ion concentration (GO:0099703, p-value = 0.04). A former study revealed the effective role of insulin-like growth factor receptor signaling pathway in head circumference and development of the nervous system (Yang et al. 2019). The selective pressures for this biological pathway can be reasonable considering the differences in head dimensions between light and pony horses. The cytosolic calcium concentration can be effective in skeletal muscle kinetics (Cho et al. 2017). It can be associated with positive regulation of presynaptic cytosolic calcium concentration and induction of synaptic vesicle exocytosis by positive regulation of presynaptic cytosolic calcium ion concentration pathways. These two biological pathways could be under selective pressures particularly in light breeds considering their athletic usages such as racing, endurance, and sports.


In conclusion, the comprehensive comparison of whole-genome sequences of pony and light horses identified genomic regions and biological processes underlying selective pressures for light and pony horse types. In pony and light groups, 139 genes were detected as signatures of selection that are enriched for four significant biological pathways. We detected five under-selective-pressure novel genes including CPN1, FGF12, FAM13A, C4ORF33, and CRB1 that might possibly be related to body measurement differences between light and pony types, such as wither height, skeletal confirmation, and growth. Moreover, these results can provide a better genetic perspective of phenotypic differences among pony and light groups. Obviously, there might have been some limitation due to the small number of horses in this study, and therefore, further investigation and validation using genome-wide association and signatures of selection studies are required with larger populations and multiple horse breeds.

Author contributions

S.S.A., M.A., Y.M., and M.B.Z.M. conceived and designed this experiment. S.S.A. wrote the first draft of the manuscript and analyzed data under Y.M.’s supervision. S.S.A., M.A., Y.M., M.S., M.H.B., and M.B.Z.M. discussed the results and contributed to the final manuscript. S.S.A and Y.M. wrote the final manuscript. All authors reviewed and approved the final manuscript.

Funding statement

Select Sires Inc. provided support in the form of salaries for author M.S., but they did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of this author are articulated in the author contributions section.

Competing interests

Author M.S. is employed at Select Sires Inc. This organization did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of M.S.‘s salary. This does not alter our adherence to Genome journal policies on sharing data and materials. Other authors have declared that no competing interests exist.


Supplementary data are available with the article through the journal Web site at Supplementary Material.


Akiyama M., Okada Y., Kanai M., Takahashi A., Momozawa Y., Ikeda M., et al. 2017. Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. Nat. Genet. 49(10): 1458–1467.
Al Abri M.A., Posbergh C., Palermo K., Sutter N.B., Eberth J., Hoffman G.E., and Brooks S.A. 2018. Genome-wide scans reveal a quantitative trait locus for withers height in horses near the ANKRD1 gene. J. Equine Vet. Sci. 60: 67–73.e61.
Allen H.L., Estrada K., Lettre G., Berndt S.I., Weedon M.N., Rivadeneira F., et al. 2010. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature, 467(7317): 832–838.
Asadollahpour Nanaei H., Ayatollahi Mehrgardi A., and Esmailizadeh A. 2019. Comparative population genomics unveils candidate genes for athletic performance in Hanoverians. Genome, 62(4): 279–285.
Bae S., Choi S., Kim S.M., and Park T. 2016. Prediction of quantitative traits using common genetic variants: application to body mass index. Genomics Inform. 14(4): 149–159.
Bolger A.M., Lohse M., and Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30(15): 2114–2120.
Bowling, A.T., and Ruvinsky, A. (Editors). 2000. The genetics of the horse. CABI. 26 pp.
Brooks S., Makvandi-Nejad S., Chu E., Allen J., Streeter C., Gu E., et al. 2010. Morphological variation in the horse: defining complex traits of body size and shape. Anim. Genet. 41(s2): 159–165.
Cha S., Lim J.E., Park A.Y., Do J.-H., Lee S.W., Shin C., et al. 2018. Identification of five novel genetic loci related to facial morphology by genome-wide association studies. BMC Genomics, 19(1): 481.
Chan Y., Salem R.M., Hsu Y.-H.H., McMahon G., Pers T.H., Vedantam S., et al. 2015. Genome-wide analysis of body proportion classifies height-associated variants by mechanism of action and implicates genes important for skeletal development. Am. J. Hum. Genet. 96(5): 695–708.
Cheng I., DeLellis Henderson K., Haiman C.A., Kolonel L.N., Henderson B.E., Freedman M.L., and Le Marchand L.C. 2007. Genetic determinants of circulating insulin-like growth factor (IGF)-I, IGF binding protein (BP)-1, and IGFBP-3 levels in a multiethnic population. J. Clin. Endocrinol. Metab. 92(9): 3660–3666.
Chesi A., Mitchell J.A., Kalkwarf H.J., Bradfield J.P., Lappe J.M., Cousminer D.L., et al. 2017. A genomewide association study identifies two sex-specific loci, at SPTB and IZUMO3, influencing pediatric bone mineral density at multiple skeletal sites. J. Bone Miner. Res. 32(6): 1274–1281.
Cho C.-H., Woo J.S., Perez C.F., and Lee E.H. 2017. A focus on extracellular Ca2+ entry into skeletal muscle. Exp. Mol. Med. 49(9): e378.
Colby, F.M. 1921. New International Yearbook: A Compendium of the World’s Progress for the Year 1920. Dodd, Mead & Co., New York. 171 pp.
Cook D.E. and Andersen E.C. 2017. VCF-kit: assorted utilities for the variant call format. Bioinformatics, 33(10): 1581–1582.
Cousminer D.L., Berry D.J., Timpson N.J., Ang W., Thiering E., Byrne E.M., et al. 2013. Genome-wide association and longitudinal analyses reveal genetic loci linking pubertal height growth, pubertal timing and childhood adiposity. Hum. Mol. Genet. 22(13): 2735–2747.
Dall’Olio S., Fontanesi L., Nanni Costa L., Tassinari M., Minieri L., and Falaschini A. 2010. Analysis of horse myostatin gene and identification of single nucleotide polymorphisms in breeds of different morphological types. J. BioMed. Biotechnol. 2010: 542945.
Danecek P., Auton A., Abecasis G., Albers C.A., Banks E., DePristo M.A., et al. 2011. The variant call format and VCFtools. Bioinformatics, 27(15): 2156–2158.
Davies T.J., Fritz S.A., Grenyer R., Orme C.D.L., Bielby J., Bininda-Emonds O.R., et al. 2008. Phylogenetic trees and the future of mammalian biodiversity. Proc. Natl. Acad. Sci. U.S.A. 105(Suppl. 1): 11556–11563.
Draper, J., Sly, D., and Muir, S. 2014. The ultimate book of the horse and rider. Lorenz Books.
Ellegren H. and Galtier N. 2016. Determinants of genetic diversity. Nat. Rev. Genet. 17(7): 422–433.
Fan B., Onteru S.K., Du Z.-Q., Garrick D.J., Stalder K.J., and Rothschild M.F. 2011. Genome-wide association study identifies loci for body composition and structural soundness traits in pigs. PLoS ONE, 6(2): e14726.
Fang X., Xu H., Zhang C., Zhang J., Lan X., Gu C., and Hong C. 2010. Polymorphisms in BMP-2 gene and their associations with growth traits in goats. Genes Genomics, 32(1): 29–35.
Frischknecht M., Jagannathan V., Plattet P., Neuditschko M., Signer-Hasler H., Bachmann I., et al. 2015. A non-synonymous HMGA2 variant decreases height in Shetland ponies and other small horses. PLoS ONE, 10(10): e0140749.
Frischknecht M., Flury C., Leeb T., Rieder S., and Neuditschko M. 2016. Selection signatures in Shetland ponies. Anim. Genet. 47(3): 370–372.
Gaudieri S., Dawkins R.L., Habara K., Kulski J.K., and Gojobori T. 2000. SNP profile within the human major histocompatibility complex reveals an extreme and interrupted level of nucleotide diversity. Genome Res. 10(10): 1579–1586.
Ghosh S., Qu Z., Das P.J., Fang E., Juras R., Cothran E.G., et al. 2014. Copy number variation in the horse genome. PLoS Genet. 10(10): e1004712.
Grilz-Seger G., Druml T., Neuditschko M., Mesarič M., Cotman M., and Brem G. 2019. Analysis of ROH patterns in the Noriker horse breed reveals signatures of selection for coat color and body size. Anim. Genet. 50(4): 334–346.
Gu X., Feng C., Ma L., Song C., Wang Y., Da Y., et al. 2011. Genome-wide association study of body weight in chicken F2 resource population. PLoS ONE, 6(7): e21872.
Gurgul A., Jasielczuk I., Semik-Gurgul E., Pawlina-Tyszko K., Stefaniuk-Szmukier M., Szmatoła T., et al. 2019. A genome-wide scan for diversifying selection signatures in selected horse breeds. PLoS ONE, 14(1): e0210751.
Han Y., Chen Y., Liu Y., and Liu X. 2017. Sequence variants of the LCORL gene and its association with growth and carcass traits in Qinchuan cattle in China. J. Genet. 96(1): 9–17.
Huang W., Rudkin G.H., Carlsen B., Ishida K., Ghasri P., Anvar B., et al. 2002. Overexpression of BMP-2 modulates morphology, growth, and gene expression in osteoblastic cells. Exp. Cell Res. 274(2): 226–234.
Huckins L., Hatzikotoulas K., Southam L., Thornton L., Steinberg J., Aguilera-McKay F., et al. 2018. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa. Mol. Psychiatry, 23(5): 1169–1180.
Jun J., Cho Y.S., Hu H., Kim H.-M., Jho S., Gadhvi P., et al. 2014. Whole genome sequence and analysis of the Marwari horse breed and its genetic origin. BMC Genomics, 15(Suppl. 9): S4.
Kader A., Li Y., Dong K., Irwin D.M., Zhao Q., He X., et al. 2015. Population variation reveals independent selection toward small body size in Chinese Debao pony. Genome Biol. Evol. 8(1): 42–50.
Krzywinski M.I., Schein J.E., Birol I., Connors J., Gascoyne R., Horsman D., et al. 2009. Circos: an information aesthetic for comparative genomics. Genome Res. 19(9): 1639–1645.
Li H. and Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics, 25(14): 1754–1760.
Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., et al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics, 25(16): 2078–2079.
Li X., Su R., Wan W., Zhang W., Jiang H., Qiao X., et al. 2017. Identification of selection signals by large-scale whole-genome resequencing of cashmere goats. Sci. Rep. 7(1): 15142.
Locke A.E., Kahali B., Berndt S.I., Justice A.E., Pers T.H., Day F.R., et al. 2015. Genetic studies of body mass index yield new insights for obesity biology. Nature, 518(7538): 197–206.
Makvandi-Nejad S., Hoffman G.E., Allen J.J., Chu E., Gu E., Chandler A.M., et al. 2012. Four loci explain 83% of size variation in the horse. PLoS ONE, 7(7): e39929.
McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20(9): 1297–1303.
Meira C.T., Farah M.M., Fortes M.R., Moore S.S., Pereira G.L., Silva J.A.I.V., et al. 2014. A genome-wide association study for morphometric traits in quarter horse. J. Equine Vet. Sci. 34(8): 1028–1031.
Metzger J., Schrimpf R., Philipp U., and Distl O. 2013. Expression levels of LCORL are associated with body size in horses. PLoS ONE, 8(2): e56497.
Metzger J., Tonda R., Beltran S., Águeda L., Gut M., and Distl O. 2014. Next generation sequencing gives an insight into the characteristics of highly selected breeds versus non-breed horses in the course of domestication. BMC Genomics, 15(1): 562.
Metzger J., Rau J., Naccache F., Conn L.B., Lindgren G., and Distl O. 2018. Genome data uncover four synergistic key regulators for extremely small body size in horses. BMC Genomics, 19(1): 492.
Moon S., Lee J.W., Shin D., Shin K.-Y., Kim J., Choi I.-Y., et al. 2015. A genome-wide scan for selective sweeps in racing horses. Asian-Australas J. Anim. Sci. 28(11): 1525–1531.
Nei M. and Li W.-H. 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. U.S.A. 76(10): 5269–5273.
Nolte W., Thaller G., and Kuehn C. 2019. Selection signatures in four German warmblood horse breeds: tracing breeding history in the modern sport horse. PLoS ONE, 14(4): e0215913.
N′Diaye A., Chen G.K., Palmer C.D., Ge B., Tayo B., Mathias R.A., et al. 2011. Identification, replication, and fine-mapping of loci associated with adult height in individuals of African ancestry. PLoS Genet. 7(10): e1002298.
Olivieri B.F., Mercadante M.E.Z., Cyrillo J.N., Branco R.H., Bonilha S.F.M., de Albuquerque L.G., et al. 2016. Genomic regions associated with feed efficiency indicator traits in an experimental Nellore cattle population. PLoS ONE, 11(10): e0164390.
Petersen J.L., Mickelson J.R., Rendahl A.K., Valberg S.J., Andersson L.S., Axelsson J., et al. 2013. Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet. 9(1): e1003211.
Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A., Bender D., et al. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3): 559–575.
Rubin C.-J., Megens H.-J., Barrio A.M., Maqbool K., Sayyab S., Schwochow D., et al. 2012. Strong signatures of selection in the domestic pig genome. Proc. Natl. Acad. Sci. U.S.A. 109(48): 19529–19536.
Salek Ardestani S., Aminafshar M., Zandi Baghche Maryam M.B., Banabazi M.H., Sargolzaei M., and Miar Y. 2020. Whole-genome signatures of selection in sport horses revealed selection footprints related to musculoskeletal system development processes. Animals, 10(1): E53.
Sevane N., Dunner S., Boado A., and Cañon J. 2017. Polymorphisms in ten candidate genes are associated with conformational and locomotive traits in Spanish purebred horses. J. Appl. Genet. 58(3): 355–361.
Signer-Hasler H., Flury C., Haase B., Burger D., Simianer H., Leeb T., and Rieder S. 2012. A genome-wide association study reveals loci influencing height and other conformation traits in horses. PLoS ONE, 7(5): e37282.
Staiger E.A., Al Abri M., Pflug K.M., Kalla S., Ainsworth D., Miller D., et al. 2016. Skeletal variation in Tennessee Walking Horses maps to the LCORL/NCAPG gene region. Physiol. Genomics, 48(5): 325–335.
Teumer A., Qi Q., Nethander M., Aschard H., Bandinelli S., Beekman M., et al. 2016. Genomewide meta-analysis identifies loci associated with IGF-I and IGFBP-3 levels with impact on age-related traits. Aging Cell, 15(5): 811–824.
Van Oosterhout C. 2009. A new theory of MHC evolution: beyond selection on the immune genes. Proc. R. Soc. B Biol. Sci. 276(1657): 657–665.
Wang X., Liu J., Zhou G., Guo J., Yan H., Niu Y., et al. 2016. Whole-genome sequencing of eight goat populations for the detection of selection signatures underlying production and adaptive traits. Sci. Rep. 6: 38932.
Weir B.S. and Cockerham C.C. 1984. Estimating F-statistics for the analysis of population structure. Evolution, 38(6): 1358–1370.
Wood A.R., Esko T., Yang J., Vedantam S., Pers T.H., Gustafsson S., et al. 2014. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46(11): 1173–1186.
Xia J., Fan H., Chang T., Xu L., Zhang W., Song Y., et al. 2017. Searching for new loci and candidate genes for economically important traits through gene-based association analysis of Simmental cattle. Sci. Rep. 7: 42048.
Yang J., Li W.-R., Lv F.-H., He S.-G., Tian S.-L., Peng W.-F., et al. 2016. Whole-genome sequencing of native sheep provides insights into rapid adaptations to extreme environments. Mol. Biol. Evol. 33(10): 2576–2592.
Yang X.-L., Zhang S.Y., Zhang H., Wei X.-T., Feng G.J., Pie Y.F., and Zhang L. 2019. Three novel loci for infant head circumference identified by a joint association analysis. Front. Genet. 10: 947.
Zhang C., Ni P., Ahmad H.I., Gemingguli M., Baizilaitibei A., Gulibaheti D., et al. 2018. Detecting the population structure and scanning for signatures of selection in horses (Equus caballus) from whole-genome sequencing data. Evol. Bioinform. Online, 14: 1176934318775106.
Zhang F., Dai L., Lin W., Wang W., Liu X., Zhang J., et al. 2016. Exome sequencing identified FGF12 as a novel candidate gene for Kashin-Beck disease. Funct. Integr. Genomics, 16(1): 13–17.
Zhang W., Li J., Guo Y., Zhang L., Xu L., Gao X., et al. 2016. Multi-strategy genome-wide association studies identify the DCAF16-NCAPG region as a susceptibility locus for average daily gain in cattle. Sci. Rep. 6: 38073.
Zhang Z., Jia Y., Almeida P., Mank J.E., van Tuinen M., Wang Q., et al. 2018. Whole-genome resequencing reveals signatures of selection and timing of duck domestication. GigaScience, 7(4): giy027.

Supplementary Material

Supplementary data (

Information & Authors


Published In

cover image Genome
Volume 63Number 8August 2020
Pages: 387 - 396


Received: 2 January 2020
Accepted: 6 May 2020
Accepted manuscript online: 14 May 2020
Version of record online: 14 May 2020

Key Words

  1. fixation index
  2. horse type
  3. nucleotide diversity
  4. signatures of selection
  5. whole-genome sequence


  1. indice de fixation
  2. type de cheval
  3. diversité nucléotidique
  4. signatures de sélection
  5. séquençage génomique complet



Siavash Salek Ardestani
Department of Animal Science, Science and Research Branch, Islamic Azad University, Tehran 1477893855, Iran.
Mehdi Aminafshar
Department of Animal Science, Science and Research Branch, Islamic Azad University, Tehran 1477893855, Iran.
Mohammad Bagher Zandi Baghche Maryam
Department of Animal Science, University of Zanjan, Zanjan 4537138791, Iran.
Mohammad Hossein Banabazi
Department of Biotechnology, Animal Science Research Institute of Iran, Agricultural Research, Education & Extension Organization, Karaj 3146618361, Iran.
Mehdi Sargolzaei
Department of Pathobiology, University of Guelph, Guelph, ON NIG 2W1, Canada.
Select Sires Inc., Plain City, OH 43064, USA.
Younes Miar [email protected]
Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS B2N 5E3, Canada


Copyright remains with the author(s) or their institution(s). This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Metrics & Citations


Other Metrics


Cite As

Export Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

1. Strong selection signatures for Aleutian disease tolerance acting on novel candidate genes linked to immune and cellular responses in American mink (Neogale vison)
2. Investigation of potential genetic factors for growth traits in yellow-feather broilers using weighted single-step genome-wide association study
3. Identification of selection signatures in Iranian dromedary and Bactrian camels using whole genome sequencing data
4. Population structure and genomic footprints of selection in five major Iranian horse breeds
5. Detection of Common Copy Number of Variants Underlying Selection Pressure in Middle Eastern Horse Breeds Using Whole-Genome Sequence Data
6. A Genome-Wide Scan for Signatures of Selection in Kurdish Horse Breed
7. Shared Ancestry and Signatures of Recent Selection in Gotland Sheep
8. A genome-wide signatures of selection study of Welsh ponies and draft horses revealed five genes associated with horse type variation
9. Signatures of selection reveal candidate genes involved in economic traits and cold acclimation in five Swedish cattle breeds

View Options

View options


View PDF

Get Access

Login options

Check if you access through your login credentials or your institution to get full access on this article.


Click on the button below to subscribe to Genome

Purchase options

Purchase this article to get full access to it.

Restore your content access

Enter your email address to restore your content access:

Note: This functionality works only for purchases done as a guest. If you already have an account, log in to access the content to which you are entitled.





Share Options


Share the article link

Share on social media

Cookies Notification

We use cookies to improve your website experience. To learn about our use of cookies and how you can manage your cookie settings, please see our Cookie Policy.