REV_ISS_WEB_JBG_12119_131-5 409..412 REVIEW 10th WCGALP in beautiful Vancouver R.J.C. Cantet 1 , O.F. Christensen 2 , M. P�erez-Enciso3 & J.H.J. van der Werf4 1 Universidad de Buenos Aires, Buenos Aires, Argentina 2 Aarhus University, Aarhus, Denmark 3 ICREA – Centre for Research in Agrigenomics (CSIC-IRTA-UAB-UB), Barcelona, Spain 4 University of New England, Armidale, NSW, Australia Correspondence Asko M€aki-Tanila, University of Helsinki, Finland E-mail: Asko.Maki-Tanila@helsinki.fi Received: 28 August 2014; accepted: 29 August 2014 The 10th World Congress was inaugurated by orga- nizers Filippo Miglior and John Pollak in Vancouver at 8 pm on Sunday 17 Aug, preceded by a cocktail to warm up attendees’ epigenomes. We return to these congresses each time in higher numbers, now over 1500 participants. The arrangements were very good and the weather cherished us all week, including the boat trip out to open sea among the small hydroplanes whirling up and down around us on the water. The new technology was adopted in presenting the posters (of rather dated outlay though) and the talks could now be easily found by author names and also re-lis- tened to at the congress web site. It is not easy to itemise separate themes or avoid overlaps in review- ing the congress, where the sessions were thoroughly filled or hollowed by our extensive genome-wide studies. From sequence to presequence (Miguel P�erez- Enciso) In the review on Leipzig’s WCGALP, I predicted that the Vancouver meeting would be flooded by sequences: ‘At the next World Congress, Vancouver 2014, complete genomes will be as popular as SNP microarrays were at the Leipzig venue’ (P�erez-Enciso 2010, J Anim Breed Gen 127, 338). I was wrong: Van- couver WCGALP was overwhelmingly dominated by genomic selection (GS) issues with QTL – GWAS stud- ies being never more popular in animal sciences. This is partly due to the clear practical focus of WCGALP but does not explain the whole variance among the communication dataset from the meeting. Among the oral presentations, 15 were on next- generation sequencing (NGS) data versus 77 on genomic selection, 23 on population genetics topics (selection footprint, variability), 47 with GWAS / QTL approaches and 9 on systems biology (net- works, pathways). For the posters, the numbers were 6, 61, 30, 59, and 6, respectively. (These numbers are mainly based on reading the title of communications and are subjective to an extent). In any case, NGS talks were overrepresented sug- gesting that NGS was considered a hot topic while the animal NGS field has not exploded yet in its entirety. The same seems to be the case for systems biology. Genomic selection was actually under rep- resented among oral contributions, possibly due to frequency dependent selection. For the observations above I have counted only genome wide sequence: RNAseq has become a popu- lar tool (20 papers in the congress) than genomic NGS. The likely reasons are the lower costs and richer information. For example, e.g., RNAseq can be used not only to measure overall expression, but also allele specific expression (Bill Muir) or to refine the annota- tion. If one extrapolates the prospective avenues from the on-going human research, it can be expected epi- genome and metagenome to be quite popular targets in the near future. There were <10 papers on them in the meeting. Most of the sequence data reported were on cattle, (over 2000 genomes). It seems that the most fruitful application of NGS was in detecting monogenic, dele- terious mutations. Michel Georges and Richard Spel- man used Hardy–Weinberg deviation to find lethal J. Anim. Breed. Genet. ISSN 0931-2668 © 2014 Blackwell Verlag GmbH • J. Anim. Breed. Genet. 131 (2014) 409–412 doi:10.1111/jbg.12119 mutations affecting embryo development. They used some 500 bull sequences (7X on average) with impu- tation for 10 million animals. Aur�elien Capitan et al. at INRA further isolated several causal mutations affecting rare syndromes using the 1000 bull genome data, which consists of 1147 highly influential bulls from 27 breeds sequenced at on average of 11X. Using the same data Ben Hayes found that with BayesR the accuracy of genomic selection accuracy is only some 2% higher than with dense marker set. Imputation accuracy was 90% for minor allele frequency (MAF) > 0.1, and dropped dramatically for lower MAF. The issue, of course, is that low MAF variants are the most frequent ones in sequence data, and this could be one of the reasons why complete sequence did not help. After all, the subdued appearance of NGS data in the meeting is possibly caused by the difficulties in their analyses, which are much greater than antici- pated. Such data require costly computer resources and are noisy for the scale needed in animal breeding. Their analyses are also complicated due to limited sequencing depth (which introduces incertitude in SNP calling). We have observed significant changes in the SNP calling process when using two different ver- sions of the same software like samtools mpileup tool. Jerry Taylor confessed he had become a NGS addict, with 500 bulls sequenced. So did I, but I am in a detoxifying treatment for the given reasons. As peni- tence, I devote now most of my time developing ana- lytical tools to make the most out of the data already available, and to optimize experimental designs before is too late. Several communications (e.g., Jerry Taylor, Vincent Ducrocq, Ben Hayes, Mike Goddard) revolved around the utilization of the causal mutations in selection for complex traits, which seems a bit counterintuitive in the genomic selection paradigm. Several decades of research and sequencing have proven how difficult this is, even if the causal mutations are present in your data. In WCGALP several authors, Peter Søren- sen and Mike Goddard, among others, recognized the importance of using biological information for predic- tion purposes. I fully agree but an accepted or mean- ingful way to do so remains to be elucidated. Perhaps, one could start by recognizing that not all SNPs are born equal and introduce annotation in the model. Tools like variant effect predictor in ensembl classify SNPs according to their expected degree of severity. When using sequence, this information can be readily taken into account in the priors. However, this is not relevant for genotyping arrays because most chip SNPs will be intergenic and likely neutral per se. In this paradigm, sequence data could make a difference. As you can guess, I do not dare doing any prediction for New Zealand’s event, though. Genomic selection matures (Ole F. Christensen) A very interesting symposium was about industry applications of GS. The session started from dairy cat- tle with Esa M€antysaari’s historical perspective about the enormous impact of GS on the sector. For poultry, Anna Wolc presented the results from a multi-genera- tion GS and experiences about the GS implementa- tion in broilers and layers. In poultry the applications arrived later than in dairy cattle, primarily because of the prohibitive expenses of genotyping relative to the value of individual selection candidates. Atlantic sal- mon is a very different species due to its much later domestication (only some ten generations ago). The present population is an admixture. Another feature is the high fecundity both in males and females. Jør- gen Ødegaard praised the high potential of GS in aquaculture and compared GS models in a two-trait context (lice resistance and fillet colour). There are clearly species specific issues in the GS applications. Several presentations were about methodology for single-step genomic evaluation (ssGBLUP) using a hybrid relationship matrix with a good overview given by Andres Legarra. Ismo Strand�en presented an equivalent equation system for solving ssGBLUP with- out constructing and inverting the pedigree relation- ship matrix for genotyped animals. Dorian Garrick formulated ssGBLUP as a SNP effect model. From a conceptual point of view it is very important to have the two equivalent formulations. Zenting Liu said that SNP model allows excluding/including specific ani- mals in the training data. Prediction across breeds was a topic with many pre- sentations and several groups are developing useful approaches. Mario Calus concluded that the predic- tions using information across breeds benefit from few closely related inidividuals while some individuals can deteriorate the predictions. Shared large effect QTL’s improve the prediction across breeds (Mahdi Saatchi and Dorian Garrick), and they could be detected from imputed whole genome sequence (e.g. Rasmus Brøndum). In addition to the many sessions focusing by name on GS, there were presentations about genotyping and phenotyping strategies, and presentations where GS (or ssGBLUP) was not of primary interest but a natural part of the genetic evaluation. GS is now more mature and over the highest uncontrolled enthusi- asm. Many of the challenges seen with pedigree-based © 2014 Blackwell Verlag GmbH • J. Anim. Breed. Genet. 131 (2014) 409–412410 Review genetic selection are still present. Because the cost of genotyping is steadily coming down, there will be many more marker based evaluation programs. By the next WCGALP, I would expect to see many stu- dies where the main focus is not on GS but marker genotypes (or causal variants) are included in the genetic model. Developments in quantitative genetics (Julius H. J. van der Werf) Quantitative genetics is the foundation of much of the work in animal breeding. At the conference it was cov- ered by ‘Breeding objectives, economics of selection schemes, and advances in selection theory’ but appeared in many other topics. The amazing genomic toolbox requires sound quantitative genetic theory to underpin models of analysis while challenging assump- tions. So genomics is causing a revolution similar to the one almost hundred years ago when Ronald Fisher and Sewall Wright proposed to use pedigree information to enhance genetic analysis of quantitative traits. The genome wide association studies give us a clue about the size and distribution of gene effects that control quantitative genetic variation, and further about gene by gene and gene by environmental inter- actions. The analyses with dense markers give infor- mation about the level of heterozygosity, or absence thereof, genetic diversity, and signatures of selection. Molecular information also provides a tool to gain a greater insight about identity of descent at the level of a single locus, and from that we can derive coefficient of covariance for a range of genetic effects. For exam- ple, dominance variation can be estimated based on variation in genomic dominance relationships, and there is no longer a requirement of having full-sib families. Theo Meuwissen pointed out how the geno- mic prediction is contributed by information on pedi- gree, linkage and linkage disequilibrium. The relative importance depends on the true genetic model, with the linkage based approach being more important for large QTL effects. The veil of the underlying genetic model is slowly being lifted. Some believe that the missing heritability problem is mainly due to non-additive genetic effects (e.g. Zuk et al, PNAS 109: 1193). However, Asko M€aki-Tanila and Bill Hill showed that epistatic effects rarely con- tribute much to the observed variation and taking them into account in either selection or GWAS strate- gies is unlikely to have a large impact. These studies are good attempts to reconcile the top-down and bot- tom up approaches into quantitative genetic analysis, as suggested by Eric Lander at the International Quantitative Genetics Symposium in Edinburgh in 2012. With sequence data we are able to detect more causal variants, but we are far from explaining the observed (additive) genetic variance with detected QTL effects and terabytes of data grinding should be passed to resolve for the explanation. The resurgence in the hunt for QTL is a logical next step in the geno- mic prediction models. Hopefully, we can now make use of the lessons learned more than a decade ago. E.g. the selection on QTL is less optimal as the joint selection on QTL and polygenic background. This would be difficult to achieve if genomic selection was based on just a few QTL, suggested by Jerry Taylor for the use of sequence information. The plain phenotypic variation could be analysed by new genetic models, e.g. in the analysis of mater- nal and social effects, traits measured on trajectories and genotype by environment interactions. These could reveal nonlinear relationships between traits as shown by Han Mulder, and these effects could also be selected upon. Epigenetic studies had not arrived in large numbers at the WCGALP in spite of being, for some years now, a hot topic in human genetic analy- ses. Only eight studies looked at epigenetics, with sim- ple variance components or with gene expression or methylation patterns. I would expect that such studies will also become more common in animal genetics with interesting phenomena as an outcome. Yaodong Hu, Guilherme Rosa and Daniel Gianola showed that imprinting could lead to a significant reduction in the GWAS heritability. It was good to see that optimal contribution selec- tion has now become part of the regular animal breeding toolbox. John Woolliams considered cases where selection accuracy is equal to one and stated that ‘if accuracy does not approach one with huge numbers, then the community needs to completely overhaul the basis of its most cherished models for genetic evaluation’. We’ll have to see! In the symposium on Breeding Objectives some excellent insights were presented relevant to achiev- ing successful outcomes in animal improvement programmes. The debate continues because the assessment of utility that can vary between circum- stances and people. The area lacks a comprehensive theory, and if anything, we were made aware that the existing framework of (linear) selection index princi- ples are rarely found adequate when determining relative trait emphasis in multiple trait selection prac- tices (Pieter Knap; Rob Banks). Jack Dekkers made it clear that the way in which breeding objectives are achieved largely depends on the information available per trait, and e.g. genomic information may change © 2014 Blackwell Verlag GmbH • J. Anim. Breed. Genet. 131 (2014) 409–412 411 Review the direction of genetic progress. It is somewhat ironic that the availability of genomic markers has sparked more interest in phenotyping, not less, with several sessions devoted to phenotyping for traits difficult to measure. At the end of the day, the genomic informa- tion just helps us to make better inferences but the phenotypic information remains the basis of genetic improvement. Statistical animal breeding, genomics and prediction of breeding value (Rodolfo J. C. Cantet) Animal breeders have used a plethora of statistical methods to predict breeding values when adding geno- mic information to phenotypes and to pedigree data. At the Leipzig WCGALP two different models for pre- diction of breeding value entered the genomic arena: the independent multiple marker (SNP) model by Theo Meuwissen and co-workers (2001 Genetics 157, 1819), and the infinitesimal animal model with the covariance matrix of independent markers defined by Paul VanRaden (2008 J Dairy Sci 91, 4414), as worked out by Ignacy Misztal and co-workers (see Legarra et al. 2009 J Dairy Sci 92, 4656) ssGBLUP. In Vancou- ver we now witnessed (un)intentional efforts to con- verge from either side to a model that takes into account the genetic architecture of the trait and be consistent with the infinitesimal model that has served so well until now. Reflecting my thoughts as somebody dealing with genetic evaluation for beef cattle: (i) I try to avoid situations where I have to compare the geno- mic and conventional methods as the existence of two different predictions of breeding value affect the confi- dence on either method; (ii) If a bull or cow has no new phenotypic data but genomic information (or not), it is easy to explain a change in the predicted breeding value from ssGBLUP. However, this is not easy from the SNP model, in particular if imputation and Bayesian algorithms have been used. The basic difficulty with both the models is that under linkage and linkage disequilibrium, genome segments rather than SNPs are transmitted over gen- erations. Hence there are neither independent SNP effects nor real permutable markers to calculate geno- mic relationships (see Thompson 2013 Genetics 194, 301). Therefore VanRaden’s predictor of the true genomic relationship partially captures the pattern of inheritance under Hardy–Weinberg, but it does not take into account the genetic architecture (variable gene effects over the genome) behind the trait varia- tion. The first session on Monday morning evidenced all of this. Dorian Garrick and Vincent Ducrocq pre- sented models accommodating individual markers and polygenic effects. Gustavo de los Campos gave a clever presentation on the difficulties in equating the estimate of the additive genetic variance from the SNP model with the one from the infinitesimal model. The former works well for prediction, but it does not allow the estimation of additive variance. The search for the covariance matrix and the linear prediction model that takes into account genomic information, was pre- sented by Gregor Gorjanc, and also by Zulma Vitezica on metafounders. I showed with her how genomic information is utilized by the classic regression approach of breeding values across generations to track the Mendelian sampling effects and thereby account for more additive variance than with the con- ventional animal model. When the (true) underlying genetic model is unde- fined, the accuracy of prediction cannot be defined the usual way and we cannot compare the two meth- ods of utilizing genomic information. We have been performing genetic evaluation over decades assuming the infinitesimal model, as it has proved to be consis- tent with the observations. Usually overlooked by animal breeders, asymptotic theory arguments for- malizing the infinitesimal model have been given by Lange (1978, J Math Biol 6, 59) and by Abney et al. (2000, Am J Hum Genet 66, 629). Hopefully we are soon able to perform genetic evaluation (like Dr. Hen- derson taught us in the non-genomic era) having all the benefits of genomic information and taking into account the genetic architecture of the trait. © 2014 Blackwell Verlag GmbH • J. Anim. Breed. Genet. 131 (2014) 409–412412 Review