Canola pan-genome map for better, more resilient varieties
Sequencing of canola genomes a major step for breeders.
December 8, 2021 By Julienne Isaacs
An international coalition of researchers led by an Agriculture and Agri-Food Canada (AAFC) scientist, the Global Institute for Food Security (GIFS) at the University of Saskatchewan, and Israeli bioinformatics company NRGene has mapped the canola pan-genome, or its entire set of genes.
It’s a major step forward for the canola industry and will lead to advances in breeding, says Andrew Sharpe, director of genomics and bioinformatics for GIFS, who co-led the effort with AAFC research scientist and GIFS affiliate researcher Isobel Parkin.
The initiative, called the International Canola Pan-Genome Consortium, was established in 2019 and included contributions from key players in the canola industry based in Canada, the United States, Europe and Israel, including Bayer, Corteva, NuSeed and Nutrien Ag Solutions, as well as NRGene.
NRGene uses artificial intelligence-based genomics tools to accelerate breeding programs around the world. The genomics support was necessary to help assemble a huge amount of data, Sharpe says.
The project involved sequencing 12 canola and rapeseed varieties using NRGene’s DeNovoMAGIC software. Once sequenced, NRGene compared each of the varieties’ chromosome-level genome sequences to the others and built the pan-genome database.
Sharpe calls this resource “foundational.” He’s engaged in similar projects in wheat and camelina. For each crop, a completed pan-genome can be used for research “in perpetuity.”
“You can start to collect sequence data from other genotypes or lines for a particular crop and then align the data you’ve got from those genotypes against the reference pan-genome,” he explains. “That allows you to identify new genetic variation from diverse sources, which ultimately leads to variation in traits, which is of interest to breeders and breeding companies.”
Following publication of the pan-genome, the consortium will make it available to the entire canola research and breeding community, Sharpe says.
Sharpe says the consortium wanted to characterize small- and large-scale variations between the genomes.
In other words, they wanted to analyze differences between single nucleotides, where one single base in the sequence changes to another base, as well as larger differences, such as structural variation involving “chunks” of DNA in different varieties, where portions of chromosomes are duplicated, deleted, inverted, or even moved to another chromosome.
“This structural variation looks to be a very important type of variation, often associated with key traits, and which we struggled to characterize previously,” he says.
“With whole genome assembly, you end up with very long contiguous bits of genome, which provides you with an excellent framework for identifying larger structural events, particularly those that have impacts on traits like disease resistance and resistance to abiotic stress. We can look to see if there are strong associations with the traits.”
GIFS’ role in the project, Sharpe says, was to use its in-house informatics analysis and data storage capabilities to identify variation between the different lines, draw out comparisons across entire genomes, and compile that information in a database. GIFS, through the Plant Phenotyping and Imaging Research Centre (P2IRC) that it also manages on behalf of University of Saskatchewan, has genome-viewing platforms that can be used to compare variations between genomes in what he calls an “intuitive” visual format.
Breeders can use this database of genetic information to select particular genotypes that are associated with a major resistance response to a fungal pathogen or to environmental stress, to name two examples.
They’ll then select different combinations of genomes so they can find the sections that provide beneficial responses “in a single line,” he explains. This means screening potentially thousands of lines in the lab and selecting promising candidates before moving them to field trials. This kind of advanced access to information about traits has the potential to dramatically accelerate breeding efforts.
Another example of how the pan-genome could be used is a project currently being led by Parkin, Sharpe says. It involved crossing fifty very diverse lines with the same reference line to achieve several thousand progeny. Because everything is crossed to the constant reference line, the researchers can see more robust associations with particular genotypes. “We can use the pan-genome information to develop better tools to interrogate the generated lines,” he says.
Sharpe believes the project will soon pay dividends for producers. Climate change will result in more extreme environmental conditions and tougher growing conditions, he says, and the accelerated development of new and more resilient varieties will increase the likelihood of stable yields even in tough years.
“What we want to see is the availability of new cultivars that will provide them with greater security.”