Bridging the gap: long reads enable more contiguous assembly of repeat-rich plant genomes

Larissa Ramsay

Larissa Ramsay, University of Saskatchewan


Due to its large genome size (4Gb) and highly repetitive nature, previous short-read assemblies of cultivated lentil (Lens culinaris) were highly collapsed and fragmented. Only 2.7Gb could be assembled (scaffold N50 54Kb), and 1.1Gb anchored, despite using a combination of long mate pair data, chromosome flow sorting, consensus genetic map, and an optical map. Long-read technology has resulted in dramatic improvements, with 3.7Gb (contig N50 1.4Mb) now assembled from 50-fold coverage of the genome. The updated genome assembly uncovers an additional 25% of the lentil genome with 121x increase in sequence contiguity. A HiC contact map provided final anchoring and orientation for 92% of the long-read contigs. Using the same process, we have also assembled de novo several wild relatives of interest to the lentil breeding program. Higher assembly contiguity allows us to fully characterize structural rearrangements previously seen only in interspecific genetic maps as well as identify novel structural variation relative to cultivated species. Improved assembly of highly repetitive regions and intact TE elements also allows for a closer look at genomic expansion within the genus Lens


Larissa Ramsay is the lead bioinformatician in the Bett lab at the University of Saskatchewan’s Department of Plant Sciences. She has been pivotal in the assembly of multiple, highly repetitive, lentil species genomes and is currently immersed in an analysis of their abundant structural rearrangements. She has also contributed to work in other species of agricultural interest from chickpeas to canola. Before moving to the University of Saskatchewan, she was employed as a bioinformatician at the National Research Council of Canada.