Sequencing and assembling highly heterozygous and/or repeat-rich plant genomes using Oxford Nanopore technology

Victor A. Albert

Victor A. Albert, University at Buffalo


The development of long-read sequencing technologies has entirely changed the landscape of possibilities for sequencing and assembling complex plant genomes. While Pacific Biosciences SMRT sequencing has served admirably for a number of years now, Oxford Nanopore technology is highly portable and requires much less up-front investment. Concerns have arisen over nanopore error rates compared to PacBio or Illumina, but at least using current versions of flow cells, chemistry, and base-calling, we no longer find these misgivings tenable regarding the construction of a highly contiguous genome. Still, a combined approach is required, at minimum including polishing using low-error — but cheap — Illumina reads. Assemblies at the chromosome scale often require further efforts, such as HiC scaffolding — but the workflow is now democratized to the extent that any university lab should be able to generate a high-quality genome of its choice. We use case-by-case workflows for generating chromosome-scale assemblies of various-sized plant genomes. Unfortunately, no one assembly approach (e.g., De Bruijn graph or overlap-layout-consensus method) works best for all species, given the various and sundry nature of their heterozygositys, ploidy levels, and transposable element blooms (the latter two also in terms of their event ages). Despite not yet achieving a truly pipeline approach, we are satisfied with our ability to generate excellent de novo genomes on unprecedentedly low time and cost scales. We will describe several of our recent projects and the individual challenges encountered and how they were overcome.


Prof. Victor A. Albert is currently Empire Innovation Professor of Biological Sciences at the University at Buffalo, USA, and Visiting Professor in the School of Biological Sciences, Nanyang Technological University, Singapore. Prof. Albert’s research currently employs genomic approaches to understanding problems in plant evolutionary biology — his current research interests include genome sequencing and biodiversity "omics" analysis of the flora of Bukit Timah Nature Reserve in Singapore, which contains almost 1,200 plant species representing most of the flowering plant clades on Earth, as well as understanding the genetic basis for convergent evolution and “adaptive” radiations of plant forms, such as carnivorous plants. This work requires complete genome sequencing of carnivores from different plant families as well as looking at the role of mechanistic co-option in the evolution of carnivorous plant physiology, for example, by repurposing of pathogenesis-related gene functions. He is also interested in population genomic approaches to the study of interspecies admixture, local environmental adaptation, and the evolution of agriculturally important traits.