Long-read nanopore cDNA sequencing and direct DNA methylation detection resolves copy number debate in Cannabis

Todd Michael

Todd Michael, Salk Institute for Biological Studies


DNA methylation plays an essential role in governing the architecture of gene expression across eukaryotes. In plants, DNA methylation not only protects the genome from transposable element (TE) jumping by repressing their expression, it also mediates environmental-specific expression responses. Here we use direct DNA methylation detection coupled to full-length cDNA sequencing across several Cannabis chemotypes (high and low THC/CBD) to understand the final step in the cannabinoid pathway. The final step of the cannabinoid pathway is the Tetrahydrocannabinol acid synthase (THCAS) and/or Cannabidiol acid synthase (CBDAS), which are intron-less genes that are nested in 50-80 kb TE units tandemly duplicated at several loci in the genome. Due to the complexity of the THCAS/CBDAS loci and the fact that they are highly similar, full length cDNA sequencing was required to resolve which synthases were expressed in different chemotype genomes. While 10-15 copies of the THCAS/CBDAS are found across high-quality genome assemblies, only one synthase is expressed in chemotype specific plants, and only one THCAS and one CBDAS is expressed in dual chemotype plants. The expression differences across the synthases cannot be explained by the promoter regions, which are completely conserved across tandem repeats. Once again due to the complexity of the synthase loci, short-read based bisulfite sequencing fails to distinguish the DNA methylation patterns at these loci. However, direct DNA methylation enables the characterization of DNA patterns across these complex loci, providing evidence that DNA methylation in part explains the observed expression patterns. These results have implications for the modulation of THC/CBD content and the application of full-length cDNA sequencing and direct DNA methylation detection in complex plant genomes.


Dr. Todd Michael is a Research Professor at the Salk Institute for Biological Studies, USA where his group specializes in sequencing and analysis of plant genomes. His group is part of the Harnessing Plant Initiative at the Salk that aims to develop crop plants that sequester more carbon via extensive root systems containing recalcitrant carbon polymers to fight climate change.