Portable DNA sequencing in remote East African farms
Plant viruses are one of the greatest problems facing crop production in the world. Its severe effects are experienced in the developing world where small-scale farming is a major source of food production and knowledge and resources for management are limited. Among other plants, cassava is one of the important crops in Africa. It is a staple food for close to one billion people in the tropics, but its production is threatened by two viral diseases, namely cassava brown streak and cassava mosaic disease. Diseased planting cuttings and whiteflies transmit these diseases. Early detection of the cassava viruses is crucial because it helps farmers make decision on which cassava variety to plant and/or where to source the clean planting materials. Our team used three portable, battery-powered devices, PDQeX DNA purification technology, developed by ZyGEM NZ Ltd, in New Zealand and MinION with MinIT sequencing platform developed by Oxford Nanopore in the United Kingdom, to produce an effective on-site diagnostic of viruses in cassava. The PDQeX system extracted a purified DNA using a cocktail of enzymes in thermo-responsive extractor cartridges inserted in a temperature control unit. Then DNA library was prepared and loaded in MinION for sequencing. MinIT did real-time base calling and identification of viruses affecting cassava field was done using a customized BLAST search.
Rapid genomic discovery using VolTRAX library preparation and nanopore sequencing
Here we demonstrate a rapid, simplified, and portable sequencing platform using a combination of the VolTRAX V2 and the MinION. We applied this platform across two distinct research areas: 1) the environmental reservoir of antibiotic resistance; Environmental bacteria can act as reservoir of opportunistic pathogens despite a lack of exposure. Identifying specific antibiotic resistant microbes is essential for quick and appropriate treatment. 2) Detecting genomic variation in cultured cancerous cell lines. Because cancerous cell lines are often used in drug discovery and testing, it is important to understand if long-term laboratory growth might play a role in the cellular responses to various genotoxic stressors. In this proof of principle, each library was constructed by undergraduate researchers with introductory laboratory skills using the VolTRAX V2 using the guided automated library preparation. Sequencing was carried out using the MinION, the portable Oxford Nanopore DNA sequencer. We found that this combination provides a reliable and repeatable bench experience for the early career researchers
Efficient reading of DNA-encoded cellular memories using nanopore sequencing
DNA recombinases are site-specific enzymes that can precisely manipulate the structure of DNA molecules. They allow sections of DNA to be flipped or excised, or even new regions of DNA to be inserted at specific locations. Due to the permanent nature of these changes synthetic biologists have been able to create DNA-based memory to record events that cells experience by the structural configuration of their DNA. While prototypes of such devices have demonstrated the feasibility of this approach, there is still a severe lack of quantitative measurements and limited numbers of genetic designs hampering their widespread use. To address this, we are systematically constructing large libraries of DNA-based memory register designs and assaying their switching performance en masse using multiplexed nanopore sequencing. This provides us with precise measurements of structural changes occurring over time and illustrates the benefits that long-reads generated by nanopore sequencing have for bioengineers developing DNA-based technologies.
Multiplexed direct quantification of barcoded protein reporters on a MinION
Genetically encoded reporter proteins are a cornerstone of molecular biology widely used to measure many biological activities, but the current number of unique reporters that can be used together for multiplexed tracking is small due to overlapping detection channels such as fluorescence. We therefore built an expanded library of orthogonally-barcoded nanopore-addressable protein tags engineered as reporters (NanoporeTERs), which can be read by nanopore sensors at the single-molecule level. By adapting a commercially available nanopore sensor array platform typically used for real-time DNA/RNA sequencing (Oxford Nanopore Technologies MinION), we show direct quantification of individual NanoporeTER expression levels at the protein level from engineered bacterial cultures, with little to no sample preparation. These results open new applications for multiplexed, real-time tracking of complex biological phenomena not possible with conventional protein reporters using portable, high-throughput nanopore sensor technology.
Transcriptome profiling of colon cancer by nanopore sequencing
By utilizing the advantages of Oxford Nanopore technology, full-length transcripts can be read in a relatively short period of time, thereby ensuring new unidentified transcripts such as novel isoform or fusion genes. Although the analytical advantages of nanopore are available, no attempt to identify the cancer transcriptome using nanopore sequencing technology has been made. We have sequenced 20 pairs of Korean colon cancer patients using cDNA PCR Oxford Nanopore technology. We attempted to identify some isoforms which could not be identified in the other sequencing platform. We also wanted to detect some of the differentially expressed transcripts including non-coding RNAs to reveal their molecular and functional analogy as well as their clinical implication in colon cancer. Moreover, using nanopore data, we attempted to detect some fusion genes in colorectal cancer which might have potential diagnostic implication. Overall, we have characterized the transcriptome profiles of Korean colon cancer patients using nanopore data to identify some potential biomarker for diagnosis prediction. Using such technology would allow us to find biomarkers that could be more precise and accurate than the previously used method to define transcriptome profiles, and also to propose a new molecular mechanism to better understand the colon cancer.
Nanopore sequencing of full-length circRNAs in human and mouse brains reveals circRNA specific exon usage and intron retention
CircRNA is a poorly understood class of non-coding RNAs, some of which have been shown to be functional important for cell proliferation and development. CircRNAs mainly derive from back splicing events of coding mRNAs, making it difficult to distinguish the internal exon composition of circRNA from the linearly spliced mRNA. To examine the global exon composition of circRNAs, we performed long-read sequencing of single molecules using Oxford Nanopore technology for human and mouse brain-derived RNA. By applying an optimized circRNA enrichment protocol prior to sequencing, we were able to detect 7,834 and 10,975 circRNAs in human and mouse brain, respectively, of which 2,945 and 7,052 are not currently found in circBase. Alternative splicing was more prevalent in circRNAs than in linear spliced transcripts, and notably >200 not previously annotated exons were used in circRNAs. We conclude that the Oxford Nanopore technology provides a fast and reliable sequencing method to map the specific exon composition of circRNA.
Molecular tagging with nanopore-orthogonal DNA strands
Molecular tagging is an approach to securely label physical objects of high value using DNA or other molecules. An ideal system should be inexpensive, quick and reliable to decode, and require minimal equipment. We created a novel molecular tagging system using DNA-based tags and the MinION nanopore device. In our tagging system, we first encode a 32-bit digital tag into a 90-bit codeword that is more robust to errors. We then convert it into a DNA-based molecular tag where each 1 or 0 in the codeword is represented by the presence or absence of a molecular bit (molbit). A single molbit is a DNA strand that modularly combines a unique barcode structure with a specific strand length. This allows us to classify molbits directly from the raw nanopore signal, avoiding basecalling to reduce compute time and misclassification errors. Molbits are prepared for readout at the time of tag assembly and are then stabilized by dehydration. These steps extend the shelf life of the tag, decrease decoding time, and make it robust to contamination from environmental DNA. The result is an extensible, real time, high accuracy tagging system that includes a novel approach to developing nanopore-orthogonal barcodes with applications beyond this system.
Multi-copy qnrA1 plasmid causes elevated quinolone resistance in E. coli
Large multi-drug resistance (MDR) plasmids are often challenging to assemble and circularize due to repetitive sequences, numerous insertion elements, and regions sharing homology with the bacterial chromosome. These complications are exacerbated by copy number variation introduced by large duplication events involving multiple genes. In this work, we emphasize the utility of nanopore long-read sequencing to overcome these challenges by producing a complete sequence of a 200kb MDR plasmid containing a duplication event mediated by insertion sequence ISCR1 transposition. The 20kb multi-duplication introduced four additional copies of the qnrA1 quinolone resistance gene. The qnrA1 copy number appears to have a dosage effect significantly increasing resistance to ciprofloxacin in E. coli strains carrying the plasmid. To our knowledge this is the first report of a resistance mechanism associated with a stable qnrA1 duplication in an E. coli plasmid.
Semi-quantitative characterisation of mixed pollen samples using MinION sequencing and Reverse Metagenomics (RevMet)
The ability to identify and quantify the constituent plant species in mixed-species pollen samples has important applications in ecology, conservation, and agriculture. Metabarcoding methods for pollen can reveal presence of species but do not allow reliable quantification. A PCR-free, shotgun metagenomics approach has greater potential for accurately quantifying relative abundances but applying metagenomics to eukaryotes is challenging due to low numbers of reference genomes. We have developed a pipeline, RevMet (Reverse Metagenomics), that allows reliable and semi-quantitative characterization of the species composition of mixed-species eukaryote samples, such as bee-collected pollen, without requiring reference genomes. Instead, reference species are represented by short-read genome skims. The skims are mapped to MinION-generated long reads from mixed-species samples, and each long read is uniquely assigned to a species. We genome-skimmed 49 wild UK plant species, validated our pipeline with mock DNA mixtures, and then applied RevMet to bee-collected pollen.
MinION application: performing long fragment analysis on pure fungal cultures (3.5 kb and 6 Kb) and genome analysis of Malassezia pachydermatis
Two purposes were assessed with nanopore sequencing (i) Detection of fungal communities using 1D PCR barcoding, starting with pure cultures of Alternaria, Aspergillus, Candida, Malassezia, Microsporum canis, and Penicillum. We compared partial fungal operon (18S-ITS1-5.8S-ITS2-28S) sequencing of ~4 Kb with the whole one of ~6 Kb. Saccharomyces and Cryptococcus from the ZymoBIOMICSTM mock community were also sequenced as positive control. Amplification of partial and whole operon showed same results. (ii) Malassezia pachydermatis, commonly found in skin and may be associated to skin diseases (e.g. atopic dermatitis), was sequenced by 1D Native barcoding genomic DNA and analysed with bioinformatics tools to obtain consensus sequences and to understand if it may be similar to other Malassezia species.
High-throughput annotation of full-length long noncoding RNAs using CapTrap-CLS
Our genome expresses thousands of long noncoding RNAs (lncRNAs) that are mRNA-like RNA transcripts, which do not encode any identifiable peptide product. Although lncRNAs perform diverse roles in the cell, the overwhelming majority of them remain uncharacterized. Decoding lncRNAs function highly depends on accurate annotations, which map the precise location of lncRNA genes in the genome. Present lncRNA annotations are far from being complete and this incompleteness can have a strong impact on downstream analyses. To improve the annotation of GENCODE lncRNAs in human and mouse genomes, we developed Capture Long-read Sequencing (CLS) method that combines targeted RNA capture with long-read sequencing. Despite its benefits, CLS can produce transcripts with incomplete 5’ ends. To address this limitation, we developed Cap-CLS – an upgraded version of CLS method with the CapTrap protocol to enrich 5’ ends of transcripts. Results obtained with nanopore sequencing indicate that Cap-CLS considerably improves the annotation of 5' ends of transcripts.
Field forward sequencing in naval environments
Looking for a needle in a haystack: nanopore sequencing of a new Simian immunodeficiency virus (SIV) from a chacma baboon (Papio ursinus)
HIV-1 and HIV-2 together with SIVs comprise the primate lentivirus family. SIVs infect many Old World African primates, both in the wild and in domesticated animals. Some SIV lineages co-evolve with their hosts, but cross-species transmission between different simian species and transmission from simians to humans can occur. Based on short sequences, SIV infection in a yellow baboon and a chacma baboon was published in the 1990s. We isolated SIV from the chacma baboon by co-culture of baboon PBMCs with human CD4+ cell lines and detecting virus by reverse transcriptase activity. High molecular weight (hmw) cellular DNA with integrated provirus was extracted in 1989, using the standard phenol-chloroform method and stored at 4°C for 20 years. Next generation sequencing was attempted, but we could not assemble a complete SIV genome. After storage for another 10 years, the archived DNA was quantified using Qubit and the DNA purity evaluated using the NanoDrop. The 1D ligation kit was used for library preparation to attempt nanopore sequencing on the GridION X5. Base called FastQ sequences, were imported in Geneious Prime 2019 and BBDuk was used for quality trimming and filtering. SIV proviral sequences were assembled to the short baboon reference sequence using the Map to Reference option of Geneious 6.0.3 Read Mapper. After using BLAST on the consensus baboon sequence, we repeated the assembly process with the most closely related SIVagm isolate. The new baboon SIV had less than 80% nucleotide sequence identity to known SIV isolates. Preliminary phylogenetic analyses indicated the baboon virus was related to the vervet subgroup of SIVagm. We are designing primers to amplify the baboon provirus to compare nanopore sequences with Sanger sequencing.
The fever tree: extracting and preparing the DNA of Cinchona pubescens
Identification of a new class of local copy number aberrations in lung cancer genomes using PromethION
Using PromethION, we generated whole-genome long-read sequencing data of five lung cancer cell lines at the depths of 30X— in LC2/ad cells and 10X — in other cells. We identified the cancerous mutations including point mutations, large deletions and chromosomal rearrangements, which were previously reported using the short-read sequencers. In addition, we unexpectedly detected complex middle-size structural variations, which we named Cancerous Local Copy-number Lesions (CLCLs). Those CLCLs consisted of complex combinations of local inversions, duplications and micro deletions, which should invoke functional alternations of their encoding proteins. Those CLCLs occurred even in the cancer-related genes, such as STK11, NF1 and PTEN, where the short-read sequencers could not have confidently identified their aberrant structures. We further conducted the similar analyses using ten clinical lung adenocarcinoma specimens. We successfully demonstrated that those CLCLs are not specific to cultured cells, but also occurring in vivo.