Measuring transcriptomic diversity induced by genome SCRaMbLEing with nanopore direct RNA sequencing
Eukaryotic genomes are organized non-randomly. We are studying how abrupt changes to genome organization shapes the transcriptional landscape using Sc2.0, a yeast strain composed entirely of designer, synthetic DNA. Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) is a key design feature of the synthetic yeast genome project. SCRaMbLE generates stochastic and complex rearrangements at engineered recombinase sites throughout the genome on-demand using Cre-mediated recombination. We have characterized extensive transcriptional diversity in >60 independent SCRaMbLE strains containing a circular synthetic chromosome, synIXR. These transcriptional alterations appear to be isolated to the SCRaMbLEd chromosome and dependent on rearrangements specific to each SCRaMbLE strain. Using Oxford Nanopore’s direct RNA sequencing, we have observed novel transcriptional events in SCRaMbLEd genomes, including alteration to transcript start and termination sites. Our results suggest an inextricable link between physical organization of the genome and transcript isoform expression.
Dr Aaron Brooks earned his PhD from the University of Washington and is currently an EMBL Interdisciplinary Postdoc (EIPOD) working with Dr Lars Steinmetz in Heidelberg, Germany. Aaron’s research harnesses synthetic and evolutionary biology to understand how the physical layout of the genome shapes its function. Aaron and his team have relied on nanopore sequencing to detect abrupt reorganization events in synthetic genomes and measure their consequences.
From ancient tomb to animal viruses: mobile suitcase lab for nanopore sequencing at field setting
Nanopore sequencing technology can be applied to identify the pathogen responsible for an outbreak through sequencing all nucleic acids existing in the collected sample in a single run. In addition, it gives insight about the origin and variant of the causative agent. We have established a novel sequencing protocol relying on nanopore sequencing and offline BLAST search beside a microbiome screening of an ancient tomb. The whole procedure was conducted in a solar powered mobile suitcase laboratory, which is easy to use at the point of need. The procedure was completed in 5 hours including extraction, barcoding, sequencing and data analysis, which did not require a bioinformatician. Our protocol enables rapid and reliable foot and mouth disease virus serotyping and the differentiation of the Capri poxviruses (Sheep poxvirus, Goat poxvirus and Lumpy Skin Disease virus). The microbiome composition of the ancient tomb revealed potential threat of respiratory illness due to bacteria from family of Bacillaceae. Furthermore, bacteria from family of Pseudomonadaceae gave hints to the former use of the tomb as a byre.
Dr Abd El Wahed studied veterinary medicine at Mansoura University in Egypt, and received his PhD in biology from Göttingen University, Germany in 2011. He has participated in the development of 30 point-of-care assays for the detection of infectious agents, and In 2013, he was awarded the Young Investigator award from the ASTMH on the establishment of a mobile laboratory for rapid detection of haemorrhagic fever viruses at low resource settings. Recently, he established a mobile suitcase laboratory for rapid detection of viruses, bacteria and parasites. The mobile setup was in field trials in Guinea, Sri Lanka, Nepal, Senegal, Egypt, Bangladesh and Brazil.
Going full circle: Assembly of high-quality, single-contig microbial genomes from the rumen microbiome using long-read sequencing
Ruminants such as cows and sheep are important livestock species. They convert low nutritional value plant matter into high-quality meat and dairy products. Within a specialised stomach called the rumen, microbes ferment the plant matter producing short-chain fatty acids from difficult to digest plant matter. The composition of the rumen microbial community can affect the animal’s health, feed efficiency and level of methane production. Species in the rumen are typically difficult to culture and despite its importance, it remains an underexplored environment. DNA sequencing of the contents of the rumen offers the potential to identify microbial species without culture techniques. Here we sequence cow rumen fluid using Oxford Nanopore sequencing. We show that despite these data coming from a highly complex microbial sample we can assemble high-quality, single-contig whole genomes and plasmids of known and novel species, including numerous circular contigs. Additionally, we compare and validate the assemblies of these genomes with binned genomes generated from short read Illumina assemblies. We show that the long-read assembly out performs the short-read assembly in contiguity and in incorporation of important features such as AMR genes and marker genes..
Amanda Warr recently completed her PhD at The Roslin Institute in Edinburgh, UK. Her PhD research involved using genomics to investigate reproductive traits in pigs and reassembling the pig genome using long-read sequencing. Although this work was primarily in bioinformatics, she also spent time in the lab using the MinION and training others to use the sequencer. She has accumulated a number of MinION-related side projects and collaborations, including work in a variety of species on anti-microbial resistance, viral epidemiology, genome assembly in mammals and microbiomes, and diagnostics. Currently she is employed as a Postdoctoral Research Fellow at The Roslin Institute with Mick Watson and Christine Tait-Burkard, with main projects focussing on the rumen microbiome, functional genomics in chickens and tracking the spread of porcine reproductive and respiratory syndrome virus in the Philippines.
The phyllosphere of Euphorbia lateriflora and Ficus thonningii
The phyllosphere, which includes the microbiome of the leaves of plants, remains an understudied ecological niche. The bacterial colonizers of medicinal plants have barely been studied even though their host plants have been used widely in ethnomedicine. Part of the limitations of studying phyllosphere bacteria is that they cannot be sufficiently classified using traditional culture and biochemical methods, largely because such methods were created with a focus on medically important bacteria which represent only a small fraction of the group. Consequently, phyllosphere studies conducted before the era of genomics were mostly species in-specific or focused on bacterial pathogens/symbionts of food plants. It soon became clear as phyllosphere studies progressed that some organisms, including novel ones, were missed as they were not covered even in the classic Bergey’s manual of systemic bacteriology database. With the advent of sequencing, it has become possible to study phyllosphere bacteria in detail, both culturable and unculturable. Metagenomic DNA can be obtained directly from plant colonizers and sequenced directly. Phyllosphere studies currently employ sequencing to characterize bacteria, including novel species. We used sequencing to characterize phyllosphere bacteria of Ficus thonningii and Euphorbia laterifolia which are Nigerian medicinal plants, hypothesizing that both plants are colonized by diverse bacteria which are advantageous. Bacteria were isolated on 0.1X tryptone soya agar, after which isolates were identified by 16S rRNA gene sequencing, compared by multiple sequence alignments and phylogenetics. Select isolates were further classified biochemically with Microbact 24E and genotypically by whole genome sequencing. A total of 100 and 77 isolates from F. thonningii and E. lateriflora respectively were identified by 16S rRNA sequencing. The dominant phylum and family from both plants were Proteobacteria and Enterobacteriacea respectively. 12 other bacterial families were encountered. Biochemical and genomic classification of 14 isolates which could not be assigned to any species based on 16S rRNA analyses were discordant as Microbact identified all but two as Acinetobacter, while genome sequencing classified them as Brevibacteria, Agrococci, Kocuria and others. Plasmids, resistance and virulence genes were also detected in a sub set of isolates. Study of phyllosphere bacteria was greatly limited by traditional microbiology methods up until introduction of sequencing into the field. Apart from correctly identifying organisms, sequencing provides extensive insight into phylogeny and metabolic capacity of organisms, making it possible to predict functions of the bacteria to host plants after a single sequencing experiment. This has greatly changed the ways in which the phyllosphere microbiome is studied.
Anderson O. Oaikhena is a Doctoral student and teaching assistant in the Department of Pharmaceutical Microbiology at the University of Ibadan. He is currently studying bacterial colonizers of medicinal plants, with a particular interest in colonization of antibacterial plants as a framework for better understanding the ecology of antimicrobial resistance in nature. Anderson’s outstanding performance in the Master of Science degree in Pharmaceutical Microbiology in 2017 earned him a doctoral scholarship award from the postgraduate college of University of Ibadan. He serves as a research associate in the Nigeria AMR surveillance network, where he is the point person for whole genome sequencing of antibiotic resistant bacteria.
Human long-read whole genome sequencing: applications and results
Dante Labs moved long-read whole genome sequencing into wellness and social care leveraging Oxford Nanopore PromethION technology. Results will be presented on large scale long-read sequencing, data results and actual impact on people's lives.
Andrea Riposati is the CEO & co-founder of Dante Labs, the global whole genome data company. Prior to Dante Labs, Andrea was a Senior Product Manager at Amazon in Seattle, where he launched 3D Printing and the Marketplace for Professional Medical Supplies. Andrea holds an MS summa cum laude in Business from Bocconi University and an MBA from Harvard Business School.
Genomic profiling in acute myeloid leukemia with complex karyotype
Acute myeloid leukemia (AML) represents clonal expansion of malignant cells. A stratification of patients in risk groups is based on cytogenetics and molecular markers for a genotype-based treatment strategy. Conventional karyotyping, which is necessary for classification of “high-risk” AML, is available after 5 to 7 days. Using Oxford Nanopore sequencing, we established karyotyping based on shallow genome sequencing within 24 hours. The throughput of one flowcell was sufficient to achieve 3-fold genome coverage and reproduce results of conventional karyotyping in 20 AML patients. To discover structural variations, we applied direct RNA sequencing and analysed fusion genes based on 1.2 million reads. A single run is sufficient to detect a balanced translocation t(9;22), a fusion gene BCR-ABL1, in the cell line K-562. While a study of a larger AML patient cohort is ongoing, parallel low coverage genome and transcriptome analysis allows identification of high-risk AML during the initial diagnostic work-up of 24 hours.
Anna Dolnik is PostDoc at the Charité – University Medicine Berlin, Campus Virchow in Germany. Trained as a biologist, she switched to processing of Illumina short-gun sequencing in 2012, working at the edge of biology and bioinformatics. In 2016 Anna first experienced working with Oxford Nanopore technology through resequencing of novel fusion genes found in AML with complex karyotype and now routinely uses a GridION for better characterization of high-risk AML. Her research focusses on clonal evolution in blood cancer (acute myeloid leukemia), identification of cancer driving genes and characterization of complex changes in cancer genome by whole genome sequencing.
Application of nanopore sequencing in clinical haematology
Blood cancers are together the 5th most common cancer and many patients are young adults. Besides, the most common cancers of childhood worldwide are acute lymphoblastic leukaemia and endemic Burkitt’s Lymphoma. Six of the 10 most lucrative cancer drugs are prescribed for haematological malignancies ($55.6billion by 2025). Their diagnosis requires a microscope and DNA-based precision diagnostics. But precision diagnostics is not always about identifying a simple base pair change, and requires the detection of all different types of mutations from one single (often small) patient sample. Examples of where DNA-based diagnostics are critical will be given and include all leukaemias and an increasing number of lymphomas. Their detection is recommended by the WHO classification of haematological malignancies. For example, the NHS test directory includes 177 different genetic aberrations.
The problem is that current multimodality testing in diagnostics laboratories is inadequate to deal with this demand on testing. Many of the conventional single gene assays lack sensitivity, speed and precision. Illumina whole genome sequencing (WGS) of tumour and paired germline has the potential to reveal all types of different mutations and global measures across the genome, but it is limited by the small fragment size and the need for large and expensive equipment. As part of the Genomics England Chronic Lymphocytic leukaemia Pilot have used information from Illumina WGS from 400 patients to develop an improved response prediction tool for chemoimmunotherapy that predicts patients who will be cured (manuscript in preparation). An alternative way specifically for diagnostics, is to combine targeted deep sequencing and error correction with shallow whole genome sequencing using the MinION. This method and variations of it can be applied globally in haematology. For example, the most common inherited anaemias, the haemoglobinopathies, are characterised by 1700 SNVs/indels and deletions across three genes. The most common of these, sickle cell disease, occurs in sub-Saharan Africa. About 10% of patients require confirmatory diagnosis by genetics. Life-saving therapies are available for this disease, and knowledge of the presence of the condition in the fetus could significantly streamline neonatal screening programmes around the world. We have clinically validated a proprietary method for non-invasive testing for sickle cell disease from maternal plasma (pre-published in BioRxiv) and have also developed a nanopore-based test for diagnosing haemoglobinopathies from germline DNA without the need for PCR amplification.
Finally, plasma-derived DNA can also be used in other clinical indications in sub-Saharan Africa. The most common childhood cancer in the region is endemic Burkitt’s lymphoma. This is caused by EBV infection in early childhood. With simple treatment, over 90% of patients can be cured. And treatment is free of charge in all African countries affected. Currently, over 90% of kids die. This is because children present late and are not diagnosed once in hospital because there is lack of trained surgeon and pathologists to establish the diagnosis from an invasive biopsy across the region. We are now clinically validating a non-invasive method to diagnose this type of lymphoma from the blood using a combination of tumour and virus sequencing.
In conclusion: haematological diseases have always spearheaded innovations and discoveries in medicine, in particular genetics. Precision medicine is a reality for an increasing number of patients with blood diseases from targeted small molecules in leukaemias and lymphomas to gene therapy in the inherited blood diseases. The next step is to leapfrog diagnostics technologies and to introduce these advances globally as expensive cancer therapies are coming off patent and are increasingly available and on the WHO list of essential medicines. Ultimately, this approach will achieve a huge impact for a large number of patients world-wide.
Professor Schuh completed academic and clinical haematology training in Oxford and in 2006, she was appointed clinical lead for haematology laboratories, including molecular diagnostics, and has also been the clinical lead for chronic lymphocytic leukaemia and other lymphoproliferative disorders for the NHS Thames Valley Cancer Network. Over the past twelve years she has led over 30 early and late phase clinical trials in leukaemia as a principle or national chief investigator. A number of these led to NICE approvals and have changed clinical practice for patients in the UK and worldwide. As a result, she was recently appointed as the Chair of Chronic Lymphocytic Leukaemia Research in the UK by the National Cancer Research Institute. In addition to other national and international roles, she has also chaired the UK CLL Forum since 2016 that promotes training and education, and she has led the UK's guidelines writing group for CLL Therapy on behalf of the British Society of Haematology. Her second research interest is with the development, evaluation and implementation of new technologies for Precision Diagnostics, especially genomics. Her group published the first ever longitudinal study of the changes in the genomic landscape of patients undergoing treatment for leukaemia. She is the lead for the Genomics England Clinical Interpretation Partnership for haematological malignancies. Professor Schuh has received grants from the NIHR, Wellcome Trust, Technology Strategy Board, Cancer Research UK and Bloodwise and she has authored or co-authored over 80 peer-reviewed publications in the last five years.
Building experiential learning about DNA sequencing through algae bloom monitoring onboard R/V Sikuliaq
The warming temperatures of the oceans affect phytoplankton communities. The frequency of algae blooms increases and causes dramatic consequences for Alaskan fisheries and for coastal native communities. Indeed, Alaskan traditional diets include impacted shellfish that have accumulated toxins produced by algae. Unfortunately, freezing or cooking do not neutralize these toxins. Thus, the preservation of traditional living of native communities necessitates the implementation of monitoring programs for harmful algae blooms and toxic shellfish. In October 2018, we introduced 12 undergraduate students to field genomics while navigating the Alaskan seas aboard the research vessel Sikuliaq. Using MinIONs, students analyzed in real-time the microbial communities of seawater samples. The second edition of our workshop will occur during London Calling 2019. We will sail the Northwestern Pacific Ocean for 9 days from San Diego, CA to Seward, AK. This time of the year and route are well suited to investigate algae blooms. Building on the design of the experiential learning we implemented in 2018, we will analyze sea water for phytoplankton, collect chlorophyll content measurements and microscopic images. By integrating their sequencing results with oceanographic data and imagery, participants will have the opportunity to fathom the relevance of using Oxford Nanopore Technology for urgent environmental concerns.
MinION sequencing from the extreme to the everyday
The portability and agility of the MinION platform means it can reach places other sequencers can’t. In our hands, this means we are able to deploy MinION DNA sequencing to provide real-time insights into microbial communities using metagenomics and 16S microbiome analyses in extreme environments. It also means that DNA data can be used to enrich public discourse about genomics and microbiomes by sequencing for the media. Here, I will share some of the insights and lessons learned from using MinION for research in off-grid environments and for public outreach. Case studies will include the use of MinION in an off-grid ice camp enduring subzero temperatures and hurricane force blizzards, which showed for the first time the potential for anoxygenic photosynthesis as novel metabolic process on the Greenland Ice Sheet, the live sequencing of soil metagenomes for national radio and the blending of extreme and everyday by sequencing deep subsurface microbiomes and metagenomes in a carbon sequestration experiment in the US directly followed by microbiome analysis for the Royal Institution Christmas Lectures. From these experiences, we are able to offer guidance on ensuring robust outcomes for sequencing in unlikely locations which may help others develop genomics data and discourse in the extreme and the everyday.
Dr. Arwyn Edwards is Senior Lecturer in Biology and Director of the Interdisciplinary Centre for Environmental Microbiology at Aberystwyth University. Portable DNA sequencing using the MinION is enabling Arwyn and other Arctic microbiologists respond to rapid changes in the Arctic, and access unprecedented insights into the diversity and function of extreme microbial ecosystems. This area of research is supported by NERC, Leverhulme, the Welsh Government and the European Union.
Portable DNA sequencing in remote East African farms
Plant viruses are one of the greatest problems facing crop production in the world. Its severe effects are experienced in the developing world where small-scale farming is a major source of food production and knowledge and resources for management are limited. Among other plants, cassava is one of the important crops in Africa. It is a staple food for close to one billion people in the tropics, but its production is threatened by two viral diseases, namely cassava brown streak and cassava mosaic disease. Diseased planting cuttings and whiteflies transmit these diseases. Early detection of the cassava viruses is crucial because it helps farmers make decision on which cassava variety to plant and/or where to source the clean planting materials. Our team used three portable, battery-powered devices, PDQeX DNA purification technology, developed by ZyGEM NZ Ltd, in New Zealand and MinION with MinIT sequencing platform developed by Oxford Nanopore in the United Kingdom, to produce an effective on-site diagnostic of viruses in cassava. The PDQeX system extracted a purified DNA using a cocktail of enzymes in thermo-responsive extractor cartridges inserted in a temperature control unit. Then DNA library was prepared and loaded in MinION for sequencing. MinIT did real-time base calling and identification of viruses affecting cassava field was done using a customized BLAST search.
Field-forward sequencing with Oxford Nanopore technology: a strategy to establish the upside-down mangrove jellyfish Cassiopea xamachana as a bioindicator
The combined relative ease of operation, high throughput and reduced cost of NGS platforms has enabled the coupling of traditional field collection methods with laboratory-based metagenomic approaches to provide a molecular snapshot of species-diversity in a plethora of aquatic environments. However, the often-extensive time-lag between field sampling, sequencing and endpoint bioinformatics analyses precludes the ability to provide a contemporaneous characterization of the target ecosystem, against the backdrop of briskly shifting global climate. Furthermore, the growing decline of healthy aquatic ecosystems due to chemical, physical and biological threats, along with concerns related to invasive species and natural disasters, highlights the critical need for field-forward sequencing protocols to provide rapid characterization of a diversity of environmental systems. The recent publication of the reference genome of the upside-down mangrove jellyfish, Cassiopea xamachana, has gained this emerging model species attention as an indicator species with promising applications for coastal ecosystem management and conservation. Taking advantage of the versatility offered by Oxford Nanopore sequencing, we developed a field-forward DNA environmental metabarcoding strategy to characterize Florida Keys mangrove ecosystems, inhabited by C. xamachana, in the year following the catastrophic landfall of Hurricane Irma in 2017. The prototype for this portable system boasts a low-complexity protocol requiring minimal training for operation, a relatively short sample-to-answer timeframe i.e. several hours, field-forward DNA metabarcoding capabilities in austere environments, manual and/or battery-powered equipment with ease of portability and minimal footprint, as well as multiplexing capabilities for the simultaneous assessment of multiple collection sites and/or genetic markers. We present here the first ever eDNA assessment of C. xamachana populations in several Florida Key coastal environments in the wake of a devastating natural disaster, based on the findings of our inaugural field-forward sequencing study.
Cheryl L. Ames is a National Academies of Sciences (NRC) Postdoctoral Fellow at the United States Naval Research Laboratory and a Research Associate at the National Museum of Natural History, Smithsonian Institution in Washington, DC. As an expert in jellyfish and their evolutionary novelties, such as their venom, vision and sex, Cheryl uses a MinION-based field-forward environmental DNA sequencing protocol to assess biodiversity in marine ecosystems, such as coastal mangroves, the Gulf Stream and public aquaria. Cheryl has a Ph.D. in Biological Sciences from the University of Maryland, USA, a Master’s degree in Marine Biology from the University of the Ryukyus in Okinawa, Japan and undergraduate degrees from Queen’s University and the University of Ottawa in Canada.
Long-read NGS guided preimplantation genetic testing for chromosomal structural rearrangement
It is well-known that patients diagnosed with chromosomal structural abnormality will lead to increased miscarriage rate. Preimplantation genetic testing for chromosomal structural rearrangement (PGT-SR) can increase the pregnancy rate, and provides a chance to avoid the genetic defect in the successive generation if handles appropriately. Diagnosis of the DNA breakpoint is always a difficult task from both effective and economical point of views. In the past, we performed haplotype analysis to phase the disease and wild-type chromosomes. Although it is feasible, the occurrence of double crossover between breakpoints and markers can generate false negative result. In this session, I will demonstrate the approach of breakpoints determination using latest technology advancement that maximize the efficiency and minimize the economic burden. Precise identification of breakpoints simplifies PGT-SR to an approach similar to PGT-M (preimplantation genetic testing – Monogenic disease). This strategy requires only a couple of PCR reactions to distinguish the derivative from wild-type chromosomes accurately. Moreover, such breakpoint information is applicable to all family members affected by the same chromosomal aberrations, where haplotype approach require reassessment of microsatellite markers for every couple even shared the same chromosome aberrations.
Dr Chan graduated from the University of Newcastle Upon Tyne in the UK and gained his PhD at the University of Hong Kong. His research interest is in genetics and epigenetics of hereditary cancers as well as clinical genetic screening. Dr Chan has long been using cutting-edge technologies in his research, including the early application of pyrosequencing in quantification of DNA methylation, leading to the discovery of the mechanism of transcriptional read-through as the cause of Lynch Syndrome. He has been using an NGS approach in preimplantation genetic testing (PGT) since 2015 and his latest work involves the use of long-read NGS for structural variants.
Discerning the origin of Epstein-Barr virus in patients using nanopore-derived DNA methylation signatures
Epstein-Barr virus (EBV) infects the vast majority of the human population. In rare instances EBV is involved in the generation of lymphomas and other malignancies. The presence of elevated EBV levels in blood is currently used in the clinical diagnosis and subclassification of infectious disease and lymphoma; yet when patients present with detectible EBV in blood, it is uncertain if this represents an active infection or an evolving malignancy. Epigenetic changes occur as an essential component of the EBV life cycle, controlling the virus’s ability to infect, establish a chronic (latent) infected population of cells, and later reactivate to produce infectious virions. We have found that tumor-derived EBV DNA displays specific DNA methylation signatures and EBV virion-derived infectious particles or lytically-active (virus-replicating) cells contain largely unmethylated EBV DNA. Importantly, tumor-specific EBV methylation patterns are maintained in circulating plasma (cell-free) DNA. We have employed nanopore sequencing to interpret EBV DNA methylation signatures in patients. Analysis of tumor and plasma-derived EBV methylomes from lymphoma patients and donors with infectious EBV revealed highly discernable DNA methylation signatures, with higher levels of methylation in primary EBV+ lymphoma cases. In addition, we have uncovered an unexpectedly high complexity of EBV methylation patterns among tumors and evidence that epigenetic differences correlate with viral gene expression. These data may predict responses to combined antiviral and immunotherapeutic strategies in cancer patients. In summary, our work aims to develop approaches that harness a patient’s EBV epigenetic signature with the aid of nanopore sequencing to rapidly discern benign versus cancerous clinical scenarios and help direct the use of current and novel therapies.
Christopher Oakes is an Assistant Professor in the Departments of Internal Medicine/Division of Hematology and Biomedical Informatics at The Ohio State University. He performed his graduate studies at McGill University in Canada and post-doctoral studies at the German Cancer Research Center in Heidelberg, Germany. His laboratory investigates epigenomic, genetic and other molecular features of a broad range of hematological malignancies, with a focus on chronic lymphocytic leukemia and other non-Hodgkin’s lymphomas and acute myeloid leukemia. He explores high-throughput epigenetic and molecular profiling data and combines these analyses with functional evaluation of key genes and molecular pathways. His laboratory is interested in the developmental origins of epigenetic programs in lymphoid and myeloid malignancies and aim to uncover the ontogeny of disease development. Current research focuses on investigating the role that perturbation of gene function plays in establishing aberrant global epigenetic states and landscapes. Beyond fundamental tumor biology, he aims to develop novel molecular diagnostics clinical for diagnosis, stratification and prediction of treatment responses, as well as the identification of novel therapeutic targets.
Clinical application of long-read sequencing
With the widespread use of low-cost genome-wide diagnostic screening tests, unanticipated but possibly pathogenic dosage changes affecting single genes are discovered with increasing frequency. Clinical management demands facile validation of such incidental findings, often necessitating the design of custom variant-specific assays. Although deletion variants can be readily confirmed using a range of next-generation sequencing strategies, characterising duplication variants, at nucleotide resolution, remains demanding. We have addressed this challenge by deploying a novel Cas9 enrichment strategy combined with long-read sequencing with the Oxford Nanopore MinION. We used bulk genomic DNA without the need for PCR amplification. We present the diagnostic resolution of two problematic cases in which incompletely characterised duplication variants had been identified by array CGH. The first patient presented with learning difficulties and autism spectrum disorder but had been found to have an incidental 1.7-kb imbalance which included a partial duplication of VHL exon 3. This was inherited from the patient’s father, who had renal cancer aged 38 years. In the second case, we identified an incidental 200-kb duplication which included DMD exons 30-44. Parental testing was consistent with this variant having arisen de novo. In both cases, the single-molecule sequencing yielded sufficient information to define precisely the architecture of the rearranged region, enabling Sanger sequencing assays across the integration sites and surrounding homologous regions, that likely gave rise to the duplicated sequences. Adoption of this approach by diagnostic laboratories promises to enable rapid and cost-effective characterisation of challenging duplication-containing alleles.
Christopher studied molecular biology and human genetics at the University of Manchester and Mayo Clinic in Florida, USA. He subsequently moved to Leeds to undertake clinical scientist training and attained HCPC registration in 2013. As the NHS lead in the Translational Genomics Unit, he has overseen the clinical implementation of numerous short-read sequencing instruments and next-generation sequencing assays. He is currently a visiting research fellow at the University of Leeds where he is focussed on understanding the clinical utility of long-read sequencing, particularly for the diagnosis of rare Mendelian disease.
Update from Oxford Nanopore Technologies
Clive is Chief Technology Officer at Oxford Nanopore Technologies. On the Executive team, he is responsible for all of the Company’s product-development activities. Clive leads the specification and design of the Company’s nanopore-based sensing platform, including strand DNA/RNA sequencing and protein-sensing applications with a strong focus on scientific excellence and successful adoption by the scientific community.
Clive joined Oxford Nanopore Technologies from the Wellcome Trust Sanger Institute (Cambridge, UK) where he played a key role in the adoption and exploitation of next-generation DNA sequencing platforms. This involved helping to set up the world’s largest single installation of Illumina (formerly Solexa) Genome Analyzers in a production sequencing environment, initially used to pioneer the 1000 Genomes Project. From early 2003 he was Director of Computational Biology and IT at Solexa Ltd, where he was central to the development and commercialisation of the Genome Analyzer (GA). Solexa was sold to Illumina for $650m in early 2007 after the successful placement and adoption of 12 instruments. The Solexa technology, now commercialised by Illumina, is the market-leading DNA sequencing technology driving the renaissance in DNA-based discovery.
He has a strong background in computer science and genetics/molecular biology and manages interdisciplinary teams including mechanical engineering, electronics, physics, surface chemistry, electrophysiology, software engineering and applications (of the technology). Clive applies modern agile management techniques to the entire product-development lifecycle. Clive has also held various management and consulting positions at GlaxoWellcome, Oxford Glycosciences and other EU- and US-based organisations. He has worked at the interface between computing and science, ranging from genetics to proteomics. He holds degrees in Genetics and Computational Biology from the University of York.
The real Simon Pure
Dan Turner is Vice President of Applications at Oxford Nanopore Technologies and is a highly experienced scientist who has worked in the field of next-generation sequencing for the last 11 years. Dan provides scientific leadership for multi-disciplinary teams in Oxford, New York and San Francisco. The Applications group aims to bring together sample prep technologies, genomics applications and bioinformatics, to expand the utility of Oxford Nanopore Technologies devices and illustrate the benefits of these technologies to the wider world. Before joining Oxford Nanopore Technologies, Dan was Head of Sequencing Technology Development at the Wellcome Trust Sanger Institute, and prior to this he held postdoctoral positions at the Sanger Institute and Cornell University Medical College in Manhattan.
Redefining the transcriptional complexity of viral pathogens using direct RNA sequencing
Comparison of single nucleotide variants identified by Illumina and Oxford Nanopore technologies in the context of a potential outbreak of Shiga Toxin producing E.coli
Short-read sequencing platforms have been adopted by public health agencies for infectious disease surveillance worldwide and have proved to be a robust and accurate method for quantifying relatedness between bacterial genomes. However, this approach offers less flexibility for urgent, small scale sequencing that is often required during public health emergencies. In contrast, Oxford Nanopore Technologies offers a range of rapid real-time sequencing platforms, although at this time it has been suggested that lower read accuracy compared to other sequencing technologies might be problematic for variant identification. We compared Illumina and Oxford Nanopore sequencing data of two isolates of Shiga toxin producing Escherichia coli to assess the utility of nanopore technologies for urgent, small scale sequencing. We investigated whether the same single nucleotide variants were identified by the two sequencing technologies and whether inference of relatedness was consistent. We show that with optimised variant calling using nanopore sequencing data alone, it is possible to rapidly determine whether or not two cases of were likely to be epidemiologically linked.
David graduated from the University of Bedfordshire with a BSc in Biomedical Science in 2014 before completing a MSc in Biomedical Science, specialising in Medical Microbiology, from Ulster University in 2015. He then joined the Gastrointestinal Bacteria Reference Unit (GBRU) at Public Health England in London, for the laboratory typing of gastrointestinal pathogens, before moving to the bioinformatics team where he performed data analysis on whole genome sequencing data. David is currently working as a bioinformatician at Public Health England and is a part-time PhD student with the University of Edinburgh, Roslin Institute studying the use of Oxford Nanopore sequencing technologies for the investigation of outbreaks of Shiga-toxin producing Escherichia coli in humans.
Rapid genomic discovery using VolTRAX library preparation and nanopore sequencing
Here we demonstrate a rapid, simplified, and portable sequencing platform using a combination of the VolTRAX V2 and the MinION. We applied this platform across two distinct research areas: 1) the environmental reservoir of antibiotic resistance; Environmental bacteria can act as reservoir of opportunistic pathogens despite a lack of exposure. Identifying specific antibiotic resistant microbes is essential for quick and appropriate treatment. 2) Detecting genomic variation in cultured cancerous cell lines. Because cancerous cell lines are often used in drug discovery and testing, it is important to understand if long-term laboratory growth might play a role in the cellular responses to various genotoxic stressors. In this proof of principle, each library was constructed by undergraduate researchers with introductory laboratory skills using the VolTRAX V2 using the guided automated library preparation. Sequencing was carried out using the MinION, the portable Oxford Nanopore DNA sequencer. We found that this combination provides a reliable and repeatable bench experience for the early career researchers
Ship-Seq: nanopore sequencing of polar microbes onboard research vessels
One of the exciting features of the MinION is the ability to carry out in situ sequencing in remote environments where previously it has been impossible to effectively study ecosystems. Polar oceans are biodiversity hot-spots which disproportionately contribute to global biogeochemical cycles, but they are among the most under-explored ecosystems on Earth, as well as the most threatened by anthropogenic environmental change. As a result of this, there is increasing interest in the study of polar microorganisms such as diatoms and coccolithophores, which are the main regulators of the polar ocean biogeochemical cycles. The study of polar microbes is often challenging as they survive only at specific temperatures, which limits our ability to transport them to laboratories for experiments. Long-term maintenance in the laboratory is also problematic as many species are cold-adapted and require polar-specific environments, as well as failing to thrive in close quarters. We are addressing this challenge by using the MinION for real-time studies on the diversity and function of microbial communities from the surface ocean. Our aim is to provide a real-time assessment of microbial diversity, real-time analysis of in situ experiments in polar oceans, and genome and transcriptome sequencing of sensitive but ecologically relevant polar microbes. During January and February 2019, we carried out our first feasibility test during a research cruise on the RRS Discovery, in which a MinION was used in conjunction with NanoOK RT software for in situ sequencing and real-time analysis of metagenomic samples collected by the ship. Our experiment provided information about species composition and abundance at multiple sampling points on a long transect between the Falkland Islands and South Georgia and the South Sandwich Islands, crossing the polar front. This includes a range of nutrient levels and temperatures, which allows for the investigation of genetic basis for the ability of diatoms and other phytoplankton to survive in a wide variety of conditions. Real-time analysis onboard a research vessel allows researchers to make evidence-based decisions on sampling locations and whether sampling has been sufficient. The results from our experiment will be validated against previous data from similar locations, alongside sequencing of sample replicates using alternative platforms. This will allow for a comparison between in situ sequencing with the MinION and UK-based sequencing with other platforms. Our results indicate that MinION sequencing is a powerful tool for polar microbe research, although a lack of available reference genomes currently limits its power. For further investigations, alongside the production of more reference genomes, analysis pipelines will be tailored to target specific genes and species that are of interest in terms of their function and ecological role.
Emma Langan is a NEXUSS PhD student at the University of East Anglia, Earlham Institute and the British Antarctic Survey, where she is using the Nanopore MinION for real-time metagenomic sequencing of polar ocean samples to monitor microbe populations. Emma graduated from University of Edinburgh with a BSc in Biomedical Sciences (Infectious Diseases) in 2015, before completing a MSc in Bioinformatics where she built a genome browser for the investigation of silent cricket genotypes.
Pore-C: a method for genome-wide, multi-contact chromosome conformation capture
The DNA within the nucleus of an interphase cell is organised into a complex hierarchy of folds and loops known as the 3D Genome. The development of various chromatin conformation capture methods has enabled the detection of the structures that define each level of this hierarchy e.g. chromosome territories, A/B compartments, topologically associated domains (TADs) and promoter-enhancer loops. This in turn has facilitated functional studies which have uncovered some of the mechanisms behind the formation and maintenance of these structures, as well as their effect on gene expression. However, most of these studies rely on methods that could only capture interactions between two points on the genome, and thus lacked the ability to resolve higher-order interactions. We will share our progress on Pore-C, a method to generate genome-wide, multi-contact chromatin conformation maps. We will also demonstrate how it can be used to improve whole genome assemblies and help resolve complex structural variants in cancer.
Eoghan Harrington is the Associate Director of Genomic Applications Bioinformatics working out of Oxford Nanopore’s New York office. He brings over a decade's worth of experience in genome sequencing to bear on his role in the Genomic Applications Group, a multi-disciplinary team tasked with finding novel uses for Oxford Nanopore devices and communicating them to a wide audience. To achieve this goal, Eoghan works closely with internal and external collaborators to identify and develop high-impact applications and publicise the results in posters, presentations and scientific publications. After graduating from Trinity College Dublin with a BA in Human Genetics and an Msc. in High Performance Computing, Eoghan went to EMBL Heidelberg to carry out his doctoral research. While there he used comparative genomes to study alternative splicing, in addition to some of the first shotgun metagenomic datasets. He went on to do postdoctoral research in single-cell microbial genomics at Stanford University. Prior to joining Oxford Nanopore Technologies, he worked at two start-ups: a leading personal genomics company and an oncology-focused electronic healthcare record and analytics company.
C-to-U RNA editing signals revealed by nanopore direct RNAseq
RNA editing is a relevant epitranscriptome modification able to modify primary transcripts by insertions, deletions or base conversions. In mammals, it includes the deamination of adenosine in inosine or the conversion of cytosine (C) to uridine (U). While the former has been well characterized by Illumina RNAseq, the latter, carried out by APOBEC1 enzyme, is rare and only a few instances have been discovered to date. Indeed, the identification of C-to-U events by second generation sequencing is quite challenging because real C-to-U modifications are masked by C-to-T transitions due to sequencing errors. Using nanopore directed RNA sequencing, we explored C-to-U RNA editing signals comparing RNA molecules from a wild type macrophagic mouse cell line and the corresponding APOBEC1 knockout cell line. Our results on candidate positions reveal a strong C-to-U signal indicating that the nanopore direct RNAseq could be a valid alternative to Illumina cDNA sequencing for C-to-U RNA editing.
Accurate detection of m6A RNA modifications in native RNA sequences using third-generation sequencing
From the battery of over 170 known RNA modifications, more than 70 have already been linked to human diseases, including neurological disorders and cancer, highlighting their importance in proper cellular functioning. Unfortunately, the limited availability of antibodies and chemicals selective to RNA modifications has so far limited our transcriptome-wide view to only a handful of RNA modifications. Consequently, the abundance, location, and function of the majority of RNA modifications remains unknown. To overcome these limitations, we have employed direct RNA sequencing from Oxford Nanopore Technologies, which allows direct sequencing of native RNA molecules, without any further amplification or reverse transcription step, thus potentially allowing for direct detection of RNA modifications in the full-length RNA transcripts. Using this technology, we have trained an algorithm that allows for the detection m6A RNA modifications in a quantitative manner and with single nucleotide resolution, finding that we can detect m6A RNA modifications with an overall accuracy of 90%. We then validate our findings in vivo, showing that our methodology can detect m6A modifications in yeast. As a control, we show that these modifications are not predicted by our algorithm in Ime4 knockout strains, which lack m6A. Our results open new avenues to investigate the universe of RNA modifications in full-length transcripts, with single molecule resolution. The establishment of the Oxford Nanopore platform as a tool to map virtually any given modification will allow us to query the epitranscriptome in ways that, until now, had not been possible. Future work can expand to other modifications like 5-methylcytosine (m5C), as well as provide additional thresholds for controlling specificity and sensitivity.
Eva Maria Novoa obtained her BSc in Biochemistry in 2007 with Honours, followed by an MSc in Bioinformatics in 2009. Since then, she has conducted research across three continents, including the Institute for Research in Biomedicine (IRB Barcelona) in Spain, the Massachusetts Institute of Technology and the Broad Institute in the USA, and the Garvan Institute of Medical Research in Australia. During these years, she has generated a substantial research profile in the field of protein translation and post-transcriptional regulation, using a combination of molecular biology, biochemistry and bioinformatic approaches. Since 2018, she has been Group Leader at the Center for Genomic Regulation (CRG) in Spain, in a dual appointment with the Garvan Institute, where she leads a team of 8 people. Her current work is focused on deciphering the language of RNA modifications, and how its orchestration can regulate our cells in a space-, time- and signal-dependent manner. Eva has received fellowships from EMBO, HFSP, “LaCaixa” and the ARC, and her work has been awarded with the Fisher Scientific Prize for Young Researchers (2013) given by the Spanish Society of Molecular Biology and Biochemistry, and the Young Researcher Award (2016) given by the Catalan Society of Biology.
Splice isoform-specific analysis of endogenous NMD targets in human cells
Nonsense-mediated mRNA decay (NMD) is a translation-dependent RNA degradation pathway that targets mRNAs with premature termination codons, as well as some endogenous mRNAs that encode full-length proteins. The features that render an mRNA sensitive to NMD are still poorly understood, except for the presence of an exon junction complex (EJC) >55 nts downstream of the termination codon. Obscuring the identification of NMD-inducing features is the fact that previous transcriptome-wide analyses of endogenous NMD targets did not reveal which specific splice isoforms are degraded by NMD. This is mostly attributed to the insufficient coverage of splice junction sites and the lack of information regarding non-annotated mRNA isoforms that are enriched upon NMD inhibition. A recent comparative transcriptome analysis from our lab of cells, in which three essential NMD factors were knocked down and then rescued, identified a high-confidence set of genes whose transcripts react to NMD (Colombo et al., RNA, 2016). However, because the analysis was based on short-reads only, we could not obtain reliable isoform-specific information. For an isoform-specific analysis, we now use cDNA nanopore sequencing, which allows us to identify full-length mRNAs that are stabilized upon NMD inhibition. Our approach can detect full-length isoforms that are enriched, or even appear, when NMD is inactivated and we have experimentally verified several examples. We integrate long and short-read sequencing to accurately quantify the expression of individual isoforms and thereby identify those that are targeted by NMD. We aspire to reveal the regulatory role of NMD at isoform-specific level and generate a resource that will enable the study of features that render a specific mRNA sensitive to NMD.
Evangelos is a postdoc in Oliver Mühlemann’s group at the Department of Chemistry and Biochemistry, University of Bern, Switzerland. He is interested in post-transcriptional mRNA regulation in mammalian cells and applies nanopore sequencing to identify endogenous mRNAs that are sensitive to nonsense-mediated mRNA decay. He is a biochemist from Greece with background in transcriptomics, translation termination and RNA decay.
Compressing Oxford Nanopore signal into CRAM
Ewan Birney is Director of EMBL-EBI with Dr Rolf Apweiler and runs a small research group. He is also EMBL-EBI's Joint Head of Research, alongside Dr Nick Goldman. Ewan completed his PhD at the Wellcome Sanger Institute with Richard Durbin. In 2000, he became Head of Nucleotide data at EMBL-EBI and in 2012 he took on the role of Associate Director at the institute. He became Director of EMBL-EBI in 2015. Ewan led the analysis of the Human Genome gene set, mouse and chicken genomes and the ENCODE project, focusing on non-coding elements of the human genome. Ewan’s main areas of research include functional genomics, DNA algorithms, statistical methods to analyse genomic information (in particular information associated with individual differences in humans and Medaka fish) and use of images for chromatin structure. Ewan is a non-executive Director of Genomics England, and a consultant and advisor to a number of companies, including Oxford Nanopore Technologies. Ewan was elected an EMBO member in 2012, a Fellow of the Royal Society in 2014 and a Fellow of the Academy of Medical Sciences in 2015. He has received a number of awards including the 2003 Francis Crick Award from the Royal Society, the 2005 Overton Prize from the International Society for Computational Biology and the 2005 Benjamin Franklin Award for contributions in Open Source Bioinformatics.
Efficient reading of DNA-encoded cellular memories using nanopore sequencing
DNA recombinases are site-specific enzymes that can precisely manipulate the structure of DNA molecules. They allow sections of DNA to be flipped or excised, or even new regions of DNA to be inserted at specific locations. Due to the permanent nature of these changes synthetic biologists have been able to create DNA-based memory to record events that cells experience by the structural configuration of their DNA. While prototypes of such devices have demonstrated the feasibility of this approach, there is still a severe lack of quantitative measurements and limited numbers of genetic designs hampering their widespread use. To address this, we are systematically constructing large libraries of DNA-based memory register designs and assaying their switching performance en masse using multiplexed nanopore sequencing. This provides us with precise measurements of structural changes occurring over time and illustrates the benefits that long-reads generated by nanopore sequencing have for bioengineers developing DNA-based technologies.
Genomics from roadkill - high quality mammalian genomes using hybrid assembly with MinION long reads
With thousands of fatalities due to car collisions with wildlife reported each year, roadkill are an underexploited resource in genomics. Here we show that mammalian roadkill samples could be used as a suitable source of DNA for long-read sequencing using the MiniON device for two carnivoran species frequently encountered along South African roads: the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristatus). For both species, hybrid assembly of 150PE Illumina reads at ~85X coverage (~215 Gb) and MiniON long reads at ~12X coverage (~30 Gb) using the MaSuRCA assembler provided genomes with high contiguity (~10,000 contigs with N50 of ~700 Kb) and completeness (>90% of complete BUSCOs). We further demonstrate that about 90% of the 14,509 single-copy orthologous genes of the OrthoMaM database could be successfully retrieved from these assemblies. These figures compare favourably with current mammalian genome assemblies and set our genomes among the best carnivore genomes currently available. This cost-effective strategy to obtain high quality reference mammalian genomes opens the way for large-scale population genomic studies of mammalian wildlife using resequencing of samples collected from roadkill. We illustrate the potential of the approach for genome scale species delimitation in both species for which subspecies have been defined based on disjunct distributions and morphological differences.
Frédéric Delsuc is Research Director at the French National Centre for Scientific Research (CNRS), working in the Institute of Evolutionary Sciences at the University of Montpellier. He received his PhD in molecular phylogeny from the University of Montpellier, then worked on mammalian and tunicate phylogenomics during post-doctoral positions in New Zealand and Canada before coming back to Montpellier as a permanent CNRS researcher. He is currently directing the ERC ConvergeAnt project aimed at understanding convergent evolution in ant-eating mammals through an integrative approach combining morphology, genomes, and microbiomes. The project team has adopted nanopore sequencing technology using the MinION to produce long-reads combined with Illumina short-reads to assemble mammalian genomes mostly from roadkill animals.
University of Oxford
George and his colleagues from the University of Oxford Big Data Institute have recently returned from the Mobile Malaria Project, a six-week trip to Africa to learn about malaria research and to trial MinION sequencing pipelines in the field. Working with local collaborators in Zambia and Kenya, much of the project was spent training African scientists on the basics of nanopore sequencing and assessing its feasibility in low resource settings. The team tested amplicon sequencing pipelines for key antimalarial resistance genes in Plasmodium falciparum and insecticide resistance genes in Anopheles gambiae. They hope to use what they learnt to further develop research collaborations and build capacity for nanopore sequencing in Africa.
Long reads reveal small scale genome structural variations in Brassica napus
In this era of climate change and global warming it is our responsibility as the scientific community to find sustainable ways for meeting our energy and fuel requirements. Canola, Brassica napus, based biodiesel provides a perfect alternative to the use of fossil fuels and can help us cut our Greenhouse gas emissions by up to 90%. In order to counteract the ever-increasing demand for fuel and energy, it is crucial to maintain a high yield for this crop without generating a huge environmental footprint. However, B. napus is a very complex genome originating from an inter-specific hybridization event between Brassica oleracea (Mediterranean cabbage) and Brassica rapa (Asian cabbage or turnip). Due to high levels of homology between the two sub-genomes, making up the Canola genome it is extremely difficult to identify the novel genome polymorphism underlining important traits such as yield, disease resistance and abiotic stress tolerance. Next generation genome sequencing had been a game changer when it comes to deciphering complex quantitatively inherited traits in B. napus. However, the resolution offered by the second-generation sequencing technologies, such as Illumina sequencing, was severely limited due to the small size of the sequencing reads. With Oxford Nanopore technology it is now possible to zoom into the Canola genome to identify gene level structural variation associated with key traits such as yield. We have sequenced 4 Canola genotypes using nanopore technology and identified insertions and deletions ranging from 50bp to 10,000bp in genes involved in a plethora of important traits like disease resistance, flowering time etc. This knowledge will enable us to engineer a future ready Canola plant.
Harmeet Singh Chawla is a PhD student in the Department of Plant Breeding at the Justus Liebig University Giessen. Harmeet completed a MSc in Agro-biotechnology at JLU Giessen, and is interested in studying the impact of genome structural variations on eco-geographical adaptation and various other agronomically important traits in Brassica napus, Canola.
Direct sequencing of nascent RNA exposes splicing kinetics and order
Human genes contain many long introns with degenerate sequence information at splice sites, requiring sophisticated mechanisms to locate and coordinate the excision of multiple introns within the same pre-mRNA transcript. Fundamental aspects of this process remain unexplored due to a lack of quantitative approaches that monitor RNA processing as transcripts are produced. Here we performed nanopore sequencing of nascent, or newly synthesized, RNA to directly probe the timing and patterns of mRNA splicing. Direct RNA sequencing by the Oxford Nanopore Technologies MinION reveals the native context of long RNA molecules from 3’ to 5’ without amplification-associated biases. By combining direct RNA nanopore sequencing with stringent purification of nascent RNAs, we measure both the active transcription site (nascent RNA 3’ ends) and the splice isoform of single RNA molecules as they are transcribed. Application to human K562 cells reveals that co-transcriptional splicing occurs after RNA Polymerase II has transcribed several kilobases past the 3’ splice site of most introns. We also observe that the order of intron removal is not influenced by transcription direction in human cells. By contrast, we analyzed nascent RNA from Drosophila S2 cells, which have a different gene structure, and found that co-transcriptional splicing occurs more rapidly and in the order of transcription. Treating cells with the splicing inhibitor Pladienolide B abolishes co-transcriptional splicing in both species. Altogether, directly sequencing nascent RNAs through nanopores exposes critical molecular processes that occur during transcription in living cells.
Heather received her bachelor’s degree in Molecular Biology from Princeton University where she worked in Laura Landweber’s lab studying ciliate genome rearrangement. She then spent a year in Bob Langer’s lab at MIT developing a device to predict drug sensitivity in solid tumors. She is currently completing her PhD in Genetics at Harvard University in the lab of Stirling Churchman, working with nascent RNA and nanopore sequencing. She is interested in all aspects of RNA biology and tools to measure co-transcriptional processing.
Small, modified and highly structured: the challenge of tRNA sequencing
Oxford Nanopore Technologies have been initially developed to directly sequence the long molecules of DNA and RNA. The possibility of sequencing shorter molecules using nanopore is widely discussed in the field but remains questionable because of the higher error rate compared to the classical deep-sequencing approaches. Here we show the successful application of the MinION device to sequence tRNA molecules. The challenges of sequencing tRNAs are due its short length and folded structure, however, to overcome this we have improved the library preparation in order to compute the whole length of the tRNA. Initially, we sequenced a mixture of in vitro transcribed E. coli tRNAs and developed a bioinformatical pipeline to assign base-called reads to different tRNA species with a high degree of accuracy. Moving towards more complex samples containing native tRNAs, we found that modifications along the tRNA reduced the fidelity of called bases, so we developed an algorithm of tRNA classification based on raw-signal patterns. Comparing those patterns to unmodified in vitro transcribed tRNA signals allowed us not only to distinguish between different tRNA species, but also to detect modifications occurring in the native tRNAs. Our results show that nanopore-based approaches can be used to sequence tRNAs and classify them. This unveils a new area of the nanopore technology in application to short molecules, detecting the modifications and even predicting the potential ones, which are currently unknown, but may govern the structure, affect decoding or play a role in diseases.
Irina studied Medical Cybernetics at the Russian National Research Medical University in Moscow and graduated with a Medical Degree. She then completed a Masters in Biochemistry and Bioinformatics at the University of Potsdam in 2015, where she began working with NGS data. Irina is currently a PhD student at the Institute of Biochemistry and Molecular Biology at the University of Hamburg. As a bioinformatician, Irina processes various deep sequencing data including nanopore and develops new algorithms for analysis.
SquiggleKit: a toolkit for manipulating nanopore signal data
The management of raw nanopore sequencing data poses a challenge that must be overcome to accelerate the development of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a toolkit for manipulating and interrogating nanopore data that simplifies file handling, data extraction, visualisation, and signal processing. Its modular tools can be used to reduce file numbers and memory footprint, identify poly-A tails, target barcodes, adapters, and find nucleotide sequence motifs in raw nanopore signal, amongst other applications. SquiggleKit serves as a bioinformatics portal into signal space, for novice and experienced users alike. It is comprehensively documented, simple to use, cross-platform compatible and freely available (https://github.com/Psy-Fer/SquiggleKit).
James Ferguson is a Genomic Systems Analyst in the Genomic Technologies Group at the Kinghorn Centre for Clinical Genomics, located at the Garvan institute of Medical Research in Sydney, Australia. With a background in clinical pathology testing, algorithm development, and computer hacking, James applies his unique skill set to develop new bioinformatic tools, as well as design and support nanopore sequencing infrastructure.
Mobile Malaria Project
Despite reductions in malaria prevalence in the last two decades, the World Health Organization still reported an estimated 435 thousand deaths in 2017, the majority occurring in children under the age of five. Moreover, continued progress is threatened by emerging drug and insecticide resistance. Our team won the 2019 Land Rover Bursary, supported by the Royal Geographic Society, on a proposal to convert a 2019 Land Rover Discovery into a mobile sequencing lab and drive it 6300km across Africa, from the Atlantic to Indian Ocean. During our journey, we met with local research teams and policy makers striving to combat malaria, and produced materials aiming to raise public awareness and keep malaria on the global development agenda. My role in the project was to develop and pilot the mobile lab which, with local collaborators, we used to sequence antimalarial resistance genes in Zambia, and whole mosquito genomes in Kenya. We hope our project promotes the feasibility of a decentralized approach to pathogen and vector sequencing and marks the beginning of long-term collaborations incorporating in-country nanopore sequencing with policy-directed malaria research.
Jason is reading for his DPhil in statistical genetics at the University of Oxford, focusing on how malaria genetic data can be leveraged to support malaria control. He is enrolled as part of the four-year Genomic Medicine and Statistics Programme funded by the Wellcome Trust. Prior to Oxford, he studied biochemistry at the University of Toronto, graduating with an H.BSc. and an M.Sc. He joined the Mobile Malaria Project in June 2018 to develop the lab and bioinformatic pipelines that were deployed in field settings during their trip across Africa.
Multiplexed direct quantification of barcoded protein reporters on a MinION
Genetically encoded reporter proteins are a cornerstone of molecular biology widely used to measure many biological activities, but the current number of unique reporters that can be used together for multiplexed tracking is small due to overlapping detection channels such as fluorescence. We therefore built an expanded library of orthogonally-barcoded nanopore-addressable protein tags engineered as reporters (NanoporeTERs), which can be read by nanopore sensors at the single-molecule level. By adapting a commercially available nanopore sensor array platform typically used for real-time DNA/RNA sequencing (Oxford Nanopore Technologies MinION), we show direct quantification of individual NanoporeTER expression levels at the protein level from engineered bacterial cultures, with little to no sample preparation. These results open new applications for multiplexed, real-time tracking of complex biological phenomena not possible with conventional protein reporters using portable, high-throughput nanopore sensor technology.
Cyclomics: ultra-sensitive nanopore sequencing of cell free tumor DNA
In many types of cancer, tumor cells shed small (~150bp) DNA molecules in the blood or other body fluids. Detecting mutations in the cell-free DNA (cfDNA) content of liquid biopsies in cancer patients thus offers a unique opportunity for non-invasive diagnostics for the purpose of e.g. treatment monitoring treatment response or detecting recurrent disease. Reliably detecting mutations from minute amounts of tumor derived cfDNA is, however, highly challenging. To address this, we developed CyclomicsSeq, a novel sequencing approach leveraging the long-read nanopore platform to achieve single molecule detection accuracy. We have demonstrated proof of concept on head and neck cancer (HNC) patient samples and found that CyclomicsSeq can detect mutations anywhere in the TP53 gene with single-molecule accuracy. CyclomicsSeq thus offers a reliable liquid biopsy diagnostic assay which can be cost-effectively implemented in routine clinical workflows.
Dr. Jeroen de Ridder is a Principal Investigator and Associate Professor at the Center for Molecular Medicine of the University Medical Center Utrecht, as well as a junior PI at the Oncode Institute. He runs a bioinformatics lab which aims to create and apply innovative data science methods to advance our understanding of disease biology. His research efforts are always inspired by a biological question and typically deal with big data, such as large-scale genomics and epigenomics datasets. As a result, much of the research floats on machine learning and data integration algorithms. Recently, Dr. de Ridder, along with Dr. Kloosterman and Dr. Marcozzi, founded a start-up company Cyclomics, which aims to provide ultra-sensitive sequencing of cell free tumor DNA.
Tapestry: assessing small eukaryotic genome assemblies with long-reads
Assemblies of small eukaryotic genomes using long-reads are often close to complete. However, these assemblies remain difficult to validate, especially when genomes have complex features such as large inversions, translocations, ploidy variations, and where chromosome number may not be known. While many tools for assessing assemblies with short-reads exist, long-reads have far greater power for confirming the accuracy and completeness of contigs. I will present Tapestry, a tool for validating the contigs of a small assembly automatically and visualising the contigs so the structure of the assembly can be refined before polishing. I will show how Tapestry has helped us to resolve the complex genomes of several small eukaryotes.
John Davey is a bioinformatician at the University of York, working in the Department of Biology Technology Facility. He received his PhD from the University of Edinburgh and then worked with Mark Blaxter and Edinburgh Genomics during the development of Illumina sequencing, developing methods for analysing Restriction-site Associated DNA (RAD) Sequencing data, among many other things. He then held a fellowship at the University of Cambridge, working with Chris Jiggins on speciation of Heliconius butterflies, completing a chromosomal genome assembly of H. melpomene. He now works on a wide range of genomes and metagenomes at York, mostly trying to figure out how to turn raw nanopore sequence into completed genome assemblies.
University of British Columbia
Dr John Tyson is a senior research associate in the lab of Professor Terrance Snutch and is based in the Michael Smith laboratories and Djavad Mowafaghian Centre for Brain Health at the University of British Columbia. He is a molecular cell biologist by training and his work focuses mainly on aspects of genome modification and altered transcript splicing events relating to neurological disease, in particular the voltage dependent calcium channels. John has been utilizing nanopore sequencing in his research since 2014, focusing on both single molecule full length splice variation and whole genome sequencing and assembly. This has led to the production of expanded C. elegans genomes, with delineation of large structural variants, and the first reference human genome using the MinION as a member of the Nanopore Whole Genome Sequencing Consortium. He is currently working on methods to expand nanopore read lengths for better production of whole genome assemblies, and also using individual full length RNA/cDNA transcript sequencing to better understanding contextual splice variation in neurological disease.
The Three Peaks Challenge and developing extraction methods suitable for long-read, ultra-deep stool metagenomics on the PromethION
At present, most metagenomic surveys are performed using short-read sequencing. This approach limits the specificity of taxonomic assignment and result in highly fragmented assemblies. Single molecule sequencing platforms are able to sequence much longer molecules and the output of these platforms, particular the PromethION from Oxford Nanopore, now supports the study of complex microbial communities using shotgun metagenomics. We assessed a variety of commercially available and manual extraction methods using both a ten-species mock community and clinical samples of stool to find a method capable of generating ultra-long reads (>100 kb). Neither bead-beating or column-based extraction methods were found to support reads of the desired length and moving to magnetic bead and manual extraction methods allowed significant improvements in read-length. We also demonstrated the power of using solely chemical and enzymatic cell lysis methods for extracting high-molecular weight DNA from recalcitrant organisms, such as Gram-positive bacteria and fungi, over popular physical disruption methods. Development of these methods is critical to support the growing field of clinical microbiome research, including the ability to perform strain tracking and produce high-quality metagenome assembled genomes (MAGs) from metagenomic samples.
Joshua is a molecular biologist specialising in sample preparation for nanopore sequencing. He is a post-doc in Nick Loman's lab at the University of Birmingham which explores the use of cutting-edge genomics and metagenomics approaches to the diagnosis, treatment and surveillance of infectious disease. In March 2015 he travelled to Guinea in West Africa with the MinION to establish the first mobile laboratory performing viral surveillance of Ebola virus during an epidemic. Later he developed a tiling, multiplex PCR method for sequencing Zika virus from low-titre clinical samples used during the 2016 outbreak in Brazil. He also developed the popular ultra-long read sequencing method used to assembly the E. coli genome in 8 reads, sequence the human genome with an N50 > 100 kb with read lengths up to 882 kb and later to generate the first telomere-to-telomere assembly of the X chromosome. He is currently working on methods to perform untargeted sequencing to bring rapid, pathogen identification out of the lab and into the clinic.
Transcriptome profiling of colon cancer by nanopore sequencing
By utilizing the advantages of Oxford Nanopore technology, full-length transcripts can be read in a relatively short period of time, thereby ensuring new unidentified transcripts such as novel isoform or fusion genes. Although the analytical advantages of nanopore are available, no attempt to identify the cancer transcriptome using nanopore sequencing technology has been made. We have sequenced 20 pairs of Korean colon cancer patients using cDNA PCR Oxford Nanopore technology. We attempted to identify some isoforms which could not be identified in the other sequencing platform. We also wanted to detect some of the differentially expressed transcripts including non-coding RNAs to reveal their molecular and functional analogy as well as their clinical implication in colon cancer. Moreover, using nanopore data, we attempted to detect some fusion genes in colorectal cancer which might have potential diagnostic implication. Overall, we have characterized the transcriptome profiles of Korean colon cancer patients using nanopore data to identify some potential biomarker for diagnosis prediction. Using such technology would allow us to find biomarkers that could be more precise and accurate than the previously used method to define transcriptome profiles, and also to propose a new molecular mechanism to better understand the colon cancer.
MinION for barcoding and metabarcoding in polar environments
DNA barcoding (DNA-based species identification), is, like other fields of biology, being rapidly transformed by next-generation sequencing. Because of their environmental focus, barcoding (taxonomic identification of an individual specimen of plant, animal, or fungus) and metabarcoding (identification of the plant, animal, and fungal taxa present in an environmental sample) are ripe for field-ready NGS technologies, including the MinION and emerging Oxford Nanopore products. #PolarPoo, an international, interdisciplinary research campaign begun on Twitter, is an ideal test and showcase for MinION sequencing for barcoding and metabarcoding. As a #PolarPoo pilot, we compared MinION and Illumina MySeq for sequencing identical fecal material from the Antarctic black-browed albatross (Thalassarche melanophris), an important climate change indicator species in the Antarctic. Here I report the results of this comparison and discuss the applications of nanopore sequencing in polar research.
Telomere-to-telomere assembly of a complete human X chromosome
Release of the first human genome assembly was a landmark achievement, and after nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has yet been finished end to end, and hundreds of gaps persist across the genome. These unresolved regions include segmental duplications, ribosomal rRNA gene arrays, and satellite arrays that harbor unexplored variation of unknown consequence. We aim to finish these remaining regions and generate the first truly complete assembly of a human genome.
Here we announce a whole-genome de novo assembly that surpasses the continuity of GRCh38, along with the first complete, telomere-to-telomere assembly of a human X chromosome. In total, we collected 40X coverage of ultra-long Oxford Nanopore sequencing for the CHM13hTERT cell line, including 44 Gb of sequence in reads >100 kb and a maximum read length exceeding 1 Mb. This unprecedented coverage of ultra-long reads enabled the resolution of most repeats in the genome, including large fractions of the centromeric satellite arrays and short arms of the acrocentrics. A de novo assembly combining this nanopore data with 70X of existing PacBio data achieved an NG50 contig size of 75 Mb (compared to 56 Mb for GRCh38), with some chromosomes broken only at the centromere. Using this assembly as a basis, we chose to manually finish the X chromosome. The few unresolved segmental duplications were assembled using ultra-long reads spanning the individual copies, and the ~2.3 Mbp X centromere was assembled by identifying unique variants within the array and using these to anchor overlapping ultra-long reads. These results demonstrate that it is now possible to finish entire human chromosomes without gaps, and our future work will focus on completing and validating the remainder of the genome.
Karen H. Miga, PhD, is an Assistant Research Scientist at UCSC. Dr. Miga’s research program combines innovative computational and experimental approaches to produce the high-resolution sequence maps of human centromeric and pericentromeric DNAs.
Nanopore sequencing of full-length circRNAs in human and mouse brains reveals circRNA specific exon usage and intron retention
CircRNA is a poorly understood class of non-coding RNAs, some of which have been shown to be functional important for cell proliferation and development. CircRNAs mainly derive from back splicing events of coding mRNAs, making it difficult to distinguish the internal exon composition of circRNA from the linearly spliced mRNA. To examine the global exon composition of circRNAs, we performed long-read sequencing of single molecules using Oxford Nanopore technology for human and mouse brain-derived RNA. By applying an optimized circRNA enrichment protocol prior to sequencing, we were able to detect 7,834 and 10,975 circRNAs in human and mouse brain, respectively, of which 2,945 and 7,052 are not currently found in circBase. Alternative splicing was more prevalent in circRNAs than in linear spliced transcripts, and notably >200 not previously annotated exons were used in circRNAs. We conclude that the Oxford Nanopore technology provides a fast and reliable sequencing method to map the specific exon composition of circRNA.
Molecular tagging with nanopore-orthogonal DNA strands
Molecular tagging is an approach to securely label physical objects of high value using DNA or other molecules. An ideal system should be inexpensive, quick and reliable to decode, and require minimal equipment. We created a novel molecular tagging system using DNA-based tags and the MinION nanopore device. In our tagging system, we first encode a 32-bit digital tag into a 90-bit codeword that is more robust to errors. We then convert it into a DNA-based molecular tag where each 1 or 0 in the codeword is represented by the presence or absence of a molecular bit (molbit). A single molbit is a DNA strand that modularly combines a unique barcode structure with a specific strand length. This allows us to classify molbits directly from the raw nanopore signal, avoiding basecalling to reduce compute time and misclassification errors. Molbits are prepared for readout at the time of tag assembly and are then stabilized by dehydration. These steps extend the shelf life of the tag, decrease decoding time, and make it robust to contamination from environmental DNA. The result is an extensible, real time, high accuracy tagging system that includes a novel approach to developing nanopore-orthogonal barcodes with applications beyond this system.
Single cell isoform profiling, 10xGenomics scRNA-seq and nanopore long read sequencing
Single cell transcriptome sequencing has become a powerful tool for high-resolution analysis of gene expression in individual cells. However, current high throughput approaches only allow sequencing of one extremity of the transcript (transcriptome profiling). Information crucial for an in-depth understanding of cell-to-cell heterogeneity on splicing, chimeric transcripts and sequence diversity (SNPs, RNA editing, imprinting) is lost. Here we present an approach that uses Oxford Nanopore sequencing with unique molecular identifiers to obtain error corrected full length single cell sequence information with the 10xGenomics single cell isolation system and apply it to examine differential RNA alternative splicing and RNA editing events in the embryonic mouse brain.
Kevin Lebrigand is Head of Bioinformatics at UCAGenomiX, the functional genomics platform of Nice-Sophia-Antipolis, one of the core nodes of the "France Genomique" network, using next generation sequencing to perform a broad range of sequencing projects such as de novo genome assembly, RNA-seq, small RNA-seq, CHIP-seq and CLIP-seq. In 2014 the platform decided to focus their expertise on methodological developments around single cell transcriptomics using the Fluidigm C1, and more recently the 10xGenomics Chromium system, on which more than 120 samples has been profiled. Last summer Kevin acquired a PromethION long-read sequencer to perform isoform-level profiling at the single cell level.
Retrotransposon variation in human genome and tumorigenesis
Retrotransposons are transposable genetic sequences that copy themselves into an RNA intermediate and insert elsewhere in the genome by reverse transcription. Almost half of the human genome is derived from transposon derived sequences but only some dozens of full length Long Interspersed Nuclear Element-1s (LINE1s) in the human germline are expected to be retrotransposition competent. Mapping and genotyping retrotransposons with short-reads is complicated due to their size and high copy number of their consensus sequence in the reference genome, so we applied nanopore sequencing to study multiple aspects of LINE1 retrotransposition. First, we sequenced Inverse-PCR products with MinION detecting highly subclonal insertion sites of a particular LINE1 element in two colon cancer tumors. Second, we have whole genome sequenced few germline and tumor genomes with PromethION, detecting the whole range of retrotransposon insertions that are variable within humans or inserted during tumorigenesis. Finally, we studied DNA methylation around the LINE1 insertion and source sites in human tumors in order to understand the mechanisms of LINE1 activation in during tumorigenesis. Presently we are extending these studies to ~300 whole genomes of Uterine Leiomyoma tumors and their respective normal sequences.
Kimmo Palin completed his PhD in Computer Science in University of Helsinki in 2007, focusing on comparative modelling of mammalian gene enhancer elements. Between 2008 to 2012, he was a postdoctoral fellow at the Wellcome Trust Sanger Institute in Hinxton, UK, working on human genetics and genome sequencing. Since 2012, he has been working as a staff scientist in University of Helsinki with Prof. L. Aaltonen and Prof. J. Taipale on tumor genetics and genomics, including copy number variation, mutational signatures, gene regulation and chromatin structure.
PSI-Sigma: a comprehensive splicing-detection method for short-read and long-read RNA-seq analysis
Percent Spliced-In (PSI) values are commonly used to report alternative pre-mRNA splicing (AS) changes. Previous PSI-detection tools were limited to specific AS events and were evaluated by in silico RNA-seq data. We developed PSI-Sigma, which uses a new PSI index, and we employed actual (non-simulated) RNA-seq data from spliced synthetic genes (RNA Sequins) to benchmark its performance (i.e., precision, recall, false positive rate, and correlation) in comparison with three leading tools (rMATS, SUPPA2, and Whippet). PSI-Sigma outperformed these tools, especially in the case of AS events with multiple alternative exons and intron-retention events. We also briefly evaluated its performance in long-read RNA-seq analysis, by sequencing a mixture of human RNAs and RNA Sequins with nanopore long-read sequencers. Based on the long-read RNA-seq data of RNA sequins, we found that nanopore long-read RNA-seq is qualitatively reliable. Also, in human U87 cells, we found that ~1 million long reads can already detect major AS changes in ~3,500 protein-coding genes with at least 10 supporting long reads. PSI-Sigma is implemented in Perl and is available at https://github.com/wososa/PSI-Sigma
Kuan-Ting Lin is a Computational post-doc at Cold Spring Harbour Laboratory, where he focusses on quantitative biology and transcriptomics technologies. His long-term research interests involve the use of mathematical, statistical or computational techniques to develop understanding of how alterations in RNA transcription contribute to human health and his academic training and research experience has provided an excellent background in drug discovery, data mining and quantitative biology.
Nano-C: targeted poly-contact chromatin interactions for comprehensive profiling of cell-to-cell variation of 3D genome organization
Three-dimensional (3D) genome organization is an essential aspect of genomic function. Experimentally, the spatial association of regulatory elements with their targets and the characterization of complex relationships in spatial genome architecture have emerged as key challenges. To unravel these structures, Chromosome Conformation Capture (3C) methods are used, which are based on the principle that cross-linked interacting genomic regions can be cut and re-ligated such that sequences in close physical proximity become frequently joined together. The resulting ligation junctions thus reflect the 3D organization of the genome at the time of fixation. However, common short-read Illumina-based 3C methods generate pairwise interactions present within the library of re-ligated fragments (3C library), thus limiting the ability to untangle complex multiway interaction networks. For instance, if different interactions happen simultaneously, exclusively or in subsets, cannot be discerned by methods based on pairwise interactions. The ultimate aim of my project is to determine if the CTCF insulator protein, a major architectural protein in the mammalian nucleus, functions by structuring mostly homogeneous or heterogeneous chromatin domains, particularly TADs (Topologically Associating Domains). To address this question, I have developed “Nano-C”, a method combining 3C with nanopore sequencing. Nano-C uses in-vitro transcription of long DNA molecules in a 3C library followed by direct-RNA nanopore sequencing, thereby enabling the detection of up to 10-15 uniquely mapped interactions from defined genomic loci that occur within single cells. Analysis of such poly-contact chromatin interactions, both intra- and inter-chromosomal, in hundreds of cells at a time allows the uncovering of complex 3D genome architecture. I will discuss our recent Nano-C results that provide a first measure of TAD border heterogeneity when the CTCF architectural protein is bound.
Li-Hsin Chang is a postdoctoral researcher at the Chromatin Dynamics group led by Dr. Daan Noordermeer at the Institute for Integrative Biology of the Cell (I2BC) in France. Li-Hsin received her PhD in cellular and developmental biology in 2017 from the University of Illinois at Urbana-Champaign, where she studied the function and regulatory landscape of zinc-finger transcription factors with Professor Lisa Stubbs. Her current research interests focus on 3D chromatin organization, aiming to uncover the cellular heterogeneity of Topologically Associating Domains (TADs). To this end, she has developed a new method “Nano-C”, which combines chromosome conformation capture (3C) technique with nanopore sequencing for detecting poly-contact chromatin interactions.
An international collaborative effort for infectious disease analyses using MinION
GRAID is the Global Research Alliance for Infectious Disease, a collaborative international effort for infectious disease research using MinION. In this framework, we try to educate researchers and develop the methods and guidelines for field analysis of many aspects of infectious disease. We have conducted four summer schools in three developing countries: Thailand, Indonesia, and Kenya, as part of our efforts to introduce MinION in those countries. We are collaborating with researchers and have produced papers about serotype identification of dengue virus and comprehensive drug resistance identification of malaria parasites. Using MinION and isothermal amplification, we identified serotype of dengue virus in Manado, Indonesia and Hanoi, Vietnam. We found that this method simplifies the amplification of the virus nucleic acid by using only blood or serum and a water bath or a thermal block prior to library preparation. We analyzed 141 Indonesian and 80 Vietnamese patients. The overall successful detection rate was 79% and it depends largely on the viral titer. We also determined that the serotype of dengue virus is different in Indonesia and Vietnam, which is DENV1 and DENV3, respectively. Our next collaborative project is to comprehensively describe the drug resistance of malaria parasites in Indonesia, Vietnam, and Thailand. Here, we used PCR to amplify nine genes correlated to the drug resistance phenotype. We sequenced 118, 11, and 5 samples from Indonesia, Thailand, and Vietnam in multiplex manner and described the drug resistance pattern in each country. We found a position in K13 gene non-propeller region mutated quite frequently from our Indonesian samples. Although we believe that the mutation is not related to artemisinin resistance, we think that the parasites may be on selective pressure due to the artemisinin administration in the region. We also are working with bioinformaticians to develop a graphical user interface tools for researchers or clinicians who are unfamiliar with bioinformatics analysis. We have published Nano Pipe, which serves as an easy to use MinION data analysis tool for a ‘regular’ user. We have ongoing and prospective projects in the framework, such as HLA typing in the severe dengue patients, identification of unknown fever-causing pathogens, and determining the drug resistance pattern in HIV. We are confident that our consortium will make an impact in the infectious disease community to switch to sequencing in the research context while laying some foundations for preventive or therapeutic medicine in the future.
After graduating from the Sam Ratulangi University Manado in Indonesia, Lucky Ronald Runtuwene worked as a medical doctor in a community health center on the island of Siau. There he became interested in infectious diseases, and so decided to pursue a research career in the field. He completed a PhD in Japan, where he probed the gene expressions of a vector mosquito infected with dengue virus followed by a post-doc in the University of Tokyo, where his laboratory was one of the early adopters of the MinION technology. Lucky is interested in field work and so the portability of the MinION and ease of processing aid his research, leading to the conception of the GRAID consortium which he will introduce at London Calling 2019.
Mapping DNA replication using nanopore sequencing
We have harnessed nanopore sequencing to study DNA replication genome-wide at the single-molecule level. Using in vitro prepared DNA substrates, we characterized the effect of bromodeoxyuridine (BrdU) substitution for thymidine on the MinION nanopore electrical signal. Using a neural-network basecaller trained on yeast DNA containing various amount of BrdU, we identified BrdU-labelled tracts in yeast cells synchronously entering S phase in the presence of hydroxyurea and BrdU. As expected, the BrdU-labelled tracts coincided with previously identified early-firing, but not late-firing, replication origins. After BrdU pulse-labelling of asynchronous cells, we could detect and orientate dozens of thousands of individual replication tracts. This allowed us to reproduce RFD profiles obtained by OK-seq and to map thousands of initiation events, the vast majority of them coinciding with the well characterized ARSs. These results open the way to high-throughput, high-resolution, single-molecule analysis of DNA replication in many experimental systems.
Magali Hennion completed a PhD in molecular biology in Toulouse with Dr Emmanuel Käs and Dr Olivier Cuvier, where she worked on chromatin organisation and insulator proteins. She then moved to Göttingen in Germany where she worked with Dr Steven Johnson in a postdoc internship on chromatin changes during stem cell differentiation, before a second postdoc with Stefan Bonn, where she focussed on epigenetic changes associated with memory formation and maintenance in the mouse brain. Since 2016, Magali has worked in Dr Olivier Hyrien's team in Paris, where she is developing new techniques to study DNA replication based on nanopore sequencing.
Metagenomics of India's largest River Ganges confluence at Prayagraj, India using MinION sequencing
River confluence and their microbial dynamics are least explored throughout globe. River Ganges is one of the most important and holy rivers of India and has a great mythological history and important for mass bathing events. Mainly Yamuna River is known to meet Ganges to form a confluence at Prayagraj (Allahabad), India. However, the influence of Yamuna river on taxonomic and functional profiling of microbial communities at the confluence of Ganga and Yamuna and in the succeeding downstream of confluence remain uncharacterized. Therefore, in year 2017 we undertook mega study under the directives of Government of India’s River cleaning mission (under the aegis of National Mission for Clean Ganga) and network program led by Council of Scientific and Industrial Research (CSIR) and Indian Council of Medical Research (ICMR) to understand India’s largest and sacred river i.e. Ganges. Water and sediment samples collected from Ganga-Yamuna confluence were processed for deciphering microbiome diversity and taxonomic functions using MinION sequencing technology. Preliminary investigations of confluence microbiome at Parayagraj revealed similar taxonomic (bacterial, fungal, archeal, and phages) and functional (resistomes) microbial profiles in the upstream (of Ganges) and farther downstream sites of the confluence revealed a transient influence of Yamuna River on holy River Ganges. Overall, Ganges River harbors plethora of microbial diversity and hidden treasure of functional potential that could be useful to depict non-putrefying properties of this river.
Mahesh Dharne is a scientist in National Collection of Industrial Microorganisms (NCIM) at CSIR-National Chemical Laboratory, Pune, India. He has established state-of-art facilities include molecular identification systems (like Sanger sequencing, next generation sequencing) and biochemical identification systems (like VITEK2 and VITEK-MS), which are usually required for industries and academia. His research interests are into environmental and industrial microbiology.
Using full-length transcript sequencing to reveal the fate of mRNA in aging seeds
After seeds mature on the mother plant, they contain all the molecular machinery they will need for germination. Generally, some time elapses between maturation and germination. At the National Center for Genetic Resource Preservation, this time may be months to decades, or, optimistically, centuries. But, during this time, seeds eventually lose the ability to germinate. One explanation for this change is that the molecular machinery has gradually accumulated damage. When trying to assess the health of a seed lot, the standard method is to use a subset of seeds for a germination test, but these results are not straightforward to interpret. We hypothesized that accumulated damage at the molecular level could be quantified as an independent measure of seed lot health and chose to examine RNA. We showed that integrity of total RNA declines with storage time in seeds of many species. To show whether mRNA was similarly affected, we compared transcript integrity in 23-year-old and 2-year-old soybean seeds by sequencing full-length cDNA using MinION. In 23-year-old seeds, certain transcripts were only partially sequenced, and we confirmed that this was because of transcript fragmentation at random sites. We quantified transcript degradation for all transcripts and found that degradation increased with transcript length. This result supports the hypothesis that random damage accumulates at the molecular level. We now anticipate using the integrity of long transcripts to assess seed health over time.
Margaret Fleming obtained her PhD in Botany from Colorado State University in 2015 for her work on the role of the structural cell wall protein extensin in biomass recalcitrance in the context of biofuel production. Margaret then completed a postdoc with Dr Christina Walters at the National Laboratory for Genetic Resource Preservation, where she studied how time and environment affect seeds of both cultivated and wild plants, focusing on the interrelationship of RNA degradation and seed viability. Her current work with Dr Chris Saski focusses on the transcriptomic effects of Armillaria (root-rot) infection of susceptible and resistant peach rootstock and will soon join the lab of Dr Marjorie Weber at Michigan State University to study the evolution of mite domatia in Vitis (grape).
Using long-read nanopore sequencing to unravel structural genomic variations in plants
Transposable elements (TEs) are mobile DNA elements potentially able to move and multiply within genomes. TEs account for 40% of total size genome in the cultivated rice, Oryza sativa, and 10% in Arabidopsis thaliana and their insertion polymorphisms are responsible for most of the structural variations between varieties or ecotypes of these two species. Although large genomic datasets are available to study the diversity within these two species, most of these datasets consist in Illumina short read files and not in fully assembled genomes. To detect TE insertion polymorphisms (TIPs) in these large datasets, we developed a new software, TRACKPOSON and applied it to characterise TIPs for 31 families of retrotransposons in 3,000 cultivated rice varieties. We used DNA nanopore sequencing to unambiguously validate our in silico results. We further took advantage of cDNA nanopore sequencing to analyze structural variation within transcripts in rice and Arabidopsis and to improve transcriptome annotation with alternative splicing detection. Moreover, for the first time we could detect long reads corresponding to entire TE transcripts.
Marie-Christine Carpentier graduated from University of Paris VII Diderot in 2009, then spent two years at the Laboratory of Biometry and Evolutionary Biology (LBBE) in Lyon as a bioinformatician, working on invertebrate RNASeq analyses. Marie-Christine then moved to the Plant Genome and Development laboratory in Perpignan where she currently works as a bioinformatician. She is also a member of Olivier Panaud’s team, where she develops new concepts and techniques for the analysis of structural variations in plant genomes. This research is increasingly involving the use of nanopore long-read technology, both at genomic and transcriptomic levels.
Long-read sequencing technologies resolve most dark and camouflaged gene regions
Complex genomes, including the human genome, contain ‘dark’ regions that standard short-read sequencing technologies do not adequately resolve, including protein-coding genes, leaving many variants that may be relevant to disease entirely overlooked. We systematically identified gene regions that are ‘dark by depth’ (few mappable reads), and others that are ‘camouflaged’ (ambiguous alignment). More than 100 protein-coding genes are 100% camouflaged using standard short-read sequencing. Many known disease-relevant genes are also camouflaged, including CR1, a top Alzheimer’s disease gene, and other disease-relevant genes include NEB, SMN1 and SMN2, and ARX. We further assessed how well long-read technologies resolve these regions, including 10x Genomics, PacBio’s Sequel, and Oxford Nanopore PromethION (Cliveome v. 3.0). We found that long-read technologies largely resolve the camouflaged gene regions, making it possible to identify mutations that may be important in human disease.
Dr. Ebbert is an Assistant Professor of Neuroscience at the Mayo Clinic with a background in computational biology and bioinformatics, focusing on Alzheimer’s disease, amyotrophic lateral sclerosis (ALS), frontotemporal dementia (FTD). He also has experience in genomics studies and analyses, algorithm design, and statistics. He has published in respected journals across cancer, bioinformatics, and Alzheimer’s disease, and recently published a manuscript demonstrating that long-read technologies can traverse the challenging C9orf72 ‘GGGGCC’ repeat expansion.
Finding disease-causing complex mutations
Various kinds of complex mutation (e.g. tandem-repeat expansion/contraction, homologous recombination, chromosome shattering, virus/transposon insertion) are known to cause diseases but have been neglected because they are hard to identify. I will describe our pipeline to find such mutations in patients. We perform whole-genome nanopore sequencing, find and group reads with structural differences from a reference genome, then de-prioritize differences shared by humans without the disease. We find most probable alignments between reads and reference, allowing for arbitrary rearrangements, based on probabilities of nucleotides, substitutions, insertions, and deletions. Our pipeline discovered the cause of neuronal intranuclear inclusion disease: expansion of a GGC tandem repeat in NOTCH2NLC. We can fully characterize complex congenital mutations caused by chromosome shattering. Some important properties of these rearrangements, such as sequence loss, are holistic: they are not present in any part of the rearrangement but are apparent only in the rearrangement as a whole.
Martin has a cross-appointment as a researcher at the AIST Artificial Intelligence Research Center, and as a Professor at the University of Tokyo Department of Computational Biology and Medical Sciences. He is broadly interested in analyzing genetic sequences to understand the information encoded in them, their evolutionary history, and their role in disease. He studied Physics and Philosophy at University of Oxford, Mathematics at University of Cambridge, taught English in Beijing, completed a PhD in Bioinformatics at Boston University, and carried out postdoctoral research at the University of Queensland in Australia and RIKEN in Japan.
Leveraging long reads for high-throughput multiomic analyses of cellular diversity in human tumours
Dissecting RNA biology, one molecule at a time
The human transcriptome is highly diverse and complex, as evidenced by cell-type specific expression of unique transcript isoforms. In particular, transcripts derived from non-coding regions are the most qualitatively diverse class of genetic elements, encompassing over 70% of the genome. In contrast, about 80% of GWAS SNPs reside in non-coding regions, which suggests long non-coding RNAs may be the missing link, at least to some degree. I will present our recent work on high-resolution cDNA sequencing of non-coding regions associated with neuropsychiatric disorders using targeted sequencing on the PromethION. I will detail the discovery of new non-coding RNAs, mRNA isoforms, long-range exon dependencies, and how these relate to mental health and neurodegeneration. Given the apparent involvement of RNA modifications in neurological diseases, I will then segue into direct RNA sequencing and describe our in vitro strategy to train RNA base callers and detect modified RNA bases. This will include data from our recent preprint on detecting m6A in RNA with 90% accuracy. Finally, I will present how we maximise flow cell output by complementing an innovative RNA barcoding strategy and 'AI'.
Martin Smith is Head of the Genomic Technologies program at the Kinghorn Centre for Clinical Genomics, located at the Garvan Institute of Medical Research in Sydney, Australia. He has been using nanopore sequencing since 2014, with a heavy focus on transcriptomic applications. Martin is a computational biologist from Canada with a background in genomics, microbiology and immunology.
Direct nanopore sequencing of ribosome protected mRNA fragments
Ribosome profiling provides positional information of ribosomes along coding mRNA sequences, thus representing a powerful tool for gene expression analysis at the protein synthesis level. The currently available technologies for ribosome profiling are time-consuming, requiring laborious sample processing and several days for the preparation of DNA-based libraries that i) cannot carry over all the sequence information contained in native RNA molecules, and ii) easily introduce amplification biases. Oxford Nanopore pioneered direct RNA sequencing with a nanopore-based long-read sequencing technology. Here, we present a new library prep strategy for nanopore sequencing of short RNA molecules. Our method enables the reading of 30 nt-long ribosome-protected fragments (RPFs) and allows for faster ribosome profiling experiments with the potential to deliver information on post-transcriptional sequence modifications.
University of Nottingham
Prof. Matt Loose is based at the School of Life Sciences, University of Nottingham. A developmental biologist and bioinformatician, he also heads up DeepSeq, the University of Nottingham next-generation sequencing service. The DeepSeq lab is equipped with MinION, GridION and now PromethION. DeepSeq actively encouraged Nottingham Academics to apply to join the Nanopore Community and, in return, supported participants with both library prep and bioinformatics, and led to the development of tools including MinoTour and also working on Read Until. Matt was initially interested in the generation of long-reads to sequence novel genomes alongside real-time analysis of MinION data. To that end he recently co-led with Prof Nick Loman the sequencing and assembly of the first reference human genome on the MinION. DeepSeq also have a small urn in their possession, although perhaps not for much longer.
Plant de novo genome sequencing and assembly using Oxford Nanopore Technology
Oxford Nanopore sequencing technology has made it possible to sequence and de novo assemble plant genomes at relatively low cost with fast turn-around times. Continuous improvements in the technology as well as our optimization of DNA extraction, size selection and library preparation in combination with current long-read assemblers enable us to assemble larger plant genomes to a high-quality draft level. Using Hi-C we are further able to improve those high-quality contiguous genomes to chromosome-scale assemblies. This talk will highlight the progress we made in the past 3 years with this technology on the examples of the genomes of Solanum pennellii, Vinca minor, Physalis ixocarpa and Physalis alkekengi and also show how genes can be easily structurally annotated in de novo genome assemblies using the Oxford Nanopore RNA-seq technology. The low error-rate of these gene models derived from polished nanopore assemblies also allows for high-throughput functional annotation with the Mercator 4 pipeline.
Maximilian Schmidt was awarded a BSc in Biotechnology from the University of Cooperative Education Riesa in 2012, before moving to RWTH Aachen to complete a MSc, where he studied genes involved in plant cell wall biosynthesis. He is currently a PhD student with Prof. Usadel at RWTH Aachen where he is interested in de-novo plant genome sequencing.
Untangling heterogeneity in DNA replication with nanopore sequencing
Genome replication is a stochastic process whereby each cell exhibits different patterns of origin activation and replication fork movement. Despite this heterogeneity, replication is a remarkably stable process that works quickly and correctly over hundreds of thousands of iterations. Existing methods for measuring replication dynamics largely focus on how a population of cells behave on average, which precludes the detection of low probability errors that may have occurred in individual cells. These errors can have a severe impact on genome integrity, yet existing single-molecule methods, such as DNA combing, are too costly, low-throughput, and low-resolution to effectively detect them. We have created a method called D-NAscent that uses Oxford Nanopore sequencing to create high-throughput genome-wide maps of DNA replication dynamics in single molecules. I will discuss the informatics approach that our software uses, as well as questions pertaining to DNA replication and genome stability that our method is uniquely positioned to answer.
Michael Boemo is a postdoctoral research assistant in the Sir William Dunn School of Pathology at University of Oxford with Professor Conrad Nieduszynski, and currently holds the Emanoel Lee Junior Research Fellowship at St. Cross College. Michael completed his PhD in condensed matter physics in 2016 at the University of Oxford where, together with Professor Andrew Turberfield and Professor Luca Cardelli, he developed a computing system comprised of autonomous robots made from DNA. Dr Boemo is interested in developing computational methods to study systems biology, and his current work aims to develop methods to study DNA replication dynamics at single-molecule resolution and a new process algebra for the simulation of biological systems.
Deep transcriptomic sampling with long-read single cell RNA sequencing
Single cell RNA-seq (scRNA-seq) is rapidly gaining favour for understanding biology. Long-read methods have great potential for scRNA-seq by facilitating the identification and quantification of gene isoforms, allowing cell specific variation in isoform expression and splicing to be characterised. Previous methods for transcriptome-wide long-read scRNA-seq have been limited by low numbers of cells and/or few reads per cell. We have developed an improved nanopore long-read scRNA-seq method that allows the profiling of hundreds of single-cells at high read-depth and demonstrate its use by profiling five human cancer cell lines. These results demonstrate the power of long-read sequencing to characterise gene and isoform expression in single cells.
Mike Clark is head of the Transcriptomics and Neurogenetics group at the University of Melbourne in Australia. His research sits at the intersection of genomics and neuroscience, utilizing a number of genomic approaches, including nanopore sequencing, to investigate gene expression and function in the human brain and in neuropsychiatric disorders.
Accelerated de novo assembly on GPUs
Recent years has seen an uptake in the use of GPUs for Genomics, from basecalling (e.g. Guppy) to variant calling (e.g. Deep Variant). Long-read sequencing technology such as Oxford Nanopore sequencing holds the promise of simple and cost-effective de novo assembly. This is important for generating reference sequences (even for complex, polyploid organisms) and identifying structural variants such as deletions and translocations. One of the difficulties however of high-quality de novo assembly is its substantial computational cost. Post-sequencing assembly can take longer than the sequencing experiment itself. The Nvidia genomics team is harnessing the power of GPUs to develop a pipeline for massive acceleration of de novo assembly. Our end goal is real-time long-read de novo assembly.
Mike Vella is a Senior Deep Learning and Genomics Engineer at NVIDIA corporation. Mike works on using GPUs to help researchers with the analysis of high-throughput sequencing data. Mike has a PhD in Computational Neuroscience from the University of Cambridge and an undergraduate degree in Physics from the University of Bristol.
It is all about accessibility: Galaxy as a framework for democratizing Oxford Nanopore data analysis
Thanks to Oxford Nanopore Technologies, long-read sequencing is becoming more accessible for a much broader range of applications and end-users. Bioinformatics analysis was already a bottle-neck with the previous generations of sequencing technologies, but even more so with the new generations. Nanopore-based sequencing technologies are so much more accessible and can rapidly produce so much more data that the data analysis challenges can become fundamental. Community-driven solutions to democratize data analysis is crucial in the same way Oxford Nanopore is democratizing sequencing. Galaxy has been shown to be a successful option for short-read sequencing, but we think its advantages will shine even more in the era of long-read sequencing. Firstly, the user-friendly web interface does not require advanced computational skills, making it ideally suited for this interdisciplinary area and educational purposes. Secondly, the software and workflows can be seamlessly upgraded at the server side while maintaining 100% reproducibility of the performed analysis. Thirdly, the computational infrastructure supports a diverse spectrum from personal computers to cluster grids and the cloud. Within the scope of this project, we provide Oxford Nanopore-related tools in Galaxy. We have developed a collection of the best practice workflows for genome assembly within Galaxy. Our work is available for everyone at the European Galaxy server (https://usegalaxy.eu) and supportive self-learning training material is available. I will also introduce the Street Science Community (https://streetscience.community), a voluntary-based non-profit group that aims to teach the public the fundamental concepts of molecular biology and genetics data analysis by analyzing the “DNA of beer” using MinION and Galaxy.
Milad Miladi is PhD candidate and research assistant in the Bioinformatics group at the University of Freiburg in Germany. With a background in computer science and RNA computational biology, his research involves transcriptomics, non-coding RNAs and reproducible data analysis with Galaxy.
Identification of new somatic structural variants and cancer driver genes using long-read nanopore sequencing
Third generation DNA sequencing technologies have been transforming genome medicine and cancer research, producing evidences for structural variations (SV’s) being the common and major driver of complex diseases and tumorigenesis. By taking advantage of the un-parallelled power of long-read and high-throughput capability of the Oxford Nanopore PromethION platform, we investigated the role of SV’s in cancer development. We sequenced DNA obtained from colorectal cancer biopsy and corresponding normal tissue-samples of Han Chinese. Using a comprehensive SV-calling pipeline that consists of ngmlr-sniffle, dynamic filtering, database search and comparison, manual curation, and break point mapping, we obtained high quality SV call sets. By using PCA, population structure, and frequency spectrum analyses, we identified a set of SV’s that are tumor specific. In addition to somatic point mutations in mismatch repair genes that are well known for causing colorectal cancers, we observed complex somatic SV’s that show evidence of chromothriptic rearrangements, the hallmark of the late stage tumors, that were focally localized to a terminal region of a chromosome in colorectal cancer samples. One of the complex somatic rearrangements was linked to the amplification of the gene that is essential for DNA recombination. Furthermore, we also observed a direct link between the expansion of microsatellites and SV’s, suggesting the microsatellite instability might drive the formation of SV’s and cause genome instability in colorectal cancers. Collectively, our results present the power of the Oxford Nanopore PromethION platform for high resolution analysis of SV’s in the human genome, which can lead to a better understanding of the molecular, biochemical, and cellular events that govern tumor progression.
As Director of the Grandomics Genome Institute, Min works with a talented group of scientists and technologists who develop new genomic solutions to enhance the strengths of the Oxford Nanopore platform for genome science and genome medicine. His team integrates existing and new methods to create a comprehensive pipeline to produce complete animal and plant genomes with a minimum number of gaps. His team also studies the origin, mechanisms, and roles of SV’s in adaptive evolution, complex diseases, and tumorigenesis.
Generating high-quality reference human genomes using PromethION nanopore sequencing
To catalogue and associate all forms of human genetic variation to health and disease, a new generation of genome sequencing and assembly technologies is required. However, current workflows for producing high-quality human genome assemblies have overall cost and production time bottlenecks that prohibit scaling to hundreds of individuals. We designed and evaluated an optimized PromethION-based workflow to produce near reference quality genome assemblies for the offsprings from ten parent-offspring trios. We demonstrate the production of long read, high-quality, and high-coverage genomes with a less than one-week total turnaround time from sample extraction to complete assembly, and a total projected cost of less than $10k per genome. To lower costs and improve quality we have developed three new tools: 1) Shasta - a nanopore de novo long read assembler that on a single compute node can produce complete human genomes in around 6 hours; 2) marginPolish - a new graphical model-based assembly polisher that improves on earlier methods in both cost and accuracy; and 3) HELEN - an RNN-based multi-task learning model that further refines the base and run-length prediction for each genomic position and produces state-of-the-art results. We evaluate the performance based on assembly accuracy, throughput/timing, and cost and demonstrate improvements relative to current best-of-breed in all areas. Recognizing that even 100kb reads are insufficient to scaffold through the most repetitive regions of the human genome, we augment this sequencing with a Hi-C long-range library to facilitate scaffolding and haplotype phasing.
Miten is an Assistant Research Scientist at the University of California, Santa Cruz. His research interests include developing methods for long-read sequencing of DNA and RNA, methods for detection of base modifications, and software for analysis of MinION and PromethION data.
Multi-copy qnrA1 plasmid causes elevated quinolone resistance in E. coli
Large multi-drug resistance (MDR) plasmids are often challenging to assemble and circularize due to repetitive sequences, numerous insertion elements, and regions sharing homology with the bacterial chromosome. These complications are exacerbated by copy number variation introduced by large duplication events involving multiple genes. In this work, we emphasize the utility of nanopore long-read sequencing to overcome these challenges by producing a complete sequence of a 200kb MDR plasmid containing a duplication event mediated by insertion sequence ISCR1 transposition. The 20kb multi-duplication introduced four additional copies of the qnrA1 quinolone resistance gene. The qnrA1 copy number appears to have a dosage effect significantly increasing resistance to ciprofloxacin in E. coli strains carrying the plasmid. To our knowledge this is the first report of a resistance mechanism associated with a stable qnrA1 duplication in an E. coli plasmid.
Characterizing large homology directed repair (HDR) insertions by CRISPR/Cas9 using MinION long-read sequencing technology
Precise genome editing by the CRISPR/Cas9 system has proven to be ground-breaking in basic research. Cas9 protein is increasingly being used for genome editing by direct transfection of an active guide RNA Cas9 ribonucleoprotein (RNP) complex into cells which introduce double-stranded breaks (DSBs) at targeted genomic loci. DSBs are repaired by endogenous cellular pathways such as non-homologous end joining (NHEJ) and homology-directed repair (HDR). Providing a ssDNA template during repair allows researchers to precisely introduce a desired mutation by utilizing the HDR pathway. However, rates of HDR are often low compared to NHEJ-mediated repair and analysis of large (>100 nt) insertions can be challenging. Long read sequencing technology allows for a more comprehensive analysis of the outcome of large insertions or deletions created by CRISPR/Cas9 genome editing. Here, we use a target enrichment approach to selectively sequence a region of interest (ROI) around the CRISPR edited site to measure the rates of precise insertion by HDR.
Mollie Schubert is a Research Scientist in the molecular genetics research group at Integrated DNA Technologies. Mollie received her master’s degree in biochemistry from Iowa State University and has been at IDT since 2013. For the past five years, she has focused on studying CRISPR gene editing, including high-throughput screening of CRISPR-Cas9 guides for the development of a site selection tool, optimizing the composition and delivery of synthetic RNA reagents complexed to recombinant CRISPR nucleases, and developing methods for efficient gene editing with a recent focus on improvements to homology directed repair.
Nanopore sequencing and analysis of plant pathogenic viruses: more than just rapid diagnostics?
The use of the portable MinION sequencer in plant pathology is rapidly increasing. Many studies have shown that the accuracy, portability and reduced time to result using the MinION are actively changing the way we do diagnostics and new diagnostic development for pests and diseases in agriculture. The first advantage of the MinION is clearly the ability to obtain rapid preliminary IDs of unknown pests and disease in the field. This was demonstrated recently by the Cassava Virus Action Project, taking just 4 hours to identify the virus present in symptomatic cassava plants in the field. The other advantage that can be overlooked is the opportunity to reduce the turn-around time to diagnosis for unknown pests and pathogens in a laboratory setting, as well as avoiding the need for expensive specialised equipment. While much has been made of the advances in in-field diagnostics, we wanted to answer the question of how does the data stack up in a laboratory setting when compared with other technologies? The small start-up costs and easy access to MinION sequencing, compared to other technologies makes it very attractive to plant virologists. Our team took a set of plant RNA samples from field pea with a known viral composition of a Potyvirus (Pea seed-borne mosaic virus - polyadenylated) and a Polerovirus (Turnip yellows virus – non polyadenylated), which we already had an Illumina data set for, and sequenced the samples using a cDNA and direct RNA kit on a MinION. We compared downstream analyses performed with both the Illumina and MinION data. The results of our research suggest that not only is MinION suitable for rapid diagnostics in the laboratory and the field, but it is also useful in a wider research capacity.
Dr. Monica Kehoe is a Plant Virologist and Molecular Plant Pathologist working for the Western Australian Department of Primary Industries and Regional Development (WA DPIRD) in the diagnostic and laboratory services section. Her current work focuses mainly on the development, validation and use of molecular methods for plant disease diagnostics across a broad range of broadacre and horticultural crops. Research interests include the cassava brown streak and mosaic viruses, luteoviruses in pulses and oilseeds, grapevine viruses, viruses of vegetable crops, supercomputing for plant disease diagnostics and the use of portable sequencing for rapid diagnostics in plant pathology, in both the field and the laboratory. Monica has a B.Sc from the University of Melbourne, Honours in Plant Virology from Murdoch University, and in 2014 completed her PhD in Plant Virology at the University of Western Australia.
Resolution of germline hereditary cancer structural variants using nanopore sequencing
Structural variants (SVs) are difficult to ascertain using short-read sequencing technology. As part of the Personalized OncoGenomics (POG) study, tumour and matched normal blood Illumina whole-genome sequencing was performed in patients with advanced cancers. We used Oxford Nanopore sequencing for validation and breakpoint resolution of four germline SVs in hereditary cancer genes: 1) ATM deletion, 2) NTHL1-TSC2-TRAF7 complex rearrangements, 3) IFT140-TSC2 inversion and 4) UIMC1-NSD1 complex rearrangements. The 12 breakpoints of these 4 SVs were seen in the nanopore data. Long-read sequencing was necessary for the resolution of SVs and corrected the initial interpretation in 3 out of 4 cases. Our results also showed the suspected IFT140-TSC2 large inversion to be a small intronic inverted duplication event that did not disrupt either gene. Our results suggest that short-read technology may not be sufficient for SVs assessment. Long-read sequencing technology may eventually be considered as an option for the detection and validation of clinically relevant germline SVs.
Dr. My Linh Thibodeau is currently training in the Medical Genetics Residency Program at the University of British Columbia. In 2017, she won entry to the Royal College of Physicians and Surgeons Canada Clinician Investigator Program to apply bioinformatic approaches to the discovery and characterization of hereditary cancer predispositions. During her work in the Personalized OncoGenomics study at BC Cancer in Vancouver, Canada, Dr. Thibodeau acquired expertise in the analysis and integration of whole genome and whole transcriptome datasets. Taken together with her medical training, these experiences have allowed Dr. Thibodeau to develop a unique clinical-bioinformatic skillset.
Extracting megabase DNA
Ultra-long DNA extraction and library prep protocols have generated sequencing reads of up to 2.3 Mb on MinION flow cells, with individual runs yielding multiple reads longer than 1 Mb and N50 values higher than 100 kb. Optimisation experiments are currently being carried out to increase flow cell yields and further improve read length statistics across all Oxford Nanopore platforms. Results shown will compare DNA extracted from a diverse range of organisms and sample types and include QC comparisons from different ultra-high molecular weight (ultra-HMW) extraction protocols including manual and automated approaches. We will also share progress updates comparing library preps from these samples run over Flongle, MinION and PromethION flow cells.
Nadine is currently working as the Senior Technical Specialist at Deep Seq, a multi-platform sequencing facility at the University of Nottingham. Becoming a specialist in Next Generation Sequencing (NGS) followed on from a research career in bacterial genomics that focussed on the early adoption of new sequencing and comparative genomics platforms. Deep Seq is a certified service provider for GridION and PromethION sequencing and routinely receive requests to extract DNA from a wide range of different organisms and sample types, for Oxford Nanopore long-read sequencing.
Ultra-long reads and ultra-long duplications: deciphering the mysteries of the Bordetella pertussis genome
In light of widespread resurgence of the respiratory disease whooping cough, ongoing research aims to identify changes to the causative bacterium, Bordetella pertussis. B. pertussis is traditionally described as a highly clonal species at the single-base level, hence our research largely focusses on identifying differences between strains on a whole-genome scale. Long-read sequencing has enabled us to produce closed genome sequences for B. pertussis isolates on an unprecedented scale, allowing visualisation of extensive inter-strain genomic rearrangements. This work also led to the unexpected discovery of a second phenomenon: large duplications which are present in some recent isolates but not in the B. pertussis reference genome. Intriguingly, these duplications may be present in only a fraction of the cells of duplication-carrying strains. At London Calling 2019, I will discuss this developing story, including the essential role of long and ultra-long nanopore sequencing in proving the existence of the duplications and characterising variable populations, alongside continuing work to quantify the phenotypic effects of the duplications.
Natalie Ring graduated from the University of Bath with a BSc in Biochemistry in 2012. She then spent four years working at MRC Harwell as a data wrangler for the International Mouse Phenotyping Consortium, as well as completing a post-graduate qualification in Science Communication from the University of Edinburgh. She is currently a PhD student at the University of Bath in the Bagby and Preston groups, studying the genome of Bordetella pertussis, the bacterium responsible for whooping cough.
Semi-quantitative characterisation of mixed pollen samples using MinION sequencing and Reverse Metagenomics (RevMet)
The ability to identify and quantify the constituent plant species in mixed-species pollen samples has important applications in ecology, conservation, and agriculture. Metabarcoding methods for pollen can reveal presence of species but do not allow reliable quantification. A PCR-free, shotgun metagenomics approach has greater potential for accurately quantifying relative abundances but applying metagenomics to eukaryotes is challenging due to low numbers of reference genomes. We have developed a pipeline, RevMet (Reverse Metagenomics), that allows reliable and semi-quantitative characterization of the species composition of mixed-species eukaryote samples, such as bee-collected pollen, without requiring reference genomes. Instead, reference species are represented by short-read genome skims. The skims are mapped to MinION-generated long reads from mixed-species samples, and each long read is uniquely assigned to a species. We genome-skimmed 49 wild UK plant species, validated our pipeline with mock DNA mixtures, and then applied RevMet to bee-collected pollen.
Blood donor genotyping - how can long range sequencing help?
To ensure the safety of blood transfusions it is critical to match the blood type of both donor with the recipient. Current typing methods use monoclonal antibodies, however, reagents for rare blood groups are expensive, unavailable or unreliable. DNA-based identification of human blood groups has been used to overcome these limitations and its application has reduced rates of alloimmunisation in chronically transfused patients. While recent studies have shown the high degree of blood typing accuracy that can be achieved with modern high-throughput molecular techniques, structural variants and rare recombination events in the genome remain a source of error. Long range sequencing technologies can be leveraged to produce high quality haplotype reference sequences for the blood group encoding genes which can be used to improve current typing algorithms.
Nicholas Gleadall is a PhD student working in the laboratory of Professor Willem Ouwehand at the University of Cambridge. His work focuses on the genetics of human blood group antigens and development of techniques for high throughput, DNA based donor typing. Nicholas has previous experience introducing new technologies into clinical service by developing and validating diagnostic laboratory assays for large organisations such as Public Health England, where he worked on HIV whole genome sequencing for national surveillance and resistance genotyping, and NHS England on a project focussed on exome sequencing for diagnosis of rare human inherited disorders.
University of Birmingham
Nick is Professor of Microbial Genomics and Bioinformatics in the Institute of Microbiology and Infection at the University of Birmingham and a Fellow at the Alan Turing Institute. He is supported by a Fellowship in Microbial Genomics Bioinformatics as part of the MRC CLIMB project. His research explores the use of cutting-edge genomics and metagenomics approaches to the diagnosis, treatment and surveillance of infectious disease. Nick has so far used high-throughput sequencing to investigate outbreaks of important Gram-negative multi-drug resistant pathogens, and recently helped establish real-time genomic surveillance of Ebola in Guinea and Zika in Brazil. His current work and focuses on the development and evaluation of novel molecular biology, sequencing and bioinformatics methods to aid the interpretation of genome and metagenome scale data generated in clinical and public health microbiology.
Revealing mRNA alternative splicing complexity in the human brain
Identifying the cellular pathways underlying psychiatric disorders has great potential to improve patient lives. Voltage-gated calcium channels (VGCCs), including CACNA1C, have been linked to the risk of bipolar disorder and schizophrenia, so are promising targets for new treatments. VGCCs are also important in the cardiovascular system, meaning treatments must be tissue-selective. We aim to identify brain-enriched CACNA1C isoforms as novel targets for new treatments. We used targeted long-range nanopore sequencing to characterise full length transcripts of CACNA1C in human brain and mouse brain, heart and aorta. In human we identified >250, and in mouse >190, mRNA isoforms, exons and splice junctions; including many predicted to modulate protein function. In mice, splicing was clearly tissue-specific, and we expect that this underlying principle of tissue-specific splicing will be conserved between mice and humans. Our data show that splicing of CACNA1C in human brain is far more diverse than is currently appreciated. This information will be critical to reveal pathophysiological mechanisms, and to identify brain-enriched VGCC isoforms that may be novel targets for new psychiatric treatments.
Nicola Hall is a postdoctoral researcher at the University of Oxford in Department of Psychiatry with the Tunbridge group. She is using her background in molecular biology and RNA sequencing to investigate gene expression in the human brain. Her current work focuses on alternative splicing of the calcium channel CACNA1C, implicated in schizophrenia and bipolar disorder. Nicola completed her PhD in 2017 at the University of Oxford in the Department of Biochemistry.
Nanopype: processing and quantification of short tandem repeats
The availability of substantially longer reads with the Oxford Nanopore approach opens new possibilities in many fields and explains the increasing use of the nanopore technology. To facilitate access and match storage as well as processing routines to the higher demand, we assembled Nanopype a modular, parallelized and easy-to-use pipeline to process the sequencing data from the raw signal output into standardized formats. Specifically, Nanopype facilitates the essential steps of base calling, quality control, and alignments, as well as various downstream applications by incorporating field-specific tools and complemented by custom utility scripts. To illustrate its application, we apply it to the assessment of short tandem repeats that have been implicated in neuropsychiatric disorders. Combined with a Cas12a-based enrichment strategy and the STRique package we show efficient targeting and quantification on raw signal level, as well as determination of the associated methylation status.
Pay received his MSc in Electrical Engineering from the Kiel University of Applied Sciences with a focus on embedded systems and hardware accelerated signal processing. He is currently a PhD student in Alex Meissner's lab at the Max-Planck-Institute for Molecular Genetics in Berlin. Pay is interested in the epigenetic regulation of the genome, direct base modification detection and developing tools and pipelines to process third generation sequencing data.
Retroviral invasion of the koala genome
Nanopore direct RNA sequencing enables comprehensive transcriptome profiling and modification detection
RNA sequencing provides insight into the molecules that are actively expressed in a given tissue type at a certain point in time. Traditional RNA sequencing approaches generate cDNA from RNA templates using reverse transcriptase, usually with PCR amplification. Therefore, traditional approaches are subject to biases that are associated with the reverse transcriptase enzyme and PCR and are not able to assess RNA modifications. Oxford Nanopore’s new direct RNA sequencing approach does not depend on reverse transcriptase or require PCR, allowing users to overcome biases typically associated with their use. Here, we apply this direct RNA sequencing approach to interrogate two important human cell lines: a widely used lymphoblastoid cell line (GM12878) and induced pluripotent stem cells. Using this approach, we extensively sampled the transcriptome, sequencing ~3 million molecules in a single sequencing run. These reads reached a median length of 1kb, with maximum read lengths >13kb. More than 90% of these reads align to the human reference genome and existing transcriptome, allowing us to examine known transcript variants. Additionally, existing RNA modifications such as 6mA and 5mC can be readily detected using Tombo at known transcript locations. Analyzing this data in two distinct cell types allows us to discover new cell-type specific transcripts in coding and noncoding regions of the genome and assess their specific associated modifications. Overall, this direct RNA sequencing methodology allows for efficient, comprehensive transcriptome profile.
Rachel Goldfeder is a Computational Scientist on the Genome Technologies team at The Jackson Laboratory for Genomic Medicine. Her research interest is in using novel sequencing approaches to aid disease understanding, diagnosis, prognosis, and treatment. Rachel holds a BS in Biomedical Engineering from Washington University in St. Louis and a PhD in Biomedical Informatics from Stanford University.
‘O nā Hōkū nō nā Kiu o ka Lani: Bridging genomics research and culture-based science education through astrobiology in Hawaiʻi
Culture-based science education provides a framework to develop the potential of every student (haumāna), through the mutual exchange of cultural and academic knowledge (a`o). Using this framework and nanopore technology, we are developing teacher training programs and classroom exercises (ʻĀina-Informatics) that bridge research in bacterial quorum sensing and Hawaiian cultural traditions. Quorum sensing is a type of intercellular signaling found in Bacteria and Archaea that coordinates activities among microbes through gene expression. Students and teachers have sequenced complete genomes of bacterial strains of interest in our research. We are also using nanopore reads to assist with de-convolution of genomes from microbial mats found in a variety of extreme environments. These data will address the importance of cooperative behavior in extreme environments. Our educational programs focus on increasing interest in groups underrepresented in STEM, help provide a cultural identity, and provide teachers training in genomics and bioinformatics.
Rebecca (Becks) received her B.A. degree in Ecology and Evolutionary Biology from the University of Arizona in USA. She then went on to continue her studies in Natural Resources at University of Arizona, where she worked on marine benthic ecology and community-based resource management to obtain a Masters degree. During this work, she began studying biofilms, which is the focus of her research. Her Ph.D work at the University of Hawai‘i at Mānoa investigated interactions between coral larvae and biofilms on the Great Barrier Reef in Australia. She is currently a National Science Foundation Postdoctoral Fellow in Biology working on “Survival in extreme environments through cooperation: biofilms and looking for life on Mars.”
Biological evidence of the future: the use of sequencing in forensic DNA analysis
Forensic DNA profiling uses short tandem repeat (STR) analysis for human identification purposes, i.e. to establish a link between biological evidence and an individual. This technique is currently limited to assessing the length of STR alleles via capillary electrophoresis and relies on the comparison to a reference DNA profile. The advent of DNA sequencing has revolutionised the field of forensic genetics. Alleles with the same length but a different sequence can be distinguished, providing additional discrimination between individuals which can greatly aid in DNA mixture interpretation. Rare sequence mutations can be identified to differentiate identical twins, who cannot be told apart using conventional DNA profiling. Using sequencing, scientists have also begun to harness intelligence-based information that a biological sample can provide which could be of use in an investigation. The analysis of single nucleotide polymorphisms (SNPs) offers new opportunities in the form of forensic DNA phenotyping and forensic epigenetics. Prediction of eye, hair and skin colour, as well as bio-geographic ancestry and chronological age estimations of an unknown individual are all now possible. The introduction of nanopore sequencing technology has the potential to transform the field of forensic genetics even further. The portability and real-time capability of the MinION could shift analysis out of the lab into the field, greatly reducing cost and turnaround time which are critical in an investigation. Research into the feasibility of this technology for forensic applications is currently underway. Sequencing has not only changed the field of forensic genetics, but also has changed the way biological evidence is approached and could be used in investigations which has had a wide-reaching effect in enforcement, legal, governmental and judicial fields. Although not routinely used in forensic casework at present, many forensic laboratories around the world are currently validating sequencing technologies with the expectation that this will be the biological evidence of the future.
Rebecca Richards is a doctoral student in the Forensic Science Programme at the University of Auckland. Her research focuses on the development and optimisation of DNA methylation markers for forensic applications, specifically identical twin differentiation and chronological age estimation. Rebecca is also a senior technician in the Forensic Biology Group at the Institute of Environmental Science and Research (ESR), a Crown Research Institute which provides forensic services to the New Zealand Police. In addition, she is running point for the MinION research currently being undertaken at ESR and is involved in the wider validation of DNA sequencing for forensic use.
Don’t let data management be your bottleneck
Once you can generate terabytes of data from a single flowcell, simply moving and storing that data can become the bottleneck for your workflow. We present some recent and up-coming MinKNOW features designed to help with your data management challenges. Find out how to trigger analyses automatically and hear some rules of thumb to help you plan what kit you need. We will also give you some insight into how Oxford Nanopore has scaled out data management to hundreds of devices.
Richard Carter has been part of the Informatics team at Oxford Nanopore since 2010. He has worked on a diverse range of applications including bioinformatics, customer facing software solutions through to his current role leading a team that analyse performance metrics for Oxford Nanopore devices. He has been involved in next generation sequencing for over ten years, prior to which he worked in bioinformatics and structural biology.
From amplicons to metagenomes: Long read sequencing the environment
Complex environmental matrices, such as soil, sediment and excreta, are often synonymous with diverse microbial communities. Long read sequencing of DNA extracted from such communities can yield highly contiguous genomic data and provide information on both genetic composition and structure. However, DNA extracted from such matrices is often impure, fragmented and can potentially lack complete representation. Therefore, techniques such as isolation, enrichment and metagenomic assembly are used to answer questions on function and diversity. Here we present the sequencing and assembly of two environmental AMR harboring plasmids and one novel gc-rich genome isolated from the environment. Furthermore, we describe our exploration into the sequencing and analysis of long-range amplicon-based enrichment for AMR associated mobile genetic elements, and undertake metagenomic analysis of the community composition of two fractions of industrial anaerobic digesters. This has permitted us to investigate the evolution and selective drivers of AMR in the environment.
Dr Rob James is currently working as a post-doctoral research fellow with Prof. E. Wellington as lead investigator on the BBSRC funded project; “Mycobacterium bovis and the farmland ecosystem: understanding transmission dynamics between animals and the environment.” This collaborative project between the University of Warwick, the Zoological society of London and Imperial College, aims to identify the environmental reservoirs of infection in agricultural land use types, and routes of transmission between mammalian hosts and the environment. Furthermore, Rob has an interest in the evolution and selection of antimicrobial resistance genes in the environment and has recently undertaken work to quantify AMR gene abundances in farmland and residential areas of Karatchi and Islamabad.
Rapidly mapping raw nanopore signal with UNCALLED to enable real-time targeted sequencing
UNCALLED is a tool that maps raw nanopore reads to large DNA references as they are being sequenced. It is a streaming algorithm, meaning the mapping begins as soon as the first bit of signal comes from the sequencer. UNCALLED can currently map reads from all active MinION pores to a 31Mbp reference containing eight bacterial genomes after less than three seconds of sequencing and analysis per read. The main application for UNCALLED is ReadUntil sequencing, where reads can be ejected from the pore depending on whether or not they map to the reference.
Sam Kovaka is a third year PhD student in computer science at Johns Hopkins University, co-advised by Michael Schatz and Mihaela Pertea. He attended Clark University for his undergraduate degree, majoring in biology and computer science. Sam started working with nanopore sequencing in the first year of his PhD with a class project that turned into the work that he will be presenting at London Calling 2019.
Structural diversity in the budding yeast
Population genomics within the last few years has been driven by short-read sequencing. Pertaining to my species of interest, Saccharomyces cerevisiae, there has been a large number of strains with high coverage Illumina read datasets, having been released within the last two years, totaling approximately 2000. Although this allows us to have a good view of SNPs and INDELs, heterozygosity and the pangenome, we currently lack a comprehensive view of structural variation, including complex rearrangements, within our species of interest, or any species to date. Our project has taken advantage of Oxford Nanopore long-read sequencing to catalogue structural variations (SV) in ~100 strains representing a wide variety of clades, some with distinct phenotypes such as high suflite resistance in some European wine strains. To discover SVs we use de novo assembly of highly contiguous genomes which can be easily compared to a reference using Whole Genome Alignment. This SV catalogue, alongside detailed SNP and phenotype data, would give us the means to tease apart the phenotypic impact of SVs and SNPs, using CRIPSR technology to engineer the discovered SVs into a known genetic background.
MinION application: performing long fragment analysis on pure fungal cultures (3.5 kb and 6 Kb) and genome analysis of Malassezia pachydermatis
Two purposes were assessed with nanopore sequencing (i) Detection of fungal communities using 1D PCR barcoding, starting with pure cultures of Alternaria, Aspergillus, Candida, Malassezia, Microsporum canis, and Penicillum. We compared partial fungal operon (18S-ITS1-5.8S-ITS2-28S) sequencing of ~4 Kb with the whole one of ~6 Kb. Saccharomyces and Cryptococcus from the ZymoBIOMICSTM mock community were also sequenced as positive control. Amplification of partial and whole operon showed same results. (ii) Malassezia pachydermatis, commonly found in skin and may be associated to skin diseases (e.g. atopic dermatitis), was sequenced by 1D Native barcoding genomic DNA and analysed with bioinformatics tools to obtain consensus sequences and to understand if it may be similar to other Malassezia species.
Nanopore sequencing in space: one small step for a MinION, one giant leap for spaceflight research
As NASA sets sights beyond low-Earth orbit, real-time monitoring and diagnostics of crew health and the environment are required. The majority of previous research on the International Space Station (ISS) has relied on ground-based analyses following sample return to Earth. As a result, biological insights could be lost due to sample fixation and time to receipt in the laboratory. Moreover, sample return will become impractical as missions move beyond the ISS. Oxford Nanopore Technologies’ MinION has made in situ sequencing a reality in any field setting. Since our 2016 demonstration of the MinION’s high-performance off-Earth, four astronauts across the last ten ISS expeditions have successfully completed 18 sequencing experiments. Here, we will describe our end-to-end, sample-to-sequencer process that can be conducted entirely aboard the ISS, and which resulted in the first identification of microbes collected and cultured entirely off-Earth. Expanding beyond the need to first culture the microbes, our culture-independent, swab-to-sequencer investigation is currently underway to characterize further the microbiome of the ISS. One of our long-term goals is to enable crew health monitoring using functional genomics. As a start, we sequenced native poly-A RNA (and cDNA) from a human cell line (GM12878) aboard the ISS and will present findings from these experiments. Nanopore sequencing technology and our sample preparation procedures have expanded the reach of molecular biology to the final frontier. These proven capabilities hold the potential to revolutionize space-based research and in-flight medical operations.
Dr. Sarah Wallace serves as technical lead in the Microbiology Laboratory at the NASA Johnson Space Center, which is responsible for mitigating infectious disease risk during human spaceflight. Her responsibilities include the assessment of microbial risk based on vehicle and mission architectures as well as crewmember, food, and environmental monitoring. These assessments are used to develop requirements for NASA and commercial spaceflight vehicles, including the International Space Station. In addition to her operational support of human spaceflight, Dr. Wallace leads new technology initiatives for her discipline with the goal of reducing Earth dependence for complex sample analysis. She has served as PI for numerous spaceflight investigations, including those to increase off-planet molecular biology capabilities and also to understand how the spaceflight environment impacts cellular behavior.
Long-read sequencing and assembly of a large environmental blaCTX-M-15 harbouring plasmid
Infections caused by antimicrobial resistant bacterial pathogens are fast becoming an important global public health issue. Using next generation sequencing data of whole sediment and cultured fractions, our research group have identified wastewater treatment plants (WWTPs) as hotspots for the dissemination of antimicrobial resistance genes/bacteria (ARG/ARB) into the environment. Whilst WWTPs can remove up to 99.9% faecal coliforms, our results suggest that anaerobic digestors and the water treatment process positively select for ARG/ARB. The persistence of plasmid mediated ARGs outside of the host-associated system may play a compounding role in shaping the community-acquired resistome. The aim of our research is to understand the mechanisms of enzyme secretion in E. coli and determine why only ESBLs are in the exoproteome and no other beta-lactamases. Here we investigated the secretory mechanism of an ESBL-producing E. coli strain ST131 isolated from a UK water system. Strains of E. coli ST131 carrying multiple resistance genes, including blaCTX-M-15 (encoding extended spectrum beta-lactamase, ESBL), were isolated from the rivers downstream of WWTPs. We then quantified survival under prolonged anaerobic digestion in the presence and absence of selective antibiotics. We also confirmed if the gene was plasmid borne and studied the secretory mechanisms associated with all beta-lactamases in the genome. Here we present a method to rapidly sequence, assemble and undertake primary annotation of the bla genes carrying plasmid that are associated with our environmental E. coli ST131. The use of Oxford Nanopore long-read sequencing has permitted accurate de novo assembly and has helped further resolve the AMR genes location, composition, order, function and putative mechanism of transposition. Such assembly has been previously unachievable using our existing short-read sequence data set.
Séverine Rangama is currently a PhD student at the School of Life Sciences, University of Warwick. Her research aims to develop an increased understanding of beta-lactam resistance gene expression and to elucidate the secretion of the enzyme via the SecA pathway.
Unraveling shark secrets: sequencing genomes and microbiomes for research and conservation
Chondrichthyes - sharks, rays and chimaeras (‘sharks’) evolved 500 million years ago and are one of the oldest extant vertebrates today. Sharks have extraordinarily long-life spans, exceptional wound healing capabilities and large genomes – qualities which make them ideal candidates for understanding mechanisms contributing to genome stability and immunological resilience. As apex predators, sharks are also vital to top-down regulation of oceanic ecosystems and are therefore crucial to maintaining commercial fish stocks and human food security. However, sharks are disproportionately targeted to meet the international demand for shark fins and as a result an estimated 25-50% of species are threatened by extinction. 50% of shark species are also data deficient, making it difficult to conserve remaining populations and to study their evolutionary adaptations. Our goal is to reduce data deficiency of shark populations through on-site genomic and metagenomic studies in shark biodiversity hot-spots, including the USA, India, Tanzania, Mexico, Australia, and Philippines. Shark samples are collected from free swimming sharks or from specimens found in fish markets. Genomic DNA is sequenced on-site by trained undergraduate and graduate students on the MinION. We sequenced four new chondrichthyan genomes including the Silky shark (Carcharhinus falciformis), Sharpnose guitarfish (Glaucostegus granulatus), and two manta rays (Mobula japonica and Mobula tarapacana). Long-read sequencing on the MinION allowed high depth of sequencing coverage of shark genomes, which are typically 1-6 gigabases in size. Our studies increased the number of sequenced chondrichthyan genomes by 40%. Ongoing genome assessments for population size and structure will allow determination of conservation status for these shark species. Genome comparisons across taxa will increase understanding of mechanisms which impart evolutionary resilience to this species group. Further, our microbiome analyses of free-swimming whale sharks (Rhincodon typus) in locations across the globe revealed that microbiomes are similar with respect to taxonomic composition and functional profiles in genetically diverse and geographically separated whale shark populations, providing key insights about the biogeography of whale sharks. Analyses of functional profiles of the microbiome in wild thresher sharks (Alopias vulpinus) revealed a10-fold higher proportion of heavy metal-metabolizing genes in sharks as compared to the water column in coastal San Diego, suggesting either bioaccumulation of heavy metals or a novel baseline microbiome specific to thresher sharks. In summary, use of portable sequencing technology from Oxford Nanopore has improved the data deficiency of shark populations through local capacity building and will facilitate greater protection of endangered species in the future.
Shaili Johri is a geneticist with a research focus in conservation genomics of wildlife populations. She completed her BSc. and M.S. in India and moved to the United States for her PhD in Genetics. She did her post-doctorate at the Center for Conservation Biology at the University of Washington in Seattle and is currently a research professor at San Diego State University in association with Dr. Elizabeth Dinsdale’s laboratory. Shaili works at the intersection of marine conservation policy and interdisciplinary research and her research career spans projects relating to conservation of tigers in western India, wolves in northwest USA, killer whales in the Pacific Northwest and now sharks and rays in the southwest US and India. Shaili’s role as a project lead involves developing genomic and metagenomic tools to assist with biodiversity assessments and population health monitoring of marine megafauna such as sharks and killer whales. In parallel to her research, Shaili works in close collaboration with fishing communities to develop science-based conservation policies through cross-sector collaborations, outreach and education. In addition to her research, Shaili engages in capacity building for genomic methods among wildlife research communities in the US and abroad and this is where the Oxford Nanopore MinION device has been a game changer.
High-throughput annotation of full-length long noncoding RNAs using CapTrap-CLS
Our genome expresses thousands of long noncoding RNAs (lncRNAs) that are mRNA-like RNA transcripts, which do not encode any identifiable peptide product. Although lncRNAs perform diverse roles in the cell, the overwhelming majority of them remain uncharacterized. Decoding lncRNAs function highly depends on accurate annotations, which map the precise location of lncRNA genes in the genome. Present lncRNA annotations are far from being complete and this incompleteness can have a strong impact on downstream analyses. To improve the annotation of GENCODE lncRNAs in human and mouse genomes, we developed Capture Long-read Sequencing (CLS) method that combines targeted RNA capture with long-read sequencing. Despite its benefits, CLS can produce transcripts with incomplete 5’ ends. To address this limitation, we developed Cap-CLS – an upgraded version of CLS method with the CapTrap protocol to enrich 5’ ends of transcripts. Results obtained with nanopore sequencing indicate that Cap-CLS considerably improves the annotation of 5' ends of transcripts.
Field forward sequencing in naval environments
Obtaining high quality DNA from plant tissues for nanopore sequencing
Setting up your first few nanopore sequencing runs are super exciting but far too often they end with disappointing yields for those of us who work with plants and other more challenging sample types. Frequently, the DNA extracted using standard kits and common solvent-based techniques are contaminated with substances that are incompatible with nanopore sequencing. I will share with you some methods I have found useful to prepare high quality DNA from plants, how to recognise what contaminants are present and some methods to get rid of them.
Stella is currently a platform coordinator at a small, not-for-profit sequencing centre and NGS training facility at Deakin Genomics Centre in Melbourne, Australia. She has been technical specialist at Deakin University for over a decade, providing training and support to undergraduate and post graduate students in a broad range of molecular and cell biology techniques. Stella completed her undergraduate degree in Science at Latrobe University and PhD at the University of Sydney, Australia. She has extensive experience working with difficult specimens such as formalin-fixed museum specimens, plants, insects and soil.
Don’t let data management be your bottleneck
Once you can generate terabytes of data from a single flowcell, simply moving and storing that data can become the bottleneck for your workflow. We present some recent and up-coming MinKNOW features designed to help with your data management challenges. Find out how to trigger analyses automatically and hear some rules of thumb to help you plan what kit you need. We will also give you some insight into how Oxford Nanopore has scaled out data management to hundreds of devices.
Stephen is Associate Director of Data Engineering at Oxford Nanopore Technologies. He is the principal architect of Oxford Nanopore’s automated mirroring, analysis and archiving system as well as coordinating the development of various applications supporting the Research and Development groups. Previously, Stephen worked in climate science and cheminformatics where he has developed many systems supporting UK, European and International research, including a major role in the ESGF architecture for sharing climate model outputs across the globe.
Applied bioinformatics: from basic QC to Epi2ME
The Bioinformatics Resource section of the Nanopore Community is an evolving repository of data analysis tutorials which aim to deliver best-practise workflows for researchers to explore their own data. The installation of software dependencies is managed through bioconda and the data analysis is orchestrated using the Snakemake workflow management system. The R Markdown package is used to merge bioinformatics code and nanopore data into a reproducible and literate document. The tutorials are packaged with example data, and the complete tutorial code is placed on our Github pages. In this data analysis session, I will introduce our tutorials and will demonstrate how they can be used by laboratory researchers developing their bioinformatics skills. Tutorials cover topics such as the basic QC of individual flowcell runs, mapping reads to a reference genome and the quantitative analysis of transcriptome data. A more technical dissection of a tutorial should illustrate how the tutorials can be modified and customised to your needs. We would welcome your feedback and requests for future tutorial topics during the Q&A at the end of the presentation.
Stephen Rudd joined the Product management team last year having previously been the Strategic Account Manager at Oxford Nanopore Technologies for customers in Germany and Austria. Stephen is a classical geneticist and has a background in genome bioinformatics. He has been project manager for a taxonomically diverse range of genome studies utilising most DNA sequencing and genotyping technologies. He is looking forward to brainstorming potential solutions to challenging problems and to learning more about different research horizons.
Looking for a needle in a haystack: nanopore sequencing of a new Simian immunodeficiency virus (SIV) from a chacma baboon (Papio ursinus)
HIV-1 and HIV-2 together with SIVs comprise the primate lentivirus family. SIVs infect many Old World African primates, both in the wild and in domesticated animals. Some SIV lineages co-evolve with their hosts, but cross-species transmission between different simian species and transmission from simians to humans can occur. Based on short sequences, SIV infection in a yellow baboon and a chacma baboon was published in the 1990s. We isolated SIV from the chacma baboon by co-culture of baboon PBMCs with human CD4+ cell lines and detecting virus by reverse transcriptase activity. High molecular weight (hmw) cellular DNA with integrated provirus was extracted in 1989, using the standard phenol-chloroform method and stored at 4°C for 20 years. Next generation sequencing was attempted, but we could not assemble a complete SIV genome. After storage for another 10 years, the archived DNA was quantified using Qubit and the DNA purity evaluated using the NanoDrop. The 1D ligation kit was used for library preparation to attempt nanopore sequencing on the GridION X5. Base called FastQ sequences, were imported in Geneious Prime 2019 and BBDuk was used for quality trimming and filtering. SIV proviral sequences were assembled to the short baboon reference sequence using the Map to Reference option of Geneious 6.0.3 Read Mapper. After using BLAST on the consensus baboon sequence, we repeated the assembly process with the most closely related SIVagm isolate. The new baboon SIV had less than 80% nucleotide sequence identity to known SIV isolates. Preliminary phylogenetic analyses indicated the baboon virus was related to the vervet subgroup of SIVagm. We are designing primers to amplify the baboon provirus to compare nanopore sequences with Sanger sequencing.
Targeted nanopore sequencing with Cas9 for studies of methylation, structural variants and mutations
There is an existing need for clinical tools that can be used to rapidly assess genomic variants and epigenetic changes at medically relevant genes. We have been using the CRISPR/Cas9 system for target-enrichment nanopore sequencing. We show the ability of this method to generate greater than 200X average coverage at 10 genomic loci (mean size 18kb) with a single MinION flow cell. We demonstrate that this high coverage data enables us to (1) profile DNA methylation patterns at cancer driver genes, (2) detect structural variations at known hot spots, and (3) survey for the presence of single nucleotide mutations. We demonstrate applications of this technique by examining the well-characterized GM12878 cell line as well as three breast cell lines (MCF-10A, MCF-7, MDA-MB-231) with varying tumorigenic potential as a model for cancer.
Timothy Gilpatrick is a current MD/PhD student at Johns Hopkins University in Baltimore, USA. He is doing his PhD work in the lab of Winston Timp, where his studies have centred on the use of nanopore sequencing to study cancer epigenetics and structural variation. He received his BSc in Biochemistry from the University of Delaware, working in a protein structure lab to characterize the role of lipoprotein-associated enzymes in atherosclerosis. Prior to starting his graduate studies, he worked as a research fellow at the National Institutes of Health, examining how microRNAs regulate histone modifications in embryonic stem cells.
Mobile antibiotic resistome in wastewater treatment plants revealed by nanopore metagenomic sequencing
In this study, we combined Oxford Nanopore and Illumina metagenomics sequencing to comprehensively uncover the resistome context of influent, activated sludge and effluent of three wastewater treatment plants (WWTPs) and simultaneously track the hosts of antibiotic resistance genes (ARG). The results showed that most of the ARGs detected in all compartments of the WWTPs were carried by plasmids. Transposons and integrons also showed higher prevalence on plasmids than on the ARG-carrying chromosome. Notably, integrative and conjugative elements (ICEs) carrying five types of ARGs were detected, and they may play an important role in facilitating the transfer of ARGs, particularly for tetracycline and macrolide-lincosamide-streptogramin (MLS). A broad spectrum of ARGs carried by plasmids (29 subtypes) and ICEs (4 subtypes) was persistent across the WWTPs. Host tracking showed a variety of antibiotic-resistant bacteria in the effluent, suggesting the high potential for their dissemination into receiving environments. Importantly, phenotype-genotype analysis confirmed the significant role of conjugative plasmids in facilitating the survival and persistence of multidrug-resistant bacteria in the WWTPs. At last, the consistency in the quantitative results for major ARGs types revealed by Oxford Nanopore and Illumina sequencing platforms demonstrated the feasibility of nanopore sequencing for resistome quantification. Overall, these findings substantially expand our current knowledge of the resistome in WWTPs, and help establish a baseline analysis framework to study ARGs in the environment.
Dr Tong Zhang is a Professor in the Environmental Biotechnology Laboratory in Department of Civil Engineering, and an Honorary Professor in the School of Public Health, at The University of Hong Kong. He received his Bachelor and Master’s degrees in Environmental Science and Engineering from Nanjing University, and his PhD from The University of Hong Kong. His research interests include environmental bioinformatics, omics technologies, anaerobic digestion and bioenergy from wastes/wastewater, biological wastewater treatment (N removal and P recovery), biodegradation of emerging pollutants (antibiotics, PPCP and EDCs) and antibiotic and antibiotic resistance genes. He has published over 200 peer-reviewed papers, and has more than 14, 000 citations and an H index of 68 on Google Scholar. He is an associate editor of Microbiome and Applied Microbiology and Biotechnology and had served as an advisor for Beijing Genomics Institute on Environmental Microbiology and Biotechnology from 2011 to 2014. He was Yi Xing Chair Professor of Nanjing University from 2013 to 2016, and currently is a distinguished visiting professor of Southern University of Science and Technology in China. He got First-Class Award in Natural Science of China Ministry of Education in 2015, Second-Class Award State Natural Science Award of China State Council in 2016, and Outstanding Research Student Supervisor Award of HKU in 2017. He is listed as one of the Highly Cited Researchers by Clarivate in 2018.
Complete genomes of eighteen different Actinobacillus pleuropneumoniae reference serotypes assembled with Oxford Nanopore reads
Actinobacillus pleuropneumoniae is a Gram-negative Pasteurellaceae causing pig pleuropneumonia. Eighteen serovars with distinctive lipopolysaccharide and capsular compositions were described so far, yet complete genomes are only available for the reference strains of serotype 1, 4, and 5b. We aimed to complete this picture by sequencing the remaining reference strains with the MinION sequencer and llumina HiSeq platform. Genome assemblies were performed following two different strategies, i.e., Oxford Nanopore-only de novo assemblies polished with Illumina reads or hybrid assemblies combining Oxford Nanopore and Illumina reads. Both methods proved successful to obtain accurate circular genomes with comparable quality to PacBio-generated assemblies, as shown by comparison with the PacBio-sequenced genomes of serotype 1 and 4 reference strains from the NCBI database. Further comparative genome analysis allowed identifying common and distinctive traits of the different serovars and inferring their phylogenetic relatedness. These genomic data provide useful information for diagnostics, epidemiological studies and vaccine development.
Using direct RNA sequencing to detect RNA structures in transcriptomes
Studying how an RNA folds is crucial to understanding RNA function inside cells. RNA secondary structure mapping using second generation short-read sequencing such as Illumina, provides large scale structure information but lacks the connectivity between structures along a transcript. Here, we utilize direct RNA sequencing to detect structure modifications. Using machine learning models, we demonstrate that we can accurately detect RNA secondary structures and their dynamics on known RNAs. Structure probing of the human embryonic stem cell transcriptome captures structural features seen in other high throughput structure datasets and allows us to detect structural information in individual isoforms of a gene. We show that structural differences between isoforms are associated with differential translation efficiencies, further expanding our understanding of structure and function inside cells.
Yue Wan received her BSc in Cell Biology and Biochemistry from the University of California, San Diego and her PhD in Cancer Biology from Stanford University under the mentorship of Howard Y. Chang. During her PhD, she developed the first high-throughput method for probing RNA structures genome-wide. Yue is a recipient of the NSS-PhD scholarship from the Agency for Science, Technology and Research (A*STAR) in Singapore and is currently a Principle Investigator in the Genome Institute of Singapore. She is a Society in Science- Branco Weiss Fellow, EMBO Young Investigator and CIFAR-Azrieli Global Scholar, as well as a recipient of the Young Scientist Award and L’Oreal-UNESCO for Women in Science, Singapore National Fellowship. She is interested in studying functional RNA structures and understanding their roles in regulating cellular biology.
Nanopore sequencing of the CYP2D6 pharmacogene
The accurate genotyping of CYP2D6 is hindered by the very polymorphic nature of the gene, high homology with its pseudogene CYP2D7, and the occurrence of structural variations. Using the GridION nanopore sequencer, we sequenced 32 samples covering various haplotypes of CYP2D6, including four samples with gene duplication, over two sequencing runs. The haplotypes of 26 samples could be matched accurately to known alleles or subvariants, while the remaining 6 samples had either novel variants or variant patterns not matched to the current PharmVar CYP2D6 haplotype database. Small insertions/deletions associated with several key haplotypes were detected accurately, and five novel variants not yet catalogues in PharmVar were reported. Allele duplication could be determined by analyzing the allelic balance between the sample haplotypes. Nanopore sequencing of CYP2D6 offers a high throughput method for genotyping, accurate haplotyping, and detection of new variants and duplicated alleles.
Yusmiati is currently a PhD Candidate at the Gene Structure and Function Lab in the Department of Pathology and Biomedical Science at the University of Otago, Christchurch. Her research focus includes application of nanopore sequencing and other sequencing methods in pharmacogenetics and adverse drug reaction. She holds a master’s degree in biomedical science from the University of Hasanuddin in Indonesia and has previously worked in a molecular diagnostic Lab in Jakarta, Indonesia.
Identification of a new class of local copy number aberrations in lung cancer genomes using PromethION
Using PromethION, we generated whole-genome long-read sequencing data of five lung cancer cell lines at the depths of 30X— in LC2/ad cells and 10X — in other cells. We identified the cancerous mutations including point mutations, large deletions and chromosomal rearrangements, which were previously reported using the short-read sequencers. In addition, we unexpectedly detected complex middle-size structural variations, which we named Cancerous Local Copy-number Lesions (CLCLs). Those CLCLs consisted of complex combinations of local inversions, duplications and micro deletions, which should invoke functional alternations of their encoding proteins. Those CLCLs occurred even in the cancer-related genes, such as STK11, NF1 and PTEN, where the short-read sequencers could not have confidently identified their aberrant structures. We further conducted the similar analyses using ten clinical lung adenocarcinoma specimens. We successfully demonstrated that those CLCLs are not specific to cultured cells, but also occurring in vivo.