Two leading sequencing techniques are not any longer at odds, due to a global effort led by scientists at University of California San Diego. In a study published July 27, 2023 in Nature Biotechnology, the researchers debuted a brand new reference database called Greengenes2, which makes it possible to match and mix microbiome data derived from either 16S ribosomal RNA gene amplicon (16S) or shotgun metagenomics sequencing techniques.
“It is a significant moment in microbiome research, as we have effectively rescued over a decade’s value of 16S data that might need otherwise turn into obsolete in the fashionable world of shotgun sequencing,” said senior writer Rob Knight, PhD, professor within the departments of Pediatrics at UC San Diego School of Medicine and Bioengineering and Computer Science at UC San Diego Jacobs School of Engineering. “Standardizing results across these two methods will significantly improve our probabilities of discovering microbiome biomarkers for health and disease.”
Microbiome studies rely on scientists’ ability to discover which microorganisms are present in a sample. To do that, they sequence the genetic information within the sample and compare it to reference databases that list which sequences belong to which organisms. 16S and shotgun sequencing are the 2 techniques most generally utilized in microbiome research, but they often yield different results.
“Many researchers assumed that data from 16S and shotgun sequencing were just too different to ever be integrated,” said first writer of the study Daniel McDonald, PhD, scientific director of The Microsetta Initiative at UC San Diego School of Medicine. “Here we show that shouldn’t be the case, and supply a reference database that researchers can now use to do exactly that.”
The unique Greengenes database had been widely utilized in the microbiome field for well over a decade. It was the reference database utilized by notable projects including the National Institutes of Health Human Microbiome Project, the American Gut Project, the Earth Microbiome Project and lots of others.
Nonetheless, one in every of its fundamental limitations was that it relied on the sequence of a single gene, 16S, to discover the organisms in a sample. This well-studied gene has long been used as a taxonomic marker, with each organism having its own 16S “barcode.” This method can describe the contents of a microbiome sample with genus-level resolution, but it surely cannot all the time discover specific species or strains of microbes, which is essential for clinical work.
Modern microbiome studies have since transitioned to using shotgun sequencing, which looks at DNA from all around the organisms’ genomes, somewhat than specializing in just one gene. This powerful approach gives researchers more species-level specificity and likewise provides insight into the microbes’ function.
Scientists often attributed the discrepancies between the 2 techniques to differences in the best way the samples are prepared within the lab. Nonetheless, the brand new study demonstrates that incompatibilities between the 2 techniques arise from differences in computation, where a greater reference database allows for a similar conclusions to be drawn from each methods. This addresses a vital issue within the reproducibility of microbiome research and allows the re-use of information from thousands and thousands of samples in older studies.
In attempting to resolve these incompatibilities, the researchers first expanded the Web of Life whole genome database. They then used several recent computational tools developed with co-author Siavash Mirarab, PhD, associate professor at UC San Diego Jacobs School of Engineering, to integrate existing high-quality full-length 16S sequences into the whole-genome phylogeny. With one other machine learning tool developed by Mirarab’s group, they placed 16S fragments from over 300,000 microbiome samples. The result was an expansive reference database that each 16S and shotgun sequencing data might be mapped onto.
To substantiate whether Greengenes2 would help standardize findings from either sequencing technique, the researchers acquired each 16S and shotgun sequencing data from the identical human microbiome samples and analyzed them each against the backdrop of the Greengenes2 phylogeny. The outcomes from each techniques showed highly correlated diversity assessments, taxonomic profiles and effect sizes -; something researchers had not seen before.
“Through Greengenes2, an enormous repository of 16S data can now be brought back into the fold and even combined with modern shotgun data in recent meta-analyses,” said McDonald. “It is a major step forward in improving the reproducibility of microbiome studies and strengthening physicians’ ability to attract clinical conclusions from microbiome data.”
Co-authors include: Yueyu Jiang, Metin Balaban, Kalen Cantrell, Antonio Gonzalez, Giorgia Nicolaou, Se Jin Song and Andrew Bartko, all at UC San Diego, in addition to Qiyun Zhu at Arizona State University, James T. Morton on the National Institutes of Health, Donovan H. Parks and Philip Hugenholtz at The University of Queensland, Søren Karst at Columbia University, Mads Albertsen at Aalborg University, Todd DeSantis at Second Genome, Aki S. Havulinna, Pekka Jousilahti, Teemu Niiranen and Veikko Salomaa on the Finnish Institute for Health and Welfare, Susan Cheng at Brigham and Women’s Hospital and Cedars-Sinai Medical Center, Mike Inouye at University of Cambridge and Baker Heart and Diabetes Institute, Mohit Jain at Sapient Bioanalytics and Leo Lahti at University of Turku.
Source:
Journal reference: