Supercomputing to perform multi-omic analysis is becoming a powerful approach to better understand disease biology and to advance disease prediction, and in this powerful new paper, it is being harnessed to train genetic prediction models.
The NIHR BioResource and our volunteers facilitated part of the data generation that has helped to create this exciting combination of multi-omics and machine learning, widely available for the entire research community.
Multi-omics profiling studies enable a more comprehensive understanding of molecular changes contributing to normal development, cellular response, and disease. Genomics can be combined with data from other modalities such as transcriptomics, epigenetics, and proteomics, to measure gene expression, gene activation, and protein levels.
In a comprehensive and robust study published in Nature - An atlas of genetic scores to predict multi-omic traits - scientists from the Cambridge Baker Systems Genomics Initiative at the University of Cambridge and Baker Heart and Diabetes Institute led a large team of global collaborators to develop, validate and apply multi-omic genetic scores using machine learning for more than 17,000 molecular traits.
They also developed an online portal (OmicsPred.org) for these genetic scores to accelerate research in this fast-growing area.
The study utilised the INTERVAL study, a large cohort of 50,000 healthy UK blood donors with extensive multi-omic profiling. This enabled prediction of 13,668 RNA transcripts, 2692 proteins and 867 metabolites.
Multi-omics (such as transcriptomics, proteomics, and metabolomics) can provide a comprehensive and powerful view of biological systems.
Increasing evidence has shown that genetic prediction of complex molecular traits including genes, proteins and metabolites can be an accurate, efficient and powerful tool in research and clinical settings to better understand cardiovascular diseases, diabetes, cancers and other diseases.
It can also help with the discovery of novel drug targets and biomarkers.
Computational biology expert Professor Michael Inouye, who led this study, says this study demonstrates how this pioneering research is helping to overcome challenging issues around time, cost and underrepresented demographics.
Munz Chair of Cardiovascular Prediction and Prevention at the Baker Institute, Professor Inouye says the collection of multi-omics data is an extremely expensive and time-consuming process:
"Because of these barriers, large-scale population cohorts typically generate multi-omic data for only a subset of participants, which reduces the statistical power of subsequent analyses and creates inequities for studies that do not have ample resources or are from underrepresented ancestries and other demographics."
In this study, he says the relative predictive values and robustness of the genetic scores are assessed and validated in seven different external studies comprising European, East Asian, South Asian and African American ancestries.
Professor Inouye says they also demonstrated the longitudinal stability and utility of these genetic scores and highlighted a series of biological insights regarding genetic mechanisms in metabolism and pathway associations with disease.
"We anticipate the OmicsPred resource will be widely and routinely utilised to investigate multi-omic traits and phenotype associations."
If you are a researcher interested in finding out more about using the NIHR BioResource to facilitate your translational research or early phase clinical trials, we'd love to hear from you. You can contact us via nbr@bioresource.nihr.ac.uk.
You can also keep up to date with the BioResource latest news on Twitter and LinkedIn.
You can learn more about joining the NIHR BioResource as a research volunteer if you'd like to contribute to future studies like this one.
Want to make a difference?
Our volunteers help to advance health research that benefits generations to come. Every volunteer makes a difference.