Rich Green
Thesis/Dissertation Title:
A novel translational bioinformatics pipeline to improve precision medicine research
Diverse Mouse models can serve as precursors to precision medicine in clinical practice (Li and Auwerx 2020) but requires the integration, analysis, and cross-species interpretation across multi-omics data sets. We present here a multi-omics pipeline designed to identify biomarkers with translational applicability using the Collaborative Cross (CC) mouse model. The CC project is a mouse genetic reference panel (GRP) that seeks to determine genetic markers driving outcomes. The CC was designed to introduce genetic diversity (like in a human population) into mouse models.
Our approach comprises three overarching aims (Aim 1) Construct Networks and Linear Models. (Aim 2) Detect Genetic Drivers and Candidate Genes. (Aim 3) Verify Clinical Correlations and Biomarker Detection, which we applied our pipeline to our driving biological project (DBP) to identify markers of neuroinvasion during West Nile virus (WNV) infection.
Aim 1 produced three novel immune networks (A-C) in the CC mouse model of West Nile virus infection. Network A was enriched in pattern recognition, innate immunity, and cell differentiation. Network B contained interferon and inflammation, and C was enriched for interferon signaling and neutrophil degranulation. Regression modeling and pathway analysis are also performed and identify unique immune regulators of disease outcomes across different CC strains. Using public data sets, we correlated novel gene-to-gene connections using an innovative approach, Integrated Transcriptomics Analysis (ITA).
In Aim 2, using the CC mouse model of WNV infection, genetic regions were correlated to the DBP through Quantitative Trait Loci analysis (QTL) which is a statistical approach that uses genotype data (genetic markers) and phenotype (viral detection, IFITM1 expression). The purpose of a QTL is to explain if there is any basis for genetic variation in the complex traits of our phenotype. QTL analysis identified three regions 59-80Mb in chromosome 4, 107-110.5Mb in chromosome 12, and 57.1-94.5 Mb in the X chromosome. Using viral load as a phenotype, identified areas in chromosomes 4 and 12. IFITM1 as a phenotypic marker identified a QTL in chromosome X. Transcriptional analysis from Aim 1 paired with Aim 2’s QTLs identified Toll-Like Receptor 4 (TLR4) in chromosome 4, Tryptophanyl-tRNA synthetase WARS in chromosome 12, and Membrane palmitoylated protein (MPP1) in chromosome X.
In Aim Three, translating findings from the CC model of WNV infection into human correlates, genetic regions from Aim 2 were converted to human genomic coordinates, and a Phenome Wide Association Study (PheWAS) using the Electronic Medical Records and Genomics (eMERGE) network (25k and 109k human genotyped participants) was performed. A PheWAS is a statistical test that uses genetic loci (or variants) and queries across a curated dataset of phenotypes defined by clinical codes. The result is genetic regions that are enriched by clinical phenotypes.
PheWAS identified various clinical associations with the genetic regions identified in the CC mouse model and mapped to human genomic coordinates, including essential tremor, Type 2 diabetes with neurological manifestations, chronic kidney disease, intestinal infection due to Clostridium difficile, end-stage renal failure, and other similar clinical phenotypes. Other clinical associations were identified in genes TLR4 and TRIM32, including codes for the circulatory system, dermatologic, endocrine, hematopoietic, infectious diseases, and neoplasms.
To augment the PheWAS, Bulk RNAseq was also performed on four human brains (two WNV infected, two mocks). Several target genes (Tnfsf8, PTBP3, Akna, and TLR4) identified as chromosome 4 were also significant in WNV-infected human brains. WARS gene in chromosome 12 and MMP1 In chromosome X were also identified.
The transcriptional analysis also revealed which sections of the brain contained the activated QTL-derived genes. TLR4 was significant in the Basal Ganglia. Akna was significant in the Cortex. PTBP3 was significant in the Basal Ganglia, Cortex, and Thalamus. In chromosome 12, the Wars gene was significant in the Basal Ganglia, Cortex, and Thalamus. MPP1 and MCFS appeared statistically significant in chromosome X in the Basal Ganglia.
Our pipeline leveraged a diverse mouse model to calculate genetic and transcriptional markers associated with disease phenotypes. Connecting the results and our findings across our aims revealed distinct connections and biomarkers to be used in precision medicine applications.