Graduated: January 1, 2013
A Graph-Theoretic Approach to Model Genomic Data and Identify Biological Modules Associated with Cancer Outcomes
Studies of the genetic basis of complex diseases present statistical and methodological challenges to discover reliable and high-confidence genes that reveal biological phenomena underlying the etiology of the disease or gene signatures prognostic of disease outcomes. This thesis examines the capacity of graph-theoretical methods to integrate and analyze genomic information and thus facilitate using prior knowledge to create a more discrete and functionally-relevant feature space. To assess the statistical and computational value of graph-based algorithms in genomic studies of cancer onset and progression I apply an instance of a random walk graph algorithm in a weighted interaction network. I merge high-throughput co-expression and curated interaction data to search for biological modules associated with key cancer processes and evaluate significant modules by their predictive value and functional relevance. This approach identifies interactions among genes involved in proliferation, apoptosis, angiogenesis, immune evasion, metastasis, and energy metabolism pathways that generate hypotheses for further cancer biology studies. Results from this analysis show that graph-based approaches are a powerful tool to integrate and analyze complex molecular relationships and to reveal coordinated activity of significant genomic features where previous statistical and analytical methods focusing on individual effects are limited.
Last Known Position:
Data Scientist Lead, TensorloT Inc
Neil F. Abernethy (Chair), John H. Gennari, Ira J. Kalet, Ali Shojaie, Barbara E. Endicott (GSR)