Skip to main content

Terry Hsin-Yi Shen

Graduated: January 1, 2009

Thesis/Dissertation Title:

Determining the Feasibility and Value of Federated Data Integration Combining Logical and Probabilistic Inference for SNP Annotation

Most common and complex diseases are influenced at some level by variation in the genome. The future work of statistical geneticists, molecular biologists, and physician-scientists with interests in genetics or genomics must thus take genetics into consideration. Research done in public health genetics, specifically in the area of single nucleotide polymorphisms (SNPs), is the first step to understanding human genetic variation. Functional uncertainty, volume of information, and cost-effectiveness result in the prioritization of SNPs to be an important research question. SNP Integration Tool (SNPit) is a data integration system tool that looks at all the possible predictors of functional SNPs and provides the user with integrated information and decision making capability. Determining the feasibility and value of SNPit with rules and probabilistic inference, thus, represents challenges from both the biological and biomedical informatics standpoint concerning how to represent, integrate, and conduct inference over disparate biological data sources.

The main objective of this dissertation is to determine the feasibility and value of creating a federated integration system with combinations of logical, probabilistic, and logical combined with probabilistic inference for functional SNP annotation. Through iterative design, four versions of the SNPit system were created which consolidates information on a variety of functional annotation predictors and includes combinations of logical and probabilistic inference. Furthermore, this dissertation evaluates the feasibility of federated data integration and assesses its’ accuracy for SNP annotation, characterizing the suitability for adding logical and probabilistic inference to the federated data integration for both point and regional SNP annotation. This study also explores the feasibility of combining both logical and probabilistic inference for point and regional SNP annotation. This dissertation contributes to general knowledge in informatics as well as SNP annotation by describing the design, implementation, and evaluation of combinations of logical, probabilistic, and both logical and probabilistic inference applied to the domain of functional SNP annotation.

Last Known Position:

Faculty Research Associate, University of Maryland School of Nursing


Peter Tarczy-Hornoch (Chair), Melissa A. Austin, James F. Brinkley III, Chris Carlson, Kelly Fryer-Edwards (GSR)