Graduated: January 1, 2016
Information extraction from clinical and radiology notes for liver cancer staging
Medical practice involves an astonishing amount of variation across individual clinicians, departments, and institutions. Adding to this condition, with the exponential pace of new discoveries in pockets of biomedical literature, medical professions, often understaffed and overworked, have little time and resources to analyze or incorporate the latest research into clinical practice. The accelerated adoption of electronic medical records (EMRs) brings about great opportunities to mitigate these issues. In computable form, large volumes of medical information can now be stored and queried, so that optimization of treatments based on patient characteristics, institutional resources, and patient preferences can be data driven. Thus, instead of relying on the skillsets of patients' support network and medical teams, patient outcomes can at least have some statistical guarantees.
In this dissertation, we focus specifically on the task of hepatocellular carcinoma (HCC) liver cancer staging using natural language processing (NLP) techniques. Staging, or categorizing cancer patients by extent of diseases, is important for normalizing patient characteristics. Normalized stages, can then be used to facilitate evidence-based research to optimize for treatments and outcomes. NLP is necessary, as with other clinical tasks, a majority of staging information is trapped in free text clinical data.
This thesis proposes an approach to liver cancer stage phenotype classification using a mixture of rule-based and machine learning techniques for text extraction. Included in this approach is a careful, layered design for annotation and classification. Each constituent part of our system was characterized by detailed quantitative and qualitative analysis regarding several medical conditions.
Last Known Position:
Senior Applied Scientist at Microsoft Health; Postdoctoral Scholar, Stanford University
Meliha Yetisgen (Chair), Fei Xia,Lucy Vanderwende, Sharon Kwan, Gina-Anne Levow (GSR)