Graduated: June 10, 2017
Predicting Cancer Outcome with Multispectral Tumor Tissue Images
Tumor tissue slides have been used by clinicians to assess cancer patient’s condition and indicate prognosis. Several studies have suggested that the distribution of important immunological biomarkers on tumor tissue slides might help predict survival outcome   . These studies rely upon non-parametric Kaplan-Meier survival analysis with Log-rank test to extract statistical insights, which, however, has several disadvantages such as prediction ambiguity and inability to directly model continuous variables.
In this study, we engineered 676 features encoding cellular distribution information from multi-spectral tumor tissue images from 118 HPV-negative oral squamous cell cancer patients. We leveraged statistical methods and predictive models to explore the predictive power of these features. 18 features were identified as potential survival predictors through Kolmogorov-Smirnov test. Our best model, random forest model, has achieved 58.54% prediction accuracy rate on independent validation dataset. Although the model does not suggest strong predictive power of selected features, evaluation on large scale training data is still needed to further tune model parameters and generate more concrete results.
Last Known Position:
Full Stack Data Engineer, Salesforce AI
Drs. Peter Myler (Chair), Ilya Shmulevich