Trevor Cohen, MBChB, MPhil, PhD

Professor, Biomedical Informatics and Medical Education


Mental health informatics; post-marketing drug surveillance; drug repurposing; analysis of health-related online social media; distributed representations; distributional semantics; clinical cognition.


Dr. Cohen trained and practiced as a physician in South Africa, before obtaining his PhD in 2007 in Medical Informatics at Columbia University. His doctoral work focused on an approach to enhancing clinical comprehension in the domain of psychiatry, leveraging distributed representations of psychiatric clinical text. Upon graduation, he joined the faculty at Arizona State University’s nascent Department of Biomedical Informatics, where he contributed to the development of curriculum for informatics students, as well as for medical students at the University of Arizona’s Phoenix camps. In 2009 he joined the faculty at the University of Texas School of Biomedical Informatics, where (amongst other things) he developed a NLM-funded research program concerned with leveraging knowledge extracted from the biomedical literature for information retrieval and pharmacovigilance, and contributed toward large-scale national projects such as the Office of the National Coordinator’s SHARP-C initiative, which supported a range of research projects that aimed at improving the usability and comprehensibility of electronic health record interfaces.


Dr. Cohen’s research focuses on the development and application of methods of distributional semantics – methods that learn to represent the meaning of terms and concepts from the ways in which they are distributed in large volumes of electronic text. The resulting distributed representations (concept or word embeddings) can be applied to a broad range of biomedical problems, such as: (1) using literature-derived models to find plausible drug/side-effect relationships; (2) finding new therapeutic applications for known (drug repurposing); (3) modeling the exchanges between users of health-related online social media platforms; and (4) identifying phrases within psychiatric narrative that are pertinent to particular diagnostic constructs (such as psychosis). An area of current interest involves applying literature-derived distributed representations in conjunction with observational data as a basis for machine learning.  More broadly, he is interested in clinical cognition – the thought processes through which physicians interpret clinical findings – and ways to facilitate these processes using automated methods.

Representative publications:

Google Scholar