Multivariate Time Series Analysis of Clinical and Physiological Data
Although the sophistication and volume of the collected data is greater now than at any point in the history of medicine, the information overload that providers face may inhibit the diagnostic process. Providers are expected to examine the large volume of data and identify correlations between the parameters based on their own clinical experience in order to detect significant medical events. Existing visualizations of the data to assist the provider in analyzing the information consist of a table or plot of values for a particular parameter over time. Automated techniques for discovering these correlations not only may assist the provider in making a diagnosis, but may help in identifying hidden patterns within the data associated with specific medical conditions or events. Current data mining and machine learning techniques show promise for extracting this information.
Furthermore, traditional baselines and thresholds for these parameters, which indicate “normal” ranges of values in a healthy patient, are generally based on age, gender, and/or weight, and were established (often decades ago) by measuring these parameters in a large population of patients. While most people’s vital signs and lab reports fall within the thresholds, others do not because of differences in lifestyle, genetic makeup, and environment. Consequently, providers typically adjust their physiological monitors to a personalized baseline for the patient based on the clinician’s observation or assessment. This approach is, at best, laborious and can be problematic if the provider’s assessment of a baseline is incorrect. With the development of electronic medical records, it is now possible to use a patient’s medical history to automate the personalization of a patient’s baselines and thresholds. Thus, more accurate measures of a patient’s state can be developed.
Multivariate Time Series Amalgams (MTSAs) provide an integrated, multivariate approach to representing clinical and physiological data. The hybrid representation automates the personalization of baselines and thresholds based on a patient’s medical history while also incorporating traditional baselines and thresholds. The visualization of this representation captures the rate of change of provider- selected parameters and the relationships among them. It groups parameters according to their influence on four vital organs: the heart, lung, kidney, and liver. A novel similarity metric for this representation, inspired by Bag of Patterns (BOP), is the cornerstone to the development of a search engine for large medical databases.
Patricia Ordóñez is a doctoral candidate in the Computer Science and Electrical Engineering department at UMBC. Her anticipated graduation date is August 2011. She received her M.S. in Computer Science from UMBC in August 2010 and her B. A. in Hispanic and Italian Studies from Johns Hopkins University in 1989.
Her current research centers on her dissertation, entitled “Multivariate Time Series Analysis of Physiological and Clinical Data.” She is interested in creating clinical decision support systems that aid medical providers to efficiently diagnose and treat patients, and that personalize medicine by applying data mining, machine learning, and visualization techniques to data warehouses of electronic medical data.
Patti has served as a lecturer of several undergraduate courses at UMBC during graduate school and as a technical trainer in industry for over 10 years. Prior to delving into technology, she taught high school math and Spanish and coached field hockey. In 2007, she received a National Science Foundation Graduate Research Fellowship to complete her doctorate, which permitted her to pursue her interests in biomedical informatics in collaboration with medical professors at Johns Hopkins School of Medicine. In 2008, her paper, Visualizing Multivariate Time Series Data to Detect Specific Medical Conditions, was nominated for the Best Student Paper Award at AMIA 2008.