Tempo: From Data Streams to Graph Streams, from Searching to Mining
Abstract
Time is an important dimension. Anything that has to do with us almost
always has to do with time. This includes data. In the big data era,
real-time data streams, or more generally sequence data over time such
as system logs, are increasingly common. In this talk, I will describe
a number of intriguing problems that we have encountered and conquered
in the recent past—both searching/querying a known interesting pattern
and mining/discovering useful information from such data.
For the searching part, I discus a series of patterns, including
subsequences, regular expressions, and graph-timing structures, with a
latter one being more general than a former one. For the mining part,
I discuss a few key issues including event outliers, stochastic data
acquisition, and lasting densest subgraphs. After all, tempo is for
the flow of data as time goes by.
Bio: Tingjian Ge is an associate professor in Computer Science at the
University of Massachusetts, Lowell. He received a Ph.D. from Brown
University in 2009. Prior to that, he got his Bachelor’s and Master’s
degrees in Computer Science from Tsinghua University and UC Davis,
respectively, and worked at Informix and IBM in California for six
years. From 2009 to 2011 he worked as an assistant professor at the
University of Kentucky. His research areas are in big data analytics,
with a focus on applying machine learning, AI, and algorithmic
techniques in data management and mining. He has a broad interest in
topics including noisy and uncertain data, data streams, graph data,
and data security and privacy. He is a recipient of the NSF CAREER
Award in 2012, and a Teaching Excellence Award at UMass Lowell in
2014. He serves in the Program Committee of major database and data
mining conferences such as SIGMOD, VLDB, ICDE, EDBT, CIKM, and ICDM,
and served as the Program Chair of New England Database Day 2015.