Big Data Analytics
With fast growing volumes of data in our world, the use of big data will become a key to accelerate productivity growth. This project investigates state-of-the-art techniques for mining massive data from various sources. We focus on structured (time series and event logs) and unstructured data (plain text, application traces, and system log files) mining. We are developing advanced analysis engines for mining time series data, complex event processing, graph mining, parallel and distributed mining, stream mining.-Read more
Complex System Modeling and Optimization
With ubiquitous sensing and networking capability, traditional complex physical systems have been undergoing revolutionary changes in their ICT capabilities. They are now equipped with a large number of sensors distributed across different parts of the system, which collect a tremendous amount of data from system operation. This project is to develop innovative analytic engines to model the big data from these systems and extract the right information to improve operation. For example, the discovered data models and patterns can drive actionable insight and timely decisions in operation. As a result, our predictive analytic solutions can enable customers to optimize their business operation to increase revenue or reduce operational costs. Our analytic solutions can also help to transform the way we live and work in our society. Smart cities, smart power grids, intelligent homes are all examples of applications by harnessing the power of big data from complex systems.-Read more
NGLA: Next Generation Log Analytics
Computer systems generate a huge amount of heterogeneous logs. Those logs provide rich contextual information describing system status and are critical sources for system monitoring and diagnosis. However, manually interpreting those logs is not effective due to the extremely large volume and complicated domain-specific syntax and semantic knowledge. NGLA is a comprehensive and scalable framework to analyze heterogeneous logs from any source without prior domain knowledge or pattern information. It provides a self-learning engine and a stream processing platform for new applications including system anomaly detection with deep log inspection and unstructured log management.