With the rise of social networks, online news media, and automated text generation systems, the consumption and production of unstructured and semi-structured textual data have significantly increased in recent years. Analyzing the evolution and interplay of topics and ideas can help social scientists and policy-makers to understand the emergence of social movements. Marketing experts can measure the impact of trending news or identify the diffusion of negative opinions. The discovery of anomalous content in microblogs or news wire data can enable public safety officials and journalists to highlight relevant, situation-related information during critical events. At the same time, however, we are facing unprecedented threats introduced by the fast and uncontrolled global spread of misinformation and rumors.
In order to understand the evolution of content patterns, detect anomalous information, and discover large scale coordinated activities, we have to cope with the inherent challenges of real-time streaming text. Due to its unstructured nature, high signal-to-noise ratio, and semantic complexities, text analysis has always been one of the most challenging scientific topics. While most of the past research has been directed towards batch data processing, only limited thought has been given to the challenge of analyzing live-streaming textual data. The goal of this project is to develop Visual Analytics (VA) approaches for the analysis of streaming data - with a tight integration of natural language processing (NLP), machine learning, and visual interfaces. In particular, the analytical pipeline should cope with the specific challenges of constantly updated data, such as visual scalability, interaction scalability, changing baselines and distribution.
This research project is a collaborative effort between the partners Zhejiang University and University of Stuttgart.