Mengnan Du works at Texas A&M University.

Posts

DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router

Large Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the query and source sides, resulting in noisy retrieval, shallow reasoning, and limited adaptability to heterogeneous knowledge sources. In this work, we introduce DeepSieve, a novel RAG method that incorporates information sieving via LLM-as-a-knowledge-router. DeepSieve breaks down complex queries into structured sub-queries and recursively routes each to the most appropriate knowledge source, filtering out irrelevant information through a multi-stage information sieving process. This modular and transparent approach ensures that DeepSieve remains adaptable across diverse information needs. Experiments on three multi-hop QA benchmarks involving heterogeneous sources show that DeepSieve achieves greater reasoning depth, retrieval precision, and interpretability compared to conventional RAG approaches. Our codes are available at https://github.com/MinghoKwok/DeepSieve.

Towards Learning Disentangled Representations for Time Series

Promising progress has been made toward learning efficient time series representations in recent years, but the learned representations often lack interpretability and do not encode semantic meanings by the complex interactions of many latent factors. Learning representations that disentangle these latent factors can bring semantic-rich representations of time series and further enhance interpretability. However, directly adopting the sequential models, such as Long Short-Term Memory Variational AutoEncoder (LSTM-VAE), would encounter a Kullback?Leibler (KL) vanishing problem: the LSTM decoder often generates sequential data without efficiently using latent representations, and the latent spaces sometimes could even be independent of the observation space. And traditional disentanglement methods may intensify the trend of KL vanishing along with the disentanglement process, because they tend to penalize the mutual information between the latent space and the observations. In this paper, we propose Disentangle Time-Series, a novel disentanglement enhancement framework for time series data. Our framework achieves multi-level disentanglement by covering both individual latent factors and group semantic segments. We propose augmenting the original VAE objective by decomposing the evidence lower-bound and extracting evidence linking factorial representations to disentanglement. Additionally, we introduce a mutual information maximization term between the observation space to the latent space to alleviate the KL vanishing problem while preserving the disentanglement property. Experimental results on five real-world IoT datasets demonstrate that the representations learned by DTS achieve superior performance in various tasks with better interpretability.