Kai Xiong
- kxiong@ir.hit.edu.cn
- Waste-Wood
- Harbin
I'm a Ph.D. student in Research Center for Social Computing and Information Retrieval (SCIR), at Harbin Institute of Technology (HIT, China). I am co-advised by Prof. Ting Liu and Prof. Xiao Ding. My research interests lie in event reasoning, eventic graph, and large language models.
Publications
Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance NeurIPS 2024
Kai Xiong, Xiao Ding, Ting Liu, Bing Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Yixin Cao
Large language models (LLMs) have developed impressive performance and strong explainability across various reasoning scenarios, marking a significant stride towards mimicking human-like intelligence. Despite this, when tasked with several simple questions supported by a generic fact, LLMs often struggle to abstract and apply the generic fact to provide consistent and precise answers, revealing a deficiency in abstract reasoning abilities. This has sparked a vigorous debate about whether LLMs are genuinely reasoning or merely memorizing. In light of this, we design a preliminary study to quantify and delve into the abstract reasoning abilities of existing LLMs. Our findings reveal a substantial discrepancy between their general reasoning and abstract reasoning performances. To relieve this problem, we tailor an abstract reasoning dataset (AbsR) together with a meaningful learning paradigm to teach LLMs how to leverage generic facts for reasoning purposes. The results show that our approach not only boosts the general reasoning performance of LLMs but also makes considerable strides towards their capacity for abstract reasoning, moving beyond simple memorization or imitation to a more nuanced understanding and application of generic facts.
Examining Inter-Consistency of Large Language Models Collaboration: An In-depth Analysis via Debate Findings of EMNLP 2023
Kai Xiong, Xiao Ding, Yixin Cao, Ting Liu, Bing Qin
Large Language Models (LLMs) have demonstrated human-like intelligence and are widely used in various applications. However, LLMs still exhibit various kinds of inconsistency problems. Existing works mainly focus on the inconsistency issues within a single LLM, while we investigate the inter-consistency among multiple LLMs, which is critical for collaborating to solve a complex task. To examine whether LLMs can collaborate to ultimately achieve a consensus for the shared goal and whether LLMs easily change their viewpoints, we introduce a Formal Debate framework (FORD) With FORD, we conduct a three-stage debate aligned with real-world scenarios: fair debate, mismatched debate, and roundtable debate. Through extensive experiments on the commonsense reasoning task, LLMs not only become more inter-consistent but also achieve higher performance. Moreover, we observe that stronger LLMs tend to dominate the debates by adhering to their perspectives, while weaker ones are more likely to change viewpoints. Additionally, we highlight the importance of a competent judge, such as GPT-4, to draw more proper conclusions. Our work contributes to understanding the inter-consistency among LLMs and lays the foundation for the development of future collaboration methods.
ReCo: Reliable Causal Chain Reasoning via Structural Causal Recurrent Neural Networks EMNLP 2022
Kai Xiong, Xiao Ding, Zhongyang Li, Li Du, Ting Liu, Bing Qin, Yi Zheng, Baoxing Huai
Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs. However, CCR suffers from two main transitive problems: threshold effect and scene drift. In other words, the causal pairs to be spliced may have a conflicting threshold boundary or scenario. To address these issues, we propose a novel Reliable Causal chain reasoning framework (ReCo), which introduces exogenous variables to represent the threshold and scene factors of each causal pair within the causal chain, and estimates the threshold and scene contradictions across exogenous variables via structural causal recurrent neural networks (SRNN). Experiments show that ReCo outperforms a series of strong baselines on both Chinese and English CCR datasets. Moreover, by injecting reliable causal chain knowledge distilled by ReCo, BERT can achieve better performances on four downstream causal-related tasks than BERT models enhanced by other kinds of knowledge.
e-CARE: a New Dataset for Exploring Explainable Causal Reasoning ACL 2022
Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin
Understanding causality has vital importance for various Natural Language Processing (NLP) applications. Beyond the labeled instances, conceptual explanations of the causality can provide deep understanding of the causal fact to facilitate the causal reasoning process. However, such explanation information still remains absent in existing causal reasoning resources. In this paper, we fill this gap by presenting a human-annotated explainable CAusal REasoning dataset (e-CARE), which contains over 20K causal reasoning questions, together with natural language formed explanations of the causal questions. Experimental results show that generating valid explanations for causal facts still remains especially challenging for the state-of-the-art models, and the explanation information can be helpful for promoting the accuracy and stability of causal reasoning models.
Enhancing pretrained language models with structured commonsense knowledge for textual inference Knowledge-Based Systems
Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin
Transformer-based pretrained language models have shown promising performances in various textual inference tasks. However, additional relational knowledge between the semantic units within the input text may also play a critical role in the inference process. To equip the pretrained language models with the relational knowledge, previous methods employ a retrieval-based strategy, which obtain the relational features from prebuilt knowledge bases by a lookup operation. However, the inherent sparsity of part of the knowledge bases would prevent the direct retrieval of the relational features. To address this issue, we propose a MIX-strategy based Structural commonsense integration framework (Mix-Sci). In addition to the traditional retrieval strategy which only adapts to the knowledge bases with high coverage, Mix-Sci introduces an additional generative strategy to incorporate the sparse knowledge bases with the pretrained language models. In specific, in the training process, Mix-Sci learns to generate the structural information of the knowledge base, including the embedding of nodes and the connection relationship between the nodes. So that in the test process, the structural information can be generated to enhance the inference process. Experimental results on two textual inference tasks: machine reading comprehension and event prediction show that Mix-Sci can effectively utilize both the dense and the sparse knowledge bases, to consistently improve the performance of pretrained language models on textual inference tasks.
ExCAR: Event graph knowledge enhanced explainable causal reasoning ACL 2021
Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin
Prior work infers the causation between events mainly based on the knowledge induced from the annotated causal event pairs. However, additional evidence information intermediate to the cause and effect remains unexploited. By incorporating such information, the logical law behind the causality can be unveiled, and the interpretability and stability of the causal reasoning system can be improved. To facilitate this, we present an Event graph knowledge enhanced explainable CAusal Reasoning framework (ExCAR). ExCAR first acquires additional evidence information from a large-scale causal event graph as logical rules for causal reasoning. To learn the conditional probabilistic of logical rules, we propose the Conditional Markov Neural Logic Network (CMNLN) that combines the representation learning and structure learning of logical rules in an end-to-end differentiable manner. Experimental results demonstrate that ExCAR outperforms previous state-of-the-art methods. Adversarial evaluation shows the improved stability of ExCAR over baseline systems. Human evaluation shows that ExCAR can achieve a promising explainable performance.
面向文本推理的知识增强预训练语言模型 中文信息学报
熊凯, 杜理, 丁效, 刘挺, 秦兵, 付博
该文聚焦于利用丰富的知识对预训练语言模型进行增强以进行文本推理。预训练语言模型虽然在大量的自然语言处理任务上达到了很高的性能表现,具有很强的语义理解能力,但是大部分预训练语言模型自身包含的知识很难支撑其进行更高效的文本推理。为此,该文提出了一个知识增强的预训练语言模型进行文本推理的框架,使得图以及图结构的知识能够更深入地与预训练语言模型融合。在文本推理的两个子任务上,该文框架的性能超过了一系列的基线方法,实验结果和分析验证了模型的有效性。
Heterogeneous graph knowledge enhanced stock market prediction AI Open
Kai Xiong, Xiao Ding, Li Du, Ting Liu, Bing Qin
We focus on the task of stock market prediction based on financial text which contains information that could influence the movement of stock market. Previous works mainly utilize a single semantic unit of financial text, such as words, events, sentences, to predict the tendency of stock market. However, the interaction of different-grained information within financial text can be useful for context knowledge supplement and predictive information selection, and then improve the performance of stock market prediction. To facilitate this, we propose constructing a heterogeneous graph with different-grained information nodes from financial text for the task. A novel heterogeneous neural network is presented to aggregate multi-grained information. Experimental results demonstrate that our proposed approach reaches higher performance than baselines.