Xuan (Lily) Yang

Hi, I am Xuan (Lily) Yang (杨萱). I'm a computer science PhD student at Duke University, supervised by Prof. Jian Pei. Before that, I received my bachelor's and master's degree from Zhejiang University (ZJU). I also visited Standford and NUS as a research intern.

My current research interests focus on data-centric AI, including training data valuation and selection, data synthesis, and LLM-based agent systems.

📌 Open to research internship opportunities in 2026.

Email  /  Google Scholar  /  Linkedin

profile photo

Selected Publications

Local Shapley: Efficient Data Valuation for Model Training   
Xuan Yang, Jian Pei
We propose Local Shapley, a principled yet efficient approach to fair data valuation that leverages the locality of machine learning models to reduce Shapley value computation from exponential to linear complexity. Building on this, our Local Shapley via Model Reuse (LSMR) algorithm efficiently reuses trained models to minimize training costs. We further extend LSMR to Graph Neural Networks, with experiments demonstrating its effectiveness and scalability across diverse datasets and models.

Batch-of-Thought: Cross-Instance Learning for Enhanced LLM Reasoning   
Xuan Yang, Furong Jia, Roy Xie, Xi Xiong, Jian Li, Monica Agrawal
We introduce Batch-of-Thought (BoT), a training-free approach that enables collective reasoning across multiple samples, and BoT-Reflection (BoT-R), a multi-agent framework where models collaboratively reflect and refine their reasoning, effectively leveraging mutual information beyond isolated inference.

Unfolding and Modeling the Recovery Process after COVID Lockdowns   
Xuan Yang, Yang Yang, Chenhao Tan, Yinghe Lin, Zhengzhe Fu, Fei Wu, Yueting Zhuang
Nature Scientific Reports, 2022
We present a graph-learning–based computational framework leveraging electricity consumption data to analyze post-lockdown recovery dynamics. Our approach quantifies sector-specific impacts, evaluates the effectiveness of government recovery policies, and models inter-sector dependencies to inform more effective strategies for holistic economic revitalization.

Who's Next: Rising Star Prediction via Diffusion of User Interest in Social Networks   
Xuan Yang, Yang Yang, Jintao Su, Yifei Sun, Shen Fan, Zhongyao Wang
IEEE Transactions on Knowledge and Data Engineering, 2022
We propose RiseNet, a novel recommendation framework designed to identify potential “Rising Star” items and mitigate unfairness in recommendation systems. RiseNet models the dynamic diffusion of user interests alongside temporal item features, using a coupled mechanism to capture their interactions and a multi-task GNN-based framework to quantify user interest. Experiments on real-world Taobao data demonstrate its effectiveness in predicting emerging popular items.

DropMessage: Unifying Random Dropping for Graph Neural Networks   
Taoran Fang, Zhiqing Xiao, Chunping Wang, Jiarong Xu, Xuan Yang, Yang Yang
AAAI, 2023 (Distinguished Paper Award)
We present a unified framework that generalizes existing random dropping techniques by applying dropping operations to the message matrix in Graph Neural Networks (GNNs). Building on this, we propose DropMessage, a versatile method applicable to any message-passing GNN. Theoretically, DropMessage improves training stability by reducing sample variance and enhances information diversity from an information-theoretic perspective.


Internship Experience
Tiktok logo

Tiktok    Bellevue, WA
Research Intern, Risk Control team
May 2025 -- Nov 2025

Alibaba logo

Alibaba Group    Hangzhou, China
Research Intern, Data Assets and Algorithm team
Oct 2020 -- Dec 2021

Stanford logo

Stanford University    Palo Alto, CA
Research Assistant, Center for Magnetic Nanotechnology
Jan 2019 -- Mar 2019

NUS logo

National University of Singapore    Singapore
Research Assistant, Big Brain, BIGHEART
Jun 2018 -- Aug 2018