Xuan Yang's Homepage

Xuan(Lily) Yang 杨萱

I am currently a computer science PhD student at Duke University, advised by Dr. Jian Pei.

I obtained my bachelor degree from CS Department, Zhejiang University. I have visited Standford and NUS as a research intern.

My research focuses on Data-centric Machine Learning and Data Management.

Contact: xuan.yang {at} duke [dot] edu

Email / CV (2024/10/20) / Linkedin

Research Projects

	Local Shapley: Exploring Locality in Shapley Value Computation Xuan Yang, Jian Pei For machine learning models, we assess the importance of data using the Shapley value methods. To overcome the high computational costs of Shapley value calculations, we leverage the “locality” properties inherent in models such as K-Nearest Neighbors (KNN) and Graph Neural Networks (GNN). This inspired the development of the Local Shapley method, which integrates localized computations to enhance efficiency while maintaining the four key properties of Shapley values: fairness, efficiency, symmetry, and additivity. Additionally, we propose efficient sampling-based approximation algorithms to extend the method's applicability to deep learning models.
	Who's Next: Rising Star Prediction via Diffusion of User Interest in Social Networks Xuan Yang, Yang Yang, Jintao Su, Yifei Sun, Shen Fan, Zhongyao Wang, Jun Zhan, and Jingmin Chen IEEE Transaction on Knowledge and Data Engineering (TKDE), 2022 We aim to find potential items (Rising Star) to alleviate the unfair problem in recommendation system. We design a novel model, RiseNet, to incorporate the dynamic user interest diffusion process with the item temporal features to effectively predict rising stars. Specifically, we adopt a coupled mechanism to capture the dynamic interplay between items and user interest, and a multi-task GNN-based framework to quantify user interest. We apply our method to real-world application on Taobao. [ paper / slides ]
	Unfolding and Modeling the Recovery Process after COVID Lockdowns Xuan Yang, Yang Yang, Chenhao Tan, Yinghe Lin, Zhengzhe Fu, Fei Wu, and Yueting Zhuang Accepted by Scientific Reports, 2022 Lockdown is a common policy used to deter the spread of COVID-19. However, it remains an open question how our society comes back to life after a lockdown. Here, based on electricity data, we propose novel computational methods and a graph-learning based model to answer the following questions: How the impact varies by sector? Do government policies to promote recovery in different sectors really work? How sectors can influence each other's recovery and how government should better develop policies to support the overall economic recovery? [ paper / poster ]
	DropMessage: Unifying Random Dropping for Graph Neural Networks Taoran Fang, Zhiqing Xiao, Chunping Wang, Jiarong Xu, Xuan Yang, and yang Yang Distinguished paper, AAAI 2023 We unify all the random dropping methods into our framework via performing dropping on the message matrix and analyze their affects for GNN models. Furthermore, we propose a new random dropping method called DropMessage, which performs dropping operations directly on the message matrix and can be uniformly applied to any message-passing GNNs. We theoretically demonstrate the superiority of DropMessage: it stabilizes the training process by reducing the sample variance; it keeps the information diversity from the perspective of information theory. [ paper ]
	DeepGraphlet:Estimating Local Graphlet Frequencies with Graph Neural Networks Jintao Su, Yang Yang, Xuan Yang, Yuxiao Dong and ChilieTan Under Review, 2022 We make the first attempt to compute Local graphlet frequencies for billion-scale graphs by transforming the task into a machine learning problem. We design the DeepGraphlet with k-tuple feature and multi-task and theoretically prove it can exceed the limitation of GNNs expressive power. Experiments on real graphs demonstrate that our method improves the estimation accuracy 60+%. It achieves 20x speedup on hundreds of millions-scale graphs, and handles graphs with billions of edges. [ paper / code ]
	Multimodal learning with graph alignment Xuan Yang, Quanjin Tao, Xiao Feng, Xiang Ren, Shouling Ji, and Yang Yang Preprint In social media, the user's interaction graph, images and text information are all important for user modelling. In order to better fuse these three multimodal information, we propose an efficient multimodal pre-training framework MMGA. A multi-step graph alignment method is introduced to allow the three modalities to supervise each other's learning and enhance mutual information. Furthermore, we create the first large-scale multimodal social media dataset with graph to facilitate future research. [ paper / code ]
	Detecting Telecommunication Frauds by Human-in-the-Loop Graph Neural Networks Teng Ke, Yang Yang, Shiliang Pu, Xuan Yang, Quanjin Tao, Yifei Sun, Weihao Jiang, Hui Wang, and Yingye Yu Under Review, 2022 We conduct observations on the calling network and find that fraudulent users appear to disguise their characteristics and do not aggregate with each other, which makes traditional GNNs ineffective. we design a local adaptive graph attention network to better identify fraudulent users. Also, to improve the interpretability of GNN models, a subgraph-level human-in-the-loop framework is introduced in the model learning process to guide the GNN subgraph aggregation process by introducing human a priori knowledge. [ paper ]

About Me

I have conducted some interesting projects:

In my free time I like to hike, write and take photos.

Some photos of my life beyond research works：