Qiaosheng (Eric) Zhang 张乔生
About me
Grants and Awards
Excellent Young Scientists Fund (Overseas) by NSFC
Young Scholars Award, Information Theory Society of Chinese Institute of Electronics (2024)
Outstanding Teaching Assistant Award, Department of Information Engineering, CUHK (2019)
Ph.D./Intern Opportunites
I am seeking self-motivated Ph.D. students (in collaboration with SJTU or Fudan) and research interns to join our team at the Shanghai AI Lab, and become part of our proud academic heritage (Check details )!
Research
Research interests
Reinforcement Learning (RL) : Online/Offline RL, RL from Human Feedback (RLHF), Multi-agent RL
Large Language Model (LLM) : LLM reasoning, LLM Agent, LLM Safety, RLHF
Information Theory : Covert communication, Information-theoretic security, Identification, Mismatched decoding
Community Detection (a.k.a. clustering) : Stochastic Block Model (SBM), Contextual SBM, Hypergraph SBM
Projects
Selected Publications
† denotes students/interns (currently or previously) mentored by me.
A Beginner-Friendly Tutorial on LLM-based Agents
Submitted to Proceedings of the IEEE (preprint: https://llm-agent-tutorial.github.io/website/).
Capacity Region for Covert Secret Key Generation over Multiple Access Channels
Y. Zhang, L. Zhou, Q. Zhang
IEEE Transactions on Information Theory (accepted in part in IEEE ITW 2025)
The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants
Y. Zhang, H. Li, C. Wang†, L. Chen, Q. Zhang , et al.
AAAI 2026 (Oral)
Robust RLHF for Human Preference with Instance-Dependent Flipping
Y. Xu†, X. Ye, Y. Chen, Q. Zhang
AAAI 2026
ICL-Router: In-Context Learned Model Representations for LLM Routing
C. Wang†, H. Li, Y. Zhang, L. Chen, J. Chen, P. Jian, Q. Zhang , S. Hu.
AAAI 2026
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
J. Chen, Z. Xun, B. Zhou†, H. Qi†, H. Zhang†, Q. Zhang , et al.
AAAI 2026
Adaptive Theory of Mind for LLM-based Multi-Agent Coordination
C. Mu†, Y. Zeng, Q. Zhang , et al.
AAAI 2026
Unsupervised Skill Discovery through Skill Regions Differentiation
T. Xiao, J. Zheng, R. Yang, K. Xu, Q. Zhang , P. Liu, Z. Wang, C. Bai
IEEE Transactions on Neural Networks and Learning Systems, 2025
Graph Feedback Bandits on Similar Arms: With and Without Graph Structures
H. Qi†, F. Guo, L. Zhu, Q. Zhang
Submitted to IEEE Transactions on Information Theory
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
Q. Zhang , C. Bai, S. Hu, Z. Wang, X. Li
Artificial Intelligence (AIJ), 2025.
Sample-Efficient Reinforcement Learning from Human Feedback via Information-Directed Sampling
H. Qi†, H. Yang, Q. Zhang , Z. Yang
IEEE Transactions on Information Theory, 2025
Graph Attention is Not Always Beneficial: A Theoretical Analysis of Graph Attention Mechanisms via Contextual Stochastic Block Models
Z. Ma†, Q. Zhang , B. Zhou†, Y. Zhang†, S. Hu, Z. Wang
International Conference on Machine Learning (ICML), 2025.
ROME is Forged in Adversity: Robust Distilled Datasets via Information Bottleneck
Z. Zhou, W. Feng, Q. Zhang , S. Lyu, Q. Zhao, G. Cheng
International Conference on Machine Learning (ICML), 2025.
Online Preference Alignment for Language Models via Count-based Exploration
C. Bai, Y. Zhang, S. Qiu, Q. Zhang , K. Xu, X. Li
International Conference on Learning Representations (ICLR), 2025 (Spotlight, Top 5.1% ).
Multi-LLM-Agents Debate - Performance, Efficiency, and Scaling Challenges
H. Zhang†, Z. Cui, Q. Zhang , S. Hu
International Conference on Learning Representations (ICLR), Blogpost Track, 2025.
Community Detection for Contextual-LSBM: Theoretical Limitations of Misclassification Rate and Efficient Algorithms
D. Jin†, Y. Zhang, Q. Zhang
IEEE International Symposium on Information Theory (ISIT), 2025
Optimal Information Security Against Limited-View Adversaries: The Benefits of Causality and Feedback
M. Bakshi, S. Kadhe, Q. Zhang , S. Jaggi, A. Sprintson
IEEE Transactions on Communications, 2025
Matrix Completion with Hypergraphs: Sharp Thresholds and Efficient Algorithms
Z. Ma†, Q. Zhang , Z. Wang
Learning on Graphs Conference (LoG), 2024.
Constrained Ensemble Exploration for Unsupervised Skill Discovery
C. Bai, R. Yang, Q. Zhang , K. Xu, Y. Chen, T. Xiao, X. Li
International Conference on Machine Learning (ICML), 2024.
On the Role of General Function Approximation in Offline Reinforcement Learning
C. Mao†, Q. Zhang , Z. Wang, X. Li
International Conference on Learning Representations (ICLR), 2024. (Spotlight, Top 5% )
Enhancing Covert Communication in OOK Schemes by Phase Deflection
X. Ji, R. Zhu, Q. Zhang , C. Li, D. Cao
IEEE Transactions on Information Forensic and Security, 2024.
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
C. Wang, X. Yu, C. Bai, Q. Zhang , Z. Wang
SCIENCE CHINA Information Sciences, 2024.
Exact Recovery in the General Hypergraph Stochastic Block Model
Q. Zhang , V. Y. F. Tan
IEEE Transactions on Information Theory, 2023.
Covert Communication with Mismatched Decoders
Q. Zhang , V. Y. F. Tan
IEEE Transactions on Information Theory, 2023. (accepted in part in IEEE ISIT 2022)
Optimal Information Security Against Limited-view Adversaries: Beyond MDS Codes
Q. Zhang , S. Kadhe, M. Bakshi, S. Jaggi, A. Sprintson
IEEE Transactions on Communications, 2023. (accepted in part in IEEE ISIT 2015 and IEEE ITW 2015)
Covert Communication Gains from Adversary’s Uncertainty of Phase Angles
S. Qiao, D. Cao, Q. Zhang , Y. Xu, G. Liu
IEEE Transactions on Information Forensic and Security, 2023.
MC2G: An Efficient Algorithm for Matrix Completion with Social and Item Similarity Graphs
Q. Zhang #, G. Suh#, C. Suh, V. Y. F. Tan (# indicates equal contribution)
IEEE Transactions on Signal Processing, 2022.
Covert Communication over Adversarially Jammed Channels
Q. Zhang , M. Bakshi, S. Jaggi
IEEE Transactions on Information Theory, 2021. (accepted in part in IEEE ITW 2018)
Covert Identification over Binary-Input Discrete Memoryless Channels
Q. Zhang , V. Y. F. Tan
IEEE Transactions on Information Theory, 2021.
Community Detection and Matrix Completion with Social and Item Similarity Graphs
Q. Zhang , V. Y. F. Tan, C. Suh
IEEE Transactions on Signal Processing, 2021. (accepted in part in IEEE ISIT 2020)
Optimal Change-Point Detection with Training Sequences in the Large and Moderate Deviations Regimes
H. He, Q. Zhang , V. Y. F. Tan
IEEE Transactions on Information Theory, 2021. (accepted in part in IEEE ISITA 2020)
Covert Communication with Polynomial Computational Complexity
Q. Zhang , M. Bakshi, S. Jaggi
IEEE Transactions on Information Theory, 2020. (accepted in part in IEEE ISIT 2016)
Stealthy Communication Over Adversarially Jammed Multipath Networks
J. Song†, Q. Zhang , S. Kadhe, M. Bakshi, S. Jaggi
IEEE Transactions on Communications, 2020. (accepted in part in IEEE ISIT 2018)
Undetectable Radios: Covert Communication under Spectral Mask Constraints
Q. Zhang , M. Bloch, M. Bakshi, S. Jaggi
IEEE International Symposium on Information Theory (ISIT), 2019