I am a fourth-year direct Ph.D. student in Artificial Intelligence at Xiโan Jiaotong University, where I am advised by Prof. Badong Chen. Prior to my doctoral studies, I received my Bachelorโs degree in Automation from Chongqing University in 2022, under the supervision of Prof. Min Zhao.
๐ญ My research interests lie in generalization in computer vision and robot learning, vision-language models, and vision language action models.
โ๏ธ Welcome to contact me for any discussion and cooperation!
๐ฅ News
- [2025/08/02]: Long-VLA - the first VLA model to enable skill chaining in long-horizon tasks, accompanied by the introduction of a new benchmark, L-CALVIN; accepted to CoRL 2025! See Project page.
- [2025/05/06]: We released OpenHelix, which provides a short survey and empirical analysis of dual-system VLA, and introduces a novel open-source dual-system VLA model.
- [2025/05/01]: BC-IB, the first to introduce information bottleneck theory into robotic manipulation through visual imitation learning under the lens of information theory, got accepted for ICML 2025! See Project page.
- [2025/03/24]: One paper on causal discovery that integrates Minimum Error Entropy to enable dynamic adaptation to varying levels of complexity and noise got accepted for Neural Networks 2025!
- [2025/01/23]: VLAS, the first vision-language-action model that incorporates speech instructions for robotic manipulation, got accepted for ICLR 2025!
๐ Historical News
- [2024/12/21]: PromptTA, a novel VLM-based source-free domain generalization method integrating a text adapter and diverse prompt inputs, got accepted by ICASSP 2025!
- [2024/10/23]: The GitHub repository Awesome-Robotics-Manipulation is now public! Letโs work together to build a comprehensive and valuable resource for the robotics and AI community!
- [2024/07/04]: SPG, a novel VLM-based domain generalization method that introduces generative concepts into prompt learning, got accepted by ECCV 2024.
- [2024/05/02]: JRNGC, a unified causal discovery method that leverages the Jacobian matrix to address high-dimensional multivariate causal discovery, got accepted by ICML 2024!
- [2024/12/14]: One paper on cross-domain few-shot classification got accepted by ICASSP 2024.
- [2023/12/09]: PDA, a novel VLM-based prompt learning approach for unsupervised domain adaptation that integrates and thoroughly evaluates diverse prompt learning methods, got accepted by AAAI 2024!
๐ Publications
Published


Soft Prompt Generation for Domain Generalization
Shuanghao Bai*, Yuedi Zhang*, Wanqi Zhou, Yicong He, Zhirong Luan, Badong Chen
ECCV 2024
Paper | arXiv | Code | Chinese Intro


Yiguo Fan*, Pengxiang Ding*, Shuanghao Bai*, Xinyang Tong*, Yuyang Zhu, Hongchao Lu, Fengqi Dai, Wei Zhao, Yang Liu, Siteng Huang, Zhaoxin Fan, Badong Chen, Donglin Wang. "Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation". [arXiv] [Project]
Wanqi Zhou, Shuanghao Bai, Qibin Zhao, Badong Chen. "An Information-Theoretic Approach for Heterogeneous Differentiable Causal Discovery". [Paper] [Code]
Wei Zhao, Pengxiang Ding, Zhang Min, Zhefei Gong, Shuanghao Bai, Han Zhao, Donglin Wang. "VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation". [arXiv] [Code]
Haoran Zhang*, Shuanghao Bai*, Wanqi Zhou, Jingwen Fu, Badong Chen. "PromptTA: Prompt-driven Text Adapter for Source-free Domain Generalization". [Paper] [arXiv] [Code]
Shuanghao Bai, Wanqi Zhou, Zhirong Luan, Donglin Wang, Badong Chen. "Improving Cross-domain Few-shot Classification with Multilayer Perceptron". [Paper] [arXiv] [Code]
Preprints & Under Submission
Yuedi Zhang, Shuanghao Bai, Wanqi Zhou, Zhirong Luan, Badong Chen. "Dual-Path Stable Soft Prompt Generation for Domain Generalization". [arXiv] [Code]
Can Cui, Pengxiang Ding, Wenxuan Song, Shuanghao Bai, Xinyang Tong, Zirui Ge, Runze Suo, Wanqi Zhou, Yang Liu, Bofang Jia, Han Zhao, Siteng Huang, Donglin Wang. "Openhelix: A Short Survey, Empirical Analysis, and Open-source Dual-system VLA Model for Robotic Manipulation". [arXiv] [Code] [Project]
Wanqi Zhou*, Shuanghao Bai*, Danilo Mandic, Qibin Zhao, Badong Chen. "Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective". [arXiv] [Code]
๐ Research Experience
- Beijing Academy of Artificial Intelligence (BAAI): Aug. 2025 - Now
- Research Intern
- Research Direction: Embodied Multimodal Foundation Models, e.g. VLA models.
- Advisor: Chi Cheng
- Co-Advisor: Shanghang Zhang
- Westlake University - MiLab: Sept. 2024 โ Mar. 2025
- Visiting Student
- Research Direction: Robotics.
- Advisor: Donglin Wang
๐ Honors and Awards
- National Scholarship, 2024
- Outstanding Undergraduate Thesis of College of Automation, Chongqing University, 2022
- National Scholarship, 2019
- Outstanding Student of Chongqing University, 2019