About Me
I am a Research Engineer at Alibaba DAMO Academy. I obtained my PhD degree from The Chinese University of Hong Kong (CUHK) in 2021, under the supervision of Prof. Wai Lam. Before starting my PhD career, I was an undergraduate at Sun Yat-sen University (SYSU). I have interned at Microsoft Research Asia, Tencent AI Lab and Huawei Noah’s Ark Lab.
Contact E-mail: lixin4ever@gmail.com.
Research Interests
- Efficient LLM Fine-tuning & Inference
- Multimodal & Multilingual LLMs
Education
Experiences
- Jul 2020-Feb 2021, Research Intern, Speech and Semantics Group@Huawei Noah’s Ark Lab. Mentor: Dr. Yi Liao.
- Jan 2018-Apr 2018, Research Intern, NLP Center@Tencent AI Lab. Mentor: Dr. Lidong Bing and Dr. Piji Li.
- Jul 2015-Jun 2016, Research intern, Knowledge Computing Group@Microsoft Research Asia. Mentor: Dr. Chin-Yew Lin and Dr. Jing Liu
- Sep 2014-July 2015, Research assistant, SentiNet Group, Sun Yat-Sen University. Supervisor: Prof. Rao Yanghui
Recent Preprints & Publications [Full List]
- VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
Boqiang Zhang* , Kehan Li* , Zesen Cheng* , Zhiqiang Hu* , Yuqian Yuan* , Guanzheng Chen* , Sicong Leng* , Yuming Jiang* , Hang Zhang* , Xin Li* , Peng Jin, Wenqi Zhang, Fan Wang, Lidong Bing, Deli Zhao.
arXiv:2501.13106
[code][checkpoints & demos]
- VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Zesen Cheng* , Sicong Leng* , Hang Zhang* , Yifei Xin* , Xin Li* , Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, Lidong Bing.
arXiv:2406.07476
[code]
[checkpoints & demos]
- VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan, Hang Zhang, Wentong Li, Zesen Cheng, Boqiang Zhang, Long Li, Xin Li, Deli Zhao, Wenqiao Zhang, Yueting Zhuang, Jianke Zhu, Lidong Bing.
arXiv:2501.00599
[code][checkpoints & benchmark]
- 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li, Jiashuo Sun, Yongliang Shen, Weiming Lu, Deli Zhao, Yueting Zhuang, Lidong Bing.
arXiv:2501.00958
[code][dataset]
- The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Sicong Leng, Yun Xing, Zesen Cheng, Yang Zhou, Hang Zhang, Xin Li, Deli Zhao, Shijian Lu, Chunyan Miao, Lidong Bing.
arXiv:2410.12787
[code][leaderboard]
- Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Zesen Cheng* , Hang Zhang* , Kehan Li* , Sicong Leng, Zhiqiang Hu, Fei Wu, Deli Zhao, Xin Li^ , Lidong Bing.
arXiv:2410.17243
[code][pypi]
- LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
Guanzheng Chen, Xin Li, Michael Shieh, Lidong Bing.
To appear in ICLR 2025 (Full paper).
- Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs
Sen Yang, Xin Li, Leyang Cui, Lidong Bing, Wai Lam.
To appear in Findings of NAACL 2025 (Full paper).
[code by Sen]
Honors & Awards
- AAAI Student Scholarship, 2020
- Outstanding Graduates Awards, Sun Yat-Sen University.
- Excellent Undergraduate Thesis award, Sun Yat-Sen University.
Professional Activities
- Reviewer (or PC Member):
- ACL 2020-2023, EMNLP 2018-2023, NAACL 2021
- AAAI 2019-2020
- WSDM 2023, CIKM 2021
- ACM Transactions on Knowledge Discovery from Data (TKDD)
- ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
- IEEE Transaction on Multimedia (TMM)
- Neurocomputing
- Computational Intelligence
Some Useful Notes & Links
Hobbies
- Playing basketball
- Swimming
- Hiking