![]() |
I am currently a postdoc research fellow of NExT++ research center, National University of Singapore, working with Prof. Tat-Seng Chua. I am also an associate researcher at Singapore SEA AI lab, (previously) working with Prof. Shuicheng Yan. Prior to that, I received my Ph.D degree at Wuhan University, advised by Prof. Donghong Ji; and received B.E degree from Xidian University. I was an intern at Baidu Inc. when I was in my B.E stage.
My research direction lies in the intersection of Natural Language Processing (NLP) and Computer Vision (CV), i.e., Vision-Language Learning or Multimodal Machine Learning.
My interests broadly cover the NLP and multimedia applications, such as
Langauge Modeling, Information Extraction, Affective Computing, Syntax/Semantic Parsing,
Text-to-Image/Video Generation, Image/Video Synthesis.
I am apt to construct learning models, with the fundamental goal of building systems capable of human-level understanding of the world.
My ongoing research focuses on the particular angle of Structure-aware Intelligence Learning
(SAIL),
which aims at enhancing the semantics understanding of varied modalities with the intrinsic data structure modeling.
The SAIL idea works effectively for the deep learning based AI, and also holds for the current large language models (LLMs),
which will ultimately help achieve AGI of universal modalities (world modeling).
See my research statement for more details.
Also, I believe so much that the key to realizing human-level AGI lies in two fundamental aspects simultaneously,
A. human-level complex reasoning ability
and B. mastering of the world knowledge
, with one not doing without the other.
My research has been published in top-tier ML/NLP/DM venues, including, ICML, NeurIPS, ACL, AAAI, WWW, SIGIR, IJCAI, EMNLP, COLING, TOIS, TNNLS, TASLP, etc. My Ph.D thesis was awarded the Excellent Doctoral Thesis of Chinese Information Processing Society (CIPS). I won more than ten honors and awards when I was in Ph.D stage. I’ve served as (Senior) Area Chair or Senior Program Committee of top-tier conferences, such as ACL, IJCAI, EMNLP, WSDM, ARR. Also, I am the persistently-invited reviewer for prestigious journals including TPAMI, TNNLS, TKDE, TOIS, TAFFC and TASLP, etc. I am (was) the organization committee of WSDM 2022 (Volunteer Chair), NSSDM 2023 (Program Chair), EMNLP 2023 (Workshop Chair).
I am constantly looking for collaborations. Hit me up, if you are a Ph.D/master/bachelor student and interested in what I am doing now. When you are from Chinese universities, there are also vacancies for research interns (e.g., self-/CSC-funded joint PhD project). Please describe your research status and attach your resume.
We are excited to announce the release of NExT-GPT (Demo, Code, Paper), the first end-to-end MM-LLM that perceives input and generates output in arbitrary combinations (any-to-any) of text, image, video, and audio and beyond.
• 25 Aug 2023Invited to give a talk at WING lab @ NUS, on the topic of LLM-Empowered Text-to-Vision Diffusion Models
• 23 Aug 2023Invited to give a talk at Institute of Computing Technology, Chinese Academy of Sciences, on the topic of Scene Graph-driven Structured Vision-Language Learning
• 10 Aug 2023Four papers are accepted by ACM MM 2023, about Text-to-Image Generation, Multimodal Emotion Recognition, Video Semantic Role Labeling, and Video Moment Retrieval, Congrats to my co-authors!
• 4 Aug 2023Our Universal Structured NLP (XNLP) demonstration system has been launched online, access it here, paper.