Hao Fei

Research Fellow

School of Computing, National University of Singapore
5 Prince George's Park, Singapore, 118404


I am a post-doctoral research fellow at National University of Singapore, working with Prof. Tat-Seng Chua at NExT++ research center, and Prof. Wynne Hsu and Prof. Mong Li Lee at Institute of Data Science. I am also an associate researcher at Kunlun 2050 Research, Skywork AI Lab Singapore, working with Prof. Shuicheng Yan (previously an associate researcher at SEA AI lab). Prior to that, I received my Ph.D. degree from Wuhan University. I was an intern at Baidu Inc. when I was a bachelor.

Research Briefing

My research direction lies in the intersection of Natural Language Processing (NLP) and Computer Vision (CV), i.e., Vision-Language Learning, with broad-covering interests, such as Large Language Model, Information Extraction, Affective Computing, Syntax/Semantic Parsing, Text-Image/Video/Audio/3D Modeling, Cross-modal Reasoning. I am apt to construct learning models, with the fundamental goal of building systems capable of human-level understanding of the world. My ongoing research focuses on the particular angle of Structure-aware Intelligence Learning (SAIL), which aims at enhancing the semantics understanding of varied modalities with the intrinsic data structure modeling. The SAIL idea works effectively for the deep learning based AI, and also holds for the current large language models (LLMs), which will ultimately help achieve AGI of universal modalities (world modeling). See my research statement for more details. Also, I believe so much that the key to realizing human-level AGI lies in two fundamental aspects simultaneously, A. human-level complex reasoning ability and B. mastering of the world knowledge, with one not doing without the other.

My research has been published in top-tier ML/NLP/DM venues, including, ICML, NeurIPS, ACL, CVPR, AAAI, WWW, SIGIR, IJCAI, EMNLP, ACM MM, TPAMI, TKDE, TOIS, TNNLS, TASLP, etc. My Ph.D thesis was awarded the Excellent Doctoral Thesis of Chinese Information Processing Society (CIPS). I won more than ten honors and awards when I was in Ph.D stage. I’ve served as (Senior) Area Chair or Senior Program Committee of top-tier conferences, such as ACL, AAAI, IJCAI, EMNLP, NAACL, WSDM, COLING, ARR. I am a persistently-invited reviewer for prestigious journals including TPAMI, TNNLS, TKDE, TOIS, TAFFC and TASLP, etc. I am (was) the organization committee of WSDM 2022 (Volunteer Chair), NSSDM 2023 (Program Chair), EMNLP 2023 (Workshop Chair), SSNLP 2023 (Organizing Committee), ACL 2024 (Volunteer Chair). Also I am the Associate Editor of some journals, including TALLIP and Neurocomputing.


I am constantly looking for collaborations, especially on the topics mentioned above. Remote manner is also supported. For promising students I will provide sufficient GPUs. Hit me up, if you are a Ph.D/master/bachelor student and interested in what I am doing now. When you are from Chinese universities, there are also vacancies for research interns (e.g., self-/CSC-funded joint PhD project). Please describe your research status and attach your resume.


  2 May 2024

Three papers are accepted by ICML 2024, 1) NExT-GPT, 2) Video-of-Thought and 3) Video-LLM Momentor. Congrats to all my co-authors!

  25 April 2024

One paper about Video-Language Modeling is accepted by TPAMI!

  16 April 2024

One paper about Few-shot Named Entity Recognition is accepted by TKDE!

  15 April 2024

We are excited to announce the release of Vitron (Demo, Paper, Code), a universal pixel-level vision LLM designed for comprehensive understanding (perceiving and reasoning), generating, segmenting (grounding and tracking), editing (inpainting) of both static image and dynamic video content.

  4 March 2024

We are holding the Grand Challenge of Visual Spatial Description (VSD) at ACM Multimedia 2023. Welcome participant!

... see all News