I lead the LLM Post-Training team at Baidu, where our work spans post-training alignment (RLHF, agentic RL, reasoning RL, reward modeling, on-policy distillation, data curation), general agents (tool use, planning, multi-agent systems), coding agents (SWE agent, WebDev agent), and deep search (deep search, wide search, deep research).
Before joining Baidu in December 2017, I was a researcher at Microsoft Research Asia (MSRA), working on information retrieval, question answering, and knowledge bases. I obtained my Ph.D. in computer science from Harbin Institute of Technology (HIT) under the supervision of Prof. Hsiao-Wuen Hon, Prof. Ting Liu, and Dr. Chin-Yew Lin.
The LLM Post-Training team at Baidu is looking for talented engineers and researchers. Both internship and full-time positions are available.
RLHF, agentic RL, reasoning RL, reward modeling, on-policy distillation, data curation
SWE agent, WebDev agent
Tool use, planning, multi-agent systems
Deep search, wide search, deep research
Interested? Send your resume to legendarydan (at) gmail (dot) com
An open-source project of Chinese NLP benchmarks, jointly launched by Baidu, CCF, and CIPSC. LUGE aims to provide comprehensive evaluations across multiple NLP tasks.
A large-scale Chinese dataset for passage retrieval, containing over 90K questions and 8M passages from Baidu Search.
A Chinese dataset for evaluating the robustness of question matching models across lexical, syntactic, and semantic dimensions.