Jing Liu(刘璟)

Distinguished Architect
Natural Language Processing Department at Baidu Inc.


I am a distinguished architect at Baidu NLP since December 2017, Before that, I was a researcher at Microsoft Research Asia (MSRA) from September 2014 to December 2017. I obtained Ph.D. degree in computer science from Harbin Institute of Technology (HIT) under the supervision of Prof. Hsiao-Wuen Hon (MSRA), Prof. Ting Liu (HIT) and Dr. Chin-Yew Lin (MSRA) in September 2014.


Please contact me via legendarydan (at) gmail (dot) com
My Sina Weibo (in Chinese) and Twitter (in English)

News

  • Please send me emails with your resume (for internships or FTE positions) if you are interested in working with us on augmented large language models, including autonomous agents, question answering, automated numerical reasoning and theorem proving. Experiences with machine (incl. but not limited to deep) learning for NLP are preferred.
  • May 2023: Our work on generative retrieval 📚 TOME was accepted by ACL 2023.
  • Oct 2022: DuReaderretrieval and DuQM were accepted by EMNLP 2022.
  • Mar 2022: We released a large-scale Chinese dataset DuReaderretrieval for passage retrieval.
  • Sep 2021: We released a dataset DuQM for robust question matching.
  • Aug 2021: 🚀 RocketQAv2 was accepted by EMNLP 2021.
  • Aug 2021: Dr. Hua Wu and I gave a talk - Benchmarks: An Industry Perspective at ACL-2021 Workshop on Benchmarking: Past, Present and Future (BPPF).
  • May 2021: 💑 PAIR and DuReaderrobust were accepted by ACL 2021.
  • Apr 2021: We released a dataset DuReaderchecklist for robust machine reading comprehension.
  • Mar 2021: 🚀 RocketQAv1 was accepted by NAACL 2021.
  • Oct 2020: We proposed RocketQAv1, an optimized training approach to dense passage retrieval for open-domain question answering. RocketQA achieved the 1st rank at the leaderboard of MSMARCO Passage Ranking Task. It was featured in zh-cn and en-us.
  • Aug 2020: We lunched an open-source project of Chinese NLP benchmarks LUGE(千言).
  • Apr 2020: We released a dataset DuReaderrobust for robust machine reading comprehension.
  • Nov 2019: Our proposed machine reading comprehension system D-NET was ranked at top 1 in the MRQA 2019 Shared Task, that tests if MRC systems can generalize beyond the datasets on which they were trained. D-NET was featured in zh-cn (1, 2) and en-us.

Papers [Google Scholar]

Datasets

Professional Activities

  • Area Chair: ACL 2021 (Question Answering), AACL 2022 (Question Answering)
  • Session Chair: AACL 2020 (Question Answering)
  • Program commitee/reviewer, ACL, EMNLP, NAACL, EACL, AACL, SIGIR, KDD, WSDM, WWW, CIKM, ICWSM, ACM Transactions on the Web (TWEB), ACM Transactions on Intelligent Systems and Technology (TIST), ACM Transactions on Information Systems (TOIS), Frontiers of Computer Science (FCS)

Working Experience

  • Principal Architect, Baidu NLP, Dec. 2017 - present
  • Researcher, Microsoft Research Asia, Sep. 2014 - Dec. 2017
  • Intern, Microsoft Research Asia, Jul. 2009 - Sep. 2014

Educations

  • PhD, Computer Science, Harbin Institute of Technology, Sep. 2009 - Sep. 2014
  • M.Sc, Computer Science, Harbin Institute of Technology, Sep. 2007 - Jul. 2009
  • B.Sc, Computer Science, Xidian University, Sep. 2003 - Jul. 2007

Last updated Jan 30 2022 (This template was originally designed by Mu Li.)