Natural Language Processing Department at Baidu Inc.
I am a principal architect and a tech lead of deep question answering team at Baidu NLP since December 2017. Before that, I was a researcher at Microsoft Research Asia (MSRA) from September 2014 to December 2017. I obtained Ph.D. degree in computer science from Harbin Institute of Technology (HIT) under the supervision of Prof. Hsiao-Wuen Hon (MSRA), Prof. Ting Liu (HIT) and Dr. Chin-Yew Lin (MSRA) in September 2014. My research interests include question answering, information extraction and social computing.
- Please send me emails with your resume (for internships or FTE positions) if you are interested in working with us on question answering and machine reading comprehension. Experiences with machine (incl. but not limited to deep) learning for NLP are preferred.
- Aug 2021: RocketQAv2 has been accepted by EMNLP 2021.
- Aug 2021: Hua Wu and I gave a talk - Benchmarks: An Industry Perspective at ACL-2021 Workshop on Benchmarking: Past, Present and Future (BPPF).
- Aug 2021: The [codes and models] of PAIR for dense retrieval have been released at github (repo).
- Jun 2021: The [codes and models] of RocketQAv1 for dense retrieval have been released at github (repo).
- May 2021: PAIR[paper] and DuReaderrobust [paper] were accepted by ACL 2021.
- Apr 2021: We released a Chinese dataset namely DuReaderchecklist [data and code] focus on challenging the machine reading comprehension models from multiple aspects, including understanding of vocabulary, phrase, semantic role, reasoning and so on. We hosted a shared task of DuReaderchecklist [leaderboard] at 2021 Language and Intelligence Challenge, and there were more than 1,080 teams and more than 4,800 submissions in the shared task. The shared task was featured in zh-cn.
- Mar 2021: RocketQAv1[paper] was accepted by NAACL 2021.
- Oct 2020: We proposed RocketQA [paper], an optimized training approach to dense passage retrieval for open-domain question answering. RocketQA achieved the 1st rank at the leaderboard of MSMARCO Passage Ranking Task. It was featured in zh-cn and en-us.
- Aug 2020: Baidu, CCF (China Computer Federation) and CIPSC (Chinese Information Processing Society of China) jointly lunched the project of LUGE（千言）[portal], that is an open-source project of Chinese NLP benchmarks. Specifically, LUGE aims to provide researchers with various kinds of data sets and comprehensive evaluations, and promotes the progress of Chinese NLP technology. Currently, we have collected more than 20 NLP datasets for 7 tasks from the great contributors of 11 organizations. LUGE was featured in videos (zh-cn, en-us) and articles (zh-cn). If you are interested in LUGE, pls. contact me.
- Apr 2020: We released a Chinese dataset namely DuReaderrobust [paper][data and code] towards evaluating the robustness of machine reading comprehension models. We hosted a shared task of DuReaderrobust [leaderboard] at 2020 Language and Intelligence Challenge, and there were more than 1,500 teams and more than 4,600 submissions in the shared task. The shared task was featured in zh-cn.
Nov 2019: Our proposed machine reading comprehension system D-NET [paper][code] was ranked at top 1 in the MRQA 2019 Shared Task, that tests if MRC systems can generalize beyond the datasets on which they were trained. D-NET was featured in zh-cn (
1, 2) and en-us.
- Area Chair: ACL 2021 (Question Answering)
- Session Chair: AACL 2020 (Question Answering)
- Program commitee/reviewer, ACL, EMNLP, NAACL, EACL, AACL, SIGIR, KDD, WSDM, WWW, CIKM, ICWSM, ACM Transactions on the Web (TWEB), ACM Transactions on Intelligent Systems and Technology (TIST), ACM Transactions on Information Systems (TOIS), Frontiers of Computer Science (FCS)
Papers [Google Scholar]
RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking
Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang and Ji-Rong Wen
PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval
Findings of ACL 2021
Ruiyang Ren, Shangwen Lv, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang and Ji-Rong Wen
[Code and Model]
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering,
Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu and Haifeng Wang
[Blog] [Code and Model]
DuReaderrobust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications,
Hongxuan Tang, Hongyu Li, Jing Liu, Yu Hong, Hua Wu and Haifeng Wang
[Data and Code], [Leaderboard].
A Robust Adversarial Training Approach to Machine Reading Comprehension,
Kai Liu, Xin Liu, An Yang, Jing Liu, Jinsong Su, Sujian Li and Qiaoqiao She
CoKE: Contextualized Knowledge Graph Embedding,
Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu
D-NET: A Simple Framework for Improving the Generalization of Machine Reading Comprehension,
EMNLP 2019 Workshop on Machine Reading for Question Answering (MRQA)
Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu and Haifeng Wang
Enhancing Pre-trained Language Representations with Rich Knowledge for Machine Reading Comprehension,
An Yang, Quan Wang, Jing Liu, Kai Liu, Yajuan Lyu, Hua Wu, Qiaoqiao She and Sujian Li
Towards Robust Neural Machine Reading Comprehension via Question Paraphrases,
Ying Li, Hongyu Li and Jing Liu
Towards Time-Aware Distant Supervision for Relation Extraction,
Tianwen Jiang, Sendong Zhao, Jing Liu, Jin-Ge Yao, Ming Liu, Bing Qin, Ting Liu, Chin-Yew Lin
Answer-focused and Position-aware Neural Question Generation,
Xingwu Sun, Jing Liu, Yajuan Lyu, Yanjun Ma and Shi Wang
Aggregated Semantic Matching for Short Text Entity Linking,
Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin and Rong Pan
Neural Math Word Problem Solver with Reinforcement Learning,
Danqing Huang, Jing Liu, Chin-Yew Lin and Jian Yin
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications,
ACL 2018 Workshop on Machine Reading for Question Answering (MRQA)
Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang
Adaptations of ROUGE and BLEU to Better Evaluate Machine Reading Comprehension Task,
ACL 2018 Workshop on Machine Reading for Question Answering (MRQA)
An Yang, Kai Liu, Jing Liu, Yajuan Lyu, Sujian Li
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification,
Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang
Revisiting Distant Supervision for Relation Extraction,
Tingsong Jiang, Jing Liu and Chin-Yew Lin
A Statistical Framework for Product Description Generation,
Jinpeng Wang, Yutai Hou, Jing Liu, Yunbo Cao and Chin-Yew Lin
News Citation Recommendation with Implicit and Explicit Semantics,
Hao Peng, Jing Liu and Chin-Yew Lin
Knowledge Base Completion via Coupled Path Ranking,
Quan Wang, Jing Liu, Yuanfei Luo, Bin Wang and Chin-Yew Lin
RBPB: Regularization-Based Pattern Balancing Method for Event Extraction
Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang and Zhifang Sui
Improving Ranking Consistency for Web Search by Leveraging Knowledge Base and Search Logs,
Jyun-Yu Jiang, Jing Liu, Chin-Yew Lin and Pu-Jen Cheng
A Regularized Competition Model for Question Difficulty Estimation in Community Question Answering Services, EMNLP 2014
Quan Wang, Jing Liu, Bin Wang and Li Guo
A Computational Approach to Measuring the Correlation between Expertise and Social Media Influence for Celebrities on Microblogs, ASONAM 2014
Xin Zhao, Jing Liu, Yulan He, Chin-Yew Lin and Ji-Rong Wen
Question Difficulty Estimation in Community Question Answering Services,
Jing Liu, Quan Wang, Chin-Yew Lin and Hsiao-Wuen Hon
A Hierarchical Entity-based Approach to Structuralize User Generated Content in Social Media: A Case of Yahoo! Answers,
Baichuan Li, Jing Liu, Chin-Yew Lin, Irwin King and Michael R. Lyu
What's in a Name? An Unsupervised Approach to Link Users across Communities
Jing Liu, Fan Zhang, Xinying Song, Young-In Song, Chin-Yew Lin and Hsiao-Wuen Hon
An Unsupervised Method for Author Extraction from Web Pages Containing User-Generated Content,
Jing Liu, Xinying Song, Jingtian Jiang and Chin-Yew Lin
Competition-based User Expertise Score Estimation,
Jing Liu, Young-In Song and Chin-Yew Lin
Automatic Extraction of Web Data Records Containing User-Generated Content,
Xinying Song, Jing Liu, Yunbo Cao, Chin-Yew Lin and Hsiao-Wuen Hon
Microsoft Research Asia with Redmond at the NTCIR-8 Community QA Pilot Task,
Young-In Song, Jing Liu, Tetsuya Sakai, Xinjing Wang, Guwen Feng, Yunbo Cao, Hisami Suzuki and Chin-Yew Lin
- Principal Architect, Baidu NLP, Dec. 2017 - present
- Researcher, Microsoft Research Asia, Sep. 2014 - Dec. 2017
- Intern, Microsoft Research Asia, Jul. 2009 - Sep. 2014
- PhD, Computer Science, Harbin Institute of Technology, Sep. 2009 - Sep. 2014
- M.Sc, Computer Science, Harbin Institute of Technology, Sep. 2007 - Jul. 2009
- B.Sc, Computer Science, Xidian University, Sep. 2003 - Jul. 2007