咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >A Simple yet Effective Framewo... 收藏

A Simple yet Effective Framework for Active Learning to Rank

作     者:Qingzhong Wang Haifang Li Haoyi Xiong Wen Wang Jiang Bian Yu Lu Shuaiqiang Wang Zhicong Cheng Dejing Dou Dawei Yin Qingzhong Wang;Haifang Li;Haoyi Xiong;Wen Wang;Jiang Bian;Yu Lu;Shuaiqiang Wang;Zhicong Cheng;Dejing Dou;Dawei Yin

作者机构:Baidu IncorporatedBeijing 100085China 

出 版 物:《Machine Intelligence Research》 (机器智能研究(英文版))

年 卷 期:2024年第21卷第1期

页      面:169-183页

核心收录:

学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:This work was supported in part by the National Key R&D Program of China(No.2021ZD0110303) 

主  题:Search information retrieval learning to rank active learning query by committee 

摘      要:While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world s largest Chinese search engine serving more than hundreds of millions of daily active users and responding to billions of queries per *** handle the diverse query requests from users at the web-scale,Baidu has made tremendous efforts in understanding users queries,retrieving relevant content from a pool of trillions of webpages,and ranking the most relevant webpages on the top of the *** the components used in Baidu search,learning to rank(LTR)plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR *** reduce the costs and time con-sumption of query/webpage labelling,we study the problem of active learning to rank(active LTR)that selects unlabeled queries for an-notation and training in this ***,we first investigate the criterion-Ranking entropy(RE)characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints,using a query-by-com-mittee(QBC)***,we explore a new criterion namely prediction variances(PV)that measures the variance of prediction res-ults for all relevant webpages under a *** empirical studies find that RE may favor low-frequency queries from the pool for la-belling while PV prioritizes high-frequency queries ***,we combine these two complementary criteria as the sample selection strategies for active *** experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models to achieve higher discounted cumulative gain(i.e.,the relative improvement DCG4=1.38%)with the same budgeted labellingefforts.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分