When Crowdsourcing Meets Data Markets:A Fair Data Value Metric for Data Trading
作者机构:Department of Computer Science and EngineeringShanghai Jiao Tong UniversityShanghai 200240China
出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))
年 卷 期:2024年第39卷第3期
页 面:671-690页
核心收录:
学科分类:08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported in part by the National Key Research and Development Program of China under Grant No.2020YFB1707900 the National Natural Science Foundation of China under Grant Nos.U2268204,62322206,62132018,62025204,62272307,and 62372296
主 题:data trading crowdsourcing mechanism design Shapley value
摘 要:Large-quantity and high-quality data is critical to the success of machine learning in diverse *** with the dilemma of data silos where data is difficult to circulate,emerging data markets attempt to break the dilemma by facilitating data exchange on the ***,on the other hand,is one of the important methods to efficiently collect large amounts of data with high-value in data *** this paper,we investigate the joint problem of efficient data acquisition and fair budget distribution across the crowdsourcing and data *** propose a new metric of data value as the uncertainty reduction of a Bayesian machine learning model by integrating the data into model *** by this data value metric,we design a mechanism called Shapley Value Mechanism with Individual Rationality(SV-IR),in which we design a greedy algorithm with a constant approximation ratio to greedily select the most cost-efficient data brokers,and a fair compensation determination rule based on the Shapley value,respecting the individual rationality *** further propose a fair reward distribution method for the data holders with various effort levels under the charge of a data *** demonstrate the fairness of the compensation determination rule and reward distribution rule by evaluating our mechanisms on two real-world *** evaluation results also show that the selection algorithm in SV-IR could approach the optimal solution,and outperforms state-of-the-art methods.