Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components
Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components作者机构:Department of Computer Science University of British Columbia Department of Statistics and Actuarial Science University of Waterloo
出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))
年 卷 期:2014年第29卷第1期
页 面:105-115页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:A preliminary version of the paper was published in the Proceedings of ICDM 2012
主 题:domain adaptation kernel embedding Hilbert-Schmidt independence criterion
摘 要:Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.