Kernel k-nearest neighbor algorithm as a flexible SAR modeling tool
作者单位:Research center of modernization of traditional Chinese medicines Central South UniversityChangsha 410083 P.R. China School of Mathematical Sciences and Computing Technology Central South UniversityChangsha 410083 P.R. China
会议名称:《中国化学会第28届学术年会》
会议日期:2012年
学科分类:081704[工学-应用化学] 07[理学] 070304[理学-物理化学(含∶化学物理)] 08[工学] 0817[工学-化学工程与技术] 0703[理学-化学]
关 键 词:k-nearest neighbor (k-NN) kernel methods string kernel structure-activity relationship (SAR).
摘 要:A kernel version of k-nearest neighbor algorithm (k-NN) has been developed to model the complex relationship between molecular descriptors and bioactivities of *** k-NN is to perform the original k-NN algorithm by mapping the training samples in the input space into a high-dimensional feature *** can be easily constructed by calculating the distance between samples in the feature space,directly deriving from the simple calculation of the kernel *** developed kernel k-NN is very flexible to deal with complex nonlinear relationship,more importantly;it can also conveniently cope with some non-vectorial data only by the definition of different *** results obtained from several real SAR datasets indicated that the performance of kernel k-NN is comparable to support vector machine *** can be regarded as an alternative modeling technique for several chemical problems including the study of structure-activity relationship (SAR).The source codes implementing kernel k-NN in R language are freely available at http://***/p/kernelmethods/.