Comparative Studies of Model Performance Based on Different Data Sampling Methods
作者单位:The State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources North China Electric Power University
会议名称:《第25届中国控制与决策会议》
主办单位:IEEE;NE Univ;IEEE Ind Elect Chapter;IEEE Harbin Sect Control Syst Soc Chapter;Guizhou Univ;IEEE Control Syst Soc;Syst Engn Soc China;Chinese Assoc Artificial Intelligence;Chinese Assoc Automat;Tech Comm Control Theory;Chinese Assoc Aeronaut;Automat Control Soc;Chinese Assoc Syst Simulat;Simulat Methods & Modeling Soc;Intelligent Control & Management Soc
会议日期:2013年
学科分类:0810[工学-信息与通信工程] 08[工学] 080401[工学-精密仪器及机械] 0804[工学-仪器科学与技术] 080402[工学-测试计量技术及仪器] 0835[工学-软件工程] 081002[工学-信号与信息处理]
基 金:supported by National Basic Research Program (973 Program) (2012CB215203) National Natural Science Foundation of China (51036002) the Fundamental Research Funds for the Central Universities (12QX15)
关 键 词:orthogonal Latin sampling uniform design data-driven model least squares support vector machine artificial neural network partial least squares
摘 要:This paper presents a comparative study on the effects of different data sampling methods to the performance of data-driven models. An engineering benchmark modeling problem is investigated, focused on which, three sampling methods, i.e. orthogonal Latin sampling, uniform design sampling and random sampling are used to generate the training data of different property. Six typical data-driven modeling techniques, which consist of artificial intelligent methods (least squares support vector machine, BP neural network and RBF neural network) and statistical methods (multiple linear regression, linear and nonlinear partial least squares regressions), are performed to make the comparison. The root mean square error (RMSE), R square (R2) and mean relative error (MRE) values are taken as the comparison criteria. The results reveal that data sampling and data property play a very key role in establishing an accurate data-driven model.