Estimation of soil organic matter in the Ogan-Kuqa River Oasis, Northwest China, based on visible and near-infrared spectroscopy and machine learning
Estimation of soil organic matter in the Ogan-Kuqa River Oasis, Northwest China, based on visible and near-infrared spectroscopy and machine learning作者机构:College of Geography and Remote Sensing ScienceXinjiang UniversityUrumqi 830046China Xinjiang Key Laboratory of Oasis EcologyXinjiang UniversityUrumqi 830046China Key Laboratory of Smart City and Environment Modelling of Higher Education InstituteXinjiang UniversityUrumqi 830046China
出 版 物:《Journal of Arid Land》 (干旱区科学(英文版))
年 卷 期:2023年第15卷第2期
页 面:191-204页
核心收录:
学科分类:09[农学] 0903[农学-农业资源与环境] 090301[农学-土壤学]
基 金:supported by the Key Project of Natural Science Foundation of Xinjiang Uygur Autonomous Region,China(2021D01D06) the National Natural Science Foundation of China(41961059)
主 题:soil organic matter content vis-NIR spectroscopy random forest Boruta algorithm machine learning
摘 要:Visible and near-infrared(vis-NIR)spectroscopy technique allows for fast and efficient determination of soil organic matter(SOM).However,a prior requirement for the vis-NIR spectroscopy technique to predict SOM is the effective removal of redundant ***,this study aims to select three wavelength selection strategies for obtaining the spectral response characteristics of *** SOM content and spectral information of 110 soil samples from the Ogan-Kuqa River Oasis were measured under laboratory conditions in July *** correlation analysis was introduced to preselect spectral wavelengths from the preprocessed spectra that passed the 0.01 level significance *** successive projection algorithm(SPA),competitive adaptive reweighted sampling(CARS),and Boruta algorithm were used to detect the optimal variables from the preselected ***,partial least squares regression(PLSR)and random forest(RF)models combined with the optimal wavelengths were applied to develop a quantitative estimation model of the SOM *** results demonstrate that the optimal variables selected were mainly located near the range of spectral absorption features(i.e.,1400.0,1900.0,and 2200.0 nm),and the CARS and Boruta algorithm also selected a few visible wavelengths located in the range of 480.0–510.0 *** models can achieve a more satisfactory prediction of the SOM content,and the RF model had better accuracy than the PLSR *** SOM content prediction model established by Boruta algorithm combined with the RF model performed best with 23 variables and the model achieved the coefficient of determination(R2)of 0.78 and the residual prediction deviation(RPD)of *** Boruta algorithm effectively removed redundant information and optimized the optimal wavelengths to improve the prediction accuracy of the estimated SOM ***,combining vis-NIR spectroscopy with machine learning to estimate SOM content is an important method to improve th