Research on three-step accelerated gradient algorithm in deep learning
作者机构:KLATASDS-MOESchool of StatisticsEast China Normal UniversityShanghaiPeople's Republic of China
出 版 物:《Statistical Theory and Related Fields》 (统计理论及其应用(英文))
年 卷 期:2022年第6卷第1期
页 面:40-57页
学科分类:02[经济学] 0202[经济学-应用经济学] 020208[经济学-统计学] 07[理学] 0714[理学-统计学(可授理学、经济学学位)] 070103[理学-概率论与数理统计] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:This work was supported by National Natural Science Foun-dation of China(11271136,81530086) Program of Shanghai Subject Chief Scientist(14XD1401600) the 111 Project of China(No.B14019)
主 题:Accelerated algorithm backpropagation deep learning learning rate momentum stochastic gradient descent
摘 要:Gradient descent(GD)algorithm is the widely used optimisation method in training machine learning and deep learning *** this paper,based on GD,Polyak’s momentum(PM),and Nesterov accelerated gradient(NAG),we give the convergence of the algorithms from an ini-tial value to the optimal value of an objective function in simple quadratic *** on the convergence property of the quadratic function,two sister sequences of NAG’s iteration and par-allel tangent methods in neural networks,the three-step accelerated gradient(TAG)algorithm is proposed,which has three sequences other than two sister *** illustrate the perfor-mance of this algorithm,we compare the proposed algorithm with the three other algorithms in quadratic function,high-dimensional quadratic functions,and nonquadratic *** we consider to combine the TAG algorithm to the backpropagation algorithm and the stochastic gradient descent algorithm in deep *** conveniently facilitate the proposed algorithms,we rewite the R package‘neuralnet’and extend it to‘supneuralnet’.All kinds of deep learning algorithms in this paper are included in‘supneuralnet’***,we show our algorithms are superior to other algorithms in four case studies.