Progressive framework for deep neural networks: from linear to non-linear
Progressive framework for deep neural networks: from linear to non-linear作者机构:School of Information and Communication EngineeringBeijing University of Posts and Telecommunications Beijing Key Laboratory of Network System and Network CultureBeijing University of Posts and Telecommunications
出 版 物:《The Journal of China Universities of Posts and Telecommunications》 (中国邮电高校学报(英文版))
年 卷 期:2016年第23卷第6期
页 面:1-7页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by the National Natural Science Foundation of China (61471049, 61372169, 61532018) the Postgraduate Innovation Fund of SICE, BUPT, 2015
主 题:framework neural network DCCA semantic RankNet
摘 要:We propose a novel progressive framework to optimize deep neural networks. The idea is to try to combine the stability of linear methods and the ability of learning complex and abstract internal representations of deep leaming methods. We insert a linear loss layer between the input layer and the first hidden non-linear layer of a traditional deep model. The loss objective for optimization is a weighted sum of linear loss of the added new layer and non-linear loss of the last output layer. We modify the model structure of deep canonical correlation analysis (DCCA), i.e., adding a third semantic view to regularize text and image pairs and embedding the structure into our framework, for cross-modal retrieval tasks such as text-to-image search and image-to-text search. The experimental results show the performance of the modified model is better than similar state-of-art approaches on a dataset of National University of Singapore (NUS-WIDE). To validate the generalization ability of our framework, we apply our framework to RankNet, a ranking model optimized by stochastic gradient descent. Our method outperforms RankNet and converges more quickly, which indicates our progressive framework could provide a better and faster solution for deep neural networks.