View-invariant human action recognition via robust locally adaptive multi-view learning
基于鲁棒局部自适应多视角学习的视点无关人体行为识别(英文)作者机构:Institute of Artificial Intelligence College of Computer Science and Technology Zhejiang University
出 版 物:《Frontiers of Information Technology & Electronic Engineering》 (信息与电子工程前沿(英文版))
年 卷 期:2015年第16卷第11期
页 面:917-929页
核心收录:
学科分类:08[工学] 080203[工学-机械设计及理论] 0802[工学-机械工程]
基 金:Project supported by the National Natural Science Foundation of China(No.61572431) the National Key Technology R&D Program(No.2013BAH59F00) the Zhejiang Provincial Natural Science Foundation of China(No.LY13F020001) the Zhejiang Province Public Technology Applied Research Projects,China(No.2014C33090)
主 题:View-invariant Action recognition Multi-view learning Ll-norm Local learning
摘 要:Human action recognition is currently one of the most active research areas in computer vision. It has been widely used in many applications, such as intelligent surveillance, perceptual interface, and content-based video retrieval. However, some extrinsic factors are barriers for the development of action recognition; e.g., human actions may be observed from arbitrary camera viewpoints in realistic scene. Thus, view-invariant analysis becomes important for action recognition algorithms, and a number of researchers have paid much attention to this issue. In this paper, we present a multi-view learning approach to recognize human actions from different views. As most existing multi-view learning algorithms often suffer from the problem of lacking data adaptiveness in the nearest neighborhood graph construction procedure, a robust locally adaptive multi-view learning algorithm based on learning multiple local L 1-graphs is proposed. Moreover, an efficient iterative optimization method is proposed to solve the proposed objective function. Experiments on three public view-invariant action recognition datasets, i.e., ViHASi, IXMAS, and WVU, demonstrate data adaptiveness, effectiveness, and efficiency of our algorithm. More importantly, when the feature dimension is correctly selected (i.e., 〉60), the proposed algorithm stably outperforms state-of-the-art counterparts and obtains about 6% improvement in recognition accuracy on the three datasets.