基于姿态时空特征的人体行为识别方法

第30卷第9期

计算机辅助设计与图形学学报 Vol.30 No.9 2018年9月 Journal of Computer-Aided Design & Computer Graphics Sep. 2018 收稿日期: 2017-08-21; 修回日期: 2017-12-07. 基金项目: 装备预研领域基金(6140001010216ZK24001). 郑潇(1992—), 女, 硕士, 助理工程师, CCF 会员, 主要研究方向为计算机视觉、目标检测与识别; 彭晓东(1981—), 男, 博士, 研究员, 博士生导师, CCF 会员, 主要研究方向为计算机视觉、增强现实、态势感知; 王嘉璇(1994—), 女, 硕士研究生, 主要研究方向为机器学习、计算机视觉.

基于姿态时空特征的人体行为识别方法

郑潇, 彭晓东, 王嘉璇

(中国科学院国家空间科学中心复杂航天系统电子信息技术重点实验室北京 101499)

(zhengxiao@https://www.360docs.net/doc/038155650.html,)

摘要: 为了高效、准确地获取视频中的人体行为和运动信息, 提出一种基于人体姿态的时空特征的行为识别方法. 首先在获取视频中各帧图像的人体关节位置的基础上, 提取关节信息描述姿态变化, 具体包括在空间维度上提取每帧图像的关节位置关系、时间维度上计算关节空间关系的变化, 二者共同构成姿态时空特征描述子; 然后利用Fisher 向量模型对不同类型的特征描述子分别进行编码, 得到固定维度的Fisher 向量; 最后对不同类型的Fisher 向量加权融合后进行分类. 实验结果表明, 该方法能够有效地识别视频中的人体复杂动作行为, 提高行为识别率.

关键词: 行为识别; 姿态时空特征; Fisher 向量; 加权融合

中图法分类号: TP391.41 DOI: 10.3724/SP.J.1089.2018.16848

Human Action Recognition Based on Pose Spatio-Temporal Features

Zheng Xiao, Peng Xiaodong, and Wang Jiaxuan

(Key Laboratory Electronic and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 101499)

Abstract: In order to extract human motion information efficiently and improve the accuracy of action rec-ognition from videos, an approach for action recognition based on human pose spatio-temporal features is proposed. Firstly, with the joint positions of human body in each frame of the video acquired, we extracted pose information by handcrafted features. Specifically, the positions of joints and relatives in the spatial di-mension, as well as the change of that in the temporal dimension were calculated. The two together consti-tuted human pose spatiotemporal feature descriptors. Then the Fisher Vector model was utilized to compute fixed dimension Fisher vector for each descriptor separately. Lastly, features were weighted to fusion for classification. Experimental results show that the proposed algorithm can effectively improve action recog-nition performance.

Key words: action recognition; pose spatiotemporal features; Fisher vector; weighted fusion

人体行为识别是计算机视觉领域的一个重要

研究课题, 任务是在给定一段包含人体单一运动

的视频片段中, 推断出视频中的人体动作标签, 如

行走、奔跑、跳跃等. 行为识别在视频监控、虚拟

现实、人机交互、视频检索等方面具有广泛的应用

前景, 其容易受到背景变化、光照变化、移动摄像

头、拍摄视角、遮挡和服装等因素的影响[1], 因此对行为识别技术的研究具有一定的挑战性. 按照行为特征提取方式的不同, 行为识别方法可分为手工设计特征和深度学习特征2类. (1) 手工设计特征包括基于局部特征描述子和基于姿态特征描述子. 局部特征描述了视频中

万方数据