一种基于动作语义的生产流水线视频关键帧提取方法

An action semantics-based method for keyframe extraction in production line videos

  • 摘要: 针对生产流水线监控视频中存在的大量冗余及当前关键帧提取方法中存在的动作语义缺失、提取帧数不可控、实时性差等问题,通过深入分析视频中单行为体的动作特性,提出一种基于动作语义的关键帧提取方法。首先,提取视频的ORB(oriented FAST and rotated BRIEF)局部特征训练动作语义词典,将局部特征映射到动作语义空间,增强特征表示的语义信息。其次,构造VLAD(vector of locally aggregated descriptors)编码生成视频全局特征。最后,结合K-means聚类使得提取结果精准可控。实验结果表明,所提方法在压缩比例极低至3.33%的情况下,仍能保留完整的动作语义,相较基准方法F1分数提高52.16%,与多种方法对比综合性能表现优异。该方法能够快速、均匀、稳定地提取关键帧,对不同压缩率具有较强的适应性和鲁棒性,具备低冗余度、高还原度等优势。

     

    Abstract: A method for keyframe extraction based on action semantics is proposed to address the issues of redundancy, action semantic loss, uncontrolled frame numbers, and poor real-time performance in existing keyframe extraction methods for production line monitoring videos. Firstly, the ORB local features of the video are extracted to train an action semantic dictionary, and the local features are mapped into the action semantic space to enhance the semantic information in feature representation. Secondly, VLAD encoding is constructed to generate global video features. Finally, K-means clustering is applied to ensure precise and controllable extraction results. Experimental results demonstrate that the proposed method retains complete action semantics with a compression ratio as low as 3.33%, achieving a 52.16% improvement in F1 score compared to the baseline method. Compared with several other methods, the proposed approach exhibits superior overall performance. It can quickly, uniformly, and stably extract keyframes, with strong adaptability and robustness to varying compression ratios. The method offers advantages such as low redundancy and high restoration accuracy.

     

/

返回文章
返回