Dynamic feature point removal method with joint target detection and depth information
-
摘要: 针对在动态环境中,视觉定位系统的定位精度和鲁棒性容易受到动态特征点影响的问题,提出了一种联合目标检测与深度信息的动态特征点去除方法. 引入YOLOv7目标检测网络快速获得当前图像帧的目标类别及位置信息,加入坐标注意力(coordinate attention,CA)机制优化深度学习模型,提升网络目标检测精度. 此外,提出了一种利用深度信息和对极几何约束的动态特征点优化策略. 有效剔除了动态特征点,同时保留了尽量多的静态点,从而降低了动态点对系统定位精度和鲁棒性的影响. 在公开的数据集TUM上进行实验验证. 结果表明:与ORB-SLAM2 (oriented fast and rotated brief-SLAM)相比,所提方案在定位精度和鲁棒性上有明显优势. 同时与动态同步定位和地图构建(dyna simultaneous localization and mapping,DynaSLAM)相比,定位精度基本持平,但在运行速度上实现了显著提升.Abstract: Aiming at the problem that the localization accuracy and robustness of visual localization systems are easily affected by dynamic feature points in dynamic environments, a dynamic feature point removal method combining target detection and depth information is proposed. The YOLOv7 target detection network is introduced to quickly obtain the target category and position information of the current image frame, and the coordinate attention (CA) mechanism is added to optimize the deep learning model and improve the target detection accuracy of the network. In addition, a dynamic feature point optimization strategy using depth information and pairwise geometric constraints is proposed. Dynamic feature points are effectively eliminated while as many static points as possible are retained, thus reducing the impact of dynamic points on the localization accuracy and robustness of the system. Experimental validation is performed on the publicly available dataset TUM. The results show that the proposed scheme has obvious advantages in terms of localization accuracy and robustness compared with ORB-SLAM2. At the same time, compared with DynaSLAM, the localization accuracy is basically the same, but the operation speed is significantly improved.
-
表 1 改进前后网络的性能表现
模型 准确率 召回率 均值平均精度 原YOLOv7模型 0.955 0.926 0.958 加入CA注意力机制的YOLOv7模型 0.981 0.945 0.982 表 2 不同研究方案的ATE表现RMSE
序列 ORB-SLAM2/m DynaSLAM/m 仅YOLOv7/m 加入CA/m CA+动态点筛选/m 提升/% fr3/walking_xyz 0.703 0.015 0.025 0.024 0.019 97.3 fr3/waling_rpy 0.590 0.035 0.048 0.046 0.036 94.0 fr3/walking_halfshpere 0.423 0.025 0.038 0.039 0.033 92.2 表 3 不同研究方案的RPE表现RMSE
序列 ORB-SLAM2/m DynaSLAM/m 仅YOLOv7/m 加入CA/m CA+动态点筛选/m 提升/% fr3/walking_xyz 0.457 - 0.023 0.022 0.021 95.4 fr3/waling_rpy 0.419 - 0.047 0.045 0.035 91.7 fr3/walking_halfshpere 0.412 - 0.039 0.041 0.032 92.2 表 4 不同研究方案在fr3数据集上的用时表现s/帧
序列 时间 ORB-
SLAM2Dyna-
SLAM本文方案 fr3/walking_xyz 中位数 0.0646 - 0.2981 平均数 0.0729 0.5 0.3020 fr3/walking_rpy 中位数 0.0698 - 0.3059 平均数 0.0713 0.5 0.3098 fr3/walking_halfsphere 中位数 0.0705 - 0.3108 平均数 0.0717 0.5 0.3162 -
[1] MUR-ARTAL R, MONTIEL J M, TARDÓS J D. ORB-SLAM: a versatile and accurate monocular SLAM system[J]. IEEE transactions on robotics, 2015, 31(5): 1147-1163. DOI: 10.1109/TRO.2015.2463671 [2] MUR-ARTAL R, TARDÓS J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE transactions on robotics, 2017, 33(5): 1255-1262. DOI: 10.1109/TRO.2017.2705103 [3] CHEN W, SHANG G, JI A, et al. An overview on visual slam: from tradition to semantic[J]. Remote sensing, 2022, 14(13): 3010. DOI: 10.3390/rs14133010 [4] BESCOS B, FACIL J M, CIVERA J, et al. DynaSLAM: tracking, mapping, and inpainting in dynamic scenes[J]. IEEE robotics and automation letters, 2018, 3(4): 4076-4083. DOI: 10.1109/LRA.2018.2860039 [5] YU C, LIU Z, LIU X J, et al. DS-SLAM: a semantic visual slam towards dynamic environments[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018: 1168-1174. DOI: 10.1109/IROS.2018.8593691 [6] RAN T, YUAN L, ZHANG J, et al. RS-SLAM: a robust semantic slam in dynamic environments based on RGB-D sensor[J]. IEEE sensors journal, 2021, 21(18): 20657-20664. DOI: 10.1109/JSEN.2021.3099511 [7] 张梦珠, 黄劲松. 基于语义特征的视觉定位研究[J]. 测绘地理信息, 2022, 47(4): 33-37. [8] 刘胤真, 徐向荣, 张卉, 等. 动态场景下基于语义和几何约束的视觉SLAM算法[J/OL]. (2023-11-02)[2024-01-15]. 信息与控制. https://doi.org/10.13976/j.cnki.xk.2024.3089 [9] 高逸, 王庆, 杨高朝, 等. 基于几何约束和目标检测的室内动态SLAM[J]. 全球定位系统, 2022, 47(5): 51-56. [10] 潘海鹏, 刘培敏, 马淼. 基于语义信息与动态特征点剔除的SLAM算法[J]. 浙江理工大学学报(自然科学版), 2022, 47(5): 764-773. [11] 付豪, 徐和根, 张志明, 等. 动态场景下基于语义和光流约束的视觉同步定位与地图构建[J]. 计算机应用, 2021, 41(11): 3337-3344. [12] HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//The IEEE/CVF conference on computer vision and pattern recognition, 2021: 13713-13722. DOI: 10.1109/CVPR46437.2021.01350 [13] STURM J, ENGELHARD N, ENDRES F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//The International Conference on Intelligent Robot Systems (IROS), 2012. DOI: 10.1109/IROS.2012.6385773