Indoor dynamic SLAM based on geometric constraints and target detection
-
摘要: 针对即时定位与地图构建(SLAM)在室内动态环境下定位精度低和地图效果差的问题,提出一种基于几何约束和目标检测的室内动态SLAM方法. 使用目标检测网络获取语义信息,提出运动物体漏检的方法;根据先验知识,提出准确识别动态区域的信息判定方法;结合几何约束和深度学习方法剔除动态点,利用静态点估计相机位姿;根据存储信息构建可闭环的静态地图. 在TUM数据集上进行实验,定位精度比ORB-SLAM2提高97.5%,相较于其他动态SLAM可取得更好的性能. 在室内真实环境进行实验,构建的静态地图更准确,有效提高了室内动态SLAM的定位精度和地图效果.
-
关键词:
- 即时定位与地图构建(SLAM) /
- 动态环境 /
- 目标检测 /
- 几何约束 /
- 静态地图
Abstract: Aiming at the problem of low localization accuracy and poor map effect of visual simultaneous localization and mapping (SLAM) in indoor dynamic environment, a indoor dynamic SLAM method is proposed based on geometric constraints and target detection. The target detection network is used to obtain semantic information and a method for missing detection of moving objects is proposed. Based on prior knowledge, an information determination method is proposed to accurately identify dynamic regions. Dynamic points are eliminated based on geometric constraints and deep learning. Static points are used to estimate camera pose. A closed-loop static map is builded based on the stored information. The experiment on TUM dataset shows that the localization accuracy is 97.5% higher than that of ORB-SLAM2 and the performance is better than other dynamic SLAM. The experiment in the indoor real environment shows that the static map is more accurate. The localization accuracy and the map effect of indoor dynamic SLAM are improved effectively. -
表 1 室内物体
分类 举例 状态 运动 人、宠物 主动运动 相对 椅子 被动运动 静止 柜子、冰箱 绝对静止 静止 书籍、杯子 被动运动 表 2 动态点剔除
方法 原理 动态点 深度学习 目标检测+信息判定 人、移动的椅子 几何约束 极线约束 拿起的杯子 表 3 静态地图
方法 信息 环境 传统方法 几何信息 静态 本文方法 几何信息+语义信息 静态+动态 表 4 本文方法和ORB-SLAM2的定位精度
序列 ORB-SLAM2/m 本文/m 改进/% fr3/walking_xyz 0.756 0.019 97.5 fr3/walking_rpy 0.789 0.058 92.3 fr3/walking_static 0.405 0.010 97.5 fr3/walking_halfsphere 0.589 0.045 92.4 fr3/sitting_xyz 0.009 0.023 - fr3/sitting_rpy 0.020 0.019 - fr3/sitting_static 0.009 0.007 - fr3/sitting_halfsphere 0.041 0.028 - 表 5 本文方法和DS-SLAM的性能
序列 DS-SLAM 本文方法 定位/m 时间/s 定位/m 时间/s fr3/walking_xyz 0.064 97.6 0.019 29.5 fr3/walking_rpy 0.491 103.4 0.058 29.6 fr3/walking_static 0.008 81.6 0.010 25.3 fr3/walking_halfsphere 0.031 123.2 0.045 37.4 -
[1] CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age[J]. IEEE transactions on robotics, 2016, 32(6): 1309-1332. DOI: 10.1109/TRO.2016.2624754 [2] JIA Y J, YAN X Y, XU Y H. A survey of simultaneous localization and mapping for robot[C]// IEEE Advanced Information Technology, Electronic and Automation Control Conference, 2019. [3] FORSTER C, ZHANG Z C, GASSNER M, et al. SVO: semidirect visual odometry for monocular and multicamera systems[J]. IEEE transactions on robotics, 2016, 33(2): 249-265. DOI: 10.1109/TRO.2016.2623335 [4] ENGEL J, KOLTUN V, CREMERS D. Direct sparse odometry[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(3): 611-625. DOI: 10.1109/TPAMI.2017.2658577 [5] MUR-ARTAL R, TARDÓS J D. Orb-slam2: an open-source slam system for monocular, stereo, and RGB-D cameras[J]. IEEE transactions on robotics, 2017, 33(5): 1255-1262. DOI: 10.1109/TRO.2017.2705103 [6] CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap SLAM[J]. IEEE transactions on robotics, 2021, 37(6): 1874-1890. DOI: 10.1109/TRO.2021.3075644 [7] 王柯赛, 姚锡凡, 黄宇, 等. 动态环境下的视觉SLAM研究评述[J]. 机器人, 2021, 43(6): 715-732. DOI: 10.13973/j.cnki.robot.200468 [8] 高兴波, 史旭华, 葛群峰, 等. 面向动态物体场景的视觉SLAM综述[J]. 机器人, 2021, 43(6): 733-750. DOI: 10.13973/j.cnki.robot.200323 [9] SUN Y X, LIU M, MENG M Q H. Improving RGB-D SLAM in dynamic environments: a motion removal approach[J]. Robotics and autonomous systems, 2017(89): 110-122. DOI: 10.1016/j.robot.2016.11.012 [10] 张慧娟, 方灶军, 杨桂林. 动态环境下基于线特征的RGB-D视觉里程计[J]. 机器人, 2019, 41(1): 75-82. DOI: 10.13973/j.cnki.robot.180020 [11] SHENG C, PAN S G, GAO W, et al. Dynamic-DSO: direct sparse odometry using objects semantic information for dynamic environments[J]. Applied sciences, 2020, 10(4): 1467. DOI: 10.3390/app10041467 [12] HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C] //The IEEE International Conference on Computer Vision, 2017: 2961-2969. [13] YU C, LIU Z X, LIU X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018: 1168-1174. DOI:10.1109/IROS.2018.8593691 [14] BADRINARAYANAN V, KENDALL A, CIPOLLA R. Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615 [15] YANG S Q, FAN G H, BAI L L, et al. SGC-VSLAM: a semantic and geometric constraints VSLAM for dynamic indoor environments[J]. Sensors, 2020, 20(8): 2432. DOI: 10.3390/s20082432 [16] REDMON J, FARHADI A. Yolov3: an incremental improvement[J]. arXiv e-print, 2018. DOI: 10.48550/arXiv.1804.02767 [17] ZHONG F W, WANG S, ZHANG Z Q, et al. Detect-SLAM: making object detection and SLAM mutually beneficial[C]//IEEE Winter Conference on Applications of Computer Vision (WACV), 2018: 1001-1010. DOI: 10.1109/WACV.2018.00115 [18] DELMERICO J, SCARAMUZZA D. A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots[C]//IEEE International Conference on Robotics and Automation (ICRA), 2018: 2502-2509. DOI:10.1109/ICRA.2018.8460664