Building extraction based on advanced attention gate U-Net
-
摘要: 针对经典深度学习语义分割网络对建筑物提取存在精度较低、边界模糊和小目标识别困难的问题,本文提出一种增强注意力门控的U型网络(advanced attention gate U-Net,AA_U-Net)用于改善建筑物提取的效果,该网络改进经典U-Net的结构,使用VGG16作为主干特征提取网络、注意力门控模块参与跳跃连接、双线性插值法代替反卷积进行上采样. 实验采用武汉大学建筑物数据集(WHU building dataset,WHD)对比提出的网络与部分经典语义分割网络的提取效果,并探究网络改进的各个模块对提取效果的影响. 结果显示:该网络对建筑物提取的总精度、交并比、查准率、召回率和F1分数分别为98.78%、89.71%、93.30%、95.89%、94.58%,各项评价指标均优于经典语义分割网络,且改进的各个模块有效提高了提取精度,改善了建筑物轮廓不清晰和小目标建筑物破碎的问题,可用于精准提取高分辨率遥感影像中的建筑物信息,对城市规划、土地利用、生产生活、军事侦察等具有指导意义.
-
关键词:
- 高分辨率遥感影像 /
- 深度学习 /
- 语义分割 /
- 增强注意力门控U-Net /
- 建筑物提取
Abstract: To facilitate the problems of low accuracy, fuzzy boundary, and difficulty in identifying small targets in building extraction using deep learning semantic segmentation networks, we propose an advanced attention gate U-Net (AA_U-Net) to improve the effect of building extraction. This network improves the structure of classic U-Net, using VGG16 as the backbone feature extraction network, attention-gated module participating in skip connection, and bilinear interpolation instead of deconvolution for upsampling. In the experiment, we use the Wuhan University building dataset (WHD) to compare the extraction effect of the proposed network and some classical semantic segmentation networks and explore the influence of each module of the network improvement on the extraction. The results show that the total accuracy, intersection of union, precision, recall rate, and F1 score of the network are 98.78%, 89.71%, 93.30%, 95.89%, and 94.58%, respectively. All evaluation indexes are better than the classical semantic segmentation network, and the improved modules can effectively improve the extraction accuracy. The problem of unclear outlines of buildings and fragmentation of small target buildings was improved, too. It can be used to accurately extract building information from high-resolution remote sensing images, which has guiding significance for urban planning, land use, production, life, and military reconnaissance. -
图 3 U-Net编码器与VGG16主干网结构对比
注:方框及颜色含义与图2相同;(a)代表U-Net的编码器部分,(b)代表VGG16的主干特征提取部分;字体为红色表示VGG16与U-Net的特征图通道数差异.
表 1 不同网络分割精度对比
% 网络 IoU Precision Recall F1 OA U-Net 87.41 91.46 95.18 93.28 98.47 SegNet 87.13 91.94 94.34 93.12 98.45 FCN 69.15 85.03 78.73 91.76 96.09 DeepLabV3 68.89 82.38 80.79 81.58 95.94 DeepLabV3+ 79.31 88.27 88.65 88.46 97.43 PSPNet 79.80 85.13 92.72 88.76 97.39 AA_U-Net 89.71 93.30 95.89 94.58 98.78 表 2 不同主干网提取精度对比
% 网络 IoU Precision Recall F1 OA ResNet50 85.17 89.98 94.09 91.99 98.18 MobileNetV3 83.93 90.00 92.56 91.27 98.03 VGG16 89.16 93.29 95.27 94.27 98.71 表 3 解码器改进前后分割精度对比
% 网络状态 IoU Precision Recall F1 OA Before 89.43 93.42 95.44 94.42 98.74 After 89.71 93.30 95.89 94.58 98.78 表 4 消融各模块分割精度对比
网络 VGG AG BIU IoU/% Precision/% Recall/% F1/% OA/% U-Net - - - 87.41 91.46 95.18 93.28 98.47 VGG 有 - - 89.16 93.29 95.27 94.27 98.71 VGG+AG 有 有 - 89.43 93.42 95.44 94.42 98.74 VGG+AG
+BIU有 有 有 89.71 93.30 95.89 94.58 98.78 -
[1] 李锋, 刘旭升, 胡聃, 等. 城市可持续发展评价方法及其应用[J]. 生态学报, 2007, 27(11): 4793-4802. [2] ATIK S O, IPBUKER C. Building extraction in VHR remote sensing imagery through deep learning[J]. Fresen environ bull, 2022, 31: 8468-8473. [3] SHALONI, DIXIT M, AGARWAL S, et al. Building extraction from remote sensing images: a survey[C]//The 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), 2020996-971. DOI: 10.1109/ICACCCN51052.2020.9362894 [4] 高妙仙, 吴新辉. 高空间分辨率遥感影像建筑物自动提取方法综述[J]. 测绘与空间地理信息, 2023, 46(3): 32-34. [5] 李文国, 黄亮, 左小清, 等. 一种结合语义分割模型和图割的街景影像变化检测方法[J]. 全球定位系统, 2021, 46(1): 98-104. [6] SHI X, HUANG H, PU C Y, et al. CSA-UNet: channel-spatial attention-based encoder–decoder network for rural blue-roofed building extraction from UAV imagery[J]. IEEE geoscience and remote sensing letters, 2022(19): 1-5. DOI: 10.1109/LGRS.2022.3197319 [7] 张忠豪. 基于深度学习的多场景下建筑物提取研究 [D]. 贵阳: 贵州大学, 2022. [8] JÓŹWIK A, SERPICO S, ROLI F. A parallel network of modified 1-NN and k-NN classifiers–application to remote-sensing image classification[J]. Pattern recognition letters, 1998, 19(1): 57-62. DOI: 10.1016/S0167-8655(97)00155-4 [9] PAL M, MATHER P M. Support vector machines for classification in remote sensing[J]. International journal of remote sensing, 2005, 26(5): 1007-1011. DOI: 10.1080/01431160512331314083 [10] PAL M. Random forest classifier for remote sensing classification[J]. International journal of remote sensing, 2005, 26(1): 217-222. DOI: 10.1080/01431160412331269698 [11] XU S J, DENG B W, MENG Y B, et al. ReA-Net: a multiscale region attention network with neighborhood consistency supervision for building extraction from remote sensing image[J]. IEEE journal of selected topics in applied earth observations and remote sensing, 2022(15): 9033-9047. DOI: 10.1109/JSTARS.2022.3204576 [12] WEI S Q, ZHANG T, JI S P, et al. BuildMapper: a fully learnable framework for vectorized building contour extraction[J]. ISPRS journal of photogrammetry and remote sensing, 2023(197): 87-104. DOI: 10.48550/arXiv.2211.03373 [13] ZHOU Y G, CHEN Z L, WANG B J, et al. BOMSC-Net: boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery[J]. IEEE transactions on geoscience and remote sensing, 2022(60): 1-17. DOI: 10.1109/TGRS.2022.3152575 [14] GUO Y M, LIU Y, GEORGIOU T, et al. A review of semantic segmentation using deep neural networks[J]. International journal of multimedia information retrieval, 2018(7): 87-93. DOI: 10.1007/s13735-017-0141-z [15] 于坤, 王贺封, 焦月正, 等. 基于语义分割的遥感影像建筑物提取[J]. 测绘与空间地理信息, 2021, 44(10): 50-54. [16] LONG J, SHELHAMER E, DARRELL T, et al. Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 3431-3440. DOI: 10.1109/CVPR.2015.7298965 [17] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention–MICCAI, 2015: 234-241. DOI: 10.1007/978-3-319-24574-4_28 [18] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615 [19] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [J]. arXiv, 2017: 170605587. DOI: 10.48550/arXiv.1706.05587 [20] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6130-6239. DOI: 10.1109/CVPR.2017.660 [21] ZHENG H H, GONG M G, LIU T F, et al. HFA-Net: High frequency attention siamese network for building change detection in VHR remote sensing images[J]. Pattern recognition, 2022(129): 108717. DOI: 10.1016/j.patcog.2022.108717 [22] CUI M T, LI K, CHEN J Y, et al. CM-Unet: A novel remote sensing image segmentation method based on improved U-Net[J]. IEEE access, 2023(11): 56994-57005. DOI: 10.1109/ACCESS.2023.3282778 [23] WANG H Y, MIAO F. Building extraction from remote sensing images using deep residual U-Net[J]. European journal of remote sensing, 2022, 55(1): 71-85. DOI: 10.1080/22797254.2021.2018944 [24] YAN X, SHEN L, WANG J C, et al. PANet: Pixel-wise affinity network for weakly supervised building extraction from high-resolution remote sensing images[J]. IEEE geoscience and remote sensing letters, 2022(19): 1-5. DOI: 10.1109/LGRS.2022.3205309 [25] 陈雪娇, 田青林, 伊丕源. 基于深度学习的高分辨率遥感影像建筑物提取[J]. 世界核地质科学, 2023, 40(1): 81-88. [26] 王华俊, 葛小三. 一种轻量级的DeepLabv3+遥感影像建筑物提取方法[J]. 自然资源遥感, 2022, 34(2): 128-135. [27] OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention u-net: learning where to look for the pancreas[J]. arXiv, 2018. DOI: 10.48550/arXiv.1804.03999 [28] 赵元昊, 赵莹莹, 刘东升, 等. 遥感影像建筑物提取多尺度特征深度学习网络[J]. 航天返回与遥感, 2022, 43(4): 25-35. [29] 宋佳, 徐慧窈, 高少华, 等. 轻量化卷积神经网络遥感影像建筑物提取模型[J]. 遥感技术与应用, 2023, 38(1): 190-199. [30] DU X T, ZHENG Z, XIAO G P, et al. DeepSIM: Deep semantic information-based automatic mandelbug classification[J]. IEEE transactions on reliability, 2021, 71(4): 1540-1554. DOI: 10.1109/TR.2021.3110096 [31] JI S P, WEI S Q, LU M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE transactions on geoscience and remote sensing, 2018, 57(1): 574-586. DOI: 10.1109/TGRS.2018.2858817 [32] YUAN J Y. Learning building extraction in aerial scenes with convolutional networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(11): 2793-2798. DOI: 10.1109/TPAMI.2017.2750680 [33] WAHYUNI I, WANG W-J, LIANG D, et al. Rice Semantic Segmentation Using Unet-VGG16: A Case Study in Yunlin, Taiwan[C]//International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2021. DOI: 10.1109/ISPACS51563.2021.9651038 [34] GHOSH S, CHAKI A, SANTOSH K. Improved U-Net architecture with VGG-16 for brain tumor segmentation[J]. Physical and engineering sciences in medicine, 2021, 44(3): 703-712. DOI: 10.1007/s13246-021-01019-w