Short-term rainfall forecast by several typical machine learning algorithm
-
摘要: 针对降雨过程中大气可降水量(PWV)和气象参数(温度(T)、湿度(U)、露点温度(Td)、气压(P))特征变化情况,提出基于机器学习算法的短临降雨预报模型. 以北京(BJFS)站和武汉(WUH2)站2020年的3 h天顶对流层延迟(ZTD)和气象数据为例,构建随机森林(RF)、支持向量机(SVM)、K近邻(KNN)、朴素贝叶斯分类器(NBC) 4种算法的预报模型,并引入各自时刻的降雨情况作为新的特征向量,分别采用70%和80%训练集的分割方式,降雨情况作为模型输出,并利用准确性、精确率和假负率评价模型的适用性. 在取得准确性约0.92,精确率约80%,假负率约20%的结果下,进一步以时间序列年积日为第150—200天的数据为样本,对200—250天的降雨情况进行预报. 实验结果表明:基于机器学习的短临降雨预报模型可以预报未来3 h 80%以上的降雨情况,且假负率在20%以下,其中SVM模型的综合性能更优. 与传统的阈值模型相比,准确率相当,假负率降低约50%.
-
关键词:
- 机器学习 /
- 天顶对流层延迟(ZTD) /
- 大气可降水量(PWV) /
- 气象数据 /
- 短临降雨
Abstract: According to the characteristic changes of precipitable water vapor and meteorological parameters (temperature (T), humidity (U), dew point temperature (Td), surface pressure (P)) during the rainfall process, it is possible to establish a short-term rainfall forecast model based on machine learning algorithms. This paper uses the 3-hour zenith tropospheric delay and meteorological data of the bjfs station and wuh2 station in 2020 as examples to construct the prediction model of the four algorithms: random forest (RF), support vector machine (SVM), K-nearest neighbor (KNN), and naive bayes classifier (NBC), and introduces the rainfall events at each time as the new feature vector, adopts the segmentation method of 70% and 80% training sets respectively, takes the rainfall events as the model output, and the applicability of the model is evaluated by the accuracy, precision rate and false negative rate. After obtaining the accuracy is about 0.92, the precision rate is about 80%, and the false negative rate is about 20%, the data of 150—200 days in the time series are further used as samples to predict the rainfall of 200—250 days. The results indicate that The short-term rainfall forecast model based on machine learning can predict more than 80% of the rainfall events in the next 3 hours, and the false negative rate is below 20%, among which the SVM model has better comprehensive performance. Compared with the traditional threshold model, the accuracy rate is equivalent, and the false negative rate is decreased by about 50%. -
表 1 降雨预报混淆矩阵
实际值 预报值 降雨 不降雨 降雨 TP FN 不降雨 FP TN 表 2 BJFS站和WUH2站降雨预报的统计结果
测站名 PWV变化量/mm PWV变化率/(mm·h−1) 精确率/% 假负率/% BJFS 2.5 0.6 79.2 66.1 WUH2 3.0 0.8 83.3 63.2 -
[1] HE Q, ZHANG K F, WU S Q, et al. Real-time GNSS-derived PWV for typhoon characterizations: a case study for super typhoon Mangkhut in Hong Kong[J]. Remote sensing, 2019, 12(1): 104. DOI: 10.3390/rs12010104 [2] FAYAZ S A, ZAMAN M, BUTT M A. Knowledge discovery in geographical sciences—a systematic survey of various machine learning algorithms for rainfall prediction[C]//International Conference on Innovative Computing and Communications, 2021: 593-608. DOI: 10.1007/978-981-16-2597-8_51 [3] 王江波. 长短期记忆网络在短临降雨中的应用[D]. 南京: 南京信息工程大学, 2021. [4] AHMED K, SACHINDRA D A, SHAHID S, et al. Multi-model ensemble predictions of precipitation and temperature using machine learning algorithms[J]. Atmospheric research, 2020(236): 104806. DOI: 10.1016/j.atmosres.2019.104806 [5] YANG M X, WANG H, JIANG Y Z, et al. GECA proposed ensemble–KNN method for improved monthly runoff forecasting[J]. Water resources management, 2020, 34(11): 849-863. DOI: 10.1007/s11269-019-02479-2 [6] LIU S, LIU R, TAN N Z. A spatial improved-KNN-based flood inundation risk framework for urban tourism under two rainfall scenarios[J]. Sustainability, 2021, 13(5): 2859. DOI: 10.3390/su13052859 [7] HUANG M, LIN R, HUANG S, et al. A novel approach for precipitation forecast via improved K-nearest neighbor algorithm[J]. Advanced engineering informatics, 2017(33): 89-95. DOI: 10.1016/j.aei.2017.05.003 [8] BOJANG P O, YANG T-C, PHAM Q B, et al. Linking singular spectrum analysis and machine learning for monthly rainfall forecasting[J]. Applied sciences, 2020, 10(9): 3224. DOI: 10.3390/app10093224 [9] SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015(1): 802-810. DOI: 10.48550/arXiv.1506.04214 [10] 周永江, 姚宜斌, 颜笑, 等. 融合 GNSS 气象参数的 BP 神经网络雾霾预测研究[J]. 大地测量与地球动力学, 2019, 39(11): 1148-1152. [11] 刘洋, 赵庆志, 姚顽强. 基于多隐层神经网络的GNSS PWV和气象数据的降雨预测研究[J]. 测绘通报, 2019(S1): 36-40. [12] 赵庆志, 刘洋, 姚顽强. 利用最小二乘支持向量机的短临降雨预测模型构建[J]. 大地测量与地球动力学, 2021, 41(2): 152-156. DOI: 10.14075/j.jgg.2021.02.008 [13] BYUN S H, BAR-SEVER Y E. A new type of troposphere zenith path delay product of the international GNSS service[J]. Journal of geodesy, 2009, 83(3): 367-373. DOI: 10.1007/S00190-008-0288-8 [14] HUANG S, HUANG M M, LYU Y J. An improved KNN-based slope stability prediction model[J]. Advances in civil engineering, 2020(11): 1-16. DOI: 10.1155/2020/8894109 [15] WANG H, ASEFA T, SARKAR A. A novel non-homogeneous hidden Markov model for simulating and predicting monthly rainfall[J]. Theoretical and applied climatology, 2021, 143(7): 627-638. DOI: 10.1007/s00704-020-03447-2 [16] 姚宜斌, 赵庆志, 李祖锋, 等. 基于全球导航卫星系统资料的短时降水预报[J]. 水科学进展, 2016, 27(3): 357-365. DOI: 10.14042/j.cnki.32.1309.2016.03.003