几种典型机器学习算法在短临降雨预报分析研究

池钦; 赵兴旺; 陈健

doi:10.12265/j.gnss.2022039

几种典型机器学习算法在短临降雨预报分析研究

Short-term rainfall forecast by several typical machine learning algorithm

摘要

摘要: 针对降雨过程中大气可降水量(PWV)和气象参数(温度(T)、湿度(U)、露点温度(T_d)、气压(P))特征变化情况，提出基于机器学习算法的短临降雨预报模型. 以北京(BJFS)站和武汉(WUH2)站2020年的3 h天顶对流层延迟(ZTD)和气象数据为例，构建随机森林(RF)、支持向量机(SVM)、K近邻(KNN)、朴素贝叶斯分类器(NBC) 4种算法的预报模型，并引入各自时刻的降雨情况作为新的特征向量，分别采用70%和80%训练集的分割方式，降雨情况作为模型输出，并利用准确性、精确率和假负率评价模型的适用性. 在取得准确性约0.92，精确率约80%，假负率约20%的结果下，进一步以时间序列年积日为第150—200天的数据为样本，对200—250天的降雨情况进行预报. 实验结果表明：基于机器学习的短临降雨预报模型可以预报未来3 h 80%以上的降雨情况，且假负率在20%以下，其中SVM模型的综合性能更优. 与传统的阈值模型相比，准确率相当，假负率降低约50%.

Abstract: According to the characteristic changes of precipitable water vapor and meteorological parameters (temperature (T), humidity (U), dew point temperature (T_d), surface pressure (P)) during the rainfall process, it is possible to establish a short-term rainfall forecast model based on machine learning algorithms. This paper uses the 3-hour zenith tropospheric delay and meteorological data of the bjfs station and wuh2 station in 2020 as examples to construct the prediction model of the four algorithms: random forest (RF), support vector machine (SVM), K-nearest neighbor (KNN), and naive bayes classifier (NBC), and introduces the rainfall events at each time as the new feature vector, adopts the segmentation method of 70% and 80% training sets respectively, takes the rainfall events as the model output, and the applicability of the model is evaluated by the accuracy, precision rate and false negative rate. After obtaining the accuracy is about 0.92, the precision rate is about 80%, and the false negative rate is about 20%, the data of 150—200 days in the time series are further used as samples to predict the rainfall of 200—250 days. The results indicate that The short-term rainfall forecast model based on machine learning can predict more than 80% of the rainfall events in the next 3 hours, and the false negative rate is below 20%, among which the SVM model has better comprehensive performance. Compared with the traditional threshold model, the accuracy rate is equivalent, and the false negative rate is decreased by about 50%.

HTML全文

参考文献(16)

施引文献

资源附件(0)