冰雹是一种致灾性较强的强对流天气, 但在气象业务工作中对其进行快捷、 准确的预警和预报仍有一定的难度。本文基于C波段雷达回波资料, 构建并应用随机森林模型对冰雹及其伴随强对流天气进行了分类识别及预报。结果发现, 随机森林模型对训练集(2008 -2017年)中四类冰雹天气(冰雹、 冰雹大风、 冰雹短强、 冰雹大风短强)的平均命中率(Probability of Detection, POD)为90.2%, 平均空报比率(False Alarm Ratio, FAR)为11.1%。对于2018 -2019年的独立样本测试集, 模型的平均POD和FAR则分别为72.8%和34.7%。因此, 本文构建的随机森林模型较为理想。应用模型和风暴单体识别与跟踪产品(Strom Cell Identification and Tracking, SCIT)对未来15~60 min的强对流天气进行预报, 结果表明四类冰雹天气的平均POD为74.8%, 平均临界成功指数为60.8%, 平均FAR为24.4%。因此, 利用C波段雷达产品, 随机森林模型能高效、 自动化且较为准确地分类预警、 预报冰雹及其伴随强对流天气, 可应用于天气预报业务工作。
Hail is a kind of strong convective weather with high possibility to cause serious disasters, but it is hard to be early-warned and nowcasted accurately for present meteorological operations.This study pre-warns and now-casts the hail disastrous weather and its accompanying strong convective weathers based on C-band radar products and random forest (RF) algorithm, and gets the following results.In the training set (from 2008 to 2017), the identification of the four types of hail disastrous weather (the hail, hail with strong wind, hail with short-time heavy precipitation, hail with strong wind and short-time heavy precipitation) using the RF model has a mean probability of detection (POD) of 90.2% and a mean false alarm ratio (FAR) of 11.1%.This indicates that the RF model developed in this study is generally ideal.Then the RF model is used in the independent testing set in 2018-2019, which shows that the mean POD and FAR of the model identification for the testing set is 72.8% and 34.7%, respectively.Among the four types of the hail disastrous weather, the hail with strong wind has the highest POD of 83.3%, while the hail with short-time heavy precipitation has the lowest FAR of 12.5%.Thus, the testing set has a lower POD and higher FAR compared to the training set, which is probably because the sample numbers of the testing set is lower.The RF model is then applied in the nowcast of hail disastrous weather in 15~60 min using the storm identification and tracking (SCIT) products.Results show that the RF model has a good performance for the nowcast of hail disastrous weather with a mean POD of 74.8%, mean critical success index (CSI) of 60.8% and mean FAR of 24.4%.Among the four types of the hail disastrous weather, the hail with strong wind has the highest POD and CSI for the forecast in 15~60 min, while the hail with short-time heavy precipitation has the lowest FAR.Furthermore, the POD of RF model is comparable to other hail identification system, but its FAR (<35%)is lower than that of other systems.Therefore, the random forest model can well classify, early-warn and nowcast the hail disastrous weather and its accompanying severe convective weather efficiently and automatically, and is suitable to be applied in the real weather forecast operations.
[1]Doswell C A, 2015.Severe convective storms in the European societal context[J].Atmospheric Research, 158/159: 210-215.DOI: 10.1016/j.atmosres.2014.08.007.
[2]Bernard S, Heutte L, Adam S, al et, 2007.Using random forests for handwritten digit recognition[C].International Conference on Document Analysis and Recognition, 2007: 1043-1047.DOI: 10.1109/ICDAR.2007.4377074.
[3]Geurts P, Ernst D, Wehenkel L, al et, 2006.Extremely randomized trees[J].Machine Learning, 63(1): 33-42.DOI: 10.1007/s10994-006-6226-1.
[4]Mecikalski J R, Williams J K, Jewett C P, al et, 2015.Probabilistic 0-1-h convective initiation nowcasts that combine geostationary satellite observations and numerical weather prediction model data[J].Journal of Applied Meteorology and Climatology, 54(5): 1039-1059.DOI: 10.1175/JAMC-D-14-0129.1.
[5]白琳, 徐永明, 何苗, 等, 2017.基于随机森林算法的近地表气温遥感反演研究[J].地球信息科学学报, 19(3): 390-397.DOI: 10.3724/SP.J.1047.2017.00390.
[6]曾智琳, 谌芸, 朱克云, 等, 2019.广东省大冰雹事件的层结特征与融化效应[J].大气科学, 43(3): 598-617.DOI: 10.3878/j.issn.1006-9895.1808.18152.
[7]董师师, 黄哲学, 2013.随机森林理论浅析[J].集成技术, 2(1): 1-7.
[8]方匡南, 吴见彬, 朱建平, 等, 2011.随机森林方法研究综述[J].统计与信息论坛, 26(3): 32-38.
[9]黄衍, 查伟雄, 2012.随机森林与支持向量机分类性能比较[J].软件, 33(6): 107-110.DOI: 10.3969/j.issn.1003-6970.2012.06.038.
[10]李丰, 刘黎平, 王红艳, 等, 2014.C波段多普勒天气雷达地物识别方法[J].应用气象学报, 25(2): 158-167.
[11]李国翠, 刘黎平, 连志鸾, 等, 2014.利用雷达回波三维拼图资料识别雷暴大风统计研究[J].气象学报, 72(1): 161-181.DOI: 10.11676/qxxb2014.003.
[12]李平, 解以扬, 李英华, 等, 2013.C波段雷达反射率资料的同化与数值模拟[J].气象科技, 42(3): 506-515.DOI: 10.19517/j.1671-6345.2013.03.016.
[13]李文娟, 赵成, 郦敏杰, 等, 2018.基于数值预报和随机森林算法的强对流天气分类预报技术[J].气象, 44(12): 1555-1564.DOI: 10.7519/j.issn.1000-0526.2018.12.005.
[14]李毓, 张春霞, 2011.基于out-of-bag样本的随机森林算法的超参数估计[J].系统工程学报, 26(4): 566-572.
[15]梁慧玲, 林玉蕊, 杨光, 2016.基于气象因子的随机森林算法在塔河地区林火预测中的应用[J].林业科学, 52(1): 89-98.DOI: 10.11707 /j.1001-7488.20160111.
[16]刘君, 肖秀珠, 张维, 等, 2013.福建西部近10年冰雹雷达资料统计分析[J].气象科技, 41(2): 339-342.
[17]刘雨佳, 陈洪滨, 朱君鉴, 2014.山东省S波段与C波段天气雷达回波强队的对比分析[J].气象科学, 34(1): 87-95.DOI: 10.3969 /2012jms.0176.
[18]路亚奇, 曹彦超, 张峰, 等, 2016.陇东冰雹天气特征分析及预报预警[J].高原气象, 35(6): 1565-1576.DOI: 10.7522/j.issn. 1000-0534.2015.00116.
[19]路志英, 任一墨, 孙晓磊, 等, 2018.基于深度学习的短时强降水天气识别[J].天津大学学报(自然科学与工程技术版), 51(2): 111-119.DOI: 10.11784/tdxbz201703106.
[20]马洪波, 陈长胜, 牛立强, 等, 2016.基于SWAN系统的新冰雹指数应用及检验[J].气象灾害防御, 23(2): 20-25.
[21]唐亘, 2018.精通数据科学//从线性回归到深度学习[M].北京: 中国工信出版集团, 238-239.
[22]唐文苑, 周庆亮, 刘鑫华, 等, 2017.国家级强对流天气分类预报检验分析[J].气象, 43(1): 67-76.DOI: 10.7519/j.issn.1000-0526.2017.01.007.
[23]王红艳, 2015.新一代天气雷达组网估算降水的覆盖能力分析法研究[D].北京: 中国气象科学研究院, 1-140.
[24]王坚红, 肖雯, 曹正, 等, 2019.华南前汛期暴雨C波段雷达特征应用研究[J].南京信息工程大学学报(自然科学版), 11(1): 91-100.DOI: 10.13878 /j.cnki.jnuist.2019.01.016.
[25]王令, 康玉霞, 焦热光, 等, 2004.北京地区强对流天气雷达回波特征[J].气象, 30(7): 31-35.
[26]王萍, 潘跃, 2013.基于显著性特征的大冰雹识别模型[J].物理学报, 62(6): 515-524.DOI: 10.7498/aps.62.069202.
[27]王莎, 沙勇, 宋金妹, 等, 2019.冀东地区冰雹云多普勒雷达参数特征分析[J].气象, 45(5): 713-722.DOI: 10.7519/j.issn.1000-0526.2019.05.013.
[28]王研峰, 黄武斌, 王聚杰, 等, 2019.一次甘肃天水强冰雹的雷达回波特征及成因分析[J].高原气象, 38(2): 368-376.DOI: 10. 7522/j.issn.1000-0534.2018.00077.
[29]吴剑坤, 陈明轩, 秦睿, 等, 2019.变分回波跟踪算法及其在对流临近预报中的应用试验[J].气象学报, 77(6): 999-1014.DOI: 10.11676/qxxb2019.062.
[30]修媛媛, 韩雷, 冯海磊, 2016.基于机器学习方法的强对流天气识别研究[J].电子设计工程, 24(9): 4-7.DOI: 10.14022/j.cnki.dzsjgc.2016.09.002.
[31]俞小鼎, 2011.强对流天气的多普勒天气雷达探测和预警[J].气象科技进展, 1(3): 31-41.
[32]张秉祥, 李国翠, 刘黎平, 等, 2014.基于模糊逻辑的冰雹天气雷达识别算法[J].应用气象学报, 25(4): 415-426.
[33]张雷, 王琳琳, 张旭东, 等, 2014.随机森林算法基本思想及其在生态学中的应用——以云南松分布模拟为例[J].生态学报, 34(3): 650-659.DOI: 10.5846/stxb201306031292.
[34]张小玲, 杨波, 盛杰, 等, 2018.中国强对流天气预报业务发展[J].气象科技进展, 8(3): 8-18.DOI: 10.3969/j.issn.2095-1973. 2018.03.001.
[35]赵海军, 潘玲, 王庆华, 等, 2018.临沂冰雹发生规律及预警技术研究[J].气象与环境科学, 41(2): 83-90.DOI: 10.16765 /j.cnki.1673-7148.2018.02.012.
[36]郑永光, 林隐静, 朱文剑, 等, 2013.强对流天气综合监测业务系统建设[J].气象, 39(2): 234-240.DOI: 10.11898/1001-7313. 20150601.
[37]周康辉, 郑永光, 王婷波, 等, 2017.基于模糊逻辑的雷暴大风和非雷暴大风区分方法[J].气象, 43(7): 781-791.DOI: 10.7519/j.issn.1000-0526.2017.07.002.