基于决策树算法的青海省高原地区冰雹预报及关键特征因子分析

展开
  • 1. 青海大学计算机技术与应用学院,青海 西宁 810016
    2. 青海省智能计算与应用实验室,青海 西宁 810016
    3. 青海理工学院计算机与信息科学学院,青海 西宁 810018
    4. 青海省气象科学研究所,青海 西宁 810001

网络出版日期: 2025-04-29

基金资助

青海省科技厅面上项目(2023-ZJ-906M);国家自然科学基金项目(6216205342265010

Hail Forecasting and Key Feature Analysis in the Qinghai Plateau Using Decision Tree Algorithms 

Expand
  • 1. School of Computer Technology and ApplicationsQinghai UniversityXining 810016QinghaiChina
    2. Qinghai Provincial Laboratory of Intelligent Computing and ApplicationsQinghai UniversityXining 810016QinghaiChina
    3. School of Computer and Information ScienceQinghai Institute of TechnologyXining 810018QinghaiChina
    4. Qinghai Institute of Meteorological ScienceXining 810001QinghaiChina

Online published: 2025-04-29

摘要

青海省因其独特的地理环境,成为冰雹天气频发的地区。针对中国高原复杂地形区域,尤其是青海高原的特殊性,本文利用2009-2023年间青海省52个地面气象站的冰雹观测数据及同期冰雹灾情资料,结合ERA5大气再分析数据集,构建了适用于青海高原的冰雹预报数据集。基于此数据集,本文采用随机森林、XGBoostLightGBM三种决策树集成学习模型进行冰雹预报建模。实验中,分别对直径≥2 mm≥5 mm的冰雹样本进行了实验与讨论。实验结果表明,LightGBM模型在两种样本上的表现均优于随机森林和XGBoost算法,尤其在大冰雹样本(直径≥5 mm)的预报中效果更佳。具体而言,LightG‐BM 模型在小冰雹样本(直径≥2 mm)上的命中率、误报率、临界成功指数、准确率和召回率分别为0. 9230. 0410. 8580. 946 0. 924;在大冰雹样本(直径≥5 mm)上的命中率、误报率、临界成功指数、准确率和召回率分别为0. 9380. 0380. 9080. 9600. 964。基于实验结果,本文深入分析了高原复杂地形区域的冰雹预报模型,发现在众多特征因子中,对于青海省冰雹天气预报影响较大的特征因子依次是热力条件(温度垂直积分p54. 162、热能垂直积分p60. 1622 m露点温度d2m)、特性高度层条件(100 hPa 温度 t100400 hPa 温度 t40020 hPa 位势高度 z20)以及动力条件(500 hPa 风的东西方向分量u500200 hPa风的南北方向分量v200200 hPa风的东西方向分量u200)。通过对特征因子核密度估计曲线分析,发现大部分特征因子的核密度估计曲线未能呈现明显的分离性,单一特征因子难以决定冰雹天气的发生。个例分析表明,基于LightGBM的冰雹预报模型展现出良好的空间预报能力。通过对关键物理量的24 h演变特征分析发现,茶卡站发生较大尺度冰雹事件与多个天气要素的显著变化密切相关:温度垂直积分(p54. 162)呈现较大波动,表明对流活动剧烈;2 m露点温度(d2m)维持在较高水平,反映近地面水汽充沛;500 hPa 东西向风速(u500)较大,暗示中层大气存在显著的动力条件;100 hPa 温度(t100)偏低,体现了高层大气的特征。这些物理量的协同演变特征不仅揭示了强对流天气系统发展的关键环节,也为完善青海省区域冰雹潜势预报方法提供了科学依据。

本文引用格式

刘 杰, 张国晶, 王晓英, 管 琴 . 基于决策树算法的青海省高原地区冰雹预报及关键特征因子分析[J]. 高原气象, 0 : 1 . DOI: 10.7522/j.issn.1000-0534.2025.00044

Abstract

Due to its unique geographical environmentQinghai Province is highly susceptible to frequent hail events. Considering the complex topography of high-altitude regionsparticularly the Qinghai Plateauthis study constructs a hail forecasting dataset by integrating hail observations from 52 meteorological stations in Qinghai from 2009 to 2023corresponding hail disaster recordsand the ERA5 atmospheric reanalysis dataset. Based on this datasetthree ensemble decision tree models-Random ForestXGBoostand LightGBM-are employed to develop a hail forecasting modelwith separate analyses conducted on hail samples with diameters of ≥2 mm and ≥5 mm. Experimental results demonstrate that the LightGBM model consistently outperforms both Random Forest and XGBoostwith particularly superior performance in forecasting large hail eventsdiameter ≥5 mm. Specificallyfor small hail samplesdiameter ≥2 mm),the LightGBM model achieves a hit rate of 0. 923a false alarm rate of 0. 041a Critical Success IndexCSIof 0. 858an accuracy of 0. 946and a recall rate of 0. 924while for large hail samplesdiameter ≥5 mm),it attains a hit rate of 0. 938a false alarm rate of 0. 038a CSI of 0. 908an accuracy of 0. 960and a recall rate of 0. 964. Further analysis of the hail forecasting model in the complex terrain of the plateau reveals that the most influential meteorological factors for hail fore‐ casting in Qinghai Province include thermodynamic conditionsvertically integrated temperature p54. 162vertically integrated thermal energy p60. 162and 2-meter dew point temperature d2m),characteristic height layer conditions100 hPa temperature t100400 hPa temperature t400and 20 hPa geopotential height z20),and dynamic conditions500 hPa zonal wind component u500200 hPa meridional wind component v200and 200 hPa zonal wind component u200. Kernel density estimation analysis indicates that most feature variables exhibit limited separabilitysuggesting that no single factor alone can determine the occurrence of hail events. A case study demonstrates that the LightGBM-based hail forecasting model exhibits strong spatial forecasting capabilities. Analysis of the 24-hour evolution of key meteorological variables preceding a large-scale hail event at the Chaka station identifies several crucial atmospheric indicators:(1significant fluctuations in vertically integrated temperaturep54. 162),indicating intense convective activity;(2persistently high 2-meter dew point temperatured2m),reflecting abundant near-surface moisture;(3strong 500 hPa zonal wind speedu500),suggesting enhanced mid-level atmospheric dynamicsand4low 100 hPa temperaturet100),capturing upper-at‐mosphere characteristics. The coordinated evolution of these atmospheric variables not only reveals key stages in the development of severe convective weather systems but also provides a scientific foundation for improving hail potential forecasting methods in Qinghai Province.

文章导航

/