机器学习与传统降尺度法处理黄河流域夏季气象要素的差异

展开
  • 1. 兰州大学 大气科学学院/半干旱气候变化教育部重点实验室,甘肃 兰州 730000
    2. 兰州大学 西部生态安全省部共建协同创新中心,甘肃 兰州 730000

网络出版日期: 2025-02-24

基金资助

国家自然科学基金项目(42041004);科技部国际合作项目(2024YFE0103200

Differences between Machine Learning and Traditional Downscaling Method in Processing Summer Meteorological Elements in the Yellow River Basin

Expand
  • 1. Key Laboratory for Semi-Arid ClimateChange of the Ministry of EducationSchool of AtmosphericSciencesLanzhou UniversityLanzhou730000GansuChina
    2. Collaborative Innovation Center for Western Ecological SafetyLanzhou UniversityLanzhou730000GansuChina

Online published: 2025-02-24

Supported by


摘要

大气环流模式(GCMs)是当前预估未来气候变化的主要工具,但其空间分辨率较粗,无法满足局地气候变化影响评估的需要。为解决这一问题,基于线性回归方程的统计降尺度方法被开发用以提高GCMs的空间分辨率,并在后来不断改进创新;同时,机器学习技术在解决各种预测建模问题方面表现优越,这使其有潜力成为统计降尺度的新工具。因此,本研究应用机器学习模型-Light Gradient Boost‐ing MachineLightGBM)构建统计降尺度模型,并结合信息流方法选择预报因子,通过与线性回归方法(基于经验正交函数分解-EOF的逐步多元线性回归法)进行比较,探讨了LightGBM在统计降尺度领域的应用能力。两种方法对我国重要的气候变化敏感区-黄河流域进行降尺度处理,对流域内90个站点建立统计降尺度模型,生成1965-2014年夏季各站点的气温与降水,通过分析降尺度值与观测值之间的相关系数,均方根误差(RMSE)和空间分布,评估两种方法的性能。结果表明,对温度降尺度时,两种降尺度方法均可以校正再分析资料(ERA5)在流域北部的温度误差,其中 LightGBM 表现出优越的站点间相关关系,但在678月分别有606452个站点表现出比回归法更高的RMSE;对降水降尺度时,两种降尺度数据集与ERA5均无法准确再现观测值的空间分布,但由LightGBM得到的降尺度值较回归法有更高的站点间相关系数,且在678月仅有16714个站点表现出比回归法更高的RMSE。考虑到机器学习方法在非线性问题上建模的潜力,未来仍需进一步改进算法,提高降尺度数据集的质量。研究结果中机器学习在降尺度工作中的优缺点为未来选用统计降尺度方法生成高分辨率温度与降水数据提供了技术参考和支持。

本文引用格式

陈 涵, 管晓丹, 马婷婷 . 机器学习与传统降尺度法处理黄河流域夏季气象要素的差异[J]. 高原气象, 0 : 1 . DOI: 10. 7522/j. issn. 1000-0534. 2024. 00118

Abstract

Global Climate Models GCMsare the primary tools currently used to predict future climate changehowevertheir coarse spatial resolution limits their ability to assess localized impacts of climate change. To address this issuestatistical downscaling methods based on linear regression equations have been developed to enhance the spatial resolution of GCMs and have continuous improvement and innovation. Meanwhilemachine learning techniques have demonstrated superior performance in various predictive modeling problemsmaking them potential new tools for statistical downscaling. Thereforethis study applied machine learning model-Light Gradient Boosting MachineLightGBMto construct statistical downscaling model for each sitecombined with information flow method to select forecasting factorsand compared with linear regression method stepwise multiple linear regression method based on Empirical Orthogonal Functionto explore the application ability of LightGBM in the field of statistical downscaling. The two methods were applied to downscale the meteorological element of the Yellow River basinan important climate change sensitive area in Chinaestablishing statistical downscaling models for 90 stations within the basin to generate temperature and precipitation data for the summer monthsJuneJulyAugustfrom 1965 to 2014. The performance of both methods is evaluated through an analysis of the correlation coefficientsroot mean square errorsRMSE),and spatial distributions be‐ tween downscaled values and observed values. The results show that both downscaling methods can correct the temperature error of the reanalysis dataERA5in the northern part of the basin. LightGBM shows superior inter-site correlationbut 606452 sites show higher RMSE than regression method in JuneJulyand Augustrespectively. For precipitation downscalingneither of the two downscaling datasets nor ERA5 could accurately represent the spatial distribution of observed valuesbut the downscaling value obtained by LightGBM had a higher inter-site correlation coefficient than the regression methodand only 16714 sites showed higher RMSE than the regression method in JuneJuly and August. Considering the potential of machine learning methods for modeling nonlinear problemsit is still necessary to further improve the algorithm and improve the quality of downscaling data sets in the future. The advantages and disadvantages of machine learning in downscaling work provided a technical reference and support for using statistical downscaling methods to generate high-resolution temperature and precipitation data in the future.

 

文章导航

/