机器学习与传统降尺度法处理黄河流域夏季气象要素的差异

陈 涵, 管晓丹, 马婷婷

doi:10. 7522/j. issn. 1000-0534. 2024. 00118

高原气象 >

0 1

DOI: https://doi.org/10. 7522/j. issn. 1000-0534. 2024. 00118

机器学习与传统降尺度法处理黄河流域夏季气象要素的差异

展开

1. 兰州大学大气科学学院/半干旱气候变化教育部重点实验室，甘肃兰州 730000；
2. 兰州大学西部生态安全省部共建协同创新中心，甘肃兰州 730000

网络出版日期: 2025-02-24

基金资助

国家自然科学基金项目（42041004）；科技部国际合作项目（2024YFE0103200）

收起

Differences between Machine Learning and Traditional Downscaling Method in Processing Summer Meteorological Elements in the Yellow River Basin

Expand

1. Key Laboratory for Semi-Arid ClimateChange of the Ministry of Education，School of AtmosphericSciences，Lanzhou University，Lanzhou，730000，Gansu，China；
2. Collaborative Innovation Center for Western Ecological Safety，Lanzhou University，Lanzhou，730000，Gansu，China

Online published: 2025-02-24

Supported by

Fold

摘要

大气环流模式（GCMs）是当前预估未来气候变化的主要工具，但其空间分辨率较粗，无法满足局地气候变化影响评估的需要。为解决这一问题，基于线性回归方程的统计降尺度方法被开发用以提高GCMs的空间分辨率，并在后来不断改进创新；同时，机器学习技术在解决各种预测建模问题方面表现优越，这使其有潜力成为统计降尺度的新工具。因此，本研究应用机器学习模型-Light Gradient Boost‐ing Machine（LightGBM）构建统计降尺度模型，并结合信息流方法选择预报因子，通过与线性回归方法（基于经验正交函数分解-EOF的逐步多元线性回归法）进行比较，探讨了LightGBM在统计降尺度领域的应用能力。两种方法对我国重要的气候变化敏感区-黄河流域进行降尺度处理，对流域内90个站点建立统计降尺度模型，生成1965-2014年夏季各站点的气温与降水，通过分析降尺度值与观测值之间的相关系数，均方根误差（RMSE）和空间分布，评估两种方法的性能。结果表明，对温度降尺度时，两种降尺度方法均可以校正再分析资料（ERA5）在流域北部的温度误差，其中 LightGBM 表现出优越的站点间相关关系，但在6，7，8月分别有60，64和52个站点表现出比回归法更高的RMSE；对降水降尺度时，两种降尺度数据集与ERA5均无法准确再现观测值的空间分布，但由LightGBM得到的降尺度值较回归法有更高的站点间相关系数，且在6，7，8月仅有16，7和14个站点表现出比回归法更高的RMSE。考虑到机器学习方法在非线性问题上建模的潜力，未来仍需进一步改进算法，提高降尺度数据集的质量。研究结果中机器学习在降尺度工作中的优缺点为未来选用统计降尺度方法生成高分辨率温度与降水数据提供了技术参考和支持。

关键词： 统计降尺度; 机器学习; 信息流; 温度; 降水

本文引用格式

陈涵, 管晓丹, 马婷婷 . 机器学习与传统降尺度法处理黄河流域夏季气象要素的差异[J]. 高原气象, 0 : 1 . DOI: 10. 7522/j. issn. 1000-0534. 2024. 00118

Abstract

Global Climate Models （GCMs） are the primary tools currently used to predict future climate change；however，their coarse spatial resolution limits their ability to assess localized impacts of climate change. To address this issue，statistical downscaling methods based on linear regression equations have been developed to enhance the spatial resolution of GCMs and have continuous improvement and innovation. Meanwhile，machine learning techniques have demonstrated superior performance in various predictive modeling problems， making them potential new tools for statistical downscaling. Therefore，this study applied machine learning model-Light Gradient Boosting Machine（LightGBM）to construct statistical downscaling model for each site，combined with information flow method to select forecasting factors，and compared with linear regression method （stepwise multiple linear regression method based on Empirical Orthogonal Function）to explore the application ability of LightGBM in the field of statistical downscaling. The two methods were applied to downscale the meteorological element of the Yellow River basin，an important climate change sensitive area in China，establishing statistical downscaling models for 90 stations within the basin to generate temperature and precipitation data for the summer months（June，July，August）from 1965 to 2014. The performance of both methods is evaluated through an analysis of the correlation coefficients，root mean square errors（RMSE），and spatial distributions be‐ tween downscaled values and observed values. The results show that both downscaling methods can correct the temperature error of the reanalysis data（ERA5）in the northern part of the basin. LightGBM shows superior inter-site correlation，but 60，64，52 sites show higher RMSE than regression method in June，July，and August， respectively. For precipitation downscaling，neither of the two downscaling datasets nor ERA5 could accurately represent the spatial distribution of observed values，but the downscaling value obtained by LightGBM had a higher inter-site correlation coefficient than the regression method，and only 16，7，14 sites showed higher RMSE than the regression method in June，July and August. Considering the potential of machine learning methods for modeling nonlinear problems，it is still necessary to further improve the algorithm and improve the quality of downscaling data sets in the future. The advantages and disadvantages of machine learning in downscaling work provided a technical reference and support for using statistical downscaling methods to generate high-resolution temperature and precipitation data in the future.

Key words： statistical downscaling; machine learning; information flow; temperature; precipitation

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract