Ensemble machine learning for modeling greenhouse gas emissions at different time scales from irrigated paddy fields

Zewei Jiang, Shihong Yang* (Corresponding Author), Pete Smith, Qingqing Pang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Quantifying greenhouse gas (GHG) emissions from irrigated paddy fields is of great significance for addressing climate change. Machine learning (ML) provides an alternative to empirical models and biogeochemical models, but it has not been used to simulate GHG emissions from paddy fields. In this study, a three-year field experiment dataset of paddy fields under controlled irrigation (CI, a kind of water-saving irrigation (WSI) technology) and flood irrigation (FI) in Kunshan, China, was collected, as well as a global dataset on WSI and FI. Then, the stacking ensemble model was developed based on three basic ML models, random forest (RF), K-Nearest Neighbor regression (KNN), gradient boosting regression (GBR), and a meta-learner, linear Regression (LR). Those models were used for the first time to simulate CH4 and N2O emissions on different time scales from paddy fields under WSI and FI. The results showed that the cumulative CH4 and N2O emissions from the WSI paddy fields decreased by 61.14% and increased by 19.52%, respectively, compared with FI. The ML algorithm can be applied to simulate daily, growth stages, and cumulative GHG emissions from paddy fields, but the performance of the linear regression model was worse than other ML models. Compared with the basic models, the stacking model improved the accuracy, improving the R2 by 0.37∼13.36% and reducing the RMSE by 3.23∼42.78%. Soil redox, temperature, and moisture are necessary for accurate modeling. Meanwhile, training the model using data from WSI and FI separately is beneficial to improve the accuracy of the model. The availability of the stacking model in other stations was verified based on literature data, with R2 varying between 0.7634 and 0.9985. Therefore, the stacking model is recommended to predict GHG emissions at different time scales from paddy fields, and this method can be extended to estimate GHG from paddy fields at various stations around the world.

Original languageEnglish
Article number108821
Number of pages18
JournalField Crops Research
Volume292
Early online date23 Jan 2023
DOIs
Publication statusPublished - 1 Mar 2023

Bibliographical note

Funding Information:
This work was supported by the Fundamental Research Funds for the Central Universities ( B220203009 ), the Postgraduate Research & Practice Program of Jiangsu Province ( KYCX22_0669 ), the National Key R&D Program of China ( 2018YFC1508303 ), the Water Conservancy Science and Technology Project of Jiangxi Province ( 201921ZDKT06 , 202124ZDKT09 ), the National Natural Science Foundation of China (51879076), the Fundamental Research Funds for the Central Universities (B210204016).

Data Availability Statement

Data Availability:
Data will be made available upon request.

Supporting Information:
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.fcr.2023.108821.

Keywords

  • Greenhouse gas
  • Machine learning
  • Paddy
  • Stacking
  • Water-saving irrigation

Fingerprint

Dive into the research topics of 'Ensemble machine learning for modeling greenhouse gas emissions at different time scales from irrigated paddy fields'. Together they form a unique fingerprint.

Cite this