Ofori-Ampofo et al. (2025) On the strategy of exploring spatio-temporal information from Earth observation data for crop yield prediction
Identification
- Journal: Smart Agricultural Technology
- Year: 2025
- Date: 2025-10-17
- Authors: Stella Ofori-Ampofo, Rıdvan Salih Kuzu, Peter Schauer, Martin Willberg, Aaron Hohl, Xiao Xiang Zhu
- DOI: 10.1016/j.atech.2025.101540
Research Groups
- Technical University of Munich, Germany (TUM School of Engineering and Design, Department of Aerospace and Geodesy; Munich Center for Machine Learning)
- Remote Sensing Technology Institute, German Aerospace Center (DLR), Weßling, Germany
- Industrieanlagen-Betriebsgesellschaft mbH (IABG), Ottobrunn, Germany
Short Summary
This study comprehensively compares multiple strategies for encoding spatial and temporal information from Earth observation data for county-level corn yield prediction in the USA using various machine learning models. It reveals that predicting crop yield effectively using only time series data is possible, with surface reflectance being a critical predictor, and highlights the importance of recent historical data over long-term records for model accuracy.
Objective
- To conduct a comprehensive comparison of existing spatial and temporal encoding techniques for corn yield prediction in the USA, evaluating their trade-offs and contributions to prediction accuracy under a unified dataset and experimental setup.
Study Configuration
- Spatial Scale: County level (473 counties across Iowa, Illinois, Indiana, Nebraska, and Minnesota, USA). Data resolutions: MODIS at 500 meters, Daymet at 1 kilometer, USDA-NASS crop type maps at 30 meters.
- Temporal Scale: 19 years of county-level corn yield records (2003-2021). Satellite image time series (SITS) from MODIS with an 8-day revisit frequency (46 timesteps/year, truncated to 27 timesteps for April-October). Daily Daymet weather data aggregated to 8-day resolution.
Methodology and Data
- Models used:
- Spatial Encoding Strategies: Pixel averages, Image histograms, Pixel-set encoders (PSE).
- Temporal Encoding Strategies: Random Forests (RF), XGBoost, Support Vector Machines (SVM), Multilayer Perceptrons (MLP), Temporal Convolutional Neural Networks (TempCNN), Multi-Scale Residual Network (MSResNet), InceptionTime, Long Short-Term Memory (LSTM), LSTM with Attention, Lightweight Temporal Attention Encoder (LTAE).
- Combined Models: Histogram-LSTM, Histogram-TempCNN, Histogram-2D CNN, PSE-LTAE.
- Data sources:
- Yield Data: County-level corn yield (bushels per acre) from the National Agriculture Statistics Office (NASS) of the United States Department of Agriculture (USDA) (2003-2021).
- Satellite Imagery: Gridded 8-day MODIS surface reflectance (MOD9A1.061, 500 m resolution) for visible and infrared bands.
- Spectral Indices: Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) derived from MODIS bands.
- Weather Data: Gridded daily precipitation, minimum temperature, and maximum temperature from Daymet (1 km resolution).
- Ancillary Data: Annual gridded crop type maps from USDA-NASS (30 m resolution) for crop masking.
Main Results
- The average Mean Absolute Percentage Error (MAPE) on test sets was generally below 10%. Root Mean Squared Error (RMSE) was higher in 2021 compared to 2020.
- Among models using time-series pixel averages, SVM demonstrated strong performance, outperforming other classical baselines (RF, XGBoost, MLP) and even some deep learning models like LSTM and Histogram-2D.
- TempCNN achieved the best overall performance (lowest RMSE) when averaged across both test years (RMSE 13.89 bu/acre or 872.1 kg/hectare in 2020, 16.05 bu/acre or 1007.4 kg/hectare in 2021), with modest improvements over MSResNet and SVM.
- The performance of LTAE declined when combined with the Pixel-Set Encoder (PSE-LTAE), possibly due to challenges in handling high spectral variation within large prediction units.
- Flattening histograms and modeling the temporal component with 1D temporal convolution (Histogram-TempCNN) proved more efficient than using LSTMs or 2D CNNs on histogram images.
- Surface reflectance (SR) features consistently performed well independently (highest R²), indicating their strong ability to capture yield variability. Models combining SR with either weather or spectral indices achieved the best overall performance and stability.
- An 8-year training window (2012-2019) generalized better to test years than a 17-year (2003-2019) or 4-year (2016-2019) window, suggesting the presence of concept drift and the benefit of more recent data.
- In-season forecasting showed that longer time spans improved prediction accuracy, with optimal performance observed around late August, after which extending time steps provided diminishing returns.
- Including the previous year's data (2020) in the training set significantly improved the prediction accuracy for the current year (2021), reducing RMSE by 2 units (approximately 125.5 kg/hectare) and improving R² by 12%.
- The MSResNet model achieved RMSEs of 13.77 bu/acre (864.6 kg/hectare) in 2020 and 14.89 bu/acre (934.7 kg/hectare) in 2021 (when 2020 was included in training), outperforming some existing 1D CNN-LSTM and LSTM-attention models (reported around 17 bu/acre or 1067.1 kg/hectare).
Contributions
- Provides the first comprehensive comparison of various spatial and temporal encoding techniques for corn yield prediction in the USA, using a unified dataset and experimental setup.
- Introduces and evaluates advanced deep learning architectures (e.g., MSResNet, InceptionTime, LTAE) for yield prediction, including the application of pixel-set encoders from crop classification studies to large prediction units.
- Offers practical insights into the trade-offs between different data encoding strategies and their impact on predictive performance.
- Conducts an in-depth analysis of critical factors influencing model performance, such as training data sample size, feature combinations, in-season forecasting capabilities, and the significance of prior-year observations.
- Highlights the effectiveness of predicting crop yield using solely time series data without explicit spatial features and underscores the importance of surface reflectance as a key predictor.
- Demonstrates that reliance on long-term historical data may hinder models' ability to reflect current conditions accurately, suggesting the presence of concept drift.
- Reinforces the value of temporal continuity in training data, showing that including the previous year's data significantly enhances current year prediction accuracy.
- Compiles and makes available a multi-source dataset for crop monitoring in the USA, facilitating future methodological advancements in remote sensing for agricultural applications.
Funding
- Munich Aerospace e.V. scholarship (S. Ofori-Ampofo)
- ML4Earth project by the German Federal Ministry for Economic Affairs and Climate Action (grant number 50EE2201C) (A. Höhl)
- MONITOR and ML4Earth projects (for dataset collation)
Citation
@article{OforiAmpofo2025strategy,
author = {Ofori-Ampofo, Stella and Kuzu, Rıdvan Salih and Schauer, Peter and Willberg, Martin and Hohl, Aaron and Zhu, Xiao Xiang},
title = {On the strategy of exploring spatio-temporal information from Earth observation data for crop yield prediction},
journal = {Smart Agricultural Technology},
year = {2025},
doi = {10.1016/j.atech.2025.101540},
url = {https://doi.org/10.1016/j.atech.2025.101540}
}
Original Source: https://doi.org/10.1016/j.atech.2025.101540