Ventura et al. (2026) Optimizing canopy cover evaluation: A machine learning approach using LiDAR data
Identification
- Journal: Environmental Modelling & Software
- Year: 2026
- Date: 2026-04-01
- Authors: Pau Bosch Ventura, Carles Carrillo, Alejandro Donaire, Eric Sánchez
- DOI: 10.1016/j.envsoft.2026.106982
Research Groups
Computer Architecture & Operating Systems Department, Universitat Autònoma de Barcelona, Barcelona, Spain
Short Summary
This study develops AI-CanopyMapper, a machine learning framework leveraging LiDAR data for efficient and accurate prediction of canopy cover, achieving a mean absolute error of 6.47% and an R² of 0.88 for the full model in Catalonia. The framework demonstrates strong generalization capabilities and computational efficiency, even with limited data, offering a fast and scalable alternative to traditional methods.
Objective
- To develop a fast, scalable, and accurate machine learning framework (AI-CanopyMapper) that uses LiDAR data to predict canopy cover, addressing the limitations of traditional methods in balancing speed and accuracy.
Study Configuration
- Spatial Scale: Catalonia, Spain. LiDAR data organized in 2 km × 2 km blocks with a minimum density of 0.5 points/m². Ground truth and derived maps (Slope, Aspect) at 20 m × 20 m resolution. LandUse data at 10 m × 10 m resolution.
- Temporal Scale: LiDAR and Canopy Cover ground truth data collected between 2016 and 2017. LandUse data collected in 2017. Slope and Aspect data collected in 2020.
Methodology and Data
- Models used: XGBoost (gradient-boosting algorithm) for canopy cover prediction. Mean imputer for handling missing values. Standard scaler for feature normalization.
- Data sources:
- LiDAR data: Institut Cartogràfic i Geològic de Catalunya (ICGC).
- Ground truth (Canopy Cover map): Institut Cartogràfic i Geològic de Catalunya (ICGC).
- LandUse data: Impact Observatory, Microsoft, and Esri.
- Slope and Aspect data: Derived from Digital Terrain Model (ICGC).
Main Results
- The full AI-CanopyMapper model achieved a Mean Absolute Error (MAE) of 6.47% and an R² of 0.88 on the testing set.
- A partial model, trained with only 1.3% of the available data, achieved an MAE of approximately 15%, demonstrating strong generalization capabilities under data-limited conditions.
- The optimized and parallelized methodology processed all 8424 blocks of Catalonia in 4 hours and 30 minutes, a significant reduction from an estimated 41 to 53 days for sequential processing.
- Feature selection identified 20 optimal features, with height and number of returns being the most important.
- Over 50% of predictions had an error within ±2%. The model showed a slight tendency to overestimate for smaller errors and underestimate for larger errors.
- AI-CanopyMapper's performance (R² of 0.88 with 0.5 points/m² LiDAR density) is notably strong compared to existing approaches in the literature, which typically report R² values between 0.60 and 0.70, often using denser LiDAR data or combined sources.
Contributions
- Introduction of AI-CanopyMapper, a novel, fast, scalable, and accurate machine learning framework for canopy cover estimation using LiDAR data.
- Optimized feature engineering, selection, and parallel processing techniques significantly enhance computational efficiency and reduce processing time for large datasets.
- Demonstrated state-of-the-art accuracy (R² 0.88, MAE 6.47%) using a relatively low LiDAR point cloud density (0.5 points/m²), outperforming many existing methods that require denser data or additional data sources.
- Validation of a "partial model" approach, showing that reliable canopy cover predictions can be obtained with significantly less training data (1.3% of total), enabling rapid preliminary assessments and efficient resource allocation.
- The framework is designed to be density-agnostic and transferable, allowing for adaptation and retraining in diverse ecosystems.
- Addresses the critical challenge of balancing speed and accuracy in canopy cover mapping, providing updated values essential for applications such as forest management and wildfire risk assessment.
Funding
- Ministerio de Ciencia e Innovación MCIN AEI/10.13039/501100011033 under contract PID2020-113614RB-C21 and PID2023-146193OB-I00.
- CPP2021-008762 by the European Union-NextGenerationEU/PRTR.
- Catalan Government under grant 2021-SGR-574.
Citation
@article{Ventura2026Optimizing,
author = {Ventura, Pau Bosch and Carrillo, Carles and Donaire, Alejandro and Sánchez, Eric},
title = {Optimizing canopy cover evaluation: A machine learning approach using LiDAR data},
journal = {Environmental Modelling & Software},
year = {2026},
doi = {10.1016/j.envsoft.2026.106982},
url = {https://doi.org/10.1016/j.envsoft.2026.106982}
}
Original Source: https://doi.org/10.1016/j.envsoft.2026.106982