Ventura et al. (2026) Optimizing canopy cover evaluation: A machine learning approach using LiDAR data

Identification

Journal: Environmental Modelling & Software
Year: 2026
Date: 2026-04-01
Authors: Pau Bosch Ventura, Carles Carrillo, Alejandro Donaire, Eric Sánchez
DOI: 10.1016/j.envsoft.2026.106982

Research Groups

Computer Architecture & Operating Systems Department, Universitat Autònoma de Barcelona, Barcelona, Spain

Short Summary

This study develops AI-CanopyMapper, a machine learning framework leveraging LiDAR data for efficient and accurate prediction of canopy cover, achieving a mean absolute error of 6.47% and an R² of 0.88 for the full model in Catalonia. The framework demonstrates strong generalization capabilities and computational efficiency, even with limited data, offering a fast and scalable alternative to traditional methods.

Objective

To develop a fast, scalable, and accurate machine learning framework (AI-CanopyMapper) that uses LiDAR data to predict canopy cover, addressing the limitations of traditional methods in balancing speed and accuracy.

Study Configuration

Spatial Scale: Catalonia, Spain. LiDAR data organized in 2 km × 2 km blocks with a minimum density of 0.5 points/m². Ground truth and derived maps (Slope, Aspect) at 20 m × 20 m resolution. LandUse data at 10 m × 10 m resolution.
Temporal Scale: LiDAR and Canopy Cover ground truth data collected between 2016 and 2017. LandUse data collected in 2017. Slope and Aspect data collected in 2020.

Methodology and Data

Models used: XGBoost (gradient-boosting algorithm) for canopy cover prediction. Mean imputer for handling missing values. Standard scaler for feature normalization.
Data sources:
- LiDAR data: Institut Cartogràfic i Geològic de Catalunya (ICGC).
- Ground truth (Canopy Cover map): Institut Cartogràfic i Geològic de Catalunya (ICGC).
- LandUse data: Impact Observatory, Microsoft, and Esri.
- Slope and Aspect data: Derived from Digital Terrain Model (ICGC).

Main Results

The full AI-CanopyMapper model achieved a Mean Absolute Error (MAE) of 6.47% and an R² of 0.88 on the testing set.
A partial model, trained with only 1.3% of the available data, achieved an MAE of approximately 15%, demonstrating strong generalization capabilities under data-limited conditions.
The optimized and parallelized methodology processed all 8424 blocks of Catalonia in 4 hours and 30 minutes, a significant reduction from an estimated 41 to 53 days for sequential processing.
Feature selection identified 20 optimal features, with height and number of returns being the most important.
Over 50% of predictions had an error within ±2%. The model showed a slight tendency to overestimate for smaller errors and underestimate for larger errors.
AI-CanopyMapper's performance (R² of 0.88 with 0.5 points/m² LiDAR density) is notably strong compared to existing approaches in the literature, which typically report R² values between 0.60 and 0.70, often using denser LiDAR data or combined sources.

Contributions

Introduction of AI-CanopyMapper, a novel, fast, scalable, and accurate machine learning framework for canopy cover estimation using LiDAR data.
Optimized feature engineering, selection, and parallel processing techniques significantly enhance computational efficiency and reduce processing time for large datasets.
Demonstrated state-of-the-art accuracy (R² 0.88, MAE 6.47%) using a relatively low LiDAR point cloud density (0.5 points/m²), outperforming many existing methods that require denser data or additional data sources.
Validation of a "partial model" approach, showing that reliable canopy cover predictions can be obtained with significantly less training data (1.3% of total), enabling rapid preliminary assessments and efficient resource allocation.
The framework is designed to be density-agnostic and transferable, allowing for adaptation and retraining in diverse ecosystems.
Addresses the critical challenge of balancing speed and accuracy in canopy cover mapping, providing updated values essential for applications such as forest management and wildfire risk assessment.

Funding

Ministerio de Ciencia e Innovación MCIN AEI/10.13039/501100011033 under contract PID2020-113614RB-C21 and PID2023-146193OB-I00.
CPP2021-008762 by the European Union-NextGenerationEU/PRTR.
Catalan Government under grant 2021-SGR-574.

Citation

@article{Ventura2026Optimizing,
  author = {Ventura, Pau Bosch and Carrillo, Carles and Donaire, Alejandro and Sánchez, Eric},
  title = {Optimizing canopy cover evaluation: A machine learning approach using LiDAR data},
  journal = {Environmental Modelling & Software},
  year = {2026},
  doi = {10.1016/j.envsoft.2026.106982},
  url = {https://doi.org/10.1016/j.envsoft.2026.106982}
}

Original Source: https://doi.org/10.1016/j.envsoft.2026.106982