Evaluation of static kNN predictions
The improvements to the reference set and the better selection of predictive features provided the bulk of the increase in the level of agreement to the reference data as compared with the original process of
Beaudoin et al. (2014). Overall, gains were larger than 14% in all four multivariate error metrics, while ranging from 36% to 56% for the two MD% error metrics (MD–% and MD+%, respectively, for the lower and upper ends of distribution of test variables) (
Fig. 2A). Additional improvements of about 10% were obtained through stratified predictions by forest and nonforest covers for MD+% and MD–% measures at the cost of slightly lower gains for
T2 and RMSD% (
Fig. 2B). Finally, the selection of
k = 3 within a priori stratified
kNN workflow ensured that all four multivariate error metrics remained within 5% of their best values, as compared with within 11% of their best values when using
k = 6 as in
Beaudoin et al. (2014) (
Fig. 3). The resulting implementation of the stratified
kNN workflow and the lower
k value provided gains in univariate error metrics in the 2001 values of our three test variables (
Table 2). Values of
T2 substantially increased by 7% to 13%, whereas RMSD decreased by –2% to –9%. Major gains were observed for MD– and MD+ measures, with decreases in overestimations of –31% to –41% and in underestimations of –37% to –45%.
Overall, the continued use of the Euclidian distance and
t = 0 and the selection of
k = 3 are in line with reported practices in a recent meta-analysis of
kNN studies (
Chirici et al. 2016). Low values of
k have been found to yield good error measures in other inventory-based applications of the
kNN (
Halperin et al. 2016). Compared with the original 2001 maps, the new 2001 maps are more contrasted, particularly in areas of sparse tree cover such as the prairie–forest ecotone of central Canada and the boreal–taiga transition where patterns of attributes such as AGB are better spatially defined (
Figs. 4A and
4B). The substantial reduction in overestimation and underestimation at lower and upper AGB range, respectively (
Table 2), is reflected in the more frequent occurrence of low and high biomass values in the maps (
Figs. 4A and
4B).
Estimated values of univariate error metrics show a slight degradation of adjustment to the reference set in the 2011 map as compared with the 2001 map in unchanged pixels, but differences in univariate error estimates between the 2001 and 2011 predictions do not exceed 7% for TREED and NLS attributes and 11% for AGB (
Table 2). Although the changes in the error estimates are small, they likely reflect inability of our various procedures to totally eliminate time-related effects. Possible error sources include imperfect spectral normalization of interannual variability in MODIS composites, as well as imperfect selection of dynamic spectral variables. Another possible source of error is the larger uncertainty in the forest–nonforest mask of 2011 relative to the one from 2001 resulting from the updating process based on available ancillary datasets (Supplementary Table S1
1).
Evaluation of 2001–2011 changes from kNN predictions
Estimated changes in AGB between 2011 and 2001 for test pixels that had been fully disturbed (fractional disturbance > 90%) by harvest or fire or that within which fractional gain had been identified over more than 90% of their surface show expected behavior. Overlaps in frequency distributions of AGB change among these change classes and with the no-change pixels are relatively limited (
Fig. 5). The near-zero mean (0.18 t·ha
–1) change in AGB between 2001 and 2011 for no-change pixels meets our expectations, while the standard deviation (±26.3 t·ha
–1) reflects the combined 2001 and 2011 prediction errors and the possible effects of partial disturbances or limited growth not identified in any of the ancillary datasets.
Pixels having undergone harvest or fire show expected negative changes in AGB in 99.2% and 98.2% of the cases in our sample, respectively. In addition, changes are greater for harvested pixels (–103.7 ± 59.5 t·ha
–1) than for burned pixels (–35.7 ± 26.4 t·ha
–1). Again, this result is in agreement with our expectations as it reflects the systematic selection of mature, well-stocked stands for harvesting in contrast to the more random and northerly occurrence of fires. Moreover, the mean harvest-related loss of about 104 t·ha
−1 translates roughly to a stem-only volume of 210 m
3·ha
−1 for softwood species (conversion factors: stem wood = 80% of total biomass, wood density = 0.4 t·m
−3). This value is close to a mean of 205 m
3·ha
−1 that can be calculated from reported annual values of total volume harvested and of total area harvested across Canada (
Natural Resources Canada 2015).
Change in AGB from 2001 to 2011 in pixels having undergone regrowth (29.9 ± 50.9 t·ha
–1) is positive as expected in 74% of our 1% sample set, but with greater overlap with the no-change AGB difference distribution compared with the fire and harvest classes. The occurrence of negative values in 26% of the test pixels is contrary to our expectations but may result, in part, from
kNN prediction errors and from the inherent difficulty in the identification of pixels with forest cover gains in the original product of
Hansen et al. (2013). Nevertheless, the larger proportion of pixels with positive AGB changes is in agreement with our expectations. In addition, the resulting mean yearly biomass rate of 3.6 t·ha
−1·year
−1 over the 10-year period is compatible with the faster rates of juvenile growth across productive forests stands in Canada, as documented in operational timber yield tables (e.g.,
Pothier and Savard 1998), where forest cover gains would have been detected by
Hansen et al. (2013).
Values of AGB and TREED in pixels affected by harvest or fire drop between target years 2001 and 2011 in proportion to the fractional disturbance, a result that is in line with our expectations (
Figs. 6A and
6B). The changes in AGB due to regrowth also behave according to expectations, with pixels showing gains in both AGB and TREED proportional to the fractional gain (
Figs. 6A and
6B). Finally, when mapped across the landscape, AGB change values are quite similar for fire and harvest (
Fig. 4C). This again meets our expectations. Although the average value of AGB loss is larger for harvest than for fire on a per-hectare basis, values of fractional change are larger for fire than for harvest (
Fig. 4D) (
Guindon et al. 2014). As a result, the realized pixel-level losses (AGB loss per hectare × fractional change) are quite similar for both disturbances.
The new maps of forest attributes for 2001 and 2011 share a reference dataset presumed to be time invariant and therefore applicable to any particular year of the MODIS time series. Time invariance is a general principle used in all MODIS-based products that rely on a fixed set of relationships between spectral features and properties at the surface of the Earth to map these properties across years (
Pouliot et al. 2009). In this study, time invariance was sought by using MODIS images that were normalized across years and by removing predictive variables that still varied across years over stable dense forests in spite of the normalization. However, we may expect a certain degree of estimate uncertainty to be generated by the imperfect time invariance of our reference set.
Although the changes made to the
kNN prediction process listed above have improved the overall level of agreement across all error measures, there are still substantial uncertainties associated with these predictions. Potential sources include a remaining impact of time mismatch of 2001 MODIS images and the year of aerial photo capture for the pre-2001 photo plots retained for this analysis (
McRoberts et al. 2016) and the errors associated with the use of photo interpretation and modelling for creating the photo-plot dataset (
Magnussen and Russo 2012). Ongoing re-measurement of the photo plots will alleviate some of these problems as these data become available, but a more fundamental shift in approach to the current analysis may be required to achieve substantial gains in accuracy.
Our estimates of changes between 2001 and 2011 are based on national-level samples assembled by classes of fractional disturbance or fractional gain and thus represent aggregate values from pixels with no systematic spatial affiliation. Because of a lack of true validation data for change estimates, what we cannot evaluate at this stage is the appropriate area over which pixel-level results have to be aggregated to correctly represent quantitative change in forest attributes following regrowth or losses due to disturbances.
Beaudoin et al. (2014) found that aggregation of pixel-level biomass estimates to 1 km
2 improved values of
T2 by 25% and of RMSD by 40%. Because estimates of change involve a difference between two yearly estimates, we suggest that aggregation would need to be performed to units larger than 1 km
2 for similar improvements of error measures. Also,
Beaudoin et al. (2014) found that the quality of estimates degrades in areas with more complex topography or that are under-represented within our reference dataset through lack of forest inventory. The same limitations will apply to change estimates.