A.1. Neural network models
This section gives a brief summary of the neural network (NN) models developed in our previous study to map the functional relationships driving net ecosystem exchange (NEE) (CO
2 exchange) and net methane exchange (NME) (CH
4 exchange), respectively. For an overview of the data collected during the 2017 study and a comprehensive discussion of the NN models, we direct the reader to
Skeeter et al. (2022). NNs are universal approximators; they make no prior assumptions about the distribution of a dataset and with enough hidden nodes they are capable of mapping any continuous function to an arbitrary degree of accuracy (
Hornik 1991;
Khosravi et al. 2011). Given this, an NN will fit any pattern, real or artificial, so care must be taken to ensure model estimates are physically plausible. Our previous study took a number of steps to prevent overfitting and ensure our models were plausible.
Both models (one for NEE and one for NME) were ensembles of 30 randomly initialized feed forward single layer NN, with each NN trained on a bootstrapped sample of the 2017 eddy covariance (EC) data. Early stopping was used to prevent overfitting (
Sarle 1995;
Heskes 1997). Ensemble means and standard deviations were used to calculate hourly NEE and NME estimates with 95% confidence intervals (
Khosravi et al. 2011). First, overparametrized models were trained on a large set (
n = 21) of potential inputs. Then, weights method (
Gevrey et al. 2003) was used to prune unnecessary inputs and the models were retrained using only the most important inputs (drivers). This approach had not been applied to EC data before, distinguishing our work from other EC studies that have treated NN as “black box” models (e.g.,
Papale and Valentini 2003;
Moffat et al. 2007;
Dengel et al. 2013;
Knox et al. 2019;
Schäfer et al. 2019;
Lee et al. 2021).
Figure A1a shows the fit between observed CO
2 fluxes and modelled NEE,
Fig. A2 shows observed modelled relationships driving NEE. The NN model was not set up to explicitly partition NEE into its component fluxes, ecosystem respiration (ER) and gross primary productivity (GPP), where NEE = ER − GPP (
Aubinet et al. 2012). However, modelled NEE provided a strong fit with the EC observations (
r2 = 0.95, RMSE = 0.33 µmol m
−2 s
−1) and the NN correctly identified the relevant drivers of both ER and GPP. Photosynthetic photon flux density (PPFD) was the primary driver of GPP, which was modulated by vapor pressure deficit (VPD). The NN mapped the exponential relationship between soil temperature and ER; the response function was similar to the commonly used Q
10 approach but was weighted to account for the differential respiration rates of polygon rims and centers. Thaw depth mapped seasonality in the NN, which was a conflation of seasonal phenology changes and the increasing volume of substrate available to respiration. This model also included a one-hot-coded “daytime” variable denoting whether the sun was above/below the horizon to explicitly distinguish between day and night.
Fig. A1b shows the fit between observed CH
4 fluxes and modelled NME;
Fig. A3 shows the modelled relationships driving NME. Methane fluxes are complex; the literature does not suggest any broadly applicable, well-defined functional relationships governing ecosystem scale NME (
Irvin et al. 2021). The balance of methanogenesis and methanotrophy dictates NME and both components are driven by soil temperature, moisture, and substrate availability (
Lai 2009). However, substrate availability can change over short timescales and the rate of CH
4 transport varies by emission pathway (diffusion, ebullition, or vascular transport), limiting methanotrophy (
Joabsson et al. 1999;
Ramirez et al. 2015). Modelled NME provided a moderate fit with the EC observations (
r2 = 0.79, RMSE = 5.89 nmol m
−2 s
−1). The NN model showed that at subdaily timescales, NME was driven by the dependence of methanogenesis on GPP, the dependence of vascular transport on net radiation (
Rn), ventilation of peat under turbulent conditions, and the relative contributions of polygon centers/rims to flux observations. At longer timescales, soil temperature, water table depth, and thaw depth were the important drivers of NME.
The relative areal fractions of polygon centers (66%) and rims (29%) at the ecosystem scale were slightly different than the median distribution of polygon centers (63%) and rims (23%) in the flux footprint climatology; the flux footprint climatology does not sum to 100% because the footprint function is technically infinite. We estimated that location bias (placement of the EC station) caused our NME estimates to be approximately 1.0 nmol m−2 s−1 above background levels in the low-center polygon (LCP) ecosystem at Fish Island. We did not find location bias impacted our NEE estimates because flux observations were not as sensitive to changes in the flux source area. We did find CO2 fluxes varied between polygon centers and rims at the plot scale, but this variability averaged out at the footprint scale because the magnitude of the differences was less drastic than for CH4.
A.2. Driver estimation
A.2.1. Fish Island AWS
In August 2008 an AWS was installed in the LCP ecosystem at Fish Island (69°22′19.7″N, 134°52′55.9″W), operated by the Canadian Department of Indian Affairs and Northern Development, Water Resources Division. The AWS recorded hourly data on a CR1000 data logger equipped with a CFM100-XT data module (Campbell Scientific Inc., Logan, UT, USA; CSI). The AWS measured wind speed (U) and direction at 2.9 m with a 05103AP-10-L Wind Monitor (RM Young, Traverse City, MI, USA), air temperature (Ta) at 2 m with a 44212 Temperature Thermilinear Probe (CSI), incoming and outgoing longwave (LW) and shortwave (SW) radiation at 1.55 m with a CNR2 4-component net radiometer (Kipp & Zonen, Delft, NL). On August 21st 2009, an unheated TE525M Tipping Bucket (Texas Electronics, Dallas, TX, USA) was installed to measure rainfall (rain). Data were nearly complete over the full period of record, except rainfall, which are missing until August 21st 2009 and during periods of snowfall. Data gaps of 2 h or less were filled with linear interpolation and longer gaps were filled using the average of corresponding hourly values for the day before and after the gap. We obtained the AWS data (spanning August 28th 2008–August 11th 2020) through personal communications with the Government of the Northwest Territories Department of Environment and Natural Resources.
Over the 2017 study period, AWS data were in close agreement with hourly averages of equivalent observations from our EC station; our EC station was located 47 m northeast of the AWS in 2017. Regression models were used to scale each AWS variable to its equivalent EC variable to remove bias between the two observations (
Table A1). Since friction velocity was not measured at the AWS, we used a slightly different approach to estimate this variable. An exponential regression model was fit to the half-hourly EC data:
where the coefficients were
a = 0.082 m s
−1,
b = 0.96 m s
−1, and
c = 6.0 × 10
−5 m
3 J
−1. The model provided a reasonable fit (
r2 = 0.92; RMSE = 0.039 m s
−1) and accounted for the influence of diurnal heating on friction velocity. Then, hourly
u* was estimated from the scaled estimates of
U and NR measured by the AWS using these coefficients.
A.2.2. ERA5 reanalysis
One-hour resolution ERA5-Land reanalysis data produced by the European Center for Medium-Range Weather Forecasting were used to complement the AWS data (
Muñoz 2019). Hourly ERA5 data from January 1st 2008 to December 31st 2020 were obtained at 11 km grid resolution from Google Earth Engine. Fish Island was covered by parts of three grid cells, so the spatially weighted mean of each hourly value was calculated over Fish Island. The data included estimates of air temperature and dewpoint at 2 m, total precipitation (liquid equivalent), fractional snow cover, soil temperature and volumetric water content. The soil data were averaged across depth profiles: (1) 0–7 cm, (2) 7–28 cm, and (3) 28–100 cm.
The reanalysis estimates of air temperature were in strong agreement with observations from the EC station in 2017 (r2 = 0.90; RMSE = 1.5 K) and with observations from the AWS over its full period of record (r2 = 0.96; RMSE = 3.0 K). Reanalysis estimated rainfall data were included in this study because a rain gauge was not installed at the AWS until late August 2019 and because the AWS rainfall data did not include snow. On an hourly basis, the observed and estimated rainfall were poorly correlated (r2 = 0.19), but when summed over 48 h intervals the correlation improved (r2 = 0.63). Monthly ERA5 precipitation totals were about 74% higher than AWS observations, but the values were reasonably correlated (r2 = 0.82).
Reanalysis estimated dewpoint temperature was in moderate agreement with the EC station in 2017 (
r2 = 0.71; RMSE = 1.7 K). These data were necessary to model the VPD at Fish Island, because humidity was not measured at the AWS. The reanalysis appears to underestimate the maritime influences at Fish Island (
Fig. A4a). For example, when winds were off the ocean from the northwest (315°–360°), estimated dewpoint temperature was less accurate (
r2 = 0.33; RMSE = 2.2 K) compared with when winds were from the southeast (135°–180°) with no maritime influence (
r2 = 0.86; RMSE = 1.2 K)
. To estimate VPD, we estimated the dewpoint from ERA5 data, then calculated the VPD using the Clausius–Clapeyron equation with the estimated dewpoint and observed air temperatures. Because the dewpoint estimate was only accurate for certain wind sectors, we used an ordinary least squares (OLS) regression model with seven inputs to scale the dewpoint estimates and account for the continental bias of the reanalysis data. The seven OLS inputs included ERA5 estimate dewpoint and air temperature, along with five AWS observations of
Ta, SW, LW, and the u and v wind components. Inputs were
z-score normalized before training and relative magnitude of each coefficient is shown in
Fig. A4b. The OLS scaled dewpoint temperature provided a better fit with observations from the EC station in 2017 (
r2 = 0.86; RMSE = 1.0 K) (
Fig. A4c). Before calculating the VPD, any dewpoint estimates that were above the
Ta at the AWS were fixed to
Ta. The final VPD estimates provided a reasonable fit with observations from 2017 (
r2 = 0.88; RMSE = 120 hPa) (
Fig. A4d).
The ERA5 data were also needed because the AWS data did not include any subsurface observations. The ERA5 soil data are integrated over depth levels; no single estimate is directly comparable to any observation.
Table A2 shows the correlations between soil conditions observed at EC station in 2017 to reanalysis estimates. The ERA5 soil temperatures for levels one and two were moderately correlated with soil temperatures at 5 cm from polygon rims and polygon center, respectively, but the reanalysis tended to over/under estimate the daily maximum/minimums. Polygon rims are elevated and drier than polygon centers, which explains why they had better associations with reanalysis estimates at levels 1 and 2, respectively. The relationships between reanalysis estimates with temperatures at 15 cm were weak for all levels, but correlations did improve when averaged daily (not shown). Reanalysis estimates of soil moisture at both levels were well correlated with water table depth observations. Each soil parameter was estimated using an OLS regression model, following a similar procedure as was used to estimate the dewpoint temperature. Each model had nine inputs that were
z-score normalized before training (
Fig. A5). The inputs included the five ERA5 subsurface parameters: ERA5 estimated rainfall (rolling 48 h totals); AWS observations of
Ta, SW, LW; and rolling 6-h average air temperatures. Each estimated parameter provided a reasonable fit (
r2 ≥ 0.88) with its corresponding observed values in 2017.
A.2.3. MODIS data
Daily 500 m resolution Nadir Bidirectional Reflectance Distribution Function Adjusted Reflectance (NBAR) MODIS data were obtained using Google Earth Engine. The NBAR data product is a daily composite of surface reflectance covering every pixel of the global land surface; it is derived from an algorithm that models the surface reflectivity using multidate, multiangular, cloudfree, atmospherically corrected, surface reflectance observations (
Wanner et al. 1997;
Lucht et al. 2000;
Schaaf 2021). Daily NBAR data are modelled using observations from MODIS instruments on the Terra and Aqua satellites over a 16-day period, with the image date centered on the 9th day, weighted data as a function of quality, coverage, and distance (temporal) from the day of interest (
Schaaf et al. 2002;
Schaaf 2021). The NBAR data aims to minimize missing data due to cloud cover while providing representative estimates of reflectivity (
Schaaf 2021). The NBAR data had good coverage, but there were missing data in September (4%) and October (25%) due to cloudy conditions and low light levels; these data gaps were backfilled linearly. The NBAR data were spatially averaged across the LCP ecosystem at Fish Island (defined by the landscape classification map in
Fig. 1) and used to calculate daily normalized difference vegetation index (NDVI) (Bands 1 & 2) and Normalized Difference Snow Index (NDSI) (Bands 4 & 6) for the periods between May 1st and October 31st, each year from 2009 to 2019.
When NDSI > 0, it indicates a pixel is snow covered, so days were classified as snowy if NDSI > 0. Comparing the NDSI data (excluding gaps) with ERA5 fractional snow cover in May/June, the values were moderately correlated (
r2 = 0.61) but the reanalysis estimated earlier snowmelt dates than were observed by satellite; in September/October, the two metrics were in better agreement (
r2 = 071). The NDSI data were combined with AWS mean daily air temperatures (
Ta) and used to calculate a daily thaw degree day index:
were

and

. Then, thaw depth (TD) was then calculated by fitting a limited number (
n = 50) thaw depth measurements from 2017 to an exponential regression model:
Thaw depth was measured manually on 5 days (day of year 174, 184, 213, 233, and 258; 10 replications each day) in 2017. The model was trained with K-fold (
K = 10) cross validation; the coefficients
a = 0.035 and
b = 0.53; the fit with the 2017 thaw depth observations was
r2 = 0.92 and RMSE = 0.02 m (
Fig. A6a). With the exponential function, thaw depth increases rapidly with snowmelt and continued to increase monotonically over the season until freezing conditions returned in late fall (
Fig. A6b). Thaw depth remained fixed at its maximum value until the end of the year; in reality, the active layer refreezes from the top and bottom and a portion of the active layer remains thawed well into early winter (
Natali et al. 2019). However, there is no reasonable way to incorporate these dynamics here, and estimating C fluxes during freeze-up is beyond the scope of this analysis.
We obtained maximum annual thaw depth data from Lousey Point and Reindeer Depot, two Circumpolar Active Layer Monitoring network sites (CALM, 2022). Lousey Point is an upland site on Richard's Island, 22 km east of Fish Island and Reindeer Depot is a forested site in the delta 86 km south of Fish Island. The data included values through 2017; both CALM stations had thicker active layers than Fish Island, but these data were needed to provide an independent assessment of our estimates at Fish Island (
Figs. A7c and
A7d). Both stations Lousey Point (Spearman
ρ = 0.79;
p = 0.01) and Reindeer Depot (Spearman
ρ = 0.93;
p < 0.01) provided a reasonable fit with estimates of maximum annual thaw at Fish Island. To check for the plausibility of early season thaw depth estimates, May and June soil temperature estimates were compared with mean daily soil temperatures at 5 and 15 cm with linear regression models. They had reasonable fits (
r2 ≥ 0.57) and the 0 °C intercepts with estimated thaw depths were reasonably close to the respective depth levels of the temperature observations (
Table A3). Given the simplicity of the thaw depth estimation procedures, these relationships are encouraging.
A.3. Evaluating the upscaled fluxes
Figure A7shows a comparison between estimated and observed fluxes. Net ecosystem exchange closely matched observed values and the low mean bias error (MBE) indicated that there was not a significant bias in estimated NEE. Net methane exchange did not match observations as closely, and the estimates had a small positive MBE. The bias was because the flux driver time series did not account for the influence that variable source area had on hourly fluxes observed by the EC station in 2017; we projected NME to the mean distribution of polygon rims/centers at the site. Daily sums show a similar fit for NEE and an improved fit for NME (
Figs. 3c and
3d). Overall, this comparison suggests the estimates of daily NEE and NME presented in this study are reasonable during times when conditions were most similar to the training data used to develop the NN models. The NN models are capable of making projections beyond conditions they were trained on; but error increases as conditions diverge from the training data, and care must be taken to ensure conditions are not fundamentally different than the training data. By limiting our analysis to days with mean soil temperatures at 15 cm > 0 °C, we are restricting our upscaled estimates to conditions that are not drastically different than the 2017 study period.
For an additional check of the plausibility of our daily NEE estimates, we compared them to SMAP L4 global daily 9 km resolution NEE data (
Kimball et al. 2021). These satellite estimates of NEE are calculated from 500 m resolution MODIS data, 9 km resolution Soil Moisture Active Passive (SMAP) observations, and 1/4 degree resolution Goddard Earth Observing System Model, Version 5 (GEOS-5) reanalysis data. The SMAP L4 data product is based on a landscape classification map that does not include tundra or wetlands; the LCP ecosystem at Fish Island is classified as shrubland. The 9 km SMAP data are averaged across multiple ecosystems (LCP, shrub tundra, small lakes, etc.), so they are not directly comparable. However, the comparison does help show that our growing season NEE estimates are reasonable (
Table A4;
Fig. A8). The 95% confidence intervals for our NEE estimates contain overlap with the SMAP L4 estimates of each year. The correlation between the two NEE estimates by year was Spearman
ρ = 0.70 and
p = 0.18. Overall, daily NEE estimated by our model matched the seasonal patterns of the SMAP data reasonably well (
r2 = 0.66) and there was not a significant disparity between the 5 years of overlapping data.