A carbon monitoring system for mapping regional, annual aboveground biomass across the northwestern USA

Andrew T Hudak; Patrick A Fekety; Van R Kane; Robert E Kennedy; Steven K Filippelli; Michael J Falkowski; Wade T Tinkham; Alistair M S Smith; Nicholas L Crookston; Grant M Domke; Mark V Corrao; Benjamin C Bright; Derek J Churchill; Peter J Gould; Robert J McGaughey; Jonathan T Kane; Jinwei Dong

doi:10.1088/1748-9326/ab93f9

1. Introduction

Humans heavily influence forest carbon (C) dynamics directly via land management as well as indirectly by changing greenhouse gas composition and climate (Birdsey and Pan 2015). The U.S. Forest Service (USFS) Forest Inventory and Analysis (FIA) Program plays a critical role in monitoring national forest trends, as do the National Aeronautics and Space Administration (NASA) and the U.S. Geological Survey (USGS) with Landsat and other remote sensing data products (Masek et al 2008, 2013, 2015). Light detection and ranging (lidar) is the most accurate remote sensing technology for estimating forest and stand structure attributes (Lefsky et al 2001, 2002b, Hyde et al 2007, Sexton et al 2009, Mondino et al 2020). Thus, there is considerable motivation to integrate lidar measurements into forest inventories (Duncanson et al 2010, Johnson et al 2014, 2015, Sheridan et al 2015).

Indeed, land managers have cumulatively invested and continue to invest millions of dollars for commercial off-the-shelf (COTS) airborne lidar collections to achieve multiple natural resource management objectives (Hudak et al 2009). Many COTS lidar collections aim to update the National Elevation Dataset (NED) while acquiring data towards national lidar coverage, partially funded by the USGS 3D Elevation Program (3DEP) (Sugarbaker et al 2017). Applications in geomorphology, hydrology, and road planning may only require a Digital Terrain Model (DTM) of the bare earth surface. However, applications of lidar for forest inventory require further investment into field plot data collections to estimate most stand attributes of interest (volume, basal area, biomass, etc.), often at substantial added expense. Forest managers can justify the added cost for field plot data collections because lidar-based inventory can map stand structure attributes with high precision at a pixel level to display within-stand variation, whereas the spatial resolution of traditional forest inventory is limited to the stand or stratum level (Hummel et al 2011).

Lidar is known to provide accurate characterization of forest structure at the plot, stand, and landscape levels (Kane et al 2010). Therefore, multiple COTS lidar and associated field plot data collections, which we henceforth term 'project datasets,' have unrealized potential to more broadly inform forest planning, management, and monitoring efforts at the regional level. Integrating multiple project datasets into a common database adds further value to stakeholders' already large investments into project data collections. However, the caveat is that they collectively represent a spatially biased sample of forests in the region. This is a key distinction of our project datasets from FIA plot data, which provide an unbiased, systematic sample of forest conditions in space and time, as is needed in support of forest planning and monitoring, reporting, and verification (MRV) (Tinkham et al 2018, Hurtt et al 2019).

The use of project datasets that are disjunct in time or space assumes the transferability of estimates from project landscapes where both lidar and field plot data are available to project areas where only lidar data are available. It has been demonstrated that estimates of stand structure attributes can be transferred in both time (Fekety et al 2015) and space (Fekety et al 2018) in northern Idaho, where coniferous forest composition and structure may be the most diverse in the northwestern USA. These studies show that although there is some loss in the accuracy and precision of stand structure attributes transferred across time or space, acceptable estimates are still obtained that meet management needs, or at least suffice until more contemporary or local inventory plot measures can be obtained (Fekety et al 2015, 2018). While airborne lidar coverage is complete in many eastern USA states, lidar collections in the West are spatially (Fekety et al 2018) and temporally (Fekety et al 2015) disjunct. Yet western USA lidar coverage is extensive enough that all major forest types and conditions are represented in many locations, primarily as a single snapshot in time (as early as 2002 in this study), but with little overlapping coverage (i.e. repeat observations).

For mapping, radiometrically and geometrically calibrated Landsat image time series provide the spatial and temporal continuity needed for regional and national forest planning and monitoring (Banskota et al 2014, Wulder et al 2019). For these reasons, the vast majority of predictive models for mapping stand structure attributes use FIA plots for calibration (because they are statistically unbiased) and Landsat images as the primary source of predictor variables (because they are spatially and temporally continuous and consistent) (Powell et al 2010, Zhu and Liu 2015, Kennedy et al 2018a). The unrivaled duration of moderate spatial resolution Landsat image time series and recent availability of advanced time series algorithms and products (Masek et al 2008, 2013, Huang et al 2010, Kennedy et al 2010, 2015, Hermosilla et al 2016) provides an important temporal dimension to estimating not just AGB but changes in AGB to improve understanding of the carbon consequences of forest management and disturbance (Kennedy et al 2018a, 2018b).

However, the 30 m resolution of Landsat TM is less than satisfying to land managers that seek to make operational decisions at the local level. Moreover, Landsat and for that matter all passive optical imagery suffers from lack of sensitivity in cases of high canopy closure (Smith et al 2009) or in high biomass forests, with the signal saturating at an AGB density of ∼150–250 Mg ha⁻¹ or more depending on the study (Huete et al 1997, Steininger 2000, Dong et al 2003, Avitabile et al 2012, Zhu and Liu 2015, Durante et al 2019). The sparseness of the FIA sample plot grid at local scales also compromises the local accuracy of regional map products, and the spatial configuration of FIA plots makes them problematic to relate to Landsat pixel data directly for model development, given the spatial mismatch between the 7.3 m radius, round configuration of an FIA subplot and 30 m × 30 m square Landsat pixels (Tinkham et al 2018). Inevitably, the four subplots will intersect a different number of pixels and in varying proportions; the subplots occasionally (10% of plots in this study) also represent multiple condition classes, such that forest edge effects add further noise to any modeled relationships (figure 1).

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** (a) Lidar canopy height model overlay versus, (b) Landsat pixel overlay on the FIA plot configuration on a forest/non-forest edge of mixed condition class.
Download figure:
Standard image High-resolution image

Finally, the geolocation accuracy of the center subplot varies by USFS FIA Region, depending on the protocol and the quality of the Global Navigation Satellite System (GNSS) receiver used in the field; the three peripheral subplots, while systematically laid out from the center subplot by consistent distances (36.6 m) and bearings (120°, 240°, 360°) are usually not georeferenced, making them more subject to locational inaccuracy due to additive errors in accounting for horizontal distance on slopes and for magnetic declination on compass azimuths (Zald et al 2014).

Accuracy issues aside, FIA plot locations are confidential and not readily available to users. This policy, although justifiable for maintaining plot integrity and landowner trust, has been a major deterrent for forestry applications that involve developing relationships to remotely sensed data, particularly globally available 30 m Landsat TM imagery (Tinkham et al 2018). The scale of lidar point cloud data provide a way to bridge the scale gap, incompatible shapes, and location issues that complicate the relationships between Landsat pixel data and FIA or other field plots (Zald et al 2014, 2016). Lidar points can be binned just as easily within fixed-radius plot footprints as within 30 m pixels, such that the lidar points are tightly coupled to tree measures or Landsat pixel values, respectively. Indeed, this is the basis for small area-based predictive modeling of stand structure attributes for lidar-based forest inventory, as demonstrated in multiple research papers and widely implemented operationally by forest managers (Næsset 2004, Næsset et al 2004, Maltamo et al 2004, Hudak et al 2006, 2008, Hyyppä et al 2008, White et al 2013, Wulder et al 2013).

In this paper, we present a prototype Carbon Monitoring System (CMS) that addresses three objectives in three steps.

The first step leverages the spatial strength of lidar for characterizing AGB at the scale of forest inventory plots and across lidar project landscapes; our hypothesis is that AGB mapped from lidar data at 30 m resolution represents the full range of AGB conditions and can be used to train a regional model to estimate AGB where lidar data are unavailable in space or time.
The second step leverages the temporal strength of Landsat for monitoring AGB at the regional scale; our hypothesis is that annual Landsat time series metrics that capture disturbance dynamics can be combined with metrics derived from 30 year climate normals and static topography to estimate AGB across large productivity gradients without asymptote.
The third step leverages the strength of FIA plots for design-based, unbiased AGB estimation; our hypothesis is that FIA plot data can be used for regional AGB map calibration instead of regional AGB model calibration as is conventional.

We propose that our prototype CMS provides an MRV framework that can provide unbiased AGB estimates over time and engage stakeholders, including those who contributed the project datasets upon which our prototype CMS is based.

2. Material and methods

2.1. Study area

The study area is comprised of the forested portions of Washington, Oregon, Idaho, and the 12 counties in Montana located west of the Continental Divide (figure 2). The climate for this region transitions from continental in the east to maritime in west, with mean annual precipitation ranging greatly from 196 mm to 3041 mm (Current Results 2019). The major mountain ranges are the Northern Rocky Mountains, Cascade Range, and the Coast Range, with western slopes being more productive then eastern slopes due to steep orographic precipitation gradients. Forests east of the Cascade Range are primarily coniferous whereas those west of the Cascade Range are a mixture of conifer and deciduous species.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** Study area, with field plot and project lidar datasets shown.
Download figure:
Standard image High-resolution image

2.2. Project datasets

No field data were collected specifically for this study. Rather, field plot and lidar project data collected by stakeholders for forest inventories were used, contributed by Federal State University, tribal, and private organizations, hereby collectively termed our stakeholders (N = 29), which collectively totaled 3805 field-measured plots (figure 2). The requirements for acceptable plot data were: (1) fixed-area plots; (2) georeferenced with a GNSS capable of differential correction; (3) established within ±3 years of an overlapping lidar collection; and (4) not disturbed in the time between field and lidar data collections.

Field plots consisted of a single, fixed-radius plot ranging between 169 m² and 809 m², with the exception of 60 nested plots from Priest River Experimental Forest, Idaho, which were established following the FIA protocol of four distributed subplots (Bechtold and Patterson 2005). Field plots in the other 28 project datasets were established following stratified random designs, usually as support for the corresponding lidar collection, although the stratification variables used differed among contributing stakeholders. Across all field protocols, a minimum of species, status (live or dead), and diameter at breast height (DBH) were recorded for all trees with a DBH greater than or equal to 12.7 cm and consequentially trees with DBH less than 12.7 cm were excluded from this study (including the FIA plot data used for validation). Tree heights were measured on subsamples of the trees; in total, 48 836 out of 97 386 trees in the database (50.1%) had a height measurement.

AGB was calculated from project field plot tree measurements using the default equations found in local variants of the Fire and Fuels Extension (FFE) of the Forest Vegetation Simulator (FVS) (Rebain 2015, Dixon 2018). These equations are based on a series of regional volume and biomass equations. The aboveground portion of the live and standing dead trees were summed to a single, plot-level AGB value; the belowground portion of the trees and non-tree species were excluded from the AGB estimates. These plot-level AGB estimates were the response variable for the landscape model (section 2.3), the estimates of which informed the regional model (section 2.4), via a two-tiered modeling strategy (figure 3).

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.** Flowchart of the methodology.
Download figure:
Standard image High-resolution image

2.3. Landscape model

The landscape model predicted AGB as the response variable using climate, topographic, and lidar metrics as predictor variables (table 1), to estimate AGB where lidar data were available. Local and regional databases were searched to identify discrete airborne lidar collections located in the study area. Lidar collections included in this prototype CMS covered forested ecoregions and were collected between 2002 and 2016. Lidar data were processed using FUSION v3.60, a free and open source lidar processing software developed by the USFS (Mcgaughey 2015). Plot-level lidar metrics used as predictor variables in the landscape model (table 1) were calculated by clipping and height-normalizing the lidar point cloud. Project-level gridded lidar metrics used in creating AGB maps (maps are described below) were also created with FUSION. Gridded lidar metrics for Idaho and eastern Washington were created at a 30 m spatial resolution that matched the LandTrendr data (i.e. EPSG:5070; https://epsg.io/5070). Gridded metrics calculated with the same FUSION parameters at 30 m resolution for Oregon and the remaining portion of Washington already existed and were reprojected and resampled to match the spatial reference system used in this project. In total, 13 007 443 ha of lidar coverages were processed (figure 2).

Table 1. Candidate and selected predictor variables for the Landscape Model.

Predictor	Description	Selected	Data source
Mean	Lidar height mean	X	FUSION
StdDev	Lidar height standard deviation	X	FUSION
IQRange	Lidar height inter-quartile range		FUSION
Skewness	Lidar height skewness		FUSION
Kurtosis	Lidar height kurtosis	X	FUSION
P05	Lidar height 5th percentile	X	FUSION
P25	Lidar height 25th percentile		FUSION
P50	Lidar height 50th percentile		FUSION
P75	Lidar height 75th percentile		FUSION
P95	Lidar height 95th percentile		FUSION
CanopyReliefRatio	(mean height—min height)/(max height—min height)	X	FUSION
Pct1stRtnAbove2 m	% lidar returns > 2 m	X	FUSION
Pct1stRtnAboveMean	% lidar returns > mean lidar height	X	FUSION
Stratum00 mTo0p5 m	% lidar returns in 0–0.5 m height stratum	X	FUSION
Stratum0p5 mTo01 m	% lidar returns in 0.5–1 m height stratum	X	FUSION
Stratum01 mTo02 m	% lidar returns in 1–2 m height stratum	X	FUSION
Stratum02 mTo04 m	% lidar returns in 2–4 m height stratum	X	FUSION
Stratum04 mTo08 m	% lidar returns in 4–8 m height stratum	X	FUSION
Stratum8 mTo16 m	% lidar returns in 8–16 m height stratum		FUSION
Stratum16 mTo32 m	% lidar returns in 16–32 m height stratum	X	FUSION
Stratum32 mTo48 m	% lidar returns in 32–48 m height stratum	X	FUSION
Stratum48 mTo64 m	% lidar returns in 48–64 m height stratum	X	FUSION
StratumAbove64 m	% lidar returns > 64 m height stratum	X	FUSION
mat	Mean annual temperature		Climate FVS
mmin	Minimum temperature in the coldest month	X	Climate FVS
mmax	Maximum temperature in the warmest month		Climate FVS
map	Mean annual precipitation	X	Climate FVS
gsp	Growing season precipitation	X	Climate FVS
sday	Julian date of the last freezing date of spring		Climate FVS
ffp	Frost free period		Climate FVS
Slope	Percent slope		SRTM
TPI90 m	Topographic position index in a 90 m window	X	SRTM
TPI990 m	Topographic position index in a 990 m window	X	SRTM
TrAsp	Transformed aspect	X	SRTM
SCosAsp	Slope X Cosine(Aspect)		SRTM
SSinAsp	Slope X Sine(Aspect)	X	SRTM

Additional predictor variables used in the landscape model (table 1) included plot-level and gridded climate metrics (1961–1990 normals) obtained from the Climate-FVS Ready Data Server (Crookston 2016). Also included were topographic metrics based on Shuttle Radar Topography Mission (SRTM) 1 Arc-Second Global products because of its near global coverage (USGS 2018). SRTM elevation rasters were reprojected and resampled to 30 m resolution, and topographic metrics were calculated in GRASS GIS v7.4. Plot-level topographic estimates, calculated as the area-weighted mean of the pixel values intersecting the plot footprint, were extracted from these rasters.

Random Forests (RF) regression (Breiman 2001) was chosen to predict AGB because it is a flexible, non-parametric approach that can account for non-linear variable interactions. RF is encoded to bootstrap the dataset to generate a virtual forest of regression trees while employing an out-of-bag sampling with replacement strategy that provides resistance to overfitting (Hudak et al 2008, Latifi and Koch 2012, Hayashi et al 2015). To further guard against overfitting, highly correlated explanatory variables (r ≥ 0.9) were removed, and the Model Improvement Ratio statistic available in the model selection tool of the R package rfUtililies (Evans and Murphy 2017) was used to specify the model using the rf.modelSel tool. Based on preliminary analysis, the parsimony parameter of rf.modelSel was set to 1%; the RF model error was found to stabilize after generating 200–400 trees, hence the default RF setting of 500 regression trees was maintained to assess RF model fit as reported by the pseudo-R² statistic. All field plots were used to construct the landscape model, since the AGB estimates would ultimately be validated against independent FIA data (section 2.5). The landscape model was applied to rasters of predictor variables that covered the spatial extent of available lidar collections to estimate mean AGB calculated from the 500 trees. The standard deviation in estimated AGB calculated across the 500 trees was also mapped to show pixel-level model uncertainty (Urbazaev et al 2018).

2.4. Regional model

The regional model used a stratified random sample of pixel-level estimates of AGB from the landscape model as the response variable, and climate, topographic, and Landsat image time series (LandTrendr; Kennedy et al 2010, 2015) metrics as predictor variables, to estimate AGB annually and synoptically across the study area. A post-stratification and random sample of pixels from the AGB maps derived from the landscape model were used to train the regional model. Lidar collections often cross ecological and land-use boundaries; therefore, AGB pixels were subsetted to include only the forested ecoregions as determined by the Commission for Environmental Cooperation land classification system (CEC 1997, Omernik and Griffith 2014). Also, AGB pixels identified as water or developed land (e.g. urban areas) according to the 2011 National Land Cover Database (Homer et al 2015) were excluded from the random sample. AGB pixels mapped with the landscape model were binned into 10 Mg ha⁻¹ bins and 500 pixels were randomly selected from each bin. When a bin had fewer than 500 pixels, one-half of the pixels were randomly selected. This procedure generated a sample of 117 808 pixels that was divided in half into training and testing datasets.

Landsat imagery from 1984–2016 were processed through the LandTrendr algorithm (Kennedy et al 2010, 2015) to create a Normalized Burn Ratio (NBR) image time series. Intermediate LandTrendr products, namely annual tasseled-cap brightness, greenness, and wetness values and disturbance layers, were used as predictor variables in the regional model (table 2) along with the fully derived LandTrendr outputs quantifying the magnitude and time since last disturbance. These LandTrendr products were calculated on an annual basis and provide the temporal component to this CMS, such that annual maps covering the entire study area would be created. The year of the lidar collection determined which year of the LandTrendr time series was selected to extract a given 30 m AGB pixel value to be used for model training.

AGB is strongly correlated to tree height across forest types (Lefsky et al 2002a); therefore, the Simard et al (2011) global canopy height map based on nominal 2005 GLAS height estimates and 2000 SRTM data was included as an explanatory variable. Climate and topographic metrics described in the landscape model (table 1) were also used in the regional model (table 2). The regional model was specified using the same RF model selection procedure as the landscape model, and maps of AGB means and standard deviations calculated across 500 trees were also generated in the same manner.

2.5. Bias correction

Mapped estimates of AGB from the regional model were biased compared to FIA estimates at the plot level; we therefore calibrated the maps to make them consistent with unbiased FIA estimates of AGB at both the pixel- and county-level aggregate scales. The measurement year for each FIA field plot was determined and the corresponding pixel value was extracted from the annual regional map of that same year. Annual bias correction factors were calculated as the ratio of the mean of the plot-level FIA AGB estimates to the mean of the mapped AGB estimates. Because our maps included years not present in the FIA database (2000–2003) for some states, a relationship between calendar year and annual bias correction factor was calculated using linear regression for the years when FIA measurements were available in all four states, such that the map calibration was regionally balanced. Annual values from the line of best fit were used to calibrate the corresponding map. Equivalence plots (Robinson et al 2005) were used to test whether mapped AGB estimates were statistically equivalent to AGB estimated independently by FIA.

2.6. Forest/non-forest mask

Non-forested pixels in the annual, regional AGB maps were masked using the ALOS PALSAR 2009 forest/non-forest (F/NF) mask (PALSAR JAXA 2014, Shimada et al 2014). Global PALSAR-derived F/NF masks were also available in 2007, 2008, 2010, 2015, and 2016, but we selected the 2009 mask because it was the least affected by misclassified water bodies in our study area and was near the midpoint of our 2000–2016 annual map products.

Clearly, commission and omission errors are associated with the 2009 PALSAR mask, as they would be with any mask. Had PALSAR or similar F/NF masks been available for all years (not just 2007–2010, 2015 and 2016) in the 2000–2016 AGB time series, an annual F/NF mask could have been applied to each annual AGB map. Using multiple F/NF masks introduces another source of variability and error in AGB accounting related to the datasets, which we judged exceeded the variability of interest related to changing conditions in the scene (i.e. disturbance effects). Although the National Land Cover Data (NLCD) map products (Homer et al 2015) had F/NF masks in 2001, 2004, 2006, 2008, 2011, 2013, and 2016, their spatial extent was limited to the contiguous USA. Therefore, the 2009 PALSAR was applied to give our prototype CMS the potential for broader applicability. However, we compared the influence of mask choice on county-level biomass estimates to test if local AGB differences between masks would translate to differences in scaled-up AGB aggregations.

3. Results

3.1. Landscape model

Mean canopy height was the most important predictor of AGB in the landscape model (figure 4(a)), but density metrics across all vertical canopy strata comprised the largest group of predictors. Climate and topographic metrics were less important than the lidar metrics, with mean annual precipitation and transformed aspect being the top predictors, respectively (figure 4(a)). The landscape model explained 80% (RF Pseudo-R² = 0.80) of the variability in AGB across all project areas; the RF model RMSE was 114.6 Mg ha⁻¹ (46.9 %) and Bias was 1.9 Mg ha⁻¹ (0.8 %) (figure 5(a)). Even though these project areas covered a wide range of conditions, model predictions did not asymptote with high levels of AGB, as is commonly observed with lidar-derived estimates of AGB or related forest structure attributes.

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.** Selected predictors in the (a) landscape and (b) regional Random Forests (RF) models, sorted in descending order by RF variable importance scores, which is the mean decrease in mean squared error from inclusion of the variable across all model nodes. See tables 1 and 2 for variable descriptions.
Download figure:
Standard image High-resolution image

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.** Hexbin plots of observed vs. predicted AGB estimates for the (a) landscape model, where AGB observations are the project-level field plots; a stratified random sample of landscape model AGB predictions was subsequently split into halves to (b) train, and (c) test the regional model. Lines of best fit are shaded gray; dashed lines show the 1:1 relationship.
Download figure:
Standard image High-resolution image

Table 2. Candidate and selected predictor variables for the Regional Model.

Predictor	Description	Selected	Data source
ftv_tcb	LandTrendr fitted Brightness	X	LandTrendr
ftv_tcg	LandTrendr fitted Greenness	X	LandTrendr
ftv_tcw	LandTrendr fitted Wetness	X	LandTrendr
ftv_nbr	LandTrendr fitted normalized burn ratio	X	LandTrendr
delta_tcb	LandTrendr fitted delta Brightness	X	LandTrendr
delta_tcg	LandTrendr fitted delta Greenness	X	LandTrendr
delta_tcw	LandTrendr fitted delta Wetness	X	LandTrendr
delta_nbr	LandTrendr fitted delta normalized burn ratio		LandTrendr
TimeSinceDisturbance	Time since last disturbance	X	LandTrendr
MagnitudeOfLastDisturbance	Magnitude of last disturbance		LandTrendr
mat	Mean annual temperature	X	Climate FVS
mmin	Minimum temperature in the coldest month		Climate FVS
mmax	Maximum temperature in the warmest month	X	Climate FVS
map	Mean annual precipitation	X	Climate FVS
gsp	Growing season precipitation		Climate FVS
sday	Julian date of the last freezing date of spring		Climate FVS
ffp	Frost free period	X	Climate FVS
Slope	Percent slope	X	SRTM
TPI90Meters	Topographic position index in a 90 m window	X	SRTM
TPI990Meters	Topographic position index in a 990 m window		SRTM
TrAsp	Transformed aspect	X	SRTM
SCosAsp	Slope X Cosine(Aspect)	X	SRTM
SSinAsp	Slope X Sine(Aspect)	X	SRTM
SimardCHM	Simard canopy height map	X	Simard et al (2011)

3.2. Regional model

We considered AGB estimates from the landscape model, available where lidar had been flown, as representative of the full range of forest conditions across the region; therefore, a stratified random sample of AGB pixels from those maps was used to train the regional AGB model. Disturbance and spectral metrics had lower variable importance scores than the climate and topographic metrics and the global canopy height product (Simard et al 2011), which ranked fifth in importance (figure 4(b)). The most important climate and topographic metrics were mean annual precipitation (top predictor overall) and slope x sine(aspect), respectively (figure 4(b)). The regional model explained almost as much variance (RF Pseudo-R² = 0.78, figure 5(b)) as the landscape model (figure 5(a)), while the precision decreased (RMSE of 161.0 Mg ha⁻¹) (27.0 %) (figure 5(b)). The regional model Bias of 2.9 Mg ha⁻¹ (0.5 %) was similar to the landscape model, and the AGB estimates also did not asymptote (figure 5(b)). Validation of the regional model AGB estimates with the test dataset from the landscape model showed very slightly improved model fit (R² = 0.80) and precision (RMSE = 151.6 (25.9%)), but the Bias increased three-fold to 9.4 Mg ha⁻¹ (1.6%) (figure 5(c)), which highlighted the need for subsequent bias correction.

3.3. Bias correction

The ratio of FIA AGB/CMS AGB was calculated for each year (e.g. figure 7(a)). We observed an unanticipated trend in these ratios, which tended to increase over time as function of map year (figure 7(b)). Thus, we constructed a simple linear model (c = 0.005466y-10.265) of the annual correction factor (c) by year (y) to calibrate the annual AGB maps (figure 6) to unbiased annual FIA estimates of AGB at the plot level. Applying the simple linear model to the annual 2000–2016 maps resulted in the closest match to FIA estimates at the plot level (figure 7(c)) and upon aggregation to the county level (figure 7(d)).

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.** Predicted AGB (a) mean and (b) standard deviation in 2011 after bias correction to calibrate the map to FIA estimates.
Download figure:
Standard image High-resolution image

Figure 7. Refer to the following caption and surrounding text. — **Figure 7.** Bias correction process for 2000–2016 AGB maps: (a) Scatterplot of unbiased FIA estimates versus mapped AGB pixel estimates *before* bias correction (e.g. 2011); (b) Ratio of mean FIA AGB to mean CMS AGB for all forested FIA plots in the study region versus AGB map year; (c) Annual mean FIA AGB versus mean CMS AGB before and after bias correction; (d) Total county-level AGB in forested FIA plots versus total county-level mapped AGB in forested pixels *after* bias correction (e.g. 2011).
Download figure:
Standard image High-resolution image

It is important to clarify that the spatially discontinuous AGB maps generated with the landscape model were not calibrated with the bias correction; nevertheless, they matched independent AGB estimates from FIA very well and were statistically equivalent (figure 8(a)). Only the continuous, annual AGB maps generated with the regional model were calibrated with the bias correction, after which these AGB estimates were also statistically equivalent to independent AGB estimates from FIA (figure 8(b)).

Figure 8. Refer to the following caption and surrounding text. — **Figure 8.** Equivalence plots of aboveground biomass (AGB) independently estimated at FIA plots (e.g. 2011) versus: (a) Landscape model AGB estimates *without* bias correction; (b) Regional model AGB estimates *with* bias correction (e.g. 2011). The 2009 PALSAR forest/non-forest (F/NF) mask was applied in both cases. The confidence interval positioned at the mean is within the region of similarity described by the shaded grey polygon defining the region of similarity, meaning the intercept of the observed/predicted regression is significantly similar to 0; therefore the estimates are *unbiased*. The black, solid line of best fit is within the angle described by the dashed black lines defining the region of similarity, meaning the slope of the observed/predicted regression is significantly similar to 1; therefore the estimates are *proportional.*
Download figure:
Standard image High-resolution image

Inter-annual variation in mapped AGB predictions upon aggregation to the county level was less than in inter-annual variation in AGB estimates from FIA, calculated for the shorter, 8 year span (2006–2013) that annual county-level FIA estimates were available from EVALIDator (figure 9; Miles 2019). Moreover, annual FIA estimates from EVALIDator were available for only 3 years in Oregon (2006, 2011, 2012) and Washington (2007, 2011, 2012); therefore, AGB variation displayed in figure 9 was calculated for the same 3 years in these two states.

Figure 9. Refer to the following caption and surrounding text. — **Figure 9.** Interannual variation in aboveground biomass (AGB) estimated by FIA at the county level versus mapped AGB aggregated to the county level, after applying the 2009 PALSAR forest/non-forest mask, in (a) Washington, (b) western Montana, (c) Oregon, and (d) Idaho. AGB means (±1 standard deviation) calculated from interannual FIA and mapped AGB estimates are indicated by the vertical and horizontal lines, respectively.
Download figure:
Standard image High-resolution image

3.4. Forest/non-forest mask

The choice of F/NF mask to use influenced the AGB maps locally, which was evident at the county level (figure 10). We assessed nine F/NF masks: five other global PALSAR masks besides the 2009 mask selected, and also spectral data derived masks, one produced by the USGS Gap Analysis Project (GAP; USGS 2011) at the national scale as well as two regional FIA masks (Blackard et al 2008, Blackard 2009). A few disparities were evident at the county level, where the selected 2009 PALSAR mask differed from the mean of all nine F/NF masks by more than a standard deviation, but these localized effects of the mask did not translate into differences at the state level for any year (figure 10). By applying the F/NF mask as the last step in our CMS workflow, any errors introduced by the F/NF mask were minimized. Aggregated county-level estimates of AGB density (Mg ha⁻¹) in figures 9 and 10 appear more scattered than aggregated county-level estimates of total AGB (Tg) in figure 7(d) because in the former case smaller counties have equal weight, and in these smaller counties the aggregated county-level AGB estimates are less stable because they are based on less forested land and fewer FIA plots.

Figure 10. Refer to the following caption and surrounding text. — **Figure 10.** Aboveground biomass (AGB) estimated by FIA at the county level (e.g. 2011) versus mapped AGB aggregated to the county level (e.g. 2011), after applying the 2009 PALSAR forest/non-forest (F/NF) mask, in (a) Washington, (b) western Montana, (c) Oregon, and (d) Idaho. Also graphed for comparison are AGB means (± standard deviation) showing spatial variation in mapped AGB in 2011 based on 9 different F/NF masks.
Download figure:
Standard image High-resolution image

4. Discussion

Our overarching strategy was to develop a CMS that would maximize the utility of project datasets contributed by stakeholders and deliver to them unbiased regional maps that are temporally and spatially consistent. Given the ad hoc nature of various lidar-based forest inventory projects initiated by diverse stakeholders acting independently to achieve multiple forest management objectives, it was inevitable that projects would differ with regard to plot sampling protocols and lidar acquisition parameters. Nevertheless, AGB estimates by the landscape model (figure 5(a)) were unbiased (figure 8(a)). These estimates were hence highly suited for training the regional model (figure 5(b)), supporting our first hypothesis. Using a simple linear model to calibrate the regional maps (figure 7) could be thought of as the simplest model-assisted strategy for unbiased AGB estimation (figure 8(b)) (Gregoire 1998, Ståhl et al 2016). The bias correction improved the accuracy of the maps (by definition) but did not alter the variance structure of the data, thus leaving the model precision unchanged because the pixel-level standard deviation in AGB scales with the AGB mean calculated across the same 500 RF trees.

Our mapped AGB estimates from both the landscape and regional models were similarly unbiased and proportional compared to independent FIA AGB estimates as illustrated using equivalence plots (figure 8). This supports earlier findings that lidar- and Landsat-based AGB estimates are comparable upon aggregation to the watershed scale (Ohmann and Gregory 2002, Bell et al 2015, 2018) or larger units (e.g. counties), even if the landscape model AGB estimates largely derived from lidar are more locally accurate. As expected, AGB largely predicted from lidar metrics did not asymptote (figure 5(a)). However, spectral variables lose sensitivity at AGB densities above 300–400 Mg ha⁻¹ (Zhang et al 2014) or 450–500 Mg ha^–1 (Kennedy et al 2018a); we attribute this lack of a saturation effect in the regional model to the inclusion of climate metrics, which proved to be the most important predictors in the regional model (figure 4(b)). Thus, climate metrics helped capture without asymptote (figures 5(b) and (c)) the higher biomass forests that commonly occur west of the Cascades in our northwestern USA study region (figure 6), supporting our second hypothesis.

Our prototype CMS approach is bottom-up, beginning with the tree measurements. FVS uses expansion factors to account for different inventory plot sizes (Dixon 2002, 2018), which works well to ensure consistent stand structure attributes (e.g. AGB) across different plot sizes. Traditional forest inventories typically use variable-radius plot designs for the efficiency gained in the field (Avery and Burkhardt 2002, Husch et al 2003, Packard and Radtke 2007); however, we included only fixed-radius plots in our project reference database because variable-radius plots do not allow the lidar returns to be precisely binned within a defined plot footprint to ensure clean relationships to the tree measures in the spatial domain (also provided the plot geolocations are accurate) (Evans et al 2009). While we excluded variable-radius plots from our project database, it is worthwhile noting that there is evidence for successful integration of variable-radius plot data with lidar data (Deo et al 2016).

We also sought to limit plot-scale sources of noise in the temporal domain, by excluding plots that had been disturbed between collection dates, and restricting temporal offsets between tree and lidar measures to ±3 years, a reasonable compromise, but this criterion for our project reference database could be easily relaxed or tightened as warranted. Maintaining both the tree and lidar measurements in the project database ensures that these plot-level reference data remain relevant at a regional scale, regardless of how long ago the reference data were collected, even as forest conditions change. While vegetation conditions can and do change at any given location, the broad range of conditions in the regional population should not change, even as disturbance regimes may shift the relative proportions of forest conditions in response to changing disturbance and climate regimes (Cohen et al 2016, Dolan et al 2017, Fekety et al 2020a).

Our project reference database, and indeed FIA plot data as well, are also FVS-ready in the sense that the tree measurements can be projected to estimate tree growth and future AGB stores, and FVS also has options to consider the influence of disturbance (fire, insects, disease) (Dixon 2002, 2018) and climate (Crookston et al 2010, 2014) on future forest structure and composition (Crookston Buma and Wessman 2013, Fekety et al 2020a). Furthermore, FVS can output any number of stand structure attributes (tree density, basal area, volume, etc) besides AGB; our project reference database, like the FIA plot database, can also provide these attributes (Fekety et al 2018). Finally, we can continue to add future project datasets contributed by stakeholders to our project reference database, such that it is a 'living' database that continually expands the range of forest conditions represented with tree and lidar measurements.

While airborne lidar data are highly valued for accurate forest structure characterization, other remote sensing technologies are emerging that could be incorporated into our prototype CMS. Digital aerial photography (DAP) and Structure from Motion (SfM) techniques including methods collected from low-cost drones provide a cheaper alternative to lidar, which could fit well into this CMS, provided that the derived data products can be consistent (Strunk et al 2019). Studies have shown DAP measures of canopy height may be comparable to those derived from lidar first returns, although canopy density metrics are probably not comparable (White et al 2018, Dietmaier et al 2019, Goodbody et al 2019).

The best F/NF mask to choose for AGB accounting and MRV is likely to change as new remote sensing data products become available. We preferred the 2009 PALSAR mask not only for its global coverage but also for its independence from all other datasets used in our CMS. Global maps of forest cover change derived from PALSAR have compared well to the United Nation's Food and Agricultural Organization (FAO) global Forest Resource Assessments (FRA) in tropical rainforests of Southeast Asia (Dong et al 2012) and South America (Qin et al 2017). PALSAR can be used to detect forest cover changes due to either fragmentation in closed canopy forests (Qin et al 2017) or encroachment in open canopy forests (Wang et al 2018). Timing of dataset acquisition is another consideration for accurate acquisition of canopy structure using PALSAR. Deng et al (2014) found that acquiring data during drier seasons helped mitigate the influence of soil and surface moisture on L-band backscatter, such that it more accurately detects biomass. Fusion between PALSAR and Landsat images has clear potential for more accurate forest cover mapping (Qin et al 2016).

We must note that the estimated uncertainties in AGB by our bottom-up approach encompass but do not specifically account for the different sources of error, namely: (1) tree measurement errors, plot X, Y location errors, and allometric errors within our project reference plot database; (2) lidar X,Y,Z measurement and georegistration errors; (3) DEM elevation and climate interpolation errors; (4) landscape model errors; (5) Landsat measurement and georegistration errors; and (6) spatiotemporal and model errors associated with the GLAS and SRTM data-derived canopy height product (Simard et al 2011). These additive sources of error collectively degrade the R² and RMSE fit statistics reporting model precision throughout this paper, but especially the regional AGB estimates in figures 5(b), (c) and 8(b).

However, we do assert that calibrating the maps near the end of our workflow using unbiased FIA plot estimates effectively minimized bias, supporting our third hypothesis. The global accuracy of the maps is evident upon aggregation to the county level (figure 7(d), 9, 10). Locally, our AGB estimates vary less at the county level than the AGB estimates from FIA (figure 9). Future research could more thoroughly evaluate local accuracies in AGB mapped by our CMS compared to other methodologies.

5. Conclusion

The strength of our project reference database is that it precisely matches tree and lidar measurements at the plot scale in space and time, while acknowledging that it is a biased sample of the regional population. We proceeded to use it to train RF models for predicting AGB from lidar (and topographic and climate) metrics across project landscapes, leveraging the strength of lidar for characterizing forest structure variation at finer (plot/pixel) scales. We in turn used these AGB maps to train a regional model predicting AGB from climate, topographic, and LandTrendr metrics to generate regional AGB maps on an annual time step, thus leveraging the strength of Landsat for capturing annual forest cover dynamics over many years. The regional AGB maps also overcame the spectral saturation limitation of Landsat in high biomass forests by including climate and topographic metrics as predictors, which can capture the large productivity gradients across the northwestern USA non-asymptotically. Our CMS leveraged the strength of FIA plots as an unbiased estimator of forest conditions to calibrate our regional, annual AGB maps, therefore overcoming our initial constraint of using a biased AGB sample as the basis of our prototype CMS. Our AGB maps benefit not just the contributing stakeholders but a much broader group of users, particularly USFS regional planners, and provide critical information for carbon monitoring.

Acknowledgments

This work was primarily funded by a NASA Carbon Monitoring Systems (CMS) Program Award (NNH15AZ06I) with secondary funding for additional WA and OR lidar processing provided by the Joint Fire Science Program (JFSP) Fire and Smoke Model Evaluation Experiment (FASMEE) Project (15-S-01-01). University partners were funded through Joint Venture Agreements with the University of Minnesota (15-JV-044), Colorado State University (16-JV-061), Oregon State University (15-JV-041), and the University of Idaho (15-JV-040).

We greatly appreciate our stakeholders who contributed both field plot and lidar project datasets from several USFS National Forests [Idaho Panhandle (Jason Jerman), Nez Perce-Clearwater (Ken Marshall), Colville, Malheur (Ed Uebler), Ochoco (Brian Wing), Fremont-Winema (Timothy Bryant)], Bureau of Land Management Coos Bay District Office (George McFadden), the Confederated Tribes of the Colville Reservation (Rodney Cawston and Cody Desautel), Seattle Public Utilities (Rolf Gersonde), and the University of Idaho. Jacob Strunk designed the sampling scheme applied on Washington Department of Natural Resources lands. We thank the FIA program and in particular Andrew Lister for their data and expertise.

Data Availability Statement

The data that support the findings of this study are openly available at the following DOIs. The NASA Oak Ridge National Lab (ORNL) Distributed Active Archive Center (DAAC) hosts both the regional 2000–2016 AGB maps (Fekety and Hudak 2019; 10.3334/ORNLDAAC/1719) and the landscape AGB maps from which they were derived (Fekety and Hudak 2020; 10.3334/ORNLDAAC/1766). Both AGB map products are accompanied by maps of RF model uncertainty, also at 30 m resolution. Publicly available project reference plot data are archived on the USDA Research Data Archive (Fekety et al 2020b; www.fs.usda.gov/rds/archive/catalog/RDS-2020-0026).

Dates

Peer review information