The tundra phenology database: More than two decades of tundra phenology responses to climate change

Observations of changes in phenology have provided some of the strongest signals of the effects of climate change on terrestrial ecosystems. The International Tundra Experiment (ITEX), initiated in the early 1990s, established a common protocol to measure plant phenology in tundra study areas across the globe. Today, this valuable collection of phenology measurements depicts the responses of plants at the colder extremes of our planet to experimental and ambient changes in temperature over the past decades. The database contains 150,434 phenology observations of 278 plant species taken at 28 study areas for periods of 1 to 26 years. Here we describe the full dataset to increase the visibility and use of these data in global analyses, and to invite phenology data contributions from underrepresented tundra locations. Portions of this tundra phenology database have been used in three recent syntheses, some datasets are expanded, others are from entirely new study areas, and the entirety of these data are now available at the Polar Data Catalogue (https://doi.org/10.21963/13215).


Introduction
Changes in phenology are some of the most well-recorded and easily observable biotic responses to climate change (Parmesan and Yohe, 2003;Thackeray et al., 2016;Menzel et al., 2020), and phenology observations provide important information for predicting ecosystem response to future climatic change (Diez et al., 2012). While climate change has significantly altered the phenology of many organisms around the world, the magnitude of phenological responses can differ greatly among genotypes, species, sites, ecosystems, and biomes (Primack, 1980;Parmesan, 2007;Post et al., 2018;Prevéy et al., 2018). The many consequences of shifts in the timing of plant flowering and growth include altered trophic interactions (Post and 4 Forchhammer, 2008; Høye et al., 2013) and changes in carbon sequestration and trace gas feedbacks (Pattison et al., 2015;Leffler et al., 2016).
The International Tundra Experiment (ITEX) was established in 1990 to examine effects of experimental warming in tundra ecosystems, using common experimental warming protocols and standardized measurements of treatment responses at plant, community, and ecosystem scales (Webber and Walker, 1991;Henry and Molau, 1997). Some of the first and most frequent measurements taken at ITEX sites were plant phenology observations, and the value of these coordinated observations -taken using a common protocol across sites in similar experimental conditions -has continued to grow over time (Arft et al., 1999;Prevéy et al., 2019). Phenology data from ITEX experiments have supported numerous publications, including: single site studies (Molau et al., 2005;Bjorkman et al., 2015;Hollister et al., 2015;Panchen and Gorelick, 2015;Semenchuk et al., 2016), comparisons of single taxa across sites (Alatalo and Totland, 1997;Jones et al., 1997;Lévesque et al., 1997;Stenström et al., 1997;Welker et al., 1997), and analyses of phenology data from multiple species and sites (Arft et al., 1999;Oberbauer et al., 2013;Prevéy et al., 2017Prevéy et al., , 2019Assmann et al., 2019). Observations from this dataset revealed that phenology of plants at colder Arctic sites is more sensitive to changes in temperature than phenology of plants from warmer Arctic sites (Arft et al.;1999;Prevéy et al. 2017); that lateflowering species flower earlier with warmer temperatures than early-flowering speciespotentially leading to shorter flowering seasons with predicted warmer summers in the future Prevéy et al., 2019); and that snowmelt and temperature are important drivers of plant phenology along coastal tundra sites (Assmann et al., 2019).
Although the ITEX phenology data have been used in several syntheses within the tundra biome, data from tundra sites are underrepresented in regional and global plant phenology syntheses (Parmesan and Yohe, 2003;Menzel et al., 2006;Cleland et al., 2007;Cook et al., 2012). Thus, one goal of publishing this database is to increase the visibility and accessibility of these data for use in global analyses. In addition, the phenology dataset described here is the most comprehensive collection of tundra phenology observations to date: containing over 100,000 more phenology observations than previously published datasets, with more phenophases, sites, and years of data than previous datasets. In this data paper, we describe the structure and content of the tundra phenology database and establish a publicly available DOI 5 with the Polar Data Catalogue (https://doi.org/10.21963/13215) where updates to the database can be added to aid in future syntheses.

Study area information and experimental setup
The tundra phenology database currently contains observations from 28 study areas in tundra ecosystems (i.e. ecosystems above latitudinal or elevational tree lines, Fig. 1, Table 1,   Table S1). 'Study areas' indicate general regions ranging in size from several hundred square meters to up to tens of kilometers. 'Subsites' are smaller regions within larger study areas, either located in different habitat types or created as blocks of plots within study areas, and 'plots' are the smallest study area units, located within subsites and study areas, and range in size based on the plant species of interest and landscape characteristics (Table S1). Study areas with warming experiments have clear plastic or fiberglass open-top chambers (OTCs) that were designed to artificially increase air temperature within the chambers by an average of 0.5-3°C (Webber and Walker, 1991;Marion et al., 1997;Arft et al., 1999;Bokhorst et al., 2013;Prevéy et al., 2019, Table S1). Variation in the amount of warming experienced in OTCs likely results from variation in habitat types and ambient climate conditions, building materials used, and differences in the height and diameter of OTCs at different study areas. The OTCs were constructed from clear fiberglass or polycarbonate materials, and have a footprint of ca. 1-2 m 2 (Marion et al., 1997).
The OTCs were placed on plots either during the summer and removed in the winter, or left on plots throughout the year, depending on the study area  Table S1). All warmed plots had associated control plots and some study areas continued monitoring the control plots beyond the time period during which warming treatments were applied. More details on study area and experimental characteristics for many of the study areas can be found at the ITEX Wikipedia page: https://en.wikipedia.org/wiki/International_Tundra_Experiment.
Eleven of the 28 study areas in the database were originally established as part of the International Tundra Experiment (ITEX) network (Webber and Walker, 1991;Henry and Molau, 1997). Oberbauer et al. (2013) added one additional study area and years of data from 1992 6 through 2009 for use in a phenology synthesis paper. Most recently, phenology data from 11 additional study areas and years through 2015 were collected for three cross-site syntheses (Prevéy et al., 2017Assmann et al., 2019 ). The updated phenology dataset described here includes five additional study areas and years of data through 2019. A current synthesis (Collins et al. in review) is using some of these data to examine the variation in plant responses to warming across multiple phenophases, over time, and with inter-annual climate.

Phenology data collection protocols
Phenology measurements collected at all original ITEX study areas were taken using a common protocol outlined in the ITEX manual (Molau and Mølgaard, 1996). The standardized protocol involves checking the phenological status of plant species within study areas or plots one to three times per week over the snow-free season. Scientific names for plant species were standardized across all study areas using The Plant List (2013, v 1.1) via the package Taxonstand in the statistical program R (R Core Team, 2020). The date that a phenological event, or phenophase, is observed is recorded as the day of year (DOY) and retained in this database. The five phenophases that were recorded most frequently across study areas, and are included in the database, are: green-up of leaves (green), first flowering date (flower), last flowering date (flowerend), seed maturation (seedmat), and leaf senescence (senesce; Arft et al., 1999).
Phenophases were defined differently depending on plant species (Molau & Molgaard, 1996), but were recorded consistently over time for each species at each study area (Table S2).
For 21 of the study areas in the tundra phenology database, the phenophase observations reflect the first observed phenological event per species, plot, study area, and year (Table S2). At these sites, 'flower' was defined as the date when either the first flower was open, the first pollen was visible or the first anthers were exposed, and 'flowerend' was defined as the date when the first anthers withered, or first petals dropped. Seven study areas recorded phenological events differently as noted below. At the Lapland subsite at Latnjajaure there were no distinct plots, so observations for this subsite reflect the first observed phenological events per species. At Baker Lake and Tanquary Fiord, the phenological observations reflect the mean date of phenophases across 20-30 monitored plants at each site. The phenological observations for Narsarsuaq, 7 Zackenberg, and Nuuk per plot and year reflect the dates of 50% flowering or senescence rather than the first observed open or senesced flower . In all cases, the manner of data collection and aggregation is consistent over time within each study area and noted in Table   S2.
All phenology observations at all study areas were graphed and visually inspected to ensure that dates were within logical ranges, for example, phenophase observations from November through March in the Northern Hemisphere were double-checked with data owners as these would have occurred outside the short growing season in tundra ecosystems. Additionally, any phenological observation outliers that were greater than three standard deviations away from the mean day of year per site, species, and phenophase were double-checked with data contributors and removed if there were determined to be errors. However, we cannot ensure that the database is entirely free of errors (e.g. observations being improperly recorded on datasheets, etc.), and we reserve the right to make corrections to the database as necessary.

Dataset availability and usage guidelines
The tundra phenology database is available at the Polar Data Catalogue (www.polardata.ca): https://doi.org/10.21963/13215. Since the phenology data collection at some study areas is ongoing, and we encourage the inclusion of data from new tundra study areas, the phenology database will occasionally be updated with new years of data, or more details on study area characteristics, and each update will be released with a new version number and made available at the DOI above. We are enthusiastic to welcome new phenology observations to the database, especially from underrepresented tundra regions (Fig. 1). Principal investigators wishing to join the ITEX experimental network and contribute to the tundra phenology database should contact the corresponding author of this data paper, or visit the ITEX webpage: https://www.gvsu.edu/itex/ for more information.
The dataset is fully available to the public and should be appropriately referenced by citing this data paper if used in published analyses. A large amount of time, effort, and funding has gone into conducting these frequent phenology observations in remote tundra locations.
Thus, collaborating with the relevant data contributors helps recognize the huge effort of the 8 study area PIs and data collectors and facilitates site-specific interpretations of cross-site data analyses. Full recognition for data use allows investigators to secure funding for the continued collection of data at remote tundra study areas. We therefore kindly request that data users contact and invite the data contributors of relevant observations in the database as coauthors should the dataset form a key contribution to the scientific analysis conducted in any resulting publications. The names and emails of data contributors are provided in the 'data_provider' column of the dataset.

Results / Dataset Description
Phenology observations were collected from a total of 28 study areas in Arctic and alpine tundra ecosystems on a total of 278 plant species (Fig. 1, Table 1). Seventeen study areas include observations from both control and experimentally warmed plots, and eleven study areas include observations from only control plots (Fig. 1). There was a median of 10 and a mean of 11 years of data collected per study area, on a mean of 15 species per study area ( Table 1) observations of seed maturation (seedmat; Fig. 3). Phenological events that happen earlier in the summer (green-up, first flowering dates) are almost twice as numerous in the dataset than lateseason events (leaf senescence, seed maturation), possibly because of herbivory, early snowfall, or because it is difficult to staff seasonal personal through late August and September in remote tundra locations, when later phenological events may be occurring.
Twenty-six percent of the observations in the database were first flowering observations, with all 28 study areas recording this event over a mean of 10.8 years (Figs. 3, 4). There was a large range in flowering dates between species, study areas, plots, and years ( Fig. 4), with an average range of 54 days among flowering dates within a year at study areas that recorded flowering of six species or more (Fig. 4). The structure of the database and variable descriptions are provided in Table 2.

Conclusion
To our knowledge, this database represents the largest collection of repeated phenology observations of plant species from across the tundra biome. This large collection of data has the potential to be used in future syntheses of vegetation response to climate change, both globally and locally. These data could be used to inform and refine climate-vegetation models, and, among many other research directions, help predict phenology in tundra regions where phenological mismatch between vegetation, herbivores and/or pollinators could occur as the climate changes.  Table 1. Tundra study areas with phenology observation included in the database. '# Species' is the total number of species with phenological observations at each study area. 'Phenophase' represents 'flower' for first flowering dates, 'flowerend' for last flowering dates, 'green' for dates of first green-up of leaves, 'seedmat' for dates of seed maturation or seed release, and 'senesce' for dates of first observed leaf coloring in fall or leaf senescence. 'Years' lists the years of data present for each data type. 'Treatment' is either 'CTL' for control plots only or 'CTL/OTC' for data from control and warming (open-top chamber) plots.

Study area
Lat   Fig. 1. General locations of tundra study areas with plant phenology observations in the database. The size of the symbols indicates the number of years of data from study areas with either only control plots (blue circles) or both control and experimentally-warmed plots (red circles). The map was created with the 'ggplot2' package (Wickham, 2016) in the statistical program R (R Core Team, 2020) using a base map from Natural Earth, and location data from the tundra phenology database (https://doi.org/10.21963/13215).   alpine tundra study areas that recorded flowering, colored by study area latitude. The DOYs for the southern hemisphere were shifted by 6 months to match those from the northern hemisphere sites. (B) The number of flowering observations recorded by latitude colored by study area latitude, and (C) locations of study areas with flowering observations, colored by study area latitude. The map was created with the 'ggplot2' package (Wickham, 2016) in the statistical program R (R Core Team, 2020) using a base map from Natural Earth, and location data from the tundra phenology database (https://doi.org/10.21963/13215). The size of the symbols indicates the number of years of data from study areas with either only control plots (blue circles) or both control and experimentally-warmed plots (red circles). The map was created with the 'ggplot2' package (Wickham, 2016) in the statistical program R (R Core Team, 2020) using a base map from Natural Earth, and location data from the tundra phenology database (https://doi.org/10.21963/13215).   alpine tundra study areas that recorded flowering, jittered horizontally and colored by study area latitude. The DOYs for the southern hemisphere were shifted by 6 months to match those from the northern hemisphere sites. (B) The number of flowering observations by an absolute measure of by latitude, and (C) locations of study areas with flowering observations, colored by study area latitude. The map was created with the 'ggplot2' package (Wickham, 2016) in the statistical program R (R Core Team, 2020) using a base map from Natural Earth, and location data from the tundra phenology database (https://doi.org/10.21963/13215).