A programmatic interface to the PhenoCam web services. Allows for easy downloads of PhenoCam near-surface remote sensing greenness (Gcc) time series directly to your R workspace or your computer. Post-processing allows for the smoothing of the time-series and the calculation of phenological transition dates as a final product.
The package gives access to the latest generated PhenoCam time series (at most 1-day old for running sites) and allows for the extraction of up-to-date phenological transition dates. However, the data acquired through the package will not be curated and vetted for data quality.
For a quality controlled and fully described dataset I suggest downloading the dataset as described by Richardson et al. (2018). This dataset uses the phenocamr packages in its final processing steps however quality control is gauranteed through careful review of the data. The data can be interactivly explored on explore.phenocam.us and downloaded in full from the ORNL DAAC. If in need of more recent data you can use the package and its functionality but be mindful of quality control especially the region-of-interest (ROI) used and potential unaccounted field-of-view (FOV) shifts in the dataset.
Below I describe the most common use of the package, downloading recent PhenoCam time series and generating phenological transition dates for a given site and data type. I intentionally disable most automated processing and step through some of the routines to illustrate the workflow which normally happens internally to the main function download_phenocam(). Generated transition date files can be used in later analysis or modelling exercises using for example the phenor R package.
A full list of meta-data for all sites can be queried using the list_sites() function.
sites <- list_sites()
print(head(sites))
#> wwf_biome nimage
#> 1 4 41592
#> 2 12 34695
#> 3 12 2639
#> 4 5 43429
#> 5 4 1452
#> 6 5 44629
#> site_acknowledgements
#> 1 Camera images from Acadia National Park are provided courtesy of the National Park Service Air Resources Program.
#> 2 Camera images from Agua Tibia Wilderness are provided courtesy of the USDA Forest Service Air Resources Management Program.
#> 3 Camera images from Agua Tibia Wilderness are provided courtesy of the USDA Forest Service Air Resources Management Program.
#> 4 Camera images from Yosemite National Park are provided courtesy of the National Park Service Air Resources Program.
#> 5
#> 6 Research at the Alligator River flux site is supported by DOE NICCR (award 08-SC-NICCR-1072), DOE-TES (awards 11-DE-SC-0006700 and 7090112), USDA Forest Service (award 13-JV-11330110-081) and USDA-NIFA (award 2014-67003-22068).
#> MAT_site date_end landcover_igbp koeppen_geiger site
#> 1 NA 2018-08-21 5 Dfb acadia
#> 2 NA 2018-08-21 7 Csa aguatibiaeast
#> 3 NA 2006-10-25 7 Csa aguatibianorth
#> 4 NA 2018-08-21 8 Csb ahwahnee
#> 5 NA 2015-10-13 13 Cfa alleypond
#> 6 16.6 2018-08-21 5 Cfa alligatorriver
#> infrared active MAT_daymet site_type lat ecoregion
#> 1 N TRUE 7.05 III 44.37694 8
#> 2 N TRUE 15.75 III 33.62200 11
#> 3 N FALSE 16.00 III 33.60222 11
#> 4 N TRUE 12.25 III 37.74670 6
#> 5 N FALSE 11.90 II 40.74284 8
#> 6 Y TRUE 16.75 I 35.78790 8
#> camera_description flux_sitenames tzoffset group
#> 1 unknown -5 National Park Service
#> 2 unknown -8 USFS
#> 3 unknown -8 USFS
#> 4 unknown -8 National Park Service
#> 5 StarDot NetCam SC -5 SmartForests
#> 6 StarDot NetCam SC US-NC4 -5 PhenoCam
#> contact1
#> 1 Dee Morse <dee_morse AT nps DOT gov>
#> 2 Ann E Mebane <amebane AT fs DOT fed DOT us>
#> 3
#> 4 Dee Morse <dee_morse AT nps DOT gov>
#> 5 Mary Martin <mary DOT martin AT unh DOT edu>
#> 6 Asko Noormets <noormets AT tamu DOT edu>
#> contact2 flux_data MAP_site
#> 1 John Gross <John_Gross AT nps DOT gov> FALSE NA
#> 2 Kristi Savig <KSavig AT air-resource DOT com> FALSE NA
#> 3 FALSE NA
#> 4 John Gross <John_Gross AT nps DOT gov> FALSE NA
#> 5 Nicholas Grant <ngrant02 AT fs DOT fed DOT us> FALSE NA
#> 6 John King <john_king AT ncsu DOT edu> TRUE 1310
#> camera_orientation secondary_veg_type date_start lon
#> 1 NE EN 2007-03-15 -68.26083
#> 2 SW 2007-08-16 -116.86700
#> 3 NE 2003-10-01 -117.34368
#> 4 E GR 2008-08-28 -119.58160
#> 5 S 2014-11-05 -73.74304
#> 6 N WL 2012-05-03 -75.90380
#> dominant_species
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6 Nyssa sylvatica, Taxodium distichum, Nyssa aquatica, Acer rubrum
#> site_description elev
#> 1 Acadia National Park, McFarland Hill, near Bar Harbor, Maine 158
#> 2 Agua Tibia Wilderness, California 1086
#> 3 Agua Tibia Wilderness, California 1090
#> 4 Ahwahnee Meadow, Yosemite National Park, California 1199
#> 5 Alley Pond, Queens, New York 61
#> 6 Alligator River National Wildlife Refuge, North Carolina 1
#> MAT_worldclim MAP_worldclim MAP_daymet flux_networks method
#> 1 6.5 1303 1439 httppull
#> 2 14.9 504 483 httppull
#> 3 13.8 729 489 httppull
#> 4 11.8 886 871 httppull
#> 5 11.7 1109 1263 ftppush
#> 6 16.4 1312 1371 AMERIFLUX ftppush
#> primary_veg_type
#> 1 DB
#> 2 SH
#> 3 SH
#> 4 EN
#> 5 DB
#> 6 DB
To select a site first download an overview meta-data table of all available sites together with their ROI id’s and vegetation type and a limited set of meta-data parameters.
rois <- list_rois()
print(head(rois))
#> description
#> 1 Deciduous trees in foreground center
#> 2 Mixed forest in foreground. Start new timeseries due to camera/FOV change.
#> 3 General canopy
#> 4 Grassy field in foreground
#> 5 Grassy field in foreground
#> 6 Grassy field in foreground
#> first_date site lat site_years missing_data_pct veg_type
#> 1 2007-03-15 acadia 44.37694 9.8 7 DB
#> 2 2017-10-11 acadia 44.37694 0.8 9 DB
#> 3 2003-10-02 aguatibianorth 33.60222 2.5 19 XX
#> 4 2008-08-29 ahwahnee 37.74670 2.8 9 GR
#> 5 2012-03-29 ahwahnee 37.74670 3.3 0 GR
#> 6 2015-07-28 ahwahnee 37.74670 3.0 1 GR
#> lon last_date roi_id_number data_release
#> 1 -68.26083 2017-09-20 1000 pre
#> 2 -68.26083 2018-08-21 2000 pre
#> 3 -117.34368 2006-10-26 1 pre
#> 4 -119.58160 2011-10-14 1 v1
#> 5 -119.58160 2015-07-01 2 v1
#> 6 -119.58160 2018-08-21 3 pre
The below code shows you how to download a PhenoCam time series for the “harvard” site, ROI (roi_id) 1 and a time step frequency of 3-days. In this case the default outlier detection and smoothing routines has been disabled and will be run separately in subsequent steps. In normal use these will be enabled by default. The default output directory is tempdir() but any directory can be specified for data management purposes. If default settings are maintained, outlier detection and smoothing will be performed automatically. If so desired phenology dates can be estimated in one pass. In the latter case new data will be written in the same directory as specified for downloading the time series data.
download_phenocam(site = "harvard$",
veg_type = "DB",
roi_id = "1000",
frequency = 3,
outlier_detection = FALSE,
smooth = FALSE,
out_dir = tempdir())
#> Downloading: harvard_DB_1000_3day.csv
After downloading we read in the data from disk. The data has a header and is comma separated.
df <- read_phenocam(file.path(tempdir(),"harvard_DB_1000_3day.csv"))
print(str(df))
#> List of 10
#> $ site : chr "harvard"
#> $ veg_type : chr "DB"
#> $ roi_id : chr "1000"
#> $ frequency : chr "3day"
#> $ lat : num 42.5
#> $ lon : num -72.2
#> $ elev : num 340
#> $ solar_elev_min: num 10
#> $ header : Named chr [1:24] NA NA NA "harvard" ...
#> ..- attr(*, "names")= chr [1:24] "#" "# 3-day summary product time series for harvard" "#" "# Site" ...
#> $ data :'data.frame': 3972 obs. of 32 variables:
#> ..$ date : chr [1:3972] "2008-01-05" "2008-01-06" "2008-01-07" "2008-01-08" ...
#> ..$ year : int [1:3972] 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 ...
#> ..$ doy : int [1:3972] 5 6 7 8 9 10 11 12 13 14 ...
#> ..$ image_count : int [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ midday_filename : chr [1:3972] NA NA NA NA ...
#> ..$ midday_r : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ midday_g : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ midday_b : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ midday_gcc : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ midday_rcc : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ r_mean : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ r_std : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ g_mean : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ g_std : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ b_mean : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ b_std : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ gcc_mean : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ gcc_std : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ gcc_50 : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ gcc_75 : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ gcc_90 : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ rcc_mean : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ rcc_std : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ rcc_50 : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ rcc_75 : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ rcc_90 : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ max_solar_elev : num [1:3972] NA NA NA NA NA NA NA NA NA NA ...
#> ..$ snow_flag : logi [1:3972] NA NA NA NA NA NA ...
#> ..$ outlierflag_gcc_mean: logi [1:3972] NA NA NA NA NA NA ...
#> ..$ outlierflag_gcc_50 : logi [1:3972] NA NA NA NA NA NA ...
#> ..$ outlierflag_gcc_75 : logi [1:3972] NA NA NA NA NA NA ...
#> ..$ outlierflag_gcc_90 : logi [1:3972] NA NA NA NA NA NA ...
#> - attr(*, "class")= chr "phenocamr"
#> NULL
The downloaded time series is of a 3-day resolution. However, to correctly evaluate the phenology on a daily time step the time series needs to be expanded to this one day time step. This can be achieved using the expand_phenocam() function.
df <- expand_phenocam(df)
After reading in the data as a data frame you can apply the outlier detection routine. This routine uses an iterative method to detect outlier values in the Gcc time series. This routine filters out most spurious values due contaminiation by snow, mist, rain or otherwise very bright events. Warnings are suppressed as the routine is iterative and might throw warnings if it does not converge on a solution. This has no implications for the routine and data returned.
df <- detect_outliers(df)
After detecting outliers you can smooth the data. This function uses an AIC based methodology to find the opitmal loess smoothing window. Warnings are suppressed as the routine uses an optimization in which certain parameter settings return warnings. This has no implications for the routine and data returned.
df <- smooth_ts(df)
Finally, if smoothed data is available you can calculate phenological transition dates. This routine uses a PELT changepoint detection based approach to find meaningful seasonal cycles in the data. By default start of growing season dates are returned. If the reverse parameter is set to TRUE the end of growing season dates are returned. Dates are formatted as unix time and will be provided for three default threshold values (10 / 25 / 50%) of the Gcc amplitude.
start_of_season <- transition_dates(df)
print(head(start_of_season))
#> transition_10 transition_25 transition_50 transition_10_lower_ci
#> 1 14002 14007 14014 14001
#> 2 14361 14367 14374 14357
#> 3 14720 14724 14730 14701
#> 4 15096 15101 15107 15092
#> 5 15451 15461 15467 15444
#> 6 15825 15829 15835 15824
#> transition_25_lower_ci transition_50_lower_ci transition_10_upper_ci
#> 1 14006 14012 14005
#> 2 14365 14372 14364
#> 3 14723 14729 14723
#> 4 15100 15105 15099
#> 5 15460 15465 15457
#> 6 15828 15833 15827
#> transition_25_upper_ci transition_50_upper_ci threshold_10 threshold_25
#> 1 14009 14015 0.37666 0.39155
#> 2 14369 14375 0.37693 0.39121
#> 3 14726 14731 0.37737 0.38842
#> 4 15103 15108 0.37998 0.39287
#> 5 15463 15468 0.38307 0.39777
#> 6 15831 15836 0.38060 0.39620
#> threshold_50 min_gcc max_gcc
#> 1 0.42041 0.36936 0.46562
#> 2 0.41848 0.36828 0.46187
#> 3 0.41848 0.36912 0.46467
#> 4 0.42230 0.37073 0.46878
#> 5 0.42725 0.37275 0.47838
#> 6 0.42533 0.37274 0.47138
Alternatively you can use the phenophases() function which is a wrapper of the transition_dates() function. However, as it potentially writes data to disk it needs additional information such as the roi_id, site name etc. The phenophases() function is the function which generated the final data products in the Richardson et al. (2018) paper. If used internally the output will be formatted in unix time, when written to file the dates will be human readable in YYYY-MM-DD format. Both start and end of season estimates will be provided.
phenology_dates <- phenophases(df, internal = TRUE)
With the phenoogy dates calculated we can plot their respective locations on the smoothed time series. In this case the plot will show the 50% amplitude threshold values for both rising and falling parts of the 90th percentile Gcc curve, marked with green and brown vertical lines respectivelly.
plot(as.Date(df$data$date),
df$data$smooth_gcc_90,
type = "l",
xlab = "date",
ylab = "Gcc")
# rising "spring" greenup dates
abline(v = phenology_dates$rising$transition_50,
col = "green")
# falling "autumn" senescence dates
abline(v = phenology_dates$falling$transition_50,
col = "brown")
Hufkens K., Basler J. D., Milliman T. Melaas E., Richardson A.D. 2018 An integrated phenology modelling framework in R: Phenology modelling with phenor. Methods in Ecology & Evolution, 9: 1-10.
This project was is supported by the National Science Foundation’s Macro-system Biology Program (awards EF-1065029 and EF-1702697).