# install.packages(c("sf", "terra", "lubridate", "httr", "jsonlite", "tidyverse", "tidyjson", "plotly", "kableExtra"))
WyAdapt API Example
WyAdapt Future Climate API Example
For this example, we provide some basic example workflows to accessing, querying and even downloading climate data available within the WyAdapt cyberinfrastructure. We use REST API’s to query for existing climate summary data that are stored within the ARCC S3 Pathfinder storage server as Cloud Optimized GeoTiffs (COGs). Currently we have multiple GCM’s in which daily data have been aggregated to monthly and annual COGs.
Using R with WyAdapt Climate Data
Within the WyAdapt.org application, you can find a complete listing of API Endpoints here: https://wyadapt.org/swagger/index.html
For the below examples we will want to use Program R and have the following Packages installed. If you do not have any of the following packages installed, the following code should get you started.
Once you have the correct packages installed, let’s load them into our session so we can do some work.
Now that our environment is prepared, let’s first check out the available datasets. Let’s start with the GCMs that are available:
<- GET("https://wyadapt.org/api/ClimateDataRepo/GcmRcpVariantCombinations") %>%
gcmcontent() %>%
spread_all() %>%
data.frame() %>%
select(gcm, variant, description) %>%
distinct() %>%
arrange(gcm)
gcm | variant | description |
---|---|---|
access-cm2 | r5i1p1f1 | Commonwealth Scientific and Industrial Research Organization (CSIRO) and Bureau of Meteorology (BOM) Australia |
canesm5 | r1i1p2f1 | Canadian Centre for Climate Modelling and Analysis |
cesm2 | r11i1p1f1 | Community Earth System Model Contributors |
cnrm-esm2-1 | r1i1p1f2 | Centre National de Recherches Météorologiques/ Centre Européen de Recherche et Formation Avancée en Calcul Scientifique |
ec-earth3 | r1i1p1f1 | EC-Earth consortium |
ec-earth3-veg | r1i1p1f1 | EC-Earth consortium |
ensemble | p75 | 75th percentile across all GCM |
ensemble | p10 | 10th percentile across all GCM |
ensemble | p50 | 50th percentile across all GCM |
ensemble | p90 | 90th percentile across all GCM |
ensemble | p25 | 25th percentile across all GCM |
fgoals-g3 | r1i1p1f1 | Chinese Academy of Sciences |
giss-e2-1-g | r1i1p1f2 | Goddard Institute for Space Studies |
miroc6 | r1i1p1f1 | Atmosphere and Ocean Research Institute (The University of Tokyo), National Institute for Environmental Studies, and Japan Agency for Marine-Earth Science and Technology |
mpi-esm1-2-hr | r7i1p1f1 | Max Planck Institute for Meteorology |
mpi-esm1-2-hr | r3i1p1f1 | Max Planck Institute for Meteorology |
mpi-esm1-2-lr | r7i1p1f1 | Max Planck Institute for Meteorology |
noresm2-mm | r1i1p1f1 | Norwegian Earth System Model |
taiesm1 | r1i1p1f1 | Research Center for Environmental Changes, Academia Sinica (Taiwan) |
ukesm1-0-ll | r2i1p1f2 | Met Office, Hadley Centre (UK) |
Admittedly, GCM’s can feel a little obtuse for individuals not in ‘the know’, but the ensembles have piqued my interest given they are an aggregation across all of the available GCM’s. What are the available variables?
<- GET("https://wyadapt.org/api/ClimateDataRepo/Variables") %>%
varscontent() %>%
spread_all() %>%
data.frame() %>%
select(variable, alias, units, scaleFactor, description)
variable | alias | units | scaleFactor | description |
---|---|---|---|---|
prec | Precipitation | mm | 0.01 | Precipitation for monthly or annual |
t2 | Average Temperature | °C | 0.01 | Average Temperature for monthly or annual |
t2max | Maximum Temperature | °C | 0.01 | Maximum Temperature for monthly or annual |
t2min | Minimum Temperature | °C | 0.01 | Minimum Temperature for monthly or annual |
cdd | Max Consecutive Days Precipitation <1mm/day | days | 1.00 | Max number of consecutive days with precipitation < 1mm/day (integer) |
cwd | Max Consecutive Days Precipitation >1mm/day | days | 1.00 | Max number of consecutive days with precipitation >= 1 mm/day (integer) |
fd | Frost Days | days | 1.00 | Frost days: Annual number of days with Min temp < 0C (integer) |
id | Icing Days | days | 1.00 | Icing days: Annual number of days with Max temp < 0C (integer) |
r20mm | Annual Count Precipitaiton >=20mm/day | days | 1.00 | Annual count of days when precipitation >= 20mm/day (integer) |
r95p | Annual Total Precipitation When Daily >95th | mm | 0.01 | Annual total precipitation when daily precip exceeds the 95th percentile of wet day precip (float) |
rx1day | Annual Max 1-day Precipitation | mm | 0.01 | Rx1day - Annual maximum one-day precipitation (float) |
rx5day | Annual Max 5-day Precipitation | mm | 0.01 | Rx5day - Annual maximum five-day precipitation (float) |
sdii | Simple Precipitation Intensity Index | mm | 0.01 | Simple Precipitation Intensity Index (Annual mean Wet days >1mm precipitation) (float) |
snow | Snow | mm | 0.01 | Snow Water Equivalent monthly |
snowApril | SWE April 1st | mm | 0.01 | Snow Water Equivalent on April 1st for given year |
snowFeb | SWE February 1st | mm | 0.01 | Snow Water Equivalent on February 1st for given year |
snowJan | SWE January 1st | mm | 0.01 | Snow Water Equivalent on January 1st for given year |
snowJune | SWE June 1st | mm | 0.01 | Snow Water Equivalent on June 1st for given year |
snowMar | SWE March 1st | mm | 0.01 | Snow Water Equivalent on March 1st for given year |
snowMay | SWE May 1st | mm | 0.01 | Snow Water Equivalent on May 1st for given year |
tnn | Monthly Minimum of Daily Temperature | °C | 0.01 | Monthly minimum of daily minimum temperature (float) |
txx | Monthly Maximum of Daily Temperature | °C | 0.01 | Monthly maximum of daily maximum temperature (float) |
There are a lot of different variables and it is important to note that there is a Scale Factor we need to pay attention to.
I mentioned above that daily data are aggregated into monthly and annual
<- GET("https://wyadapt.org/api/ClimateDataRepo/Timescale") %>%
( temporalcontent() %>%
unlist() )
[1] "30yearmonthlyrefdif" "monthlyrefdif" "annual"
[4] "monthly" "30yearannual" "30yearmonthly"
[7] "30yearannualrefdif" "annualrefdif"
….and the available spatial resolutions
<- GET("https://wyadapt.org/api/ClimateDataRepo/Resolution") %>%
rescontent() %>%
spread_all() %>%
data.frame() %>%
select(-c(`document.id`, `..JSON`))
gcm | variant | resolution | description |
---|---|---|---|
access-cm2 | r5i1p1f1 | d02_9km | 9 kilometer cell size |
canesm5 | r1i1p2f1 | d02_9km | 9 kilometer cell size |
cesm2 | r11i1p1f1 | d02_9km | 9 kilometer cell size |
cnrm-esm2-1 | r1i1p1f2 | d02_9km | 9 kilometer cell size |
ec-earth3 | r1i1p1f1 | d02_9km | 9 kilometer cell size |
ec-earth3-veg | r1i1p1f1 | d02_9km | 9 kilometer cell size |
OK, we have a pretty good idea of the available datasets, let’s learn how to interact with them given that there are several methods!
Since the climate data are COGs, we can query them spatially from the cloud and not have to worry about downloading files. To do this let’s first add some polygons to our environment….let’s go with the Counties of Wyoming from the ESRI Feature Service. Notice that we add an ID column that matches row.names()
of the layer. We will use this later on in our processing.
# Get Wyoming counties
<- parse_url("https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Boundaries_2022/FeatureServer/2/query")
url
$query <- list(where = "STATE_FIPS='56'",
urloutFields = "*",
returnGeometry = "true",
f = "geojson")
<- build_url(url)
request
<- st_read(request) %>%
county mutate(ID = as.integer(rownames(.)))
Reading layer `OGRGeoJSON' from data source
`https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Boundaries_2022/FeatureServer/2/query?where=STATE_FIPS%3D%2756%27&outFields=%2A&returnGeometry=true&f=geojson'
using driver `GeoJSON'
Simple feature collection with 23 features and 14 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -111.0546 ymin: 40.99478 xmax: -104.0523 ymax: 45.00582
Geodetic CRS: WGS 84
ggplot(county) +
geom_sf()
Let’s look at the ensemble of Annual Mean Temperature for Albany County for the entire timeseries. First we need to obtain the URLs for the specified variable and timescale (notice the query parameters within the URL). You will notice we are adding a prefix to the URL “/vsicurl/” in a new column “urlCOG”. This tells R that we are simply querying the file on the S3, please do not download.
<- GET("https://wyadapt.org/api/ClimateDataRepo/ClimateUrl?timescale=annual&gcm=ensemble&variant=p50&variable=t2") %>%
cogscontent() %>%
spread_all() %>%
data.frame() %>%
select(-c(document.id, `..JSON`))
url | filename | resolution | biascorrected | timescale | gcm | rcp | variable | variant | years | sampMonth | scale | units | alias | statistics |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
https://pathfinder.arcc.uwyo.edu/wyadapt/d02_9km/bias_corrected/cogs/annual/t2/t2_ensemble.p50_ssp370_BC_d02_1981-2099_annual.tif | t2_ensemble.p50_ssp370_BC_d02_1981-2099_annual.tif | d02_9km | TRUE | annual | ensemble | ssp370 | t2 | p50 | 1981-2099 | 13 | 0.01 | °C | Average Temperature | [-620, 3186, 1274.2671, 604.6369] |
Now that we know which data we are querying (annual estimates of mean temperature from the median ensemble values) and where the data are (urlCOG), we are all set to conduct some raster analyses using cloud resources! First we stack the rasters, then we extract the mean of all cells within our polygon boundary (filtering for Albany County) for each layer (in this case annual mean temperature). Be sure to use the URL with vsicurl prefix AND to adjust the raster values by the scale factor!
# Stack the rasters
<- rast(cogs$url, vsi = TRUE)
rs
# extract mean values for only Albany County
<- terra::extract(rs, county %>%
efilter(FIPS %in% c("56001")), fun = mean)
You will notice our output has 1 row, representing Albany County, and 119 other columns representing the annual mean temperatures for 1981-2099. Let’s build on this logic to make the data more user friendly and meaningful. You will notice we are using the ID column we made when reading in this dataset.
# extract mean values for only Albany County, parse metadata from the name and append County info
<- terra::extract(rs, county %>%
efilter(FIPS %in% c("56001")), fun = mean) %>%
pivot_longer(2:ncol(.), names_to = "raster", values_to = "value") %>%
separate_wider_delim(raster, delim = "_", names = c("gcm", "variable", "waterYear")) %>%
separate_wider_delim(gcm, delim = ".", names = c("gcm", "variant")) %>%
mutate(sampDate = as.Date(paste(waterYear, "01", "01", sep = "-")),
waterYear = as.integer(waterYear)) %>%
left_join(cogs %>%
select(gcm, variant, variable, scale), by = c("gcm", "variant", "variable")) %>%
left_join(county %>%
select(ID, NAME:FIPS), by = "ID") %>%
mutate(value = value * scale) # need to rescale the results
Excellent! Let’s visualize the results:
ggplotly(
%>%
e ggplot(aes(x = sampDate, y = value, group = NAME, color = NAME)) +
geom_line() +
ylab(paste(cogs$alias[1], cogs$units[1])) +
theme_classic()
)
Hopefully you can imagine that the above workflow can be expanded to work for all of the counties simultaneously!
# extract mean values for all, parse metadata from the name and append County info
<- terra::extract(rs, county, fun = mean) %>%
eAllpivot_longer(2:ncol(.), names_to = "raster", values_to = "value") %>%
separate_wider_delim(raster, delim = "_", names = c("gcm", "variable", "waterYear")) %>%
separate_wider_delim(gcm, delim = ".", names = c("gcm", "variant")) %>%
mutate(sampDate = as.Date(paste(waterYear, "01", "01", sep = "-")),
waterYear = as.integer(waterYear)) %>%
left_join(cogs %>%
select(gcm, variant, variable, scale), by = c("gcm", "variant", "variable")) %>%
left_join(county %>%
select(ID, NAME:FIPS), by = "ID") %>%
mutate(value = value * scale) # need to rescale the results
ggplotly(
%>%
eAll ggplot(aes(x = sampDate, y = value, group = NAME, color = NAME)) +
geom_line() +
ylab(paste(cogs$alias[1], cogs$units[1])) +
theme_classic()
)
OK, this is great, we can look at the median temperatures through time for each county. But we know there are more than one GCM, hence the ensemble of the models. Perhaps it might be a good idea to incorporate model variance….luckily for us, these data are available in which the ensemble dataset has the 10th, 25th, 50th, 75th and 90th percentiles calculated. Let’s take advantage of this, using our Albany County example.
# need to expand our queries to include all ensembles
<- GET("https://wyadapt.org/api/ClimateDataRepo/ClimateUrl?timescale=annual&gcm=ensemble&variable=t2") %>%
cogscontent() %>%
spread_all() %>%
data.frame() %>%
select(-c(document.id, `..JSON`)) %>%
mutate(urlCOG = paste0("/vsicurl/", url))
Again, we have a list, albeit larger, of urls. We can stack this larger dataset and extract values. Notice that the code for a single GCM (ensemble-p50) is exactly the same as for when we query all of the percentiles!
# stack the rasters
<- rast(cogs$urlCOG)
rs
<- terra::extract(rs, county %>%
efilter(FIPS %in% c("56001")), fun = mean) %>%
pivot_longer(2:ncol(.), names_to = "raster", values_to = "value") %>%
separate_wider_delim(raster, delim = "_", names = c("gcm", "variable", "waterYear")) %>%
separate_wider_delim(gcm, delim = ".", names = c("gcm", "variant")) %>%
mutate(sampDate = as.Date(paste(waterYear, "01", "01", sep = "-")),
waterYear = as.integer(waterYear)) %>%
left_join(cogs %>%
select(gcm, variant, variable, scale), by = c("gcm", "variant", "variable")) %>%
left_join(county %>%
select(ID, NAME:FIPS), by = "ID") %>%
mutate(value = value * scale) # need to rescale the results
Now we have to adjust the shape of the data slightly so we can visualize the temperature range across GCMs.
# flatten the data
<- e %>%
datpivot_wider(id_cols = c(NAME, STATE_ABBR, variable, sampDate), names_from = variant, values_from = value)
ggplotly(
%>%
dat ggplot(aes(sampDate)) +
geom_ribbon(aes(ymin = p10, ymax = p90), fill = "grey90") +
geom_ribbon(aes(ymin = p25, ymax = p75), fill = "grey75") +
geom_line(aes( y = p50, group = NAME, color = NAME)) +
ylab(paste(cogs$alias[1], cogs$units[1])) +
theme_classic()
)
We can map one of the layers if we like.
::mapview(rs[[44]] * cogs$scale[3], layer.name = "2024 Mean Temp (C)") mapview
Download WyAdapt Future Climate Rasters
If you really want, you can easily download these data to your personal machine. The below example should get you started.
# output location
<- "G:/Test/COG"
odir
<- GET("https://wyadapt.org/api/ClimateDataRepo/ClimateUrl?timescale=annual&gcm=ensemble-p50&variable=t2") %>%
cogscontent() %>%
spread_all() %>%
data.frame() %>%
select(-c(document.id, `..JSON`)) %>%
mutate(urlCOG = paste0("/vsicurl/", url)) %>%
slice_head(n = 3) # shorten the amount of data as this is an examples
# Download the files from Repository
for(i in 1:nrow(cogs)){
try(utils::download.file(url = cogs$url[i], destfile = paste0(odir, "/", cogs$filename[i]), quiet = TRUE, mode = "wb"), silent = TRUE)
}