Explore and analyze the continuous chlorophyll data to be included in the NDFS synthesis manuscript. We will attempt to fit a generalized additive model (GAM) to the data set to help account for seasonality in the data. We will also include interaction terms in the model which wasn’t explored in the analysis for the NDFS Synthesis report.
# Load packages
library(tidyverse)
library(lubridate)
library(scales)
library(knitr)
library(mgcv)
library(lme4)
library(car)
library(emmeans)
library(multcomp)
library(gratia)
library(here)
library(conflicted)
# Source functions
source(here("manuscript_synthesis/src/global_functions.R"))
# Declare package conflict preferences
conflicts_prefer(dplyr::filter(), dplyr::lag(), dplyr::select())
Display current versions of R and packages used for this analysis:
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.2.3 (2023-03-15 ucrt)
## os Windows 10 x64 (build 19045)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate English_United States.utf8
## ctype English_United States.utf8
## tz America/Los_Angeles
## date 2024-02-28
## pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [1] CRAN (R 4.2.0)
## boot 1.3-28.1 2022-11-22 [2] CRAN (R 4.2.3)
## bslib 0.4.2 2022-12-16 [1] CRAN (R 4.2.2)
## cachem 1.0.8 2023-05-01 [1] CRAN (R 4.2.3)
## callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.2)
## car * 3.1-2 2023-03-30 [1] CRAN (R 4.2.3)
## carData * 3.0-5 2022-01-06 [1] CRAN (R 4.2.1)
## cli 3.6.2 2023-12-11 [1] CRAN (R 4.2.3)
## coda 0.19-4 2020-09-30 [1] CRAN (R 4.2.1)
## codetools 0.2-19 2023-02-01 [2] CRAN (R 4.2.3)
## colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.2.2)
## conflicted * 1.2.0 2023-02-01 [1] CRAN (R 4.2.2)
## crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.1)
## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.1)
## digest 0.6.33 2023-07-07 [1] CRAN (R 4.2.3)
## dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.2.3)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.1)
## emmeans * 1.8.5 2023-03-08 [1] CRAN (R 4.2.2)
## estimability 1.4.1 2022-08-05 [1] CRAN (R 4.2.1)
## evaluate 0.21 2023-05-05 [1] CRAN (R 4.2.3)
## fansi 1.0.6 2023-12-08 [1] CRAN (R 4.2.3)
## fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.2)
## forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.2.2)
## fs * 1.6.3 2023-07-20 [1] CRAN (R 4.2.3)
## generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.1)
## ggplot2 * 3.4.3 2023-08-14 [1] CRAN (R 4.2.3)
## glue 1.7.0 2024-01-09 [1] CRAN (R 4.2.3)
## gratia * 0.8.1 2023-02-02 [1] CRAN (R 4.2.2)
## gtable 0.3.4 2023-08-21 [1] CRAN (R 4.2.3)
## here * 1.0.1 2020-12-13 [1] CRAN (R 4.2.1)
## hms 1.1.3 2023-03-21 [1] CRAN (R 4.2.3)
## htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.2.3)
## htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.2.3)
## httpuv 1.6.9 2023-02-14 [1] CRAN (R 4.2.2)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.1)
## jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.2.3)
## knitr * 1.42 2023-01-25 [1] CRAN (R 4.2.2)
## later 1.3.0 2021-08-18 [1] CRAN (R 4.2.1)
## lattice 0.21-8 2023-04-05 [1] CRAN (R 4.2.3)
## lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.2.3)
## lme4 * 1.1-34 2023-07-04 [1] CRAN (R 4.2.3)
## lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.2.3)
## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.1)
## MASS * 7.3-58.2 2023-01-23 [2] CRAN (R 4.2.3)
## Matrix * 1.5-4 2023-04-04 [1] CRAN (R 4.2.3)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.1)
## mgcv * 1.8-42 2023-03-02 [1] CRAN (R 4.2.2)
## mime 0.12 2021-09-28 [1] CRAN (R 4.2.0)
## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.1)
## minqa 1.2.5 2022-10-19 [1] CRAN (R 4.2.2)
## multcomp * 1.4-23 2023-03-09 [1] CRAN (R 4.2.2)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.1)
## mvnfast 0.2.8 2023-02-23 [1] CRAN (R 4.2.2)
## mvtnorm * 1.1-3 2021-10-08 [1] CRAN (R 4.2.0)
## nlme * 3.1-162 2023-01-31 [2] CRAN (R 4.2.3)
## nloptr 2.0.3 2022-05-26 [1] CRAN (R 4.2.1)
## patchwork * 1.1.2 2022-08-19 [1] CRAN (R 4.2.1)
## pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.3)
## pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 4.2.3)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.1)
## pkgload 1.3.2.1 2023-07-08 [1] CRAN (R 4.2.3)
## prettyunits 1.2.0 2023-09-24 [1] CRAN (R 4.2.3)
## processx 3.8.2 2023-06-30 [1] CRAN (R 4.2.3)
## profvis 0.3.7 2020-11-02 [1] CRAN (R 4.2.1)
## promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.1)
## ps 1.7.5 2023-04-18 [1] CRAN (R 4.2.3)
## purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.2.3)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.1)
## Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.2.3)
## readr * 2.1.5 2024-01-10 [1] CRAN (R 4.2.3)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.1)
## rlang * 1.1.3 2024-01-10 [1] CRAN (R 4.2.3)
## rmarkdown 2.21 2023-03-26 [1] CRAN (R 4.2.3)
## rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.1)
## rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.1)
## sandwich 3.0-2 2022-06-15 [1] CRAN (R 4.2.1)
## sass 0.4.6 2023-05-03 [1] CRAN (R 4.2.3)
## scales * 1.2.1 2022-08-20 [1] CRAN (R 4.2.1)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.1)
## shiny 1.7.4 2022-12-15 [1] CRAN (R 4.2.2)
## stringi 1.8.3 2023-12-11 [1] CRAN (R 4.2.3)
## stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.2.3)
## survival * 3.5-5 2023-03-12 [1] CRAN (R 4.2.3)
## TH.data * 1.1-2 2023-04-17 [1] CRAN (R 4.2.3)
## tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.2.3)
## tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.2.3)
## tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.1)
## tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.2.2)
## timechange 0.3.0 2024-01-18 [1] CRAN (R 4.2.3)
## tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.2.3)
## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.1)
## usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.1)
## utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.2)
## vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.2.3)
## withr 3.0.0 2024-01-16 [1] CRAN (R 4.2.3)
## xfun 0.39 2023-04-20 [1] CRAN (R 4.2.3)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.1)
## yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.2)
## zoo 1.8-12 2023-04-13 [1] CRAN (R 4.2.3)
##
## [1] C:/R/win-library/4.2
## [2] C:/Program Files/R/R-4.2.3/library
##
## ──────────────────────────────────────────────────────────────────────────────
# Define root file path for data
fp_data <- here("manuscript_synthesis/data")
# Import daily average water quality data
df_wq <- readRDS(file.path(fp_data, "processed/wq_daily_avg_2013-2019.rds"))
# Define file path for file containing region assignments for the continuous stations
fp_rtm_region <- ndfa_abs_sp_path(
"2011-2019 Synthesis Study-FASTR/WQ_Subteam/Processed_Data/Continuous/NDFA_Cont_WQ_Stations.csv"
)
# Import region assignments for the continuous stations
df_rtm_region <- read_csv(fp_rtm_region) %>% select(StationCode, Region = BroadRegion)
# Import dates of flow action periods
df_fa_dates <- read_csv(file.path(fp_data, "raw/FlowDatesDesignations_45days.csv"))
# Create a vector for the factor order of StationCode
sta_order <- c(
"RCS",
"RD22",
"I80",
"LIS",
"STTD",
"LIB",
"RYI",
"RVB"
)
# Prepare continuous chlorophyll for exploration and analysis
df_chla_c <- df_wq %>%
select(StationCode, Date, Chla) %>%
drop_na(Chla) %>%
# create Year variable
mutate(Year = year(Date)) %>%
# Scale and log transform chlorophyll values
mutate(Chla_log = log(Chla * 1000)) %>%
# Add Flow Action Periods, Region assignments, and DOY
ndfa_action_periods() %>%
left_join(df_rtm_region) %>%
mutate(DOY = yday(Date)) %>%
arrange(StationCode, Date) %>%
mutate(
# Apply factor orders to FlowActionPeriod, Region, and StationCode
FlowActionPeriod = factor(FlowActionPeriod, levels = c("Before", "During", "After")),
Region = factor(Region, levels = c("Upstream", "Downstream")),
StationCode = factor(StationCode, levels = sta_order),
# Add a column for Year as a factor for the model
Year_fct = factor(Year)
)
# Prepare dates of flow action periods to highlight the flow action periods for
# each year in the plots
df_fa_dates_c <- df_fa_dates %>%
transmute(
Year,
across(c(PreFlowEnd, PostFlowStart), mdy),
# add 1 day to PreFlowEnd so that the highlight for the flow action periods
# aligns correctly
PreFlowEnd = PreFlowEnd + days(1),
# Add DOY for PreFlowEnd and PostFlowStart
across(where(is.Date), yday, .names = "{.col}_DOY")
) %>%
# only include years 2013-2019
filter(Year > 2012)
df_chla_c %>%
count(Year, FlowActionPeriod, StationCode) %>%
arrange(StationCode) %>%
pivot_wider(names_from = StationCode, values_from = n) %>%
arrange(Year, FlowActionPeriod) %>%
kable()
Year | FlowActionPeriod | RCS | RD22 | I80 | LIS | STTD | LIB | RYI | RVB |
---|---|---|---|---|---|---|---|---|---|
2013 | Before | 7 | 7 | 39 | 7 | 37 | 46 | ||
2013 | During | 42 | 42 | 38 | 42 | 42 | 42 | ||
2013 | After | 33 | 30 | 34 | 33 | 46 | 46 | ||
2014 | Before | 46 | 46 | 43 | 46 | 46 | 46 | 45 | |
2014 | During | 11 | 15 | 15 | 14 | 15 | 15 | 14 | |
2014 | After | 1 | 46 | 46 | 46 | 29 | 46 | 46 | |
2015 | Before | 4 | 28 | 15 | 46 | 25 | 46 | 46 | |
2015 | During | 42 | 42 | 36 | 42 | 42 | 38 | 1 | 42 |
2015 | After | 40 | 40 | 40 | 46 | 46 | 46 | 46 | 46 |
2016 | Before | 21 | 21 | 21 | 46 | 21 | 46 | 39 | 39 |
2016 | During | 19 | 19 | 19 | 19 | 19 | 19 | 19 | 19 |
2016 | After | 46 | 46 | 46 | 46 | 46 | 46 | 46 | 37 |
2017 | Before | 46 | 46 | 46 | 46 | 46 | 46 | 46 | |
2017 | During | 21 | 21 | 21 | 12 | 4 | 21 | 21 | |
2017 | After | 46 | 46 | 46 | 46 | 6 | 46 | 46 | |
2018 | Before | 46 | 46 | 46 | 46 | 39 | 14 | 46 | 46 |
2018 | During | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 |
2018 | After | 46 | 46 | 46 | 46 | 19 | 46 | 46 | 46 |
2019 | Before | 46 | 46 | 46 | 46 | 31 | 20 | 46 | 46 |
2019 | During | 27 | 27 | 27 | 27 | 27 | 17 | 27 | 27 |
2019 | After | 46 | 46 | 46 | 46 | 46 | 39 | 46 | 46 |
RCS and RYI have some gaps, but we’ll keep them in for now.
We would like to include interaction terms in the model, so we need to look at sample sizes and visuals of the data.
df_chla_c %>% distinct(Region, StationCode) %>% arrange(Region, StationCode)
## # A tibble: 8 × 2
## Region StationCode
## <fct> <fct>
## 1 Upstream RCS
## 2 Upstream RD22
## 3 Upstream I80
## 4 Upstream LIS
## 5 Upstream STTD
## 6 Downstream LIB
## 7 Downstream RYI
## 8 Downstream RVB
df_chla_c %>%
count(Year, FlowActionPeriod, Region) %>%
pivot_wider(names_from = Region, values_from = n) %>%
kable()
Year | FlowActionPeriod | Upstream | Downstream |
---|---|---|---|
2013 | Before | 60 | 83 |
2013 | During | 164 | 84 |
2013 | After | 130 | 92 |
2014 | Before | 227 | 91 |
2014 | During | 70 | 29 |
2014 | After | 168 | 92 |
2015 | Before | 118 | 92 |
2015 | During | 204 | 81 |
2015 | After | 212 | 138 |
2016 | Before | 130 | 124 |
2016 | During | 95 | 57 |
2016 | After | 230 | 129 |
2017 | Before | 230 | 92 |
2017 | During | 79 | 42 |
2017 | After | 190 | 92 |
2018 | Before | 223 | 106 |
2018 | During | 150 | 90 |
2018 | After | 203 | 138 |
2019 | Before | 215 | 112 |
2019 | During | 135 | 71 |
2019 | After | 230 | 131 |
It looks like there is an adequate number of samples in each group.
Let’s explore the data with some plots. First lets plot the data in boxplots facetted by Year and Region using a log10 scale to see the results better.
df_chla_c %>%
ggplot(aes(x = FlowActionPeriod, y = Chla, fill = FlowActionPeriod)) +
geom_boxplot() +
facet_grid(rows = vars(Year), cols = vars(Region)) +
scale_y_log10(labels = trans_format("log10", math_format(10^.x))) +
annotation_logticks(sides = "l")
There may be some interaction between Flow Period, Year, and Region, but it’s difficult to see clearly. Its obvious that Upstream is higher than Downstream. Also 2018 and 2019 stand out in the Upstream region - During appears lower than Before and After.
Now let’s look at the same boxplots but grouped by Station. First, the stations in the Upstream region:
df_chla_c %>%
filter(Region == "Upstream") %>%
ggplot(aes(x = FlowActionPeriod, y = Chla, fill = FlowActionPeriod)) +
geom_boxplot() +
facet_grid(rows = vars(Year), cols = vars(StationCode)) +
scale_y_log10(labels = trans_format("log10", math_format(10^.x))) +
annotation_logticks(sides = "l")
The patterns differ by Station, however we’ll keep these all in the same region for consistency purposes.
Next, let’s look at the stations in the Downstream region:
df_chla_c %>%
filter(Region == "Downstream") %>%
ggplot(aes(x = FlowActionPeriod, y = Chla, fill = FlowActionPeriod)) +
geom_boxplot() +
facet_grid(rows = vars(Year), cols = vars(StationCode)) +
scale_y_log10(labels = trans_format("log10", math_format(10^.x))) +
annotation_logticks(sides = "l")
The patterns appear to differ by Station, but not as obviously as with the stations in the upstream region.
Let’s look at time-series plots based on day of year for each Region facetted by Year. The brown shaded areas represent the flow pulse periods for each year.
df_chla_c %>%
ggplot(aes(x = DOY, y = Chla, color = Region)) +
geom_point(size = 1, alpha = 0.2) +
scale_color_viridis_d(option = "plasma", end = 0.8) +
facet_wrap(vars(Year), scales = "free_y") +
geom_rect(
data = df_fa_dates_c,
aes(
xmin = PreFlowEnd_DOY,
xmax = PostFlowStart_DOY,
ymin = -Inf,
ymax = Inf
),
inherit.aes = FALSE,
alpha = 0.2,
fill = "brown"
) +
theme_bw()
Now, let’s fit some GAM models to these time-series plots.
df_chla_c %>%
ggplot(aes(x = DOY, y = Chla, color = Region)) +
# using bs = "tp" since this is the default smooth for s terms in mgcv::gam
geom_smooth(method = "gam", formula = y ~ s(x, bs = "tp")) +
scale_color_viridis_d(option = "plasma", end = 0.8) +
facet_wrap(vars(Year), scales = "free_y") +
geom_rect(
data = df_fa_dates_c,
aes(
xmin = PreFlowEnd_DOY,
xmax = PostFlowStart_DOY,
ymin = -Inf,
ymax = Inf
),
inherit.aes = FALSE,
alpha = 0.2,
fill = "brown"
) +
theme_bw()
These GAMs do a nice job of displaying the general trends for each Region. Overall, the Downstream region didn’t vary much through time except for in 2016. There is an obvious decrease in chlorophyll in the Upstream region during flow pulses in 2015 and 2017-2019.
Since the model will fit the smooth to all the data across years, regions and flow action periods, lets take a look at what that looks like using the log-transformed data.
df_chla_c %>%
ggplot(aes(x = DOY, y = Chla_log)) +
geom_point(size = 1, alpha = 0.2) +
# using bs = "tp" since this is the default smooth for s terms in mgcv::gam
geom_smooth(method = "gam", formula = y ~ s(x, bs = "tp")) +
theme_bw()
Finally, let’s see how these smooths look if we group by Region.
df_chla_c %>%
ggplot(aes(x = DOY, y = Chla_log, color = Region)) +
geom_point(size = 1, alpha = 0.2) +
# using bs = "tp" since this is the default smooth for s terms in mgcv::gam
geom_smooth(method = "gam", formula = y ~ s(x, bs = "tp")) +
scale_color_viridis_d(option = "plasma", end = 0.8) +
theme_bw()
We’ll try running a GAM including all two-way interactions between Year, Flow Action Period, and Region, a smooth term for day of year to account for seasonality, and Station as a random effect. First we’ll run the GAM without accounting for serial autocorrelation.
m_chla_gam <- gam(
Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY) + s(StationCode, bs = "re"),
data = df_chla_c,
method = "REML"
)
Lets look at the model summary and diagnostics:
summary(m_chla_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY) +
## s(StationCode, bs = "re")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.14927 0.15495 59.048 < 2e-16
## Year_fct2014 -0.08015 0.05913 -1.355 0.175349
## Year_fct2015 -0.19372 0.06160 -3.145 0.001672
## Year_fct2016 0.03639 0.08296 0.439 0.660907
## Year_fct2017 0.13888 0.05801 2.394 0.016706
## Year_fct2018 -0.25272 0.05755 -4.391 1.15e-05
## Year_fct2019 -0.06579 0.05772 -1.140 0.254399
## FlowActionPeriodDuring 0.30340 0.06489 4.676 3.00e-06
## FlowActionPeriodAfter 0.46490 0.08085 5.751 9.39e-09
## RegionDownstream -1.59727 0.24075 -6.635 3.57e-11
## Year_fct2014:FlowActionPeriodDuring -0.16455 0.07979 -2.062 0.039226
## Year_fct2015:FlowActionPeriodDuring -0.11091 0.07048 -1.574 0.115631
## Year_fct2016:FlowActionPeriodDuring -0.42141 0.09837 -4.284 1.87e-05
## Year_fct2017:FlowActionPeriodDuring -0.35651 0.07629 -4.673 3.04e-06
## Year_fct2018:FlowActionPeriodDuring -0.24945 0.06861 -3.636 0.000280
## Year_fct2019:FlowActionPeriodDuring -0.59046 0.07004 -8.430 < 2e-16
## Year_fct2014:FlowActionPeriodAfter -0.48075 0.07011 -6.858 7.79e-12
## Year_fct2015:FlowActionPeriodAfter -0.18275 0.06965 -2.624 0.008720
## Year_fct2016:FlowActionPeriodAfter -0.41231 0.10285 -4.009 6.18e-05
## Year_fct2017:FlowActionPeriodAfter -0.49840 0.06924 -7.199 6.92e-13
## Year_fct2018:FlowActionPeriodAfter -0.50229 0.06698 -7.499 7.46e-14
## Year_fct2019:FlowActionPeriodAfter -0.22473 0.06696 -3.356 0.000796
## Year_fct2014:RegionDownstream 0.54588 0.05952 9.172 < 2e-16
## Year_fct2015:RegionDownstream 0.34363 0.05532 6.212 5.63e-10
## Year_fct2016:RegionDownstream 0.68145 0.05764 11.823 < 2e-16
## Year_fct2017:RegionDownstream 0.07648 0.05860 1.305 0.191907
## Year_fct2018:RegionDownstream -0.37224 0.05563 -6.691 2.44e-11
## Year_fct2019:RegionDownstream -0.44992 0.05613 -8.015 1.34e-15
## FlowActionPeriodDuring:RegionDownstream 0.07136 0.03848 1.855 0.063702
## FlowActionPeriodAfter:RegionDownstream -0.28316 0.03322 -8.524 < 2e-16
##
## (Intercept) ***
## Year_fct2014
## Year_fct2015 **
## Year_fct2016
## Year_fct2017 *
## Year_fct2018 ***
## Year_fct2019
## FlowActionPeriodDuring ***
## FlowActionPeriodAfter ***
## RegionDownstream ***
## Year_fct2014:FlowActionPeriodDuring *
## Year_fct2015:FlowActionPeriodDuring
## Year_fct2016:FlowActionPeriodDuring ***
## Year_fct2017:FlowActionPeriodDuring ***
## Year_fct2018:FlowActionPeriodDuring ***
## Year_fct2019:FlowActionPeriodDuring ***
## Year_fct2014:FlowActionPeriodAfter ***
## Year_fct2015:FlowActionPeriodAfter **
## Year_fct2016:FlowActionPeriodAfter ***
## Year_fct2017:FlowActionPeriodAfter ***
## Year_fct2018:FlowActionPeriodAfter ***
## Year_fct2019:FlowActionPeriodAfter ***
## Year_fct2014:RegionDownstream ***
## Year_fct2015:RegionDownstream ***
## Year_fct2016:RegionDownstream ***
## Year_fct2017:RegionDownstream
## Year_fct2018:RegionDownstream ***
## Year_fct2019:RegionDownstream ***
## FlowActionPeriodDuring:RegionDownstream .
## FlowActionPeriodAfter:RegionDownstream ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 7.207 8.217 14.07 <2e-16 ***
## s(StationCode) 5.977 6.000 265.55 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.769 Deviance explained = 77.1%
## -REML = 4011.5 Scale est. = 0.24845 n = 5429
appraise(m_chla_gam)
k.check(m_chla_gam)
## k' edf k-index p-value
## s(DOY) 9 7.206775 0.9911756 0.2225
## s(StationCode) 8 5.976525 NA NA
draw(m_chla_gam, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gam, pages = 1, all.terms = TRUE)
acf(residuals(m_chla_gam))
Box.test(residuals(m_chla_gam), lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: residuals(m_chla_gam)
## X-squared = 35027, df = 20, p-value < 2.2e-16
The model definitely has residual autocorrelation as indicated by the ACF plot and the Box-Ljung test. The smooth term for day of year may also be overfitted. Let’s try a smaller k-value for the smooth first, then lets try to address the residual autocorrelation.
m_chla_gam_k5 <- gam(
Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY, k = 5) + s(StationCode, bs = "re"),
data = df_chla_c,
method = "REML"
)
summary(m_chla_gam_k5)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY,
## k = 5) + s(StationCode, bs = "re")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.16569 0.15508 59.104 < 2e-16
## Year_fct2014 -0.09652 0.05891 -1.638 0.101389
## Year_fct2015 -0.19918 0.06162 -3.232 0.001235
## Year_fct2016 0.09722 0.07577 1.283 0.199468
## Year_fct2017 0.12653 0.05796 2.183 0.029080
## Year_fct2018 -0.26419 0.05752 -4.593 4.47e-06
## Year_fct2019 -0.07601 0.05772 -1.317 0.187909
## FlowActionPeriodDuring 0.26731 0.06387 4.185 2.89e-05
## FlowActionPeriodAfter 0.45477 0.07852 5.792 7.37e-09
## RegionDownstream -1.60530 0.24119 -6.656 3.10e-11
## Year_fct2014:FlowActionPeriodDuring -0.16172 0.07945 -2.035 0.041853
## Year_fct2015:FlowActionPeriodDuring -0.10778 0.07055 -1.528 0.126659
## Year_fct2016:FlowActionPeriodDuring -0.47464 0.09052 -5.244 1.64e-07
## Year_fct2017:FlowActionPeriodDuring -0.35483 0.07605 -4.666 3.15e-06
## Year_fct2018:FlowActionPeriodDuring -0.24774 0.06857 -3.613 0.000305
## Year_fct2019:FlowActionPeriodDuring -0.58847 0.06991 -8.418 < 2e-16
## Year_fct2014:FlowActionPeriodAfter -0.46592 0.06982 -6.673 2.76e-11
## Year_fct2015:FlowActionPeriodAfter -0.18215 0.06972 -2.613 0.009011
## Year_fct2016:FlowActionPeriodAfter -0.48396 0.09595 -5.044 4.71e-07
## Year_fct2017:FlowActionPeriodAfter -0.48820 0.06890 -7.086 1.56e-12
## Year_fct2018:FlowActionPeriodAfter -0.49350 0.06693 -7.374 1.91e-13
## Year_fct2019:FlowActionPeriodAfter -0.21378 0.06679 -3.201 0.001378
## Year_fct2014:RegionDownstream 0.55570 0.05954 9.333 < 2e-16
## Year_fct2015:RegionDownstream 0.34960 0.05534 6.317 2.88e-10
## Year_fct2016:RegionDownstream 0.69038 0.05761 11.984 < 2e-16
## Year_fct2017:RegionDownstream 0.08567 0.05863 1.461 0.144057
## Year_fct2018:RegionDownstream -0.36302 0.05564 -6.524 7.47e-11
## Year_fct2019:RegionDownstream -0.44290 0.05617 -7.886 3.76e-15
## FlowActionPeriodDuring:RegionDownstream 0.07273 0.03853 1.888 0.059118
## FlowActionPeriodAfter:RegionDownstream -0.28499 0.03325 -8.570 < 2e-16
##
## (Intercept) ***
## Year_fct2014
## Year_fct2015 **
## Year_fct2016
## Year_fct2017 *
## Year_fct2018 ***
## Year_fct2019
## FlowActionPeriodDuring ***
## FlowActionPeriodAfter ***
## RegionDownstream ***
## Year_fct2014:FlowActionPeriodDuring *
## Year_fct2015:FlowActionPeriodDuring
## Year_fct2016:FlowActionPeriodDuring ***
## Year_fct2017:FlowActionPeriodDuring ***
## Year_fct2018:FlowActionPeriodDuring ***
## Year_fct2019:FlowActionPeriodDuring ***
## Year_fct2014:FlowActionPeriodAfter ***
## Year_fct2015:FlowActionPeriodAfter **
## Year_fct2016:FlowActionPeriodAfter ***
## Year_fct2017:FlowActionPeriodAfter ***
## Year_fct2018:FlowActionPeriodAfter ***
## Year_fct2019:FlowActionPeriodAfter **
## Year_fct2014:RegionDownstream ***
## Year_fct2015:RegionDownstream ***
## Year_fct2016:RegionDownstream ***
## Year_fct2017:RegionDownstream
## Year_fct2018:RegionDownstream ***
## Year_fct2019:RegionDownstream ***
## FlowActionPeriodDuring:RegionDownstream .
## FlowActionPeriodAfter:RegionDownstream ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 3.837 3.987 25.45 <2e-16 ***
## s(StationCode) 5.977 6.000 265.90 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.768 Deviance explained = 77%
## -REML = 4015.4 Scale est. = 0.24921 n = 5429
k.check(m_chla_gam_k5)
## k' edf k-index p-value
## s(DOY) 4 3.837093 1.001014 0.4875
## s(StationCode) 8 5.976554 NA NA
draw(m_chla_gam_k5, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gam_k5, pages = 1, all.terms = TRUE)
Changing the k-value to 5 seems to help reduce the “wiggliness” of the smooth term for DOY. Now, let’s add a lag1 term to the model to see if that helps with the residual autocorrelation.
df_rtm_chla_lag1 <- df_chla_c %>%
# Fill in missing days for each StationCode-Year combination
group_by(StationCode, Year) %>%
complete(Date = seq.Date(min(Date), max(Date), by = "1 day")) %>%
# Create lag1 of scaled log transformed chlorophyll values
mutate(lag1 = lag(Chla_log)) %>%
ungroup() %>%
drop_na(Chla_log, lag1)
m_chla_gam_lag1 <- gam(
Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY, k = 5) + s(StationCode, bs = "re") + lag1,
data = df_rtm_chla_lag1,
method = "REML"
)
summary(m_chla_gam_lag1)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY,
## k = 5) + s(StationCode, bs = "re") + lag1
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.448959 0.043550 10.309 < 2e-16
## Year_fct2014 -0.019120 0.019418 -0.985 0.3248
## Year_fct2015 -0.019504 0.020421 -0.955 0.3396
## Year_fct2016 0.006455 0.021270 0.303 0.7615
## Year_fct2017 -0.004660 0.019203 -0.243 0.8083
## Year_fct2018 -0.016989 0.019077 -0.891 0.3732
## Year_fct2019 -0.009242 0.019110 -0.484 0.6287
## FlowActionPeriodDuring 0.003125 0.019970 0.156 0.8757
## FlowActionPeriodAfter 0.025055 0.023617 1.061 0.2888
## RegionDownstream -0.080121 0.019930 -4.020 5.9e-05
## lag1 0.951626 0.004278 222.464 < 2e-16
## Year_fct2014:FlowActionPeriodDuring 0.015793 0.025975 0.608 0.5432
## Year_fct2015:FlowActionPeriodDuring -0.002537 0.023172 -0.109 0.9128
## Year_fct2016:FlowActionPeriodDuring -0.015012 0.024484 -0.613 0.5398
## Year_fct2017:FlowActionPeriodDuring 0.005332 0.024869 0.214 0.8303
## Year_fct2018:FlowActionPeriodDuring -0.008844 0.022499 -0.393 0.6943
## Year_fct2019:FlowActionPeriodDuring -0.026590 0.023033 -1.154 0.2484
## Year_fct2014:FlowActionPeriodAfter -0.022765 0.022939 -0.992 0.3210
## Year_fct2015:FlowActionPeriodAfter -0.001135 0.022885 -0.050 0.9604
## Year_fct2016:FlowActionPeriodAfter -0.047141 0.023045 -2.046 0.0408
## Year_fct2017:FlowActionPeriodAfter -0.018813 0.022632 -0.831 0.4059
## Year_fct2018:FlowActionPeriodAfter -0.032383 0.022059 -1.468 0.1422
## Year_fct2019:FlowActionPeriodAfter -0.007907 0.021886 -0.361 0.7179
## Year_fct2014:RegionDownstream 0.035290 0.019514 1.808 0.0706
## Year_fct2015:RegionDownstream 0.022617 0.018053 1.253 0.2103
## Year_fct2016:RegionDownstream 0.034360 0.018768 1.831 0.0672
## Year_fct2017:RegionDownstream 0.002749 0.019039 0.144 0.8852
## Year_fct2018:RegionDownstream -0.014273 0.018032 -0.792 0.4287
## Year_fct2019:RegionDownstream -0.014404 0.018246 -0.789 0.4299
## FlowActionPeriodDuring:RegionDownstream 0.001805 0.012497 0.144 0.8852
## FlowActionPeriodAfter:RegionDownstream -0.019658 0.010840 -1.813 0.0698
##
## (Intercept) ***
## Year_fct2014
## Year_fct2015
## Year_fct2016
## Year_fct2017
## Year_fct2018
## Year_fct2019
## FlowActionPeriodDuring
## FlowActionPeriodAfter
## RegionDownstream ***
## lag1 ***
## Year_fct2014:FlowActionPeriodDuring
## Year_fct2015:FlowActionPeriodDuring
## Year_fct2016:FlowActionPeriodDuring
## Year_fct2017:FlowActionPeriodDuring
## Year_fct2018:FlowActionPeriodDuring
## Year_fct2019:FlowActionPeriodDuring
## Year_fct2014:FlowActionPeriodAfter
## Year_fct2015:FlowActionPeriodAfter
## Year_fct2016:FlowActionPeriodAfter *
## Year_fct2017:FlowActionPeriodAfter
## Year_fct2018:FlowActionPeriodAfter
## Year_fct2019:FlowActionPeriodAfter
## Year_fct2014:RegionDownstream .
## Year_fct2015:RegionDownstream
## Year_fct2016:RegionDownstream .
## Year_fct2017:RegionDownstream
## Year_fct2018:RegionDownstream
## Year_fct2019:RegionDownstream
## FlowActionPeriodDuring:RegionDownstream
## FlowActionPeriodAfter:RegionDownstream .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 1.056 1.11 0.175 0.771408
## s(StationCode) 4.806 6.00 3.525 0.000199 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.976 Deviance explained = 97.6%
## -REML = -2075 Scale est. = 0.025937 n = 5357
appraise(m_chla_gam_lag1)
k.check(m_chla_gam_lag1)
## k' edf k-index p-value
## s(DOY) 4 1.055954 0.9714273 0.01
## s(StationCode) 8 4.805585 NA NA
draw(m_chla_gam_lag1, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gam_lag1, pages = 1, all.terms = TRUE)
acf(residuals(m_chla_gam_lag1), na.action = na.pass)
Box.test(residuals(m_chla_gam_lag1), lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: residuals(m_chla_gam_lag1)
## X-squared = 67.682, df = 20, p-value = 4.334e-07
Well, adding a lag1 term to the model helped with the residual autocorrelation, but it basically turned the smooth term of day of year into a straight line. This isn’t what we are looking for. After a brief internet search, I found a blog post that suggests using an AR(p) model to account for the correlated residuals. We can give that a try. We’ll run AR(1), AR(3), and AR(4) models and compare them using AIC. It wasn’t possible running an AR(2) model.
# Define model formula as an object
f_chla_gam_k5 <- as.formula("Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY, k = 5)")
# Fit original model with k = 5 and three successive AR(p) models
m_chla_gamm_k5 <- gamm(
f_chla_gam_k5,
data = df_chla_c,
random = list(StationCode = ~ 1),
method = "REML"
)
m_chla_gamm_k5_ar1 <- gamm(
f_chla_gam_k5,
data = df_chla_c,
random = list(StationCode = ~ 1),
correlation = corARMA(form = ~ 1|Year_fct, p = 1), # grouped by Year_fct
method = "REML"
)
m_chla_gamm_k5_ar3 <- gamm(
f_chla_gam_k5,
data = df_chla_c,
random = list(StationCode = ~ 1),
correlation = corARMA(form = ~ 1|Year_fct, p = 3),
method = "REML"
)
m_chla_gamm_k5_ar4 <- gamm(
f_chla_gam_k5,
data = df_chla_c,
random = list(StationCode = ~ 1),
correlation = corARMA(form = ~ 1|Year_fct, p = 4),
method = "REML"
)
# Compare models
anova(
m_chla_gamm_k5$lme,
m_chla_gamm_k5_ar1$lme,
m_chla_gamm_k5_ar3$lme,
m_chla_gamm_k5_ar4$lme
)
## Model df AIC BIC logLik Test L.Ratio
## m_chla_gamm_k5$lme 1 34 8098.715 8322.904 -4015.358
## m_chla_gamm_k5_ar1$lme 2 35 -3912.471 -3681.689 1991.236 1 vs 2 12013.186
## m_chla_gamm_k5_ar3$lme 3 37 -3925.242 -3681.272 1999.621 2 vs 3 16.771
## m_chla_gamm_k5_ar4$lme 4 38 -3925.695 -3675.131 2000.847 3 vs 4 2.453
## p-value
## m_chla_gamm_k5$lme
## m_chla_gamm_k5_ar1$lme <.0001
## m_chla_gamm_k5_ar3$lme 0.0002
## m_chla_gamm_k5_ar4$lme 0.1173
It looks like the AR(3) model has the best fit according to the AIC values and backed up by the BIC values. Let’s take a closer look at that one.
# Pull out the residuals and the GAM model
resid_ar3 <- residuals(m_chla_gamm_k5_ar3$lme, type = "normalized")
m_chla_gamm_k5_ar3_gam <- m_chla_gamm_k5_ar3$gam
summary(m_chla_gamm_k5_ar3_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY,
## k = 5)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.371564 0.227256 41.238 < 2e-16
## Year_fct2014 -0.278551 0.212444 -1.311 0.1899
## Year_fct2015 -0.227842 0.210662 -1.082 0.2795
## Year_fct2016 -0.215292 0.220986 -0.974 0.3300
## Year_fct2017 -0.129832 0.212117 -0.612 0.5405
## Year_fct2018 -0.443270 0.207354 -2.138 0.0326
## Year_fct2019 -0.309485 0.207641 -1.490 0.1362
## FlowActionPeriodDuring 0.038106 0.068551 0.556 0.5783
## FlowActionPeriodAfter 0.109433 0.095184 1.150 0.2503
## RegionDownstream -1.709765 0.359173 -4.760 1.98e-06
## Year_fct2014:FlowActionPeriodDuring -0.033895 0.090417 -0.375 0.7078
## Year_fct2015:FlowActionPeriodDuring 0.009413 0.090442 0.104 0.9171
## Year_fct2016:FlowActionPeriodDuring -0.066812 0.088260 -0.757 0.4491
## Year_fct2017:FlowActionPeriodDuring 0.035884 0.090321 0.397 0.6912
## Year_fct2018:FlowActionPeriodDuring -0.137338 0.087611 -1.568 0.1170
## Year_fct2019:FlowActionPeriodDuring 0.051955 0.087890 0.591 0.5545
## Year_fct2014:FlowActionPeriodAfter -0.136834 0.124546 -1.099 0.2720
## Year_fct2015:FlowActionPeriodAfter -0.034415 0.123137 -0.279 0.7799
## Year_fct2016:FlowActionPeriodAfter -0.111610 0.122416 -0.912 0.3620
## Year_fct2017:FlowActionPeriodAfter -0.133229 0.124635 -1.069 0.2851
## Year_fct2018:FlowActionPeriodAfter -0.235718 0.120857 -1.950 0.0512
## Year_fct2019:FlowActionPeriodAfter 0.055188 0.121057 0.456 0.6485
## Year_fct2014:RegionDownstream 0.504068 0.338861 1.488 0.1369
## Year_fct2015:RegionDownstream 0.358468 0.318325 1.126 0.2602
## Year_fct2016:RegionDownstream 0.594875 0.325356 1.828 0.0675
## Year_fct2017:RegionDownstream 0.136364 0.335529 0.406 0.6845
## Year_fct2018:RegionDownstream -0.327494 0.319201 -1.026 0.3049
## Year_fct2019:RegionDownstream -0.489606 0.321029 -1.525 0.1273
## FlowActionPeriodDuring:RegionDownstream -0.021554 0.048308 -0.446 0.6555
## FlowActionPeriodAfter:RegionDownstream -0.006239 0.065697 -0.095 0.9243
##
## (Intercept) ***
## Year_fct2014
## Year_fct2015
## Year_fct2016
## Year_fct2017
## Year_fct2018 *
## Year_fct2019
## FlowActionPeriodDuring
## FlowActionPeriodAfter
## RegionDownstream ***
## Year_fct2014:FlowActionPeriodDuring
## Year_fct2015:FlowActionPeriodDuring
## Year_fct2016:FlowActionPeriodDuring
## Year_fct2017:FlowActionPeriodDuring
## Year_fct2018:FlowActionPeriodDuring
## Year_fct2019:FlowActionPeriodDuring
## Year_fct2014:FlowActionPeriodAfter
## Year_fct2015:FlowActionPeriodAfter
## Year_fct2016:FlowActionPeriodAfter
## Year_fct2017:FlowActionPeriodAfter
## Year_fct2018:FlowActionPeriodAfter .
## Year_fct2019:FlowActionPeriodAfter
## Year_fct2014:RegionDownstream
## Year_fct2015:RegionDownstream
## Year_fct2016:RegionDownstream .
## Year_fct2017:RegionDownstream
## Year_fct2018:RegionDownstream
## Year_fct2019:RegionDownstream
## FlowActionPeriodDuring:RegionDownstream
## FlowActionPeriodAfter:RegionDownstream
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 2.354 2.354 10.28 2.63e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.677
## Scale est. = 0.2963 n = 5429
appraise(m_chla_gamm_k5_ar3_gam)
k.check(m_chla_gamm_k5_ar3_gam)
## k' edf k-index p-value
## s(DOY) 4 2.354098 0.9737098 0.05
draw(m_chla_gamm_k5_ar3_gam, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gamm_k5_ar3_gam, pages = 1, all.terms = TRUE)
acf(resid_ar3)
Box.test(resid_ar3, lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: resid_ar3
## X-squared = 63.145, df = 20, p-value = 2.297e-06
The AR(3) model has much less residual autocorrelation, and the diagnostics plots look pretty good. What does the ANOVA table look like?
# the anova.gam function is similar to a type 3 ANOVA
anova(m_chla_gamm_k5_ar3_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + s(DOY,
## k = 5)
##
## Parametric Terms:
## df F p-value
## Year_fct 6 0.960 0.45048
## FlowActionPeriod 2 0.719 0.48718
## Region 1 22.660 1.98e-06
## Year_fct:FlowActionPeriod 12 1.173 0.29639
## Year_fct:Region 6 3.739 0.00103
## FlowActionPeriod:Region 2 0.144 0.86558
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 2.354 2.354 10.28 2.63e-05
Only Region and the Year:Region interaction was significant among the parametric terms of the model. Let’s re-run the AR(3) GAM model dropping the two interactions that weren’t significant.
m_chla_gamm_k5_ar3b <- gamm(
Chla_log ~ Year_fct * Region + FlowActionPeriod + s(DOY, k = 5),
data = df_chla_c,
random = list(StationCode = ~ 1),
correlation = corARMA(form = ~ 1|Year_fct, p = 3),
method = "REML"
)
# Pull out the residuals and the GAM model
resid_ar3b <- residuals(m_chla_gamm_k5_ar3b$lme, type = "normalized")
m_chla_gamm_k5_ar3b_gam <- m_chla_gamm_k5_ar3b$gam
summary(m_chla_gamm_k5_ar3b_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ Year_fct * Region + FlowActionPeriod + s(DOY, k = 5)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.418025 0.221364 42.545 < 2e-16 ***
## Year_fct2014 -0.344070 0.202454 -1.699 0.08928 .
## Year_fct2015 -0.238766 0.198401 -1.203 0.22885
## Year_fct2016 -0.278026 0.209110 -1.330 0.18372
## Year_fct2017 -0.185742 0.201799 -0.920 0.35739
## Year_fct2018 -0.566709 0.197017 -2.876 0.00404 **
## Year_fct2019 -0.279312 0.196733 -1.420 0.15574
## RegionDownstream -1.719302 0.357881 -4.804 1.6e-06 ***
## FlowActionPeriodDuring 0.008016 0.023756 0.337 0.73580
## FlowActionPeriodAfter 0.020049 0.033624 0.596 0.55103
## Year_fct2014:RegionDownstream 0.507612 0.340264 1.492 0.13581
## Year_fct2015:RegionDownstream 0.361458 0.319723 1.131 0.25830
## Year_fct2016:RegionDownstream 0.599692 0.326685 1.836 0.06646 .
## Year_fct2017:RegionDownstream 0.142378 0.336946 0.423 0.67264
## Year_fct2018:RegionDownstream -0.332696 0.320541 -1.038 0.29935
## Year_fct2019:RegionDownstream -0.484732 0.322401 -1.504 0.13277
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 2.315 2.315 10.46 2.37e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.675
## Scale est. = 0.29796 n = 5429
appraise(m_chla_gamm_k5_ar3b_gam)
k.check(m_chla_gamm_k5_ar3b_gam)
## k' edf k-index p-value
## s(DOY) 4 2.315079 0.9919255 0.31
draw(m_chla_gamm_k5_ar3b_gam, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gamm_k5_ar3b_gam, pages = 1, all.terms = TRUE)
acf(resid_ar3b)
Box.test(resid_ar3b, lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: resid_ar3b
## X-squared = 63.547, df = 20, p-value = 1.984e-06
anova(m_chla_gamm_k5_ar3b_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ Year_fct * Region + FlowActionPeriod + s(DOY, k = 5)
##
## Parametric Terms:
## df F p-value
## Year_fct 6 1.603 0.14187
## Region 1 23.080 1.6e-06
## FlowActionPeriod 2 0.187 0.82984
## Year_fct:Region 6 3.736 0.00104
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 2.315 2.315 10.46 2.37e-05
Only the main effect of Region and the Year:Region interaction is significant in the model. Let’s take a closer look at the pairwise contrasts.
# Contrasts in Region main effect
emmeans(
m_chla_gamm_k5_ar3b,
data = df_chla_c,
specs = pairwise ~ Region
)
## $emmeans
## Region emmean SE df lower.CL upper.CL
## Upstream 9.11 0.169 5411 8.78 9.44
## Downstream 7.50 0.220 5411 7.07 7.93
##
## Results are averaged over the levels of: Year_fct, FlowActionPeriod
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 1.61 0.273 5411 5.882 <.0001
##
## Results are averaged over the levels of: Year_fct, FlowActionPeriod
# Contrasts in Region for each Year
emmeans(
m_chla_gamm_k5_ar3b,
data = df_chla_c,
specs = pairwise ~ Region | Year_fct
)
## $emmeans
## Year_fct = 2013:
## Region emmean SE df lower.CL upper.CL
## Upstream 9.38 0.222 5411 8.94 9.81
## Downstream 7.66 0.284 5411 7.10 8.22
##
## Year_fct = 2014:
## Region emmean SE df lower.CL upper.CL
## Upstream 9.03 0.209 5411 8.62 9.44
## Downstream 7.82 0.293 5411 7.25 8.40
##
## Year_fct = 2015:
## Region emmean SE df lower.CL upper.CL
## Upstream 9.14 0.206 5411 8.74 9.54
## Downstream 7.78 0.267 5411 7.26 8.30
##
## Year_fct = 2016:
## Region emmean SE df lower.CL upper.CL
## Upstream 9.10 0.213 5411 8.68 9.52
## Downstream 7.98 0.272 5411 7.45 8.51
##
## Year_fct = 2017:
## Region emmean SE df lower.CL upper.CL
## Upstream 9.19 0.207 5411 8.79 9.60
## Downstream 7.61 0.290 5411 7.05 8.18
##
## Year_fct = 2018:
## Region emmean SE df lower.CL upper.CL
## Upstream 8.81 0.203 5411 8.41 9.21
## Downstream 6.76 0.262 5411 6.24 7.27
##
## Year_fct = 2019:
## Region emmean SE df lower.CL upper.CL
## Upstream 9.10 0.203 5411 8.70 9.50
## Downstream 6.89 0.265 5411 6.37 7.41
##
## Results are averaged over the levels of: FlowActionPeriod
## Confidence level used: 0.95
##
## $contrasts
## Year_fct = 2013:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 1.72 0.358 5411 4.804 <.0001
##
## Year_fct = 2014:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 1.21 0.358 5411 3.389 0.0007
##
## Year_fct = 2015:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 1.36 0.333 5411 4.082 <.0001
##
## Year_fct = 2016:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 1.12 0.336 5411 3.335 0.0009
##
## Year_fct = 2017:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 1.58 0.354 5411 4.459 <.0001
##
## Year_fct = 2018:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 2.05 0.328 5411 6.247 <.0001
##
## Year_fct = 2019:
## contrast estimate SE df t.ratio p.value
## Upstream - Downstream 2.20 0.330 5411 6.675 <.0001
##
## Results are averaged over the levels of: FlowActionPeriod
# Contrasts in Year for each Region
cld(
emmeans(
m_chla_gamm_k5_ar3b,
data = df_chla_c,
specs = pairwise ~ Year_fct | Region
)$emmeans,
sort = FALSE,
Letters = letters
)
## Region = Upstream:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 9.38 0.222 5411 8.94 9.81 a
## 2014 9.03 0.209 5411 8.62 9.44 a
## 2015 9.14 0.206 5411 8.74 9.54 a
## 2016 9.10 0.213 5411 8.68 9.52 a
## 2017 9.19 0.207 5411 8.79 9.60 a
## 2018 8.81 0.203 5411 8.41 9.21 a
## 2019 9.10 0.203 5411 8.70 9.50 a
##
## Region = Downstream:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 7.66 0.284 5411 7.10 8.22 a
## 2014 7.82 0.293 5411 7.25 8.40 a
## 2015 7.78 0.267 5411 7.26 8.30 a
## 2016 7.98 0.272 5411 7.45 8.51 a
## 2017 7.61 0.290 5411 7.05 8.18 ab
## 2018 6.76 0.262 5411 6.24 7.27 c
## 2019 6.89 0.265 5411 6.37 7.41 bc
##
## Results are averaged over the levels of: FlowActionPeriod
## Confidence level used: 0.95
## P value adjustment: tukey method for comparing a family of 7 estimates
## significance level used: alpha = 0.05
## NOTE: If two or more means share the same grouping symbol,
## then we cannot show them to be different.
## But we also did not show them to be the same.
Upstream was always higher than downstream in all years. None of the years differed significantly in the upstream region. In the downstream region, 2018 was lower than 2013-2017, and 2019 was lower than 2013-2016.
Let’s compare the results of the GAM model to a linear mixed effects model like we used in the FASTR report. We’ll start by adding a lag1 term to account for residual autocorrelation.
m_chla_lmer_lag1 <-
lmer(
Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + lag1 + (1|StationCode),
data = df_rtm_chla_lag1
)
# Pull out residuals and look at autocorrelation
resid_lmer_lag1 <- residuals(m_chla_lmer_lag1)
acf(resid_lmer_lag1)
Box.test(resid_lmer_lag1, lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: resid_lmer_lag1
## X-squared = 67.564, df = 20, p-value = 4.527e-07
There is still some residual autocorrelation so lets add a second lag term, lag2.
df_rtm_chla_lag2 <- df_chla_c %>%
# Fill in missing days for each StationCode-Year combination
group_by(StationCode, Year) %>%
complete(Date = seq.Date(min(Date), max(Date), by = "1 day")) %>%
# Create lag1 of scaled log transformed chlorophyll values
mutate(
lag1 = lag(Chla_log),
lag2 = lag(Chla_log, 2)
) %>%
ungroup() %>%
drop_na(Chla_log, lag1, lag2)
m_chla_lmer_lag2 <-
lmer(
Chla_log ~ (Year_fct + FlowActionPeriod + Region)^2 + lag1 + lag2 + (1|StationCode),
data = df_rtm_chla_lag2
)
# Pull out residuals and look at autocorrelation
resid_lmer_lag2 <- residuals(m_chla_lmer_lag2)
acf(resid_lmer_lag2)
Box.test(resid_lmer_lag2, lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: resid_lmer_lag2
## X-squared = 58.203, df = 20, p-value = 1.346e-05
There is still some residual autocorrelation, but less than with the lag1 model. Let’s go with this one for now. Let’s look at its diagnostic plots.
df_rtm_chla_lag2 %>% plot_lm_diag(Chla_log, m_chla_lmer_lag2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The residuals look a little funky. Let’s take a look at the ANOVA table
Anova(m_chla_lmer_lag2, type = 3, contrasts=list(topic=contr.sum, sys=contr.sum))
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: Chla_log
## Chisq Df Pr(>Chisq)
## (Intercept) 120.0880 1 < 2.2e-16 ***
## Year_fct 6.5637 6 0.3631
## FlowActionPeriod 1.9362 2 0.3798
## Region 17.8706 1 2.364e-05 ***
## lag1 5387.5896 1 < 2.2e-16 ***
## lag2 23.1479 1 1.500e-06 ***
## Year_fct:FlowActionPeriod 15.8369 12 0.1988
## Year_fct:Region 18.7005 6 0.0047 **
## FlowActionPeriod:Region 5.3766 2 0.0680 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As with the GAM model, only the Region main effect and the Year:Region interaction terms are significant. Let’s re-run the model dropping the two interactions that weren’t significant.
m_chla_lmer_lag2b <-
lmer(
Chla_log ~ Year_fct * Region + FlowActionPeriod + lag1 + lag2 + (1|StationCode),
data = df_rtm_chla_lag2
)
# Pull out residuals and look at autocorrelation and diagnostic plots
resid_lmer_lag2b <- residuals(m_chla_lmer_lag2b)
acf(resid_lmer_lag2b)
Box.test(resid_lmer_lag2b, lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: resid_lmer_lag2b
## X-squared = 58.957, df = 20, p-value = 1.032e-05
df_rtm_chla_lag2 %>% plot_lm_diag(Chla_log, m_chla_lmer_lag2b)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The residuals still look a little strange. Let’s look at the ANOVA table.
Anova(m_chla_lmer_lag2b, type = 3, contrasts=list(topic=contr.sum, sys=contr.sum))
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: Chla_log
## Chisq Df Pr(>Chisq)
## (Intercept) 118.0011 1 < 2.2e-16 ***
## Year_fct 10.7569 6 0.09619 .
## Region 22.9914 1 1.627e-06 ***
## FlowActionPeriod 1.6245 2 0.44386
## lag1 5445.3569 1 < 2.2e-16 ***
## lag2 22.5668 1 2.030e-06 ***
## Year_fct:Region 15.9784 6 0.01387 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Again, only the main effect of Region and the Year:Region interaction is significant in the model. Let’s take a closer look at the pairwise contrasts.
# Contrasts in Region main effect
emmeans(m_chla_lmer_lag2b, specs = pairwise ~ Region, adjust = "sidak")
## $emmeans
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.578 0.006888 Inf 8.564 8.591
## Downstream 8.501 0.009378 Inf 8.483 8.520
##
## Results are averaged over the levels of: Year_fct, FlowActionPeriod
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0763 0.0125 Inf 6.106 <.0001
##
## Results are averaged over the levels of: Year_fct, FlowActionPeriod
## Degrees-of-freedom method: asymptotic
# Contrasts in Region for each Year
emmeans(m_chla_lmer_lag2b, specs = pairwise ~ Region | Year_fct, adjust = "sidak")
## $emmeans
## Year_fct = 2013:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.597 0.011271 Inf 8.574 8.619
## Downstream 8.509 0.013320 Inf 8.483 8.535
##
## Year_fct = 2014:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.567 0.009797 Inf 8.548 8.586
## Downstream 8.516 0.014116 Inf 8.488 8.543
##
## Year_fct = 2015:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.575 0.009419 Inf 8.557 8.594
## Downstream 8.512 0.012431 Inf 8.488 8.537
##
## Year_fct = 2016:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.581 0.010120 Inf 8.561 8.601
## Downstream 8.530 0.011969 Inf 8.506 8.553
##
## Year_fct = 2017:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.583 0.009805 Inf 8.564 8.602
## Downstream 8.501 0.014036 Inf 8.473 8.528
##
## Year_fct = 2018:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.563 0.008986 Inf 8.546 8.581
## Downstream 8.464 0.013714 Inf 8.437 8.491
##
## Year_fct = 2019:
## Region emmean SE df asymp.LCL asymp.UCL
## Upstream 8.577 0.009150 Inf 8.559 8.595
## Downstream 8.477 0.013622 Inf 8.451 8.504
##
## Results are averaged over the levels of: FlowActionPeriod
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
##
## $contrasts
## Year_fct = 2013:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0873 0.0182 Inf 4.795 <.0001
##
## Year_fct = 2014:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0512 0.0175 Inf 2.930 0.0034
##
## Year_fct = 2015:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0626 0.0161 Inf 3.891 0.0001
##
## Year_fct = 2016:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0515 0.0158 Inf 3.253 0.0011
##
## Year_fct = 2017:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0823 0.0178 Inf 4.630 <.0001
##
## Year_fct = 2018:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0994 0.0170 Inf 5.849 <.0001
##
## Year_fct = 2019:
## contrast estimate SE df z.ratio p.value
## Upstream - Downstream 0.0999 0.0173 Inf 5.777 <.0001
##
## Results are averaged over the levels of: FlowActionPeriod
## Degrees-of-freedom method: asymptotic
# Contrasts in Year for each Region
em_lmer_yr_reg <- emmeans(m_chla_lmer_lag2b, specs = pairwise ~ Year_fct | Region, adjust = "sidak")
cld(em_lmer_yr_reg$emmeans, sort = FALSE, Letters = letters)
## Region = Upstream:
## Year_fct emmean SE df asymp.LCL asymp.UCL .group
## 2013 8.597 0.011271 Inf 8.574 8.619 a
## 2014 8.567 0.009797 Inf 8.548 8.586 a
## 2015 8.575 0.009419 Inf 8.557 8.594 a
## 2016 8.581 0.010120 Inf 8.561 8.601 a
## 2017 8.583 0.009805 Inf 8.564 8.602 a
## 2018 8.563 0.008986 Inf 8.546 8.581 a
## 2019 8.577 0.009150 Inf 8.559 8.595 a
##
## Region = Downstream:
## Year_fct emmean SE df asymp.LCL asymp.UCL .group
## 2013 8.509 0.013320 Inf 8.483 8.535 ab
## 2014 8.516 0.014116 Inf 8.488 8.543 ab
## 2015 8.512 0.012431 Inf 8.488 8.537 ab
## 2016 8.530 0.011969 Inf 8.506 8.553 a
## 2017 8.501 0.014036 Inf 8.473 8.528 abc
## 2018 8.464 0.013714 Inf 8.437 8.491 c
## 2019 8.477 0.013622 Inf 8.451 8.504 bc
##
## Results are averaged over the levels of: FlowActionPeriod
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
## P value adjustment: tukey method for comparing a family of 7 estimates
## significance level used: alpha = 0.05
## NOTE: If two or more means share the same grouping symbol,
## then we cannot show them to be different.
## But we also did not show them to be the same.
These results are very similar to the GAM model results. Like with the GAM model, upstream was always higher than downstream in all years, and none of the years differed significantly in the upstream region. In the downstream region, 2018 was lower than 2013-2016, and 2019 was lower than 2016.
It’s encouraging that the GAM and LMER models had similar results. Unfortunately, neither found a significant effect of flow pulse period. Looking at the station level plots, it does appear that there is a chlorophyll response to flow pulse period at some of the stations. Let’s re-run the GAM model at the station level, including all stations at first. We’ll include all two-way interactions between Year, Flow Action Period, and Station and a smooth term for day of year to account for seasonality. We won’t account for serial autocorrelation in our initial GAM model.
m_chla_gam_st <- gam(
Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY),
data = df_chla_c,
method = "REML"
)
Lets look at the model summary and diagnostics:
summary(m_chla_gam_st)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.9877125 0.0850465 105.680 < 2e-16
## Year_fct2014 -0.0001447 0.0936376 -0.002 0.998767
## Year_fct2015 0.0742784 0.0688351 1.079 0.280602
## Year_fct2016 -0.0560153 0.0977430 -0.573 0.566610
## Year_fct2017 0.4126917 0.0858876 4.805 1.59e-06
## Year_fct2018 0.1086446 0.0851215 1.276 0.201889
## Year_fct2019 -0.0116491 0.0853000 -0.137 0.891379
## FlowActionPeriodDuring 0.0217904 0.0635643 0.343 0.731756
## FlowActionPeriodAfter 0.8455015 0.0706154 11.973 < 2e-16
## StationCodeRD22 0.7161405 0.0889423 8.052 9.98e-16
## StationCodeI80 0.6259439 0.0678434 9.226 < 2e-16
## StationCodeLIS -0.0803806 0.0854926 -0.940 0.347155
## StationCodeSTTD -0.3913993 0.0903159 -4.334 1.49e-05
## StationCodeLIB -1.4277519 0.0853133 -16.735 < 2e-16
## StationCodeRYI -1.1252069 0.0864671 -13.013 < 2e-16
## StationCodeRVB -1.5884529 0.0843502 -18.832 < 2e-16
## Year_fct2014:FlowActionPeriodDuring -0.0945012 0.0582321 -1.623 0.104683
## Year_fct2015:FlowActionPeriodDuring -0.0823757 0.0517134 -1.593 0.111236
## Year_fct2016:FlowActionPeriodDuring -0.2810510 0.0735623 -3.821 0.000135
## Year_fct2017:FlowActionPeriodDuring -0.1481757 0.0563787 -2.628 0.008608
## Year_fct2018:FlowActionPeriodDuring -0.0699917 0.0514125 -1.361 0.173452
## Year_fct2019:FlowActionPeriodDuring -0.4655619 0.0523856 -8.887 < 2e-16
## Year_fct2014:FlowActionPeriodAfter -0.4250913 0.0515033 -8.254 < 2e-16
## Year_fct2015:FlowActionPeriodAfter -0.2401225 0.0514701 -4.665 3.16e-06
## Year_fct2016:FlowActionPeriodAfter -0.4289829 0.0766805 -5.594 2.32e-08
## Year_fct2017:FlowActionPeriodAfter -0.5399129 0.0512193 -10.541 < 2e-16
## Year_fct2018:FlowActionPeriodAfter -0.4701620 0.0500465 -9.394 < 2e-16
## Year_fct2019:FlowActionPeriodAfter -0.2172227 0.0499426 -4.349 1.39e-05
## Year_fct2014:StationCodeRD22 -0.5721979 0.1024033 -5.588 2.42e-08
## Year_fct2015:StationCodeRD22 -0.1463480 0.0760411 -1.925 0.054334
## Year_fct2016:StationCodeRD22 -0.4098169 0.0960924 -4.265 2.04e-05
## Year_fct2017:StationCodeRD22 -0.6506343 0.0931247 -6.987 3.16e-12
## Year_fct2018:StationCodeRD22 -0.4396712 0.0917940 -4.790 1.71e-06
## Year_fct2019:StationCodeRD22 -0.2299593 0.0922080 -2.494 0.012664
## Year_fct2014:StationCodeI80 -0.1317280 0.0851794 -1.546 0.122048
## Year_fct2015:StationCodeI80 0.0000000 0.0000000 NaN NaN
## Year_fct2016:StationCodeI80 0.3771334 0.0774083 4.872 1.14e-06
## Year_fct2017:StationCodeI80 -0.1019820 0.0735836 -1.386 0.165825
## Year_fct2018:StationCodeI80 -0.3396135 0.0719854 -4.718 2.44e-06
## Year_fct2019:StationCodeI80 0.6074234 0.0724831 8.380 < 2e-16
## Year_fct2014:StationCodeLIS -0.0057236 0.1006072 -0.057 0.954635
## Year_fct2015:StationCodeLIS -0.4913146 0.0725485 -6.772 1.40e-11
## Year_fct2016:StationCodeLIS 0.2549530 0.0929650 2.742 0.006118
## Year_fct2017:StationCodeLIS -0.1532797 0.0918969 -1.668 0.095384
## Year_fct2018:StationCodeLIS -0.4451383 0.0898162 -4.956 7.41e-07
## Year_fct2019:StationCodeLIS 0.2726706 0.0902147 3.022 0.002519
## Year_fct2014:StationCodeSTTD 0.3299993 0.1035781 3.186 0.001451
## Year_fct2015:StationCodeSTTD -0.4049359 0.0758340 -5.340 9.69e-08
## Year_fct2016:StationCodeSTTD 0.2817640 0.0961734 2.930 0.003407
## Year_fct2017:StationCodeSTTD -0.3916241 0.1025270 -3.820 0.000135
## Year_fct2018:StationCodeSTTD -0.9186986 0.0943557 -9.737 < 2e-16
## Year_fct2019:StationCodeSTTD -1.1166322 0.0929262 -12.016 < 2e-16
## Year_fct2014:StationCodeLIB 0.4468476 0.0997561 4.479 7.64e-06
## Year_fct2015:StationCodeLIB -0.0060370 0.0717562 -0.084 0.932954
## Year_fct2016:StationCodeLIB 1.1537717 0.0921366 12.522 < 2e-16
## Year_fct2017:StationCodeLIB -0.1403588 0.0902471 -1.555 0.119941
## Year_fct2018:StationCodeLIB -1.9005010 0.0910690 -20.869 < 2e-16
## Year_fct2019:StationCodeLIB -0.5416512 0.0926395 -5.847 5.31e-09
## Year_fct2014:StationCodeRYI 0.0000000 0.0000000 NaN NaN
## Year_fct2015:StationCodeRYI 0.0000000 0.0000000 NaN NaN
## Year_fct2016:StationCodeRYI 0.5078780 0.0879377 5.775 8.11e-09
## Year_fct2017:StationCodeRYI 0.0000000 0.0000000 NaN NaN
## Year_fct2018:StationCodeRYI -0.7915348 0.0851690 -9.294 < 2e-16
## Year_fct2019:StationCodeRYI -0.7408140 0.0854320 -8.671 < 2e-16
## Year_fct2014:StationCodeRVB 0.3789463 0.0996205 3.804 0.000144
## Year_fct2015:StationCodeRVB 0.1404472 0.0711731 1.973 0.048511
## Year_fct2016:StationCodeRVB 0.4250235 0.0928583 4.577 4.82e-06
## Year_fct2017:StationCodeRVB -0.3152080 0.0899553 -3.504 0.000462
## Year_fct2018:StationCodeRVB -0.1156868 0.0886763 -1.305 0.192087
## Year_fct2019:StationCodeRVB -0.4404207 0.0890737 -4.944 7.87e-07
## FlowActionPeriodDuring:StationCodeRD22 -0.4018276 0.0544828 -7.375 1.89e-13
## FlowActionPeriodAfter:StationCodeRD22 -0.6314416 0.0488985 -12.913 < 2e-16
## FlowActionPeriodDuring:StationCodeI80 -0.3683415 0.0551013 -6.685 2.55e-11
## FlowActionPeriodAfter:StationCodeI80 -0.4262107 0.0493037 -8.645 < 2e-16
## FlowActionPeriodDuring:StationCodeLIS 0.4990072 0.0537691 9.281 < 2e-16
## FlowActionPeriodAfter:StationCodeLIS -0.5418892 0.0476717 -11.367 < 2e-16
## FlowActionPeriodDuring:StationCodeSTTD 1.0966314 0.0570739 19.214 < 2e-16
## FlowActionPeriodAfter:StationCodeSTTD -0.3876528 0.0530591 -7.306 3.16e-13
## FlowActionPeriodDuring:StationCodeLIB 0.4007993 0.0551686 7.265 4.27e-13
## FlowActionPeriodAfter:StationCodeLIB -0.7048803 0.0493113 -14.295 < 2e-16
## FlowActionPeriodDuring:StationCodeRYI 0.2746507 0.0669034 4.105 4.10e-05
## FlowActionPeriodAfter:StationCodeRYI -0.5372528 0.0583450 -9.208 < 2e-16
## FlowActionPeriodDuring:StationCodeRVB 0.1548582 0.0532327 2.909 0.003640
## FlowActionPeriodAfter:StationCodeRVB -0.6628368 0.0477859 -13.871 < 2e-16
##
## (Intercept) ***
## Year_fct2014
## Year_fct2015
## Year_fct2016
## Year_fct2017 ***
## Year_fct2018
## Year_fct2019
## FlowActionPeriodDuring
## FlowActionPeriodAfter ***
## StationCodeRD22 ***
## StationCodeI80 ***
## StationCodeLIS
## StationCodeSTTD ***
## StationCodeLIB ***
## StationCodeRYI ***
## StationCodeRVB ***
## Year_fct2014:FlowActionPeriodDuring
## Year_fct2015:FlowActionPeriodDuring
## Year_fct2016:FlowActionPeriodDuring ***
## Year_fct2017:FlowActionPeriodDuring **
## Year_fct2018:FlowActionPeriodDuring
## Year_fct2019:FlowActionPeriodDuring ***
## Year_fct2014:FlowActionPeriodAfter ***
## Year_fct2015:FlowActionPeriodAfter ***
## Year_fct2016:FlowActionPeriodAfter ***
## Year_fct2017:FlowActionPeriodAfter ***
## Year_fct2018:FlowActionPeriodAfter ***
## Year_fct2019:FlowActionPeriodAfter ***
## Year_fct2014:StationCodeRD22 ***
## Year_fct2015:StationCodeRD22 .
## Year_fct2016:StationCodeRD22 ***
## Year_fct2017:StationCodeRD22 ***
## Year_fct2018:StationCodeRD22 ***
## Year_fct2019:StationCodeRD22 *
## Year_fct2014:StationCodeI80
## Year_fct2015:StationCodeI80
## Year_fct2016:StationCodeI80 ***
## Year_fct2017:StationCodeI80
## Year_fct2018:StationCodeI80 ***
## Year_fct2019:StationCodeI80 ***
## Year_fct2014:StationCodeLIS
## Year_fct2015:StationCodeLIS ***
## Year_fct2016:StationCodeLIS **
## Year_fct2017:StationCodeLIS .
## Year_fct2018:StationCodeLIS ***
## Year_fct2019:StationCodeLIS **
## Year_fct2014:StationCodeSTTD **
## Year_fct2015:StationCodeSTTD ***
## Year_fct2016:StationCodeSTTD **
## Year_fct2017:StationCodeSTTD ***
## Year_fct2018:StationCodeSTTD ***
## Year_fct2019:StationCodeSTTD ***
## Year_fct2014:StationCodeLIB ***
## Year_fct2015:StationCodeLIB
## Year_fct2016:StationCodeLIB ***
## Year_fct2017:StationCodeLIB
## Year_fct2018:StationCodeLIB ***
## Year_fct2019:StationCodeLIB ***
## Year_fct2014:StationCodeRYI
## Year_fct2015:StationCodeRYI
## Year_fct2016:StationCodeRYI ***
## Year_fct2017:StationCodeRYI
## Year_fct2018:StationCodeRYI ***
## Year_fct2019:StationCodeRYI ***
## Year_fct2014:StationCodeRVB ***
## Year_fct2015:StationCodeRVB *
## Year_fct2016:StationCodeRVB ***
## Year_fct2017:StationCodeRVB ***
## Year_fct2018:StationCodeRVB
## Year_fct2019:StationCodeRVB ***
## FlowActionPeriodDuring:StationCodeRD22 ***
## FlowActionPeriodAfter:StationCodeRD22 ***
## FlowActionPeriodDuring:StationCodeI80 ***
## FlowActionPeriodAfter:StationCodeI80 ***
## FlowActionPeriodDuring:StationCodeLIS ***
## FlowActionPeriodAfter:StationCodeLIS ***
## FlowActionPeriodDuring:StationCodeSTTD ***
## FlowActionPeriodAfter:StationCodeSTTD ***
## FlowActionPeriodDuring:StationCodeLIB ***
## FlowActionPeriodAfter:StationCodeLIB ***
## FlowActionPeriodDuring:StationCodeRYI ***
## FlowActionPeriodAfter:StationCodeRYI ***
## FlowActionPeriodDuring:StationCodeRVB **
## FlowActionPeriodAfter:StationCodeRVB ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 8.005 8.725 19.31 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Rank: 89/93
## R-sq.(adj) = 0.882 Deviance explained = 88.3%
## -REML = 2280.5 Scale est. = 0.1274 n = 5429
appraise(m_chla_gam_st)
k.check(m_chla_gam_st)
## k' edf k-index p-value
## s(DOY) 9 8.005388 1.012803 0.795
draw(m_chla_gam_st, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gam_st, pages = 1, all.terms = TRUE)
acf(residuals(m_chla_gam_st))
Box.test(residuals(m_chla_gam_st), lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: residuals(m_chla_gam_st)
## X-squared = 13752, df = 20, p-value < 2.2e-16
Not surprisingly, the model definitely has residual autocorrelation as indicated by the ACF plot and the Box-Ljung test. The smooth term for day of year may also be overfitted as it was with the initial model using Regions. Again, let’s try a smaller k-value for the smooth first, then lets try to address the residual autocorrelation.
m_chla_gam_st_k5 <- gam(
Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY, k = 5),
data = df_chla_c,
method = "REML"
)
summary(m_chla_gam_st_k5)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY,
## k = 5)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.68631 0.08372 103.760 < 2e-16 ***
## Year_fct2014 0.29960 0.06777 4.421 1.00e-05 ***
## Year_fct2015 0.39326 0.09804 4.011 6.13e-05 ***
## Year_fct2016 0.35872 0.10005 3.585 0.00034 ***
## Year_fct2017 0.71846 0.09040 7.947 2.31e-15 ***
## Year_fct2018 0.41537 0.09021 4.605 4.23e-06 ***
## Year_fct2019 0.29622 0.09031 3.280 0.00104 **
## FlowActionPeriodDuring -0.02294 0.06306 -0.364 0.71601
## FlowActionPeriodAfter 0.82808 0.06895 12.010 < 2e-16 ***
## StationCodeRD22 1.04110 0.08871 11.736 < 2e-16 ***
## StationCodeI80 0.95049 0.08937 10.636 < 2e-16 ***
## StationCodeLIS 0.23561 0.08493 2.774 0.00556 **
## StationCodeSTTD -0.06234 0.06350 -0.982 0.32627
## StationCodeLIB -1.11634 0.08459 -13.198 < 2e-16 ***
## StationCodeRYI -1.86375 0.05677 -32.829 < 2e-16 ***
## StationCodeRVB -1.27406 0.08377 -15.209 < 2e-16 ***
## Year_fct2014:FlowActionPeriodDuring -0.08386 0.05800 -1.446 0.14829
## Year_fct2015:FlowActionPeriodDuring -0.07698 0.05184 -1.485 0.13762
## Year_fct2016:FlowActionPeriodDuring -0.36767 0.06662 -5.519 3.56e-08 ***
## Year_fct2017:FlowActionPeriodDuring -0.14640 0.05623 -2.603 0.00926 **
## Year_fct2018:FlowActionPeriodDuring -0.06467 0.05144 -1.257 0.20877
## Year_fct2019:FlowActionPeriodDuring -0.46214 0.05234 -8.830 < 2e-16 ***
## Year_fct2014:FlowActionPeriodAfter -0.40263 0.05127 -7.853 4.89e-15 ***
## Year_fct2015:FlowActionPeriodAfter -0.23648 0.05160 -4.583 4.69e-06 ***
## Year_fct2016:FlowActionPeriodAfter -0.53143 0.07046 -7.542 5.40e-14 ***
## Year_fct2017:FlowActionPeriodAfter -0.52391 0.05092 -10.288 < 2e-16 ***
## Year_fct2018:FlowActionPeriodAfter -0.45572 0.05005 -9.105 < 2e-16 ***
## Year_fct2019:FlowActionPeriodAfter -0.20050 0.04982 -4.025 5.78e-05 ***
## Year_fct2014:StationCodeRD22 -0.89755 0.07853 -11.429 < 2e-16 ***
## Year_fct2015:StationCodeRD22 -0.47440 0.10258 -4.625 3.84e-06 ***
## Year_fct2016:StationCodeRD22 -0.73577 0.10295 -7.147 1.01e-12 ***
## Year_fct2017:StationCodeRD22 -0.97635 0.09865 -9.897 < 2e-16 ***
## Year_fct2018:StationCodeRD22 -0.76534 0.09765 -7.838 5.50e-15 ***
## Year_fct2019:StationCodeRD22 -0.55564 0.09796 -5.672 1.48e-08 ***
## Year_fct2014:StationCodeI80 -0.45600 0.07923 -5.755 9.14e-09 ***
## Year_fct2015:StationCodeI80 -0.33258 0.10387 -3.202 0.00137 **
## Year_fct2016:StationCodeI80 0.05111 0.10332 0.495 0.62086
## Year_fct2017:StationCodeI80 -0.42766 0.09906 -4.317 1.61e-05 ***
## Year_fct2018:StationCodeI80 -0.66522 0.09804 -6.785 1.29e-11 ***
## Year_fct2019:StationCodeI80 0.28179 0.09836 2.865 0.00419 **
## Year_fct2014:StationCodeLIS -0.31988 0.07586 -4.217 2.52e-05 ***
## Year_fct2015:StationCodeLIS -0.81211 0.09999 -8.122 5.66e-16 ***
## Year_fct2016:StationCodeLIS -0.05797 0.09968 -0.582 0.56084
## Year_fct2017:StationCodeLIS -0.46782 0.09719 -4.814 1.52e-06 ***
## Year_fct2018:StationCodeLIS -0.76019 0.09557 -7.955 2.18e-15 ***
## Year_fct2019:StationCodeLIS -0.04232 0.09586 -0.441 0.65889
## Year_fct2014:StationCodeSTTD 0.00000 0.00000 NaN NaN
## Year_fct2015:StationCodeSTTD -0.73703 0.08478 -8.694 < 2e-16 ***
## Year_fct2016:StationCodeSTTD -0.04384 0.08505 -0.515 0.60628
## Year_fct2017:StationCodeSTTD -0.72202 0.08693 -8.306 < 2e-16 ***
## Year_fct2018:StationCodeSTTD -1.24753 0.08063 -15.472 < 2e-16 ***
## Year_fct2019:StationCodeSTTD -1.44129 0.07963 -18.100 < 2e-16 ***
## Year_fct2014:StationCodeLIB 0.13571 0.07468 1.817 0.06923 .
## Year_fct2015:StationCodeLIB -0.32376 0.09939 -3.257 0.00113 **
## Year_fct2016:StationCodeLIB 0.84399 0.09889 8.534 < 2e-16 ***
## Year_fct2017:StationCodeLIB -0.45206 0.09566 -4.726 2.35e-06 ***
## Year_fct2018:StationCodeLIB -2.21044 0.09695 -22.799 < 2e-16 ***
## Year_fct2019:StationCodeLIB -0.85874 0.09830 -8.736 < 2e-16 ***
## Year_fct2014:StationCodeRYI 0.00000 0.00000 NaN NaN
## Year_fct2015:StationCodeRYI 0.73294 0.08565 8.557 < 2e-16 ***
## Year_fct2016:StationCodeRYI 1.25387 0.07070 17.736 < 2e-16 ***
## Year_fct2017:StationCodeRYI 0.00000 0.00000 NaN NaN
## Year_fct2018:StationCodeRYI -0.05072 0.06525 -0.777 0.43703
## Year_fct2019:StationCodeRYI 0.00000 0.00000 NaN NaN
## Year_fct2014:StationCodeRVB 0.06587 0.07451 0.884 0.37671
## Year_fct2015:StationCodeRVB -0.17976 0.09897 -1.816 0.06938 .
## Year_fct2016:StationCodeRVB 0.10632 0.09959 1.068 0.28576
## Year_fct2017:StationCodeRVB -0.62942 0.09541 -6.597 4.61e-11 ***
## Year_fct2018:StationCodeRVB -0.43013 0.09448 -4.552 5.42e-06 ***
## Year_fct2019:StationCodeRVB -0.75479 0.09477 -7.965 2.01e-15 ***
## FlowActionPeriodDuring:StationCodeRD22 -0.40175 0.05465 -7.351 2.26e-13 ***
## FlowActionPeriodAfter:StationCodeRD22 -0.62960 0.04905 -12.836 < 2e-16 ***
## FlowActionPeriodDuring:StationCodeI80 -0.36811 0.05527 -6.660 3.01e-11 ***
## FlowActionPeriodAfter:StationCodeI80 -0.42354 0.04945 -8.565 < 2e-16 ***
## FlowActionPeriodDuring:StationCodeLIS 0.50068 0.05393 9.284 < 2e-16 ***
## FlowActionPeriodAfter:StationCodeLIS -0.54546 0.04781 -11.410 < 2e-16 ***
## FlowActionPeriodDuring:StationCodeSTTD 1.09454 0.05722 19.128 < 2e-16 ***
## FlowActionPeriodAfter:StationCodeSTTD -0.39325 0.05318 -7.395 1.63e-13 ***
## FlowActionPeriodDuring:StationCodeLIB 0.40378 0.05533 7.297 3.36e-13 ***
## FlowActionPeriodAfter:StationCodeLIB -0.70552 0.04945 -14.267 < 2e-16 ***
## FlowActionPeriodDuring:StationCodeRYI 0.27237 0.06707 4.061 4.96e-05 ***
## FlowActionPeriodAfter:StationCodeRYI -0.54180 0.05846 -9.268 < 2e-16 ***
## FlowActionPeriodDuring:StationCodeRVB 0.15774 0.05339 2.954 0.00315 **
## FlowActionPeriodAfter:StationCodeRVB -0.66458 0.04793 -13.867 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 3.876 3.992 34.17 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Rank: 84/88
## R-sq.(adj) = 0.881 Deviance explained = 88.3%
## -REML = 2290.6 Scale est. = 0.12822 n = 5429
k.check(m_chla_gam_st_k5)
## k' edf k-index p-value
## s(DOY) 4 3.876466 0.9817392 0.1075
draw(m_chla_gam_st_k5, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gam_st_k5, pages = 1, all.terms = TRUE)
Changing the k-value to 5 seems to help reduce the “wiggliness” of
the smooth term for DOY. Now, let’s use an AR(p) model to account for
the correlated residuals. We’ll run AR(1), AR(2), AR(3), and AR(4)
models and compare them using AIC. The nlme
model engine
that underlies the gamm()
function produces an error when
there are empty station-year-action period combinations in the data set,
so for now we’ll remove the two stations (RCS and RYI) that are missing
data for a few years.
# Define model formula as an object
f_chla_gam_st_k5 <- as.formula("Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY, k = 5)")
# Remove RCS and RYI from the data set
df_chla_c2 <- df_chla_c %>% filter(!StationCode %in% c("RCS", "RYI"))
# Fit original model with k = 5 and four successive AR(p) models
m_chla_gamm_st_k5 <- gamm(
f_chla_gam_st_k5,
data = df_chla_c2,
method = "REML"
)
m_chla_gamm_st_k5_ar1 <- gamm(
f_chla_gam_st_k5,
data = df_chla_c2,
correlation = corARMA(form = ~ 1|Year_fct, p = 1), # grouped by Year_fct
method = "REML"
)
m_chla_gamm_st_k5_ar2 <- gamm(
f_chla_gam_st_k5,
data = df_chla_c2,
correlation = corARMA(form = ~ 1|Year_fct, p = 2),
method = "REML"
)
m_chla_gamm_st_k5_ar3 <- gamm(
f_chla_gam_st_k5,
data = df_chla_c2,
correlation = corARMA(form = ~ 1|Year_fct, p = 3),
method = "REML"
)
m_chla_gamm_st_k5_ar4 <- gamm(
f_chla_gam_st_k5,
data = df_chla_c2,
correlation = corARMA(form = ~ 1|Year_fct, p = 4),
method = "REML"
)
# Compare models
anova(
m_chla_gamm_st_k5$lme,
m_chla_gamm_st_k5_ar1$lme,
m_chla_gamm_st_k5_ar2$lme,
m_chla_gamm_st_k5_ar3$lme,
m_chla_gamm_st_k5_ar4$lme
)
## Model df AIC BIC logLik Test
## m_chla_gamm_st_k5$lme 1 69 3848.889 4289.535 -1855.445
## m_chla_gamm_st_k5_ar1$lme 2 70 -2970.642 -2523.610 1555.321 1 vs 2
## m_chla_gamm_st_k5_ar2$lme 3 71 -2984.225 -2530.807 1563.113 2 vs 3
## m_chla_gamm_st_k5_ar3$lme 4 72 -2986.140 -2526.336 1565.070 3 vs 4
## m_chla_gamm_st_k5_ar4$lme 5 73 -2990.612 -2524.421 1568.306 4 vs 5
## L.Ratio p-value
## m_chla_gamm_st_k5$lme
## m_chla_gamm_st_k5_ar1$lme 6821.532 <.0001
## m_chla_gamm_st_k5_ar2$lme 15.583 0.0001
## m_chla_gamm_st_k5_ar3$lme 3.915 0.0479
## m_chla_gamm_st_k5_ar4$lme 6.471 0.0110
It looks like the AR(4) model has the best fit according to the AIC values. However, the AR(2) model is best according to the BIC values. Since AIC tends to pick more complex models than BIC, let’s choose the AR(2) model and take a closer look at that one.
# Pull out the residuals and the GAM model
resid_st_ar2 <- residuals(m_chla_gamm_st_k5_ar2$lme, type = "normalized")
m_chla_gamm_st_k5_ar2_gam <- m_chla_gamm_st_k5_ar2$gam
summary(m_chla_gamm_st_k5_ar2_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY,
## k = 5)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.543052 0.181283 52.642 < 2e-16
## Year_fct2014 0.023001 0.238494 0.096 0.923173
## Year_fct2015 -0.143971 0.235386 -0.612 0.540809
## Year_fct2016 -0.878058 0.254099 -3.456 0.000554
## Year_fct2017 -0.301980 0.238890 -1.264 0.206264
## Year_fct2018 -0.022575 0.235242 -0.096 0.923552
## Year_fct2019 0.289060 0.237351 1.218 0.223344
## FlowActionPeriodDuring 0.007088 0.083545 0.085 0.932391
## FlowActionPeriodAfter 0.214046 0.107352 1.994 0.046227
## StationCodeI80 0.346569 0.269398 1.286 0.198352
## StationCodeLIS -0.515793 0.170289 -3.029 0.002469
## StationCodeSTTD -0.410490 0.229645 -1.787 0.073926
## StationCodeLIB -2.127402 0.217168 -9.796 < 2e-16
## StationCodeRVB -2.032855 0.169858 -11.968 < 2e-16
## Year_fct2014:FlowActionPeriodDuring -0.088494 0.085508 -1.035 0.300765
## Year_fct2015:FlowActionPeriodDuring -0.033589 0.085323 -0.394 0.693841
## Year_fct2016:FlowActionPeriodDuring -0.119400 0.091186 -1.309 0.190465
## Year_fct2017:FlowActionPeriodDuring -0.020808 0.085956 -0.242 0.808729
## Year_fct2018:FlowActionPeriodDuring -0.198609 0.085245 -2.330 0.019858
## Year_fct2019:FlowActionPeriodDuring 0.074944 0.085849 0.873 0.382723
## Year_fct2014:FlowActionPeriodAfter -0.262518 0.099492 -2.639 0.008355
## Year_fct2015:FlowActionPeriodAfter -0.098981 0.099156 -0.998 0.318219
## Year_fct2016:FlowActionPeriodAfter -0.250797 0.118315 -2.120 0.034084
## Year_fct2017:FlowActionPeriodAfter -0.283734 0.101472 -2.796 0.005193
## Year_fct2018:FlowActionPeriodAfter -0.321220 0.099345 -3.233 0.001232
## Year_fct2019:FlowActionPeriodAfter 0.095366 0.099778 0.956 0.339235
## Year_fct2014:StationCodeI80 -0.578816 0.346381 -1.671 0.094785
## Year_fct2015:StationCodeI80 -0.001156 0.346479 -0.003 0.997338
## Year_fct2016:StationCodeI80 0.746313 0.366019 2.039 0.041510
## Year_fct2017:StationCodeI80 0.352251 0.346105 1.018 0.308849
## Year_fct2018:StationCodeI80 -0.820787 0.342609 -2.396 0.016631
## Year_fct2019:StationCodeI80 0.030704 0.342736 0.090 0.928621
## Year_fct2014:StationCodeLIS -0.228140 0.216105 -1.056 0.291168
## Year_fct2015:StationCodeLIS -0.479993 0.212414 -2.260 0.023888
## Year_fct2016:StationCodeLIS 0.892600 0.221366 4.032 5.62e-05
## Year_fct2017:StationCodeLIS 0.179177 0.216879 0.826 0.408758
## Year_fct2018:StationCodeLIS -0.050645 0.216494 -0.234 0.815048
## Year_fct2019:StationCodeLIS -0.391006 0.215955 -1.811 0.070272
## Year_fct2014:StationCodeSTTD -0.217508 0.303145 -0.718 0.473102
## Year_fct2015:StationCodeSTTD -0.469506 0.297992 -1.576 0.115199
## Year_fct2016:StationCodeSTTD 1.051458 0.309716 3.395 0.000693
## Year_fct2017:StationCodeSTTD -0.326809 0.311206 -1.050 0.293713
## Year_fct2018:StationCodeSTTD -1.541801 0.303092 -5.087 3.79e-07
## Year_fct2019:StationCodeSTTD -2.057137 0.299568 -6.867 7.48e-12
## Year_fct2014:StationCodeLIB 0.342943 0.281372 1.219 0.222976
## Year_fct2015:StationCodeLIB 0.555753 0.277262 2.004 0.045085
## Year_fct2016:StationCodeLIB 1.999924 0.288864 6.923 5.05e-12
## Year_fct2017:StationCodeLIB 0.370186 0.281643 1.314 0.188788
## Year_fct2018:StationCodeLIB -1.353660 0.282145 -4.798 1.66e-06
## Year_fct2019:StationCodeLIB -1.259669 0.282116 -4.465 8.21e-06
## Year_fct2014:StationCodeRVB 0.317219 0.218054 1.455 0.145804
## Year_fct2015:StationCodeRVB 0.603775 0.213992 2.821 0.004802
## Year_fct2016:StationCodeRVB 1.264268 0.222676 5.678 1.45e-08
## Year_fct2017:StationCodeRVB 0.561237 0.218445 2.569 0.010225
## Year_fct2018:StationCodeRVB 0.175853 0.216284 0.813 0.416223
## Year_fct2019:StationCodeRVB -1.335158 0.215864 -6.185 6.77e-10
## FlowActionPeriodDuring:StationCodeI80 0.023265 0.086874 0.268 0.788869
## FlowActionPeriodAfter:StationCodeI80 0.012523 0.116052 0.108 0.914076
## FlowActionPeriodDuring:StationCodeLIS 0.013683 0.087462 0.156 0.875693
## FlowActionPeriodAfter:StationCodeLIS 0.002908 0.119192 0.024 0.980535
## FlowActionPeriodDuring:StationCodeSTTD 0.175203 0.086711 2.021 0.043387
## FlowActionPeriodAfter:StationCodeSTTD -0.282126 0.117698 -2.397 0.016570
## FlowActionPeriodDuring:StationCodeLIB 0.018342 0.085850 0.214 0.830827
## FlowActionPeriodAfter:StationCodeLIB -0.053094 0.113765 -0.467 0.640737
## FlowActionPeriodDuring:StationCodeRVB 0.051731 0.087228 0.593 0.553178
## FlowActionPeriodAfter:StationCodeRVB -0.014969 0.118819 -0.126 0.899755
##
## (Intercept) ***
## Year_fct2014
## Year_fct2015
## Year_fct2016 ***
## Year_fct2017
## Year_fct2018
## Year_fct2019
## FlowActionPeriodDuring
## FlowActionPeriodAfter *
## StationCodeI80
## StationCodeLIS **
## StationCodeSTTD .
## StationCodeLIB ***
## StationCodeRVB ***
## Year_fct2014:FlowActionPeriodDuring
## Year_fct2015:FlowActionPeriodDuring
## Year_fct2016:FlowActionPeriodDuring
## Year_fct2017:FlowActionPeriodDuring
## Year_fct2018:FlowActionPeriodDuring *
## Year_fct2019:FlowActionPeriodDuring
## Year_fct2014:FlowActionPeriodAfter **
## Year_fct2015:FlowActionPeriodAfter
## Year_fct2016:FlowActionPeriodAfter *
## Year_fct2017:FlowActionPeriodAfter **
## Year_fct2018:FlowActionPeriodAfter **
## Year_fct2019:FlowActionPeriodAfter
## Year_fct2014:StationCodeI80 .
## Year_fct2015:StationCodeI80
## Year_fct2016:StationCodeI80 *
## Year_fct2017:StationCodeI80
## Year_fct2018:StationCodeI80 *
## Year_fct2019:StationCodeI80
## Year_fct2014:StationCodeLIS
## Year_fct2015:StationCodeLIS *
## Year_fct2016:StationCodeLIS ***
## Year_fct2017:StationCodeLIS
## Year_fct2018:StationCodeLIS
## Year_fct2019:StationCodeLIS .
## Year_fct2014:StationCodeSTTD
## Year_fct2015:StationCodeSTTD
## Year_fct2016:StationCodeSTTD ***
## Year_fct2017:StationCodeSTTD
## Year_fct2018:StationCodeSTTD ***
## Year_fct2019:StationCodeSTTD ***
## Year_fct2014:StationCodeLIB
## Year_fct2015:StationCodeLIB *
## Year_fct2016:StationCodeLIB ***
## Year_fct2017:StationCodeLIB
## Year_fct2018:StationCodeLIB ***
## Year_fct2019:StationCodeLIB ***
## Year_fct2014:StationCodeRVB
## Year_fct2015:StationCodeRVB **
## Year_fct2016:StationCodeRVB ***
## Year_fct2017:StationCodeRVB *
## Year_fct2018:StationCodeRVB
## Year_fct2019:StationCodeRVB ***
## FlowActionPeriodDuring:StationCodeI80
## FlowActionPeriodAfter:StationCodeI80
## FlowActionPeriodDuring:StationCodeLIS
## FlowActionPeriodAfter:StationCodeLIS
## FlowActionPeriodDuring:StationCodeSTTD *
## FlowActionPeriodAfter:StationCodeSTTD *
## FlowActionPeriodDuring:StationCodeLIB
## FlowActionPeriodAfter:StationCodeLIB
## FlowActionPeriodDuring:StationCodeRVB
## FlowActionPeriodAfter:StationCodeRVB
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 2.633 2.633 19.05 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.77
## Scale est. = 0.28473 n = 4453
appraise(m_chla_gamm_st_k5_ar2_gam)
k.check(m_chla_gamm_st_k5_ar2_gam)
## k' edf k-index p-value
## s(DOY) 4 2.63297 0.8606074 0
draw(m_chla_gamm_st_k5_ar2_gam, select = 1, residuals = TRUE, rug = FALSE)
plot(m_chla_gamm_st_k5_ar2_gam, pages = 1, all.terms = TRUE)
acf(resid_st_ar2)
Box.test(resid_st_ar2, lag = 20, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: resid_st_ar2
## X-squared = 64.149, df = 20, p-value = 1.594e-06
The AR(2) model has much less residual autocorrelation, and the
diagnostics plots look pretty good. For some reason the p-value for the
k.check
is zero which may indicate that k = 5
is too low. I may have to look into this further.
What does the ANOVA table look like for the AR(2) model?
# the anova.gam function is similar to a type 3 ANOVA
anova(m_chla_gamm_st_k5_ar2_gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Chla_log ~ (Year_fct + FlowActionPeriod + StationCode)^2 + s(DOY,
## k = 5)
##
## Parametric Terms:
## df F p-value
## Year_fct 6 4.464 0.000164
## FlowActionPeriod 2 3.211 0.040408
## StationCode 5 87.468 < 2e-16
## Year_fct:FlowActionPeriod 12 3.113 0.000204
## Year_fct:StationCode 30 21.176 < 2e-16
## FlowActionPeriod:StationCode 10 4.392 3.67e-06
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DOY) 2.633 2.633 19.05 <2e-16
All the interaction terms are significant in the AR(2) model. Let’s take a closer look at the pairwise contrasts.
# Contrasts in FlowActionPeriod for each Year
emmeans(
m_chla_gamm_st_k5_ar2,
data = df_chla_c2,
specs = pairwise ~ FlowActionPeriod | Year_fct
)
## $emmeans
## Year_fct = 2013:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.70 0.132 4384 8.44 8.96
## During 8.75 0.128 4384 8.50 9.00
## After 8.86 0.132 4384 8.60 9.12
##
## Year_fct = 2014:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.66 0.128 4384 8.41 8.91
## During 8.63 0.131 4384 8.37 8.88
## After 8.56 0.132 4384 8.30 8.82
##
## Year_fct = 2015:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.59 0.126 4384 8.34 8.84
## During 8.61 0.122 4384 8.37 8.85
## After 8.65 0.125 4384 8.40 8.89
##
## Year_fct = 2016:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.81 0.150 4384 8.52 9.11
## During 8.75 0.140 4384 8.47 9.02
## After 8.72 0.133 4384 8.46 8.98
##
## Year_fct = 2017:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.59 0.130 4384 8.33 8.84
## During 8.62 0.133 4384 8.36 8.88
## After 8.46 0.134 4384 8.20 8.72
##
## Year_fct = 2018:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.08 0.126 4384 7.83 8.32
## During 7.93 0.126 4384 7.69 8.18
## After 7.92 0.128 4384 7.66 8.17
##
## Year_fct = 2019:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.15 0.129 4384 7.90 8.41
## During 8.28 0.127 4384 8.03 8.53
## After 8.41 0.127 4384 8.16 8.66
##
## Results are averaged over the levels of: StationCode
## Confidence level used: 0.95
##
## $contrasts
## Year_fct = 2013:
## contrast estimate SE df t.ratio p.value
## Before - During -0.0541 0.0620 4384 -0.873 0.6575
## Before - After -0.1583 0.0763 4384 -2.074 0.0955
## During - After -0.1041 0.0620 4384 -1.678 0.2136
##
## Year_fct = 2014:
## contrast estimate SE df t.ratio p.value
## Before - During 0.0344 0.0622 4384 0.552 0.8454
## Before - After 0.1043 0.0759 4384 1.374 0.3546
## During - After 0.0699 0.0622 4384 1.123 0.4998
##
## Year_fct = 2015:
## contrast estimate SE df t.ratio p.value
## Before - During -0.0205 0.0623 4384 -0.330 0.9418
## Before - After -0.0593 0.0762 4384 -0.778 0.7167
## During - After -0.0387 0.0619 4384 -0.626 0.8059
##
## Year_fct = 2016:
## contrast estimate SE df t.ratio p.value
## Before - During 0.0653 0.0672 4384 0.971 0.5952
## Before - After 0.0925 0.0906 4384 1.022 0.5632
## During - After 0.0273 0.0666 4384 0.409 0.9117
##
## Year_fct = 2017:
## contrast estimate SE df t.ratio p.value
## Before - During -0.0333 0.0622 4384 -0.536 0.8538
## Before - After 0.1255 0.0771 4384 1.627 0.2345
## During - After 0.1588 0.0632 4384 2.513 0.0322
##
## Year_fct = 2018:
## contrast estimate SE df t.ratio p.value
## Before - During 0.1445 0.0618 4384 2.337 0.0509
## Before - After 0.1630 0.0757 4384 2.154 0.0794
## During - After 0.0185 0.0620 4384 0.298 0.9522
##
## Year_fct = 2019:
## contrast estimate SE df t.ratio p.value
## Before - During -0.1291 0.0629 4384 -2.053 0.0999
## Before - After -0.2536 0.0757 4384 -3.349 0.0024
## During - After -0.1245 0.0616 4384 -2.023 0.1067
##
## Results are averaged over the levels of: StationCode
## P value adjustment: tukey method for comparing a family of 3 estimates
# Contrasts in FlowActionPeriod for each Station
emmeans(
m_chla_gamm_st_k5_ar2,
data = df_chla_c2,
specs = pairwise ~ FlowActionPeriod | StationCode
)
## $emmeans
## StationCode = RD22:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 9.34 0.0861 4384 9.17 9.51
## During 9.29 0.0840 4384 9.13 9.46
## After 9.40 0.0840 4384 9.23 9.56
##
## StationCode = I80:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 9.65 0.1082 4384 9.44 9.86
## During 9.62 0.1006 4384 9.43 9.82
## After 9.72 0.0982 4384 9.52 9.91
##
## StationCode = LIS:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.81 0.0859 4384 8.65 8.98
## During 8.78 0.0848 4384 8.61 8.95
## After 8.87 0.0863 4384 8.70 9.04
##
## StationCode = STTD:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 8.42 0.0998 4384 8.23 8.62
## During 8.55 0.1023 4384 8.35 8.75
## After 8.19 0.1085 4384 7.98 8.41
##
## StationCode = LIB:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 7.31 0.0882 4384 7.13 7.48
## During 7.28 0.0859 4384 7.11 7.45
## After 7.31 0.0872 4384 7.14 7.48
##
## StationCode = RVB:
## FlowActionPeriod emmean SE df lower.CL upper.CL
## Before 7.54 0.0905 4384 7.36 7.71
## During 7.54 0.0886 4384 7.37 7.71
## After 7.57 0.0891 4384 7.40 7.75
##
## Results are averaged over the levels of: Year_fct
## Confidence level used: 0.95
##
## $contrasts
## StationCode = RD22:
## contrast estimate SE df t.ratio p.value
## Before - During 0.048049 0.0620 4384 0.775 0.7182
## Before - After -0.053777 0.0845 4384 -0.636 0.8001
## During - After -0.101825 0.0619 4384 -1.646 0.2266
##
## StationCode = I80:
## contrast estimate SE df t.ratio p.value
## Before - During 0.024784 0.0629 4384 0.394 0.9179
## Before - After -0.066299 0.0854 4384 -0.777 0.7174
## During - After -0.091083 0.0618 4384 -1.474 0.3038
##
## StationCode = LIS:
## contrast estimate SE df t.ratio p.value
## Before - During 0.034366 0.0619 4384 0.555 0.8438
## Before - After -0.056685 0.0847 4384 -0.669 0.7813
## During - After -0.091051 0.0622 4384 -1.464 0.3082
##
## StationCode = STTD:
## contrast estimate SE df t.ratio p.value
## Before - During -0.127154 0.0620 4384 -2.051 0.1003
## Before - After 0.228350 0.0856 4384 2.667 0.0210
## During - After 0.355504 0.0628 4384 5.657 <.0001
##
## StationCode = LIB:
## contrast estimate SE df t.ratio p.value
## Before - During 0.029706 0.0626 4384 0.474 0.8833
## Before - After -0.000682 0.0846 4384 -0.008 1.0000
## During - After -0.030389 0.0617 4384 -0.493 0.8747
##
## StationCode = RVB:
## contrast estimate SE df t.ratio p.value
## Before - During -0.003682 0.0617 4384 -0.060 0.9980
## Before - After -0.038808 0.0835 4384 -0.465 0.8879
## During - After -0.035126 0.0615 4384 -0.571 0.8357
##
## Results are averaged over the levels of: Year_fct
## P value adjustment: tukey method for comparing a family of 3 estimates
# Contrasts in Station for each Year
cld(
emmeans(
m_chla_gamm_st_k5_ar2,
data = df_chla_c2,
specs = pairwise ~ StationCode | Year_fct
)$emmeans,
sort = FALSE,
Letters = letters
)
## Year_fct = 2013:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 9.56 0.172 4384 9.23 9.90 a
## I80 9.92 0.226 4384 9.48 10.36 a
## LIS 9.05 0.173 4384 8.71 9.39 b
## STTD 9.12 0.217 4384 8.69 9.54 ab
## LIB 7.42 0.182 4384 7.07 7.78 c
## RVB 7.54 0.178 4384 7.19 7.89 c
##
## Year_fct = 2014:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 9.47 0.170 4384 9.14 9.80 a
## I80 9.25 0.215 4384 8.83 9.67 ab
## LIS 8.73 0.169 4384 8.40 9.06 b
## STTD 8.81 0.220 4384 8.37 9.24 b
## LIB 7.67 0.180 4384 7.32 8.03 c
## RVB 7.77 0.182 4384 7.41 8.12 c
##
## Year_fct = 2015:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 9.37 0.163 4384 9.06 9.69 a
## I80 9.73 0.219 4384 9.30 10.16 a
## LIS 8.38 0.163 4384 8.06 8.70 bc
## STTD 8.46 0.203 4384 8.06 8.86 b d
## LIB 7.79 0.175 4384 7.45 8.13 de
## RVB 7.96 0.171 4384 7.62 8.29 c e
##
## Year_fct = 2016:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 8.56 0.181 4384 8.21 8.92 ab
## I80 9.67 0.231 4384 9.21 10.12 c
## LIS 8.94 0.177 4384 8.60 9.29 a
## STTD 9.17 0.242 4384 8.69 9.64 abc
## LIB 8.42 0.182 4384 8.07 8.78 b d
## RVB 7.81 0.205 4384 7.40 8.21 d
##
## Year_fct = 2017:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 9.16 0.170 4384 8.83 9.49 a
## I80 9.87 0.208 4384 9.46 10.28 b
## LIS 8.83 0.169 4384 8.50 9.16 a c
## STTD 8.39 0.244 4384 7.91 8.87 c
## LIB 7.39 0.176 4384 7.05 7.74 d
## RVB 7.70 0.188 4384 7.33 8.07 d
##
## Year_fct = 2018:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 9.37 0.164 4384 9.05 9.69 a
## I80 8.90 0.212 4384 8.49 9.32 ab
## LIS 8.81 0.164 4384 8.48 9.13 b
## STTD 7.38 0.220 4384 6.95 7.81 c
## LIB 5.87 0.179 4384 5.52 6.23 d
## RVB 7.52 0.176 4384 7.18 7.87 c
##
## Year_fct = 2019:
## StationCode emmean SE df lower.CL upper.CL .group
## RD22 9.91 0.165 4384 9.58 10.23 a
## I80 10.30 0.211 4384 9.88 10.71 a
## LIS 9.01 0.168 4384 8.68 9.34 b
## STTD 7.41 0.208 4384 7.00 7.81 c
## LIB 6.51 0.185 4384 6.15 6.87 d
## RVB 6.55 0.175 4384 6.21 6.90 d
##
## Results are averaged over the levels of: FlowActionPeriod
## Confidence level used: 0.95
## P value adjustment: tukey method for comparing a family of 6 estimates
## significance level used: alpha = 0.05
## NOTE: If two or more means share the same grouping symbol,
## then we cannot show them to be different.
## But we also did not show them to be the same.
# Contrasts in Year for each Station
cld(
emmeans(
m_chla_gamm_st_k5_ar2,
data = df_chla_c2,
specs = pairwise ~ Year_fct | StationCode
)$emmeans,
sort = FALSE,
Letters = letters
)
## StationCode = RD22:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 9.56 0.172 4384 9.23 9.90 ab
## 2014 9.47 0.170 4384 9.14 9.80 ab
## 2015 9.37 0.163 4384 9.06 9.69 ab
## 2016 8.56 0.181 4384 8.21 8.92 c
## 2017 9.16 0.170 4384 8.83 9.49 a c
## 2018 9.37 0.164 4384 9.05 9.69 ab
## 2019 9.91 0.165 4384 9.58 10.23 b
##
## StationCode = I80:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 9.92 0.226 4384 9.48 10.36 ab
## 2014 9.25 0.215 4384 8.83 9.67 a c
## 2015 9.73 0.219 4384 9.30 10.16 abc
## 2016 9.67 0.231 4384 9.21 10.12 abc
## 2017 9.87 0.208 4384 9.46 10.28 ab
## 2018 8.90 0.212 4384 8.49 9.32 c
## 2019 10.30 0.211 4384 9.88 10.71 b
##
## StationCode = LIS:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 9.05 0.173 4384 8.71 9.39 a
## 2014 8.73 0.169 4384 8.40 9.06 ab
## 2015 8.38 0.163 4384 8.06 8.70 b
## 2016 8.94 0.177 4384 8.60 9.29 ab
## 2017 8.83 0.169 4384 8.50 9.16 ab
## 2018 8.81 0.164 4384 8.48 9.13 ab
## 2019 9.01 0.168 4384 8.68 9.34 ab
##
## StationCode = STTD:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 9.12 0.217 4384 8.69 9.54 a
## 2014 8.81 0.220 4384 8.37 9.24 a
## 2015 8.46 0.203 4384 8.06 8.86 a
## 2016 9.17 0.242 4384 8.69 9.64 a
## 2017 8.39 0.244 4384 7.91 8.87 a
## 2018 7.38 0.220 4384 6.95 7.81 b
## 2019 7.41 0.208 4384 7.00 7.81 b
##
## StationCode = LIB:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 7.42 0.182 4384 7.07 7.78 a
## 2014 7.67 0.180 4384 7.32 8.03 a
## 2015 7.79 0.175 4384 7.45 8.13 ab
## 2016 8.42 0.182 4384 8.07 8.78 b
## 2017 7.39 0.176 4384 7.05 7.74 a
## 2018 5.87 0.179 4384 5.52 6.23 c
## 2019 6.51 0.185 4384 6.15 6.87 c
##
## StationCode = RVB:
## Year_fct emmean SE df lower.CL upper.CL .group
## 2013 7.54 0.178 4384 7.19 7.89 a
## 2014 7.77 0.182 4384 7.41 8.12 a
## 2015 7.96 0.171 4384 7.62 8.29 a
## 2016 7.81 0.205 4384 7.40 8.21 a
## 2017 7.70 0.188 4384 7.33 8.07 a
## 2018 7.52 0.176 4384 7.18 7.87 a
## 2019 6.55 0.175 4384 6.21 6.90 b
##
## Results are averaged over the levels of: FlowActionPeriod
## Confidence level used: 0.95
## P value adjustment: tukey method for comparing a family of 7 estimates
## significance level used: alpha = 0.05
## NOTE: If two or more means share the same grouping symbol,
## then we cannot show them to be different.
## But we also did not show them to be the same.
The differences between flow pulse periods were not significant in most years except During was higher than After in 2017 and After was higher than Before in 2019. Before was marginally higher than both During and After in 2018.
Similarly, the differences between flow pulse periods were not significant at most stations except After was lower than both Before and During at STTD.
Comparing Stations in each Year, there was an obvious gradient from the upstream to downstream stations with the highest chlorophyll values at the most upstream stations in the Toe Drain and the lowest values at the downstream stations across all years.