Skip to contents

Function that runs a Bayesian sampler estimating parameters of an eDITH model fitted on both eDNA and direct sampling data.

Usage

run_eDITH_BT_joint(data, river, covariates = NULL, Z.normalize = TRUE,
           use.AEM = FALSE, n.AEM = NULL, par.AEM = NULL,
           no.det = FALSE, ll.type = "norm", source.area = "AG",
           mcmc.settings = NULL, likelihood = NULL, prior = NULL, 
       sampler.type = "DREAMzs",
           tau.prior = list(spec="lnorm",a=0,b=Inf, meanlog=log(5), 
                        sd=sqrt(log(5)-log(4))),
           log_p0.prior = list(spec="unif",min=-20, max=0),
           beta.prior = list(spec="norm",sd=1),
           sigma.prior = list(spec="unif",min=0, 
                        max=max(data$values[data$type=="e"], 
                na.rm = TRUE)),
           omega.prior = list(spec="unif",min=1, 
                        max=10*max(data$values[data$type=="e"], 
                na.rm = TRUE)),
           Cstar.prior = list(spec="unif",min=0, 
                        max=max(data$values[data$type=="e"], 
                na.rm = TRUE)),
           omega_d.prior = list(spec="unif",min=1,
                                max=10*max(c(0.11, data$values[data$type=="d"]), 
                na.rm = TRUE)),
           alpha.prior = list(spec="unif", min=0, max=1e6),
           verbose = FALSE)

Arguments

data

eDNA and direct observation data. Data frame containing columns ID (index of the AG node/reach where the sample was taken), values (value of the eDNA or direct measurement) and type (equal to "e" for eDNA data and to "d" for direct observation data). eDNA values are expressed as concentration or number of reads; direct observations are expressed as numbers of individuals.

river

A river object generated via aggregate_river.

covariates

Data frame containing covariate values for all river reaches. If NULL (default option), production rates are estimated via AEMs.

Z.normalize

Logical. Should covariates be Z-normalized?

use.AEM

Logical. Should eigenvectors based on AEMs be used as covariates? If covariates = NULL, it is set to TRUE. If TRUE and covariates are provided, AEM eigenvectors are appended to the covariates data frame.

n.AEM

Number of AEM eigenvectors (sorted by the decreasing respective eigenvalue) to be used as covariates. If par.AEM$moranI = TRUE, this parameter is not used. Instead, the eigenvectors with significantly positive spatial autocorrelation are used as AEM covariates.

par.AEM

List of additional parameters that are passed to river_to_AEM for calculation of AEMs. In particular, par.AEM$moranI = TRUE imposes the use of AEM covariates with significantly positive spatial autocorrelation based on Moran's I statistic.

no.det

Logical. Should a probability of non-detection be included in the model?

ll.type

Character. String defining the error distribution used in the log-likelihood formulation. Allowed values are norm (for normal distribution), lnorm (for lognormal distribution), nbinom (for negative binomial distribution) and geom (for geometric distribution). The two latter choices are suited when eDNA data are expressed as read numbers, while norm and lnorm are better suited to eDNA concentrations.

source.area

Defines the extent of the source area of a node. Possible values are "AG" (if the source area is the reach surface, i.e. length*width), "SC" (if the source area is the subcatchment area), or, alternatively, a vector with length river$AG$nodes.

mcmc.settings

List. It is passed as argument settings in runMCMC. Default is list(iterations = 2.7e6, burnin=1.8e6, message = TRUE, thin = 10).

likelihood

Likelihood function to be passed as likelihood argument to createBayesianSetup. If not specified, it is generated based on arguments no.det and ll.type. If a custom likelihood is specified, a custom prior must also be specified.

prior

Prior function to be passed as prior argument to createBayesianSetup. If not specified, it is generated based on the *.prior arguments provided. If a user-defined prior is provided, parameter names must be included in prior$lower, prior$upper (see example).

sampler.type

Character. It is passed as argument sampler in runMCMC.

tau.prior

List that defines the prior distribution for the decay time parameter tau. See details.

log_p0.prior

List that defines the prior distribution for the logarithm (in base 10) of the baseline production rate p0. See details. If covariates = NULL, this defines the prior distribution for the logarithm (in base 10) of production rates for all river reaches.

beta.prior

List that defines the prior distribution for the covariate effects beta. See details. If a single spec is provided, the same prior distribution is specified for all beta parameters. Alternatively, if spec (and the other arguments, if provided) is a vector with length equal to the number of covariates included, different prior distributions can be specified for the different beta parameters.

sigma.prior

List that defines the prior distribution for the standard deviation of the measurement error when ll.type is "norm" or "lnorm". It is not used if ll.type = "nbinom". See details.

omega.prior

List that defines the prior distribution for the overdispersion parameter omega of the measurement error when ll.type = "nbinom". It is not used if ll.type is "norm" or "lnorm". See details.

Cstar.prior

List that defines the prior distribution for the Cstar parameter controlling the probability of no detection. It is only used if no.det = TRUE. See details.

omega_d.prior

Prior distribution for the overdispersion parameter for direct sampling density observations.

alpha.prior

Prior distribution for the inverse DNA shedding rate (i.e., the organismal density that sheds a unit eDNA value per unit time).

verbose

Logical. Should console output be displayed?

Details

The arguments of the type *.prior consist in the lists of arguments required by dtrunc (except the first argument x).

By default, AEMs are computed without attributing weights to the edges of the river network. Use e.g. par.AEM = list(weight = "gravity") to attribute weights.

Value

A list with objects:

param_map

Vector of named parameters corresponding to the maximum a posteriori estimate. It is the output of the call to MAP.

p_map

Vector of best-fit eDNA production rates corresponding to the maximum a posteriori parameter estimate param_map. It has length equal to river$AG$nNodes.

C_map

Vector of best-fit eDNA values (in the same unit as data$values, i.e. concentrations or read numbers) corresponding to the maximum a posteriori parameter estimate param_map. It has length equal to river$AG$nNodes.

probDet_map

Vector of best-fit detection probabilities corresponding to the maximum a posteriori parameter estimate param_map. It has length equal to river$AG$nNodes. If a custom likelihood is provided, this is a vector of null length (in which case the user should calculate the probability of detection independently, based on the chosen likelihood).

cI

Output of the call to getCredibleIntervals.

gD

Output of the call to gelmanDiagnostics.

covariates

Data frame containing input covariate values (possibly Z-normalized).

source.area

Vector of source area values.

outMCMC

Object of class mcmcSampler returned by the call to runMCMC.

Moreover, arguments ll.type (possibly changed to "custom" if a custom likelihood is specified), no.det and data are added to the list.

Examples

data(wigger)
data(dataCD)

# reduce number of iterations for illustrative purposes 
# (use default mcmc.settings to ensure convergence)
settings.short <- list(iterations = 1e3, thin = 10)
set.seed(1)
out <- run_eDITH_BT_joint(dataCD, wigger, mcmc.settings = settings.short)
#> Covariates not specified. Production rates will be estimated
#>                       based on the first n.AEM = 26 AEMs. 
#> 
 Running DREAM-MCMC, chain  1 iteration 300 of 1002 . Current logp  1556.836 1482.211 1585.538 Please wait! 

 Running DREAM-MCMC, chain  1 iteration 600 of 1002 . Current logp  1557.273 1527.394 1585.538 Please wait! 

 Running DREAM-MCMC, chain  1 iteration 900 of 1002 . Current logp  1559.214 1529.97 1604.182 Please wait! 

 Running DREAM-MCMC, chain  1 iteration 1002 of 1002 . Current logp  1559.214 1529.97 1604.182 Please wait! 

#> runMCMC terminated after 2.048seconds
#> gelmanDiagnostics could not be calculated, possibly there is not enoug variance in your MCMC chains. Try running the sampler longer

# \donttest{
library(rivnet)
# best-fit (maximum a posteriori) map of eDNA production rates
plot(wigger, out$p_map)


# best-fit map (maximum a posteriori) of detection probability
plot(wigger, out$probDet_map)



# compare best-fit vs observed values
data.e <- which(dataCD$type=="e")
data.d <- which(dataCD$type=="d")

plot(out$C_map[dataCD$ID[data.e]], dataCD$values[data.e],
  xlab="Modelled (MAP) eDNA concentrations", ylab="Observed eDNA concentrations")
abline(a=0, b=1) 


plot(out$p_map[dataCD$ID[data.d]], dataCD$values[data.d],
  xlab="Modelled (MAP) eDNA production rate", ylab="Observed density data")



# }