Title: | Phylogenetic Structural Equation Model |
---|---|
Description: | Applies phylogenetic comparative methods (PCM) and phylogenetic trait imputation using structural equation models (SEM), extending methods from Thorson et al. (2023) <doi:10.1111/2041-210X.14076>. This implementation includes a minimal set of features, to allow users to easily read all of the documentation and source code. PCM using SEM includes phylogenetic linear models and structural equation models as nested submodels, but also allows imputation of missing values. Features and comparison with other packages are described in Thorson and van der Bijl (2023) <doi:10.1111/jeb.14234>. |
Authors: | James Thorson [aut, cre] , Wouter van der Bijl [ctb] |
Maintainer: | James Thorson <[email protected]> |
License: | GPL-3 |
Version: | 1.1.4 |
Built: | 2024-11-09 02:48:28 UTC |
Source: | https://github.com/james-thorson-noaa/phylosem |
Convert output from package phylosem to phylopath
as_fitted_DAG(object)
as_fitted_DAG(object)
object |
Output from |
Convert output to format supplied by est_DAG
Convert output from package phylosem to phylo4d object from package phylobase
as_phylo4d(object, what = c("Estimate", "Std. Error"))
as_phylo4d(object, what = c("Estimate", "Std. Error"))
object |
Output from |
what |
Select what to convert (Estimate / Std. Error). |
This package is intended to for use in using plots assocaited with package sem,
e.g., using package plotSEM semPlot::semPlotModel
phylosem output to converted format supplied by phylo4d
Convert output from package phylosem to output from package sem
as_sem(object)
as_sem(object)
object |
Output from |
Output converted to format supplied by sem
Choose model
average(x, cut_off, avg_method)
average(x, cut_off, avg_method)
x |
output from |
cut_off |
threshold where any model with delta-AIC greater than this value is excluded from average |
avg_method |
see |
Returns an AIC-weighted average of fitted models from compare_phylosem
after conversion to format from [phylopath::est_DAG]
Choose model
## S3 method for class 'compare_phylosem' average(x, cut_off = 2, avg_method = "conditional")
## S3 method for class 'compare_phylosem' average(x, cut_off = 2, avg_method = "conditional")
x |
output from |
cut_off |
threshold where any model with delta-AIC greater than this value is excluded from average |
avg_method |
see |
Returns an AIC-weighted average of fitted models from compare_phylosem
after conversion to format from est_DAG
Extract best fitted model
best(x)
best(x)
x |
output from |
Returns best model from those fitted using compare_phylosem
Extract best fitted model
## S3 method for class 'compare_phylosem' best(x)
## S3 method for class 'compare_phylosem' best(x)
x |
output from |
Returns best model from those fitted using compare_phylosem
Choose model
choice(x, choice)
choice(x, choice)
x |
output from |
choice |
Integer indicating model to extract |
Returns chosen model from those fitted using compare_phylosem
Choose model
## S3 method for class 'compare_phylosem' choice(x, choice)
## S3 method for class 'compare_phylosem' choice(x, choice)
x |
output from |
choice |
Integer indicating model to extract |
Returns chosen model from those fitted using compare_phylosem
Extract path coefficients.
## S3 method for class 'phylosem' coef(object, standardized = FALSE, ...)
## S3 method for class 'phylosem' coef(object, standardized = FALSE, ...)
object |
Output from |
standardized |
Whether to standardize regression coefficients |
... |
Not used |
Data-frame listing all path coefficients, their parameter index and estimated values
Fits several phylogenetic structural equation model for further comparison
compare_phylosem( sem_set, tree, data, family = rep("fixed", ncol(data)), covs, estimate_ou = FALSE, estimate_lambda = FALSE, estimate_kappa = FALSE, control = phylosem_control(), ... )
compare_phylosem( sem_set, tree, data, family = rep("fixed", ncol(data)), covs, estimate_ou = FALSE, estimate_lambda = FALSE, estimate_kappa = FALSE, control = phylosem_control(), ... )
sem_set |
A named list of structural equation model specifications,
where each element will be passed as argument |
tree |
phylogenetic structure, using class |
data |
data-frame providing variables being modeled. Missing values are inputted
as NA. If an SEM includes a latent variable (i.e., variable with no available measurements)
then it still must be inputted as a column of |
family |
Character-vector listing the distribution used for each column of |
covs |
optional: a character vector of one or more elements, with each element
giving a string of variable names, separated by commas. Variances and covariances
among all variables in each such string are added to the model. For confirmatory
factor analysis models specified via |
estimate_ou |
Boolean indicating whether to estimate an autoregressive (Ornstein-Uhlenbeck)
process using additional parameter |
estimate_lambda |
Boolean indicating whether to estimate additional branch lengths for
phylogenetic tips (a.k.a. the Pagel-lambda term) using additional parameter |
estimate_kappa |
Boolean indicating whether to estimate a nonlinear scaling of branch
lengths (a.k.a. the Pagel-kappa term) using additional parameter |
control |
Output from |
... |
Additional arguments passed to |
An object (list) of class 'compare_phylosem', containing a list of
output from phylosem
list_parameters
lists all fixed and random effects
list_parameters(Obj, verbose = TRUE)
list_parameters(Obj, verbose = TRUE)
Obj |
Compiled TMB object |
verbose |
Boolean, whether to print messages to terminal |
Tagged-list of fixed and random effects, returned invisibly and printed to screen
Data used to demonstrate phylogenetic comparative methods for fisheries science. Specifically a copy of the Then et al. database doi:10.1093/icesjms/fsu136 using file "Mlifehist_ver1.0.csv" accessed from https://www.vims.edu/research/departments/fisheries/programs/mort_db/
data(Mlifehist_ver1_0)
data(Mlifehist_ver1_0)
parse_path
is copied from sem::parse.path
parse_path(path)
parse_path(path)
path |
text to parse |
Copied with permission from John Fox under licence GPL (>= 2)
Tagged-list defining variables and direction for a specified path coefficient
Fits a phylogenetic structural equation model
phylosem( sem, tree, data, family = rep("fixed", ncol(data)), covs = colnames(data), estimate_ou = FALSE, estimate_lambda = FALSE, estimate_kappa = FALSE, data_labels = rownames(data), tmb_inputs = NULL, control = phylosem_control() )
phylosem( sem, tree, data, family = rep("fixed", ncol(data)), covs = colnames(data), estimate_ou = FALSE, estimate_lambda = FALSE, estimate_kappa = FALSE, data_labels = rownames(data), tmb_inputs = NULL, control = phylosem_control() )
sem |
structural equation model structure, passed to either |
tree |
phylogenetic structure, using class |
data |
data-frame providing variables being modeled. Missing values are inputted
as NA. If an SEM includes a latent variable (i.e., variable with no available measurements)
then it still must be inputted as a column of |
family |
Character-vector listing the distribution used for each column of |
covs |
optional: a character vector of one or more elements, with each element
giving a string of variable names, separated by commas. Variances and covariances
among all variables in each such string are added to the model. For confirmatory
factor analysis models specified via |
estimate_ou |
Boolean indicating whether to estimate an autoregressive (Ornstein-Uhlenbeck)
process using additional parameter |
estimate_lambda |
Boolean indicating whether to estimate additional branch lengths for
phylogenetic tips (a.k.a. the Pagel-lambda term) using additional parameter |
estimate_kappa |
Boolean indicating whether to estimate a nonlinear scaling of branch
lengths (a.k.a. the Pagel-kappa term) using additional parameter |
data_labels |
For each row of |
tmb_inputs |
optional tagged list that overrides the default constructor for TMB inputs (use at your own risk) |
control |
Output from |
Note that parameters logitlambda
, lnkappa
, and lnalpha
if estimated are each estimated as having a single value
that applies to all modeled variables.
This differs from default behavior in phylolm, where these parameters only apply to the "response" and not "predictor" variables.
This also differs from default behavior in phylopath, where a different value is estimated
in each call to phylolm during the d-separation estimate of path coefficients. However, it is
consistent with default behavior in Rphylopars, and estimates should be comparable in that case.
These additional parameters are estimated with unbounded support, which differs somewhat from default
bounded estimates in phylolm, although parameters should match if overriding phylolm defaults
to use unbounded support. Finally, phylosem
allows these three parameters to be estimated in any
combination, which is expanded functionality relative to the single-option functionality in phylolm.
Also note that phylopath by default uses standardized coefficients. To achieve matching parameter estimates between phylosem and phylopath, standardize each variable to have a standard deviation of 1.0 prior to fitting with phylosem.
An object (list) of class 'phylosem'. Elements include:
Copy of argument data
SEM model parsed from sem
using specifyModel
or specifyEquations
TMB object from MakeADFun
Copy of argument tree
The list of inputs passed to MakeADFun
The output from nlminb
The output from sdreport
The output from obj$report()
The output from obj$env$parList()
containing maximum likelihood estimates and empirical Bayes predictions
**Introducing the package, its features, and comparison with other software (to cite when using phylosem):**
Thorson, J. T., & van der Bijl, W. (In press). phylosem: A fast and simple R package for phylogenetic inference and trait imputation using phylogenetic structural equation models. Journal of Evolutionary Biology. doi:10.1111/jeb.14234
*Statistical methods for phylogenetic structural equation models*
Thorson, J. T., Maureaud, A. A., Frelat, R., Merigot, B., Bigman, J. S., Friedman, S. T., Palomares, M. L. D., Pinsky, M. L., Price, S. A., & Wainwright, P. (2023). Identifying direct and indirect associations among traits by merging phylogenetic comparative methods and structural equation models. Methods in Ecology and Evolution, 14(5), 1259-1275. doi:10.1111/2041-210X.14076
*Earlier development of computational methods, originally used for phlogenetic factor analysis:*
Thorson, J. T. (2020). Predicting recruitment density dependence and intrinsic growth rate for all fishes worldwide using a data-integrated life-history model. Fish and Fisheries, 21(2), 237-251. doi:10.1111/faf.12427
Thorson, J. T., Munch, S. B., Cope, J. M., & Gao, J. (2017). Predicting life history parameters for all fishes worldwide. Ecological Applications, 27(8), 2262-2276. doi:10.1002/eap.1606
*Earlier development of phylogenetic path analysis:*
van der Bijl, W. (2018). phylopath: Easy phylogenetic path analysis in R. PeerJ, 6, e4718. doi:10.7717/peerj.4718
von Hardenberg, A., & Gonzalez-Voyer, A. (2013). Disentangling evolutionary cause-effect relationships with phylogenetic confirmatory path analysis. Evolution; International Journal of Organic Evolution, 67(2), 378-387. doi:10.1111/j.1558-5646.2012.01790.x
*Interface involving SEM 'arrow notation' is repurposed from:*
Fox, J., Nie, Z., & Byrnes, J. (2020). Sem: Structural equation models. R package version 3.1-11. https://CRAN.R-project.org/package=sem
*Coercing output to phylo4d depends upon:*
Bolker, B., Butler, M., Cowan, P., de Vienne, D., Eddelbuettel, D., Holder, M., Jombart, T., Kembel, S., Michonneau, F., & Orme, B. (2015). phylobase: Base package for phylogenetic structures and comparative data. R Package Version 0.8.0. https://CRAN.R-project.org/package=phylobase
*Laplace approximation for parameter estimation depends upon:*
Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., & Bell, B. M. (2016). TMB: Automatic differentiation and Laplace approximation. Journal of Statistical Software, 70(5), 1-21. doi:10.18637/jss.v070.i05
# Load data set data(rhino, rhino_tree, package="phylopath") # Run phylosem model = " DD -> RS, p1 BM -> LS, p2 BM -> NL, p3 NL -> DD, p4 " psem = phylosem( sem = model, data = rhino[,c("BM","NL","DD","RS","LS")], tree = rhino_tree ) # Convert and plot using phylopath library(phylopath) my_fitted_DAG = as_fitted_DAG(psem) coef_plot( my_fitted_DAG ) plot( my_fitted_DAG ) # Convert to phylo4d to extract estimated traits and Standard errors # for all ancestors and tips in the tree. # In this rhino example, note that species are labeled s1-s100 # and ancestral nodes are not named. (traits_est = as_phylo4d(psem)) (traits_SE = as_phylo4d(psem, what="Std. Error")) # Convert to sem and plot library(sem) my_sem = as_sem(psem) pathDiagram( model = my_sem, style = "traditional", edge.labels = "values" ) effects( my_sem ) # Plot using semPlot if( require(semPlot) ){ myplot = semPlotModel( my_sem ) semPaths( my_sem, nodeLabels = myplot@Vars$name ) }
# Load data set data(rhino, rhino_tree, package="phylopath") # Run phylosem model = " DD -> RS, p1 BM -> LS, p2 BM -> NL, p3 NL -> DD, p4 " psem = phylosem( sem = model, data = rhino[,c("BM","NL","DD","RS","LS")], tree = rhino_tree ) # Convert and plot using phylopath library(phylopath) my_fitted_DAG = as_fitted_DAG(psem) coef_plot( my_fitted_DAG ) plot( my_fitted_DAG ) # Convert to phylo4d to extract estimated traits and Standard errors # for all ancestors and tips in the tree. # In this rhino example, note that species are labeled s1-s100 # and ancestral nodes are not named. (traits_est = as_phylo4d(psem)) (traits_SE = as_phylo4d(psem, what="Std. Error")) # Convert to sem and plot library(sem) my_sem = as_sem(psem) pathDiagram( model = my_sem, style = "traditional", edge.labels = "values" ) effects( my_sem ) # Plot using semPlot if( require(semPlot) ){ myplot = semPlotModel( my_sem ) semPaths( my_sem, nodeLabels = myplot@Vars$name ) }
Define a list of control parameters. Note that
the format of this input is likely to change more rapidly than that of
phylosem
phylosem_control( nlminb_loops = 1, newton_loops = 1, trace = 0, eval.max = 1000, iter.max = 1000, getsd = TRUE, quiet = FALSE, run_model = TRUE, getJointPrecision = FALSE )
phylosem_control( nlminb_loops = 1, newton_loops = 1, trace = 0, eval.max = 1000, iter.max = 1000, getsd = TRUE, quiet = FALSE, run_model = TRUE, getJointPrecision = FALSE )
nlminb_loops |
Integer number of times to call |
newton_loops |
Integer number of Newton steps to do after running
|
trace |
Parameter values are printed every 'trace' iteration
for the outer optimizer. Passed to
'control' in |
eval.max |
Maximum number of evaluations of the objective function
allowed. Passed to 'control' in |
iter.max |
Maximum number of iterations allowed. Passed to 'control' in
|
getsd |
Boolean indicating whether to call |
quiet |
Boolean indicating whether to run model printing messages to terminal or not; |
run_model |
Boolean indicating whether to estimate parameters (the default), or instead to return the model inputs and compiled TMB object without running; |
getJointPrecision |
whether to get the joint precision matrix. Passed
to |
An S3 object of class "phylosem_control" that specifies detailed model settings, allowing user specification while also specifying default values
Print parameter estimates
## S3 method for class 'phylosem' print(x, ...)
## S3 method for class 'phylosem' print(x, ...)
x |
Output from |
... |
Not used |
prints (and invisibly returns) output from nlminb
Summarize phylosem output from phylosem, including calculating intercepts at the tree root
## S3 method for class 'phylosem' summary(object, ...)
## S3 method for class 'phylosem' summary(object, ...)
object |
Output from |
... |
Not used |
Data-frame containing all estimated intercepts, path coefficients, and variance-covariance parameters as well as their standard errors
extract the covariance of fixed effects, or both fixed and random effects.
## S3 method for class 'phylosem' vcov(object, which = c("fixed", "random", "both"), ...)
## S3 method for class 'phylosem' vcov(object, which = c("fixed", "random", "both"), ...)
object |
output from |
which |
whether to extract the covariance among fixed effects, random effects, or both |
... |
ignored, for method compatibility |