pls() estimates Partial Least Squares Structural Equation Models (PLS-SEM)
and their consistent (PLSc) variants. The function accepts lavaan-style
syntax, handles ordered indicators through polychoric correlations and probit
factor scores, and supports multilevel specifications expressed with
lme4-style random effects terms inside the structural model.
pls(
syntax,
data,
standardize = TRUE,
consistent = TRUE,
bootstrap = FALSE,
ordered = NULL,
missing = c("listwise", "mean", "kNN"),
knn.k = 5,
mcpls = NULL,
probit = NULL,
tolerance = 1e-05,
max.iter.0_5 = 100L,
boot.ncpus = 1L,
boot.parallel = c("no", "multicore", "snow"),
boot.R = 50L,
boot.iseed = NULL,
sample = NULL,
mc.min.iter = 5L,
mc.max.iter = 250L,
mc.reps = 20000L,
mc.tol = 0.001,
mc.fixed.seed = FALSE,
mc.polyak.juditsky = FALSE,
mc.fn.args = list(),
verbose = interactive(),
boot.optimize = TRUE,
mc.boot.control = list(min.iter = mc.min.iter, max.iter = mc.max.iter, mc.reps =
floor(0.5 * mc.reps), tol = 2 * mc.tol, polyak.juditsky = FALSE, verbose = FALSE,
fixed.seed = TRUE, reuse.p.start = TRUE),
reliabilities = NULL,
...
)Character string with lavaan-style model syntax describing
both measurement (=~) and structural (~) relations. Random effects are
specified with (term | cluster) statements.
A data.frame or coercible object containing the manifest
indicators referenced in syntax. Ordered factors are automatically
detected, but can also be supplied explicitly through ordered.
Logical; if TRUE, indicators are standardized before
estimation so that factor scores have comparable scales.
Logical; TRUE requests PLSc corrections, whereas FALSE
fits the traditional PLS model.
Logical; if TRUE, nonparametric bootstrap standard errors
are computed with boot.R resamples.
Optional character vector naming manifest indicators that should be treated as ordered when computing polychoric correlations.
Character string specifying how to handle missing indicator data.
"listwise" removes rows with missing values (listwise deletion).
"mean" imputes missing indicator values using simple univariate
imputation: the mean for continuous variables, the median for ordered variables
with more than two categories, and the mode for binary ordered variables (two
categories) or nominal variables.
"kNN" (or "knn") imputes missing indicator values using
k-nearest neighbors imputation (kNN). When missing = "kNN", rows with
all indicators missing are removed prior to imputation, and rows with missing
cluster values are removed for multilevel models.
Integer specifying the number of neighbors (k) used when
missing = "kNN".
Should a Monte-Carlo consistency correction be applied?
Logical; overrides the automatic choice of probit factor scores that is based on whether ordered indicators are present.
Numeric; Convergence criteria/tolerance.
Maximum number of PLS iterations performed when estimating the measurement and structural models.
Integer: number of processes to be used in parallel operation.
By default this is the number of cores (as detected by parallel::detectCores()) minus one.
The type of parallel operation to be used (if any). If missing,
the default is "no".
Integer giving the number of bootstrap resamples drawn when
bootstrap = TRUE.
An integer to set the bootstrap seed. Or NULL if no
reproducible results are needed. This works for both serial (non-parallel) and
parallel settings. Internally, RNGkind() is set to "L'Ecuyer-CMRG"
if parallel = "multicore". If parallel = "snow" (under windows),
parallel::clusterSetRNGStream() is called which automatically switches to
"L'Ecuyer-CMRG". When iseed is not NULL, .Random.seed
(if it exists) in the global environment is left untouched.
DEPRECATED. Integer giving the number of bootstrap resamples drawn when
bootstrap = TRUE.
Minimum number of iterations in MC-PLS algorithm.
Maximum number of iterations in MC-PLS algorithm.
Monte-Carlo sample size in MC-PLS algorithm.
Tolerance in MC-PLS algorithm.
Should a fixed seed be used in the MC-PLS algorithm?
Should the polyak.juditsky running average method be applied in the MC-PLS algorithm?
Additional arguments to MC-PLS algorithm, mainly for controlling the step size.
Should verbose output be printed?
Logical; if TRUE and bootstrap = TRUE, applies
the settings in mc.boot.control inside each bootstrap replicate (MC-PLS only).
List of control parameters passed to the MC-PLS algorithm
inside each bootstrap replicate when boot.optimize = TRUE.
Optional named numeric vector of user-supplied reliabilities used for the PLSc consistency correction.
Internal arguments. For advanced users only.
A Plssem object containing the estimated parameters, fit measures,
factor scores, and any bootstrap results. Methods such as summary(),
coef(), and parameter_estimates() can be applied to inspect the fit.
# \donttest{
library(plssem)
library(modsem)
tpb <- '
ATT =~ att1 + att2 + att3 + att4 + att5
SN =~ sn1 + sn2
PBC =~ pbc1 + pbc2 + pbc3
INT =~ int1 + int2 + int3
BEH =~ b1 + b2
INT ~ ATT + SN + PBC
BEH ~ INT + PBC
'
fit <- pls(tpb, TPB, bootstrap = TRUE)
summary(fit)
#> plssem (0.1.2) ended normally after 3 iterations
#>
#> Estimator PLSc
#> Link LINEAR
#>
#> Number of observations 2000
#> Number of iterations 3
#> Number of latent variables 5
#> Number of observed variables 15
#>
#> Fit Measures:
#> Chi-Square 106.316
#> Degrees of Freedom 82
#> SRMR 0.008
#> RMSEA 0.012
#>
#> R-squared (indicators):
#> att1 0.847
#> att2 0.825
#> att3 0.805
#> att4 0.745
#> att5 0.845
#> sn1 0.817
#> sn2 0.863
#> pbc1 0.856
#> pbc2 0.859
#> pbc3 0.787
#> int1 0.816
#> int2 0.827
#> int3 0.742
#> b1 0.762
#> b2 0.821
#>
#> R-squared (latents):
#> INT 0.367
#> BEH 0.210
#>
#> Latent Variables:
#> Estimate Std.Error z.value P(>|z|)
#> ATT =~
#> att1 0.921 0.013 71.482 0.000
#> att2 0.908 0.016 55.688 0.000
#> att3 0.897 0.018 50.441 0.000
#> att4 0.863 0.018 48.740 0.000
#> att5 0.919 0.015 61.155 0.000
#> SN =~
#> sn1 0.904 0.011 80.275 0.000
#> sn2 0.929 0.013 71.812 0.000
#> PBC =~
#> pbc1 0.925 0.009 98.920 0.000
#> pbc2 0.927 0.010 91.179 0.000
#> pbc3 0.887 0.012 75.698 0.000
#> INT =~
#> int1 0.903 0.011 81.076 0.000
#> int2 0.909 0.011 84.452 0.000
#> int3 0.861 0.014 60.972 0.000
#> BEH =~
#> b1 0.873 0.016 54.499 0.000
#> b2 0.906 0.016 57.931 0.000
#>
#> Regressions:
#> Estimate Std.Error z.value P(>|z|)
#> INT ~
#> ATT 0.243 0.031 7.756 0.000
#> SN 0.201 0.029 6.902 0.000
#> PBC 0.240 0.030 7.977 0.000
#> BEH ~
#> PBC 0.308 0.026 11.936 0.000
#> INT 0.210 0.029 7.227 0.000
#>
#> Covariances:
#> Estimate Std.Error z.value P(>|z|)
#> ATT ~~
#> SN 0.633 0.014 45.157 0.000
#> PBC 0.692 0.014 48.360 0.000
#> SN ~~
#> PBC 0.696 0.014 50.952 0.000
#>
#> Variances:
#> Estimate Std.Error z.value P(>|z|)
#> ATT 1.000
#> SN 1.000
#> PBC 1.000
#> .INT 0.633 0.016 40.026 0.000
#> .BEH 0.790 0.019 41.975 0.000
#> .att1 0.153 0.024 6.427 0.000
#> .att2 0.175 0.030 5.916 0.000
#> .att3 0.195 0.032 6.139 0.000
#> .att4 0.255 0.031 8.256 0.000
#> .att5 0.155 0.028 5.585 0.000
#> .sn1 0.183 0.020 8.987 0.000
#> .sn2 0.137 0.024 5.661 0.000
#> .pbc1 0.144 0.017 8.271 0.000
#> .pbc2 0.141 0.019 7.501 0.000
#> .pbc3 0.213 0.021 10.253 0.000
#> .int1 0.184 0.020 9.135 0.000
#> .int2 0.173 0.020 8.869 0.000
#> .int3 0.258 0.024 10.628 0.000
#> .b1 0.238 0.028 8.599 0.000
#> .b2 0.179 0.029 6.251 0.000
#>
# }