Ordinal Regression with plssem • plssem

This vignette demonstrates how to fit an ordinal regression model with plssem using the titanic dataset.

In plssem, regression-style model syntax like y ~ x1 + x2 is supported. When the dependent variable (and/or predictors) are ordinal. Ordinal variables can be supplied via the ordered argument, or by making sure they are ordered in the dataset.

Data

head(titanic[, c("Survived", "Age", "Sex", "Female", "Pclass")])
#>   Survived Age    Sex Female Pclass
#> 1        0  22   male      0      3
#> 2        1  38 female      1      1
#> 3        1  26 female      1      3
#> 4        1  35 female      1      1
#> 5        0  35   male      0      3
#> 6        0  NA   male      0      3

Linear ordinal regression

This model predicts survival as a function of age and sex.

m_linear <- "Survived ~ Age + Female"

fit_linear <- pls(
  m_linear,
  data          = titanic,
  ordered       = "Survived",
  boot.R        = 50,
  bootstrap     = TRUE,
  boot.parallel = "multicore",
  boot.ncores   = 2
)

summary(fit_linear)
#> plssem (0.1.2) ended normally after 1 iterations
#>   Estimator                                    OrdPLSc
#>   Link                                          PROBIT
#>                                                       
#>   Number of observations                           714
#>   Number of iterations                               1
#>   Number of latent variables                         0
#>   Number of observed variables                       3
#> 
#> Fit Measures:
#>   Chi-Square                                     0.000
#>   Degrees of Freedom                                 0
#>   SRMR                                           0.000
#>   RMSEA                                            NaN
#> 
#> R-squared (latents):
#>   Survived                                       0.347
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Survived ~    
#>     Age            -0.043      0.041   -1.070    0.284
#>     Female          0.584      0.028   20.673    0.000
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Age ~~        
#>     Female         -0.093      0.040   -2.303    0.021
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .Survived        0.653      0.033   19.570    0.000
#>     Age             1.000                             
#>     Female          1.000

Optional: evaluate predictive performance.

pls_predict(fit_linear, benchmark = "acc")
#> PlsSemPredict object
#> Available fields: $Y, $X.cont, $X.cont.pred, $X.ord, $X.ord.pred, $benchmark
#> 
#> Y [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.420 -0.530 -0.759
#> 2  0.743  0.571  1.317
#> 3  0.779 -0.255  1.317
#> 4  0.752  0.365  1.317
#> 5 -0.458  0.365 -0.759
#> 7 -0.515  1.673 -0.759
#> 
#> X.cont [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.653 -0.530 -0.759
#> 2  0.955  0.571  1.317
#> 3  0.955 -0.255  1.317
#> 4  0.955  0.365  1.317
#> 5 -0.653  0.365 -0.759
#> 7 -0.653  1.673 -0.759
#> 
#> X.cont.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.420 -0.530 -0.759
#> 2  0.743  0.571  1.317
#> 3  0.779 -0.255  1.317
#> 4  0.752  0.365  1.317
#> 5 -0.458  0.365 -0.759
#> 7 -0.515  1.673 -0.759
#> 
#> X.ord [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> X.ord.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> Benchmark summary
#> acc: n=1, mean=0.780, median=0.780, min=0.780, max=0.780
#>  variable value
#>  Survived 0.780

Non-linear ordinal regression (interaction)

To include a non-linear (interaction) effect, add an interaction term. With ordinal indicators and interactions, plssem automatically switches to the Monte-Carlo ordinal PLSc estimator.

m_int <- "Survived ~ Age + Female + Age:Female"

fit_int <- pls(
  m_int,
  data          = titanic,
  ordered       = "Survived",
  boot.R        = 50,
  bootstrap     = TRUE,
  boot.parallel = "multicore",
  boot.ncores   = 2
)

summary(fit_int)
#> plssem (0.1.2) ended normally after 40 iterations
#>   Estimator                                  MCOrdPLSc
#>   Link                                          PROBIT
#>                                                       
#>   Number of observations                           714
#>   Number of iterations                              40
#>   Number of latent variables                         0
#>   Number of observed variables                       3
#> 
#> Fit Measures:
#>   Chi-Square                                     0.036
#>   Degrees of Freedom                                -3
#>   SRMR                                           0.002
#>   RMSEA                                            NaN
#> 
#> R-squared (latents):
#>   Survived                                       0.560
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Survived ~    
#>     Age            -0.073      0.039   -1.896    0.058
#>     Female          0.696      0.041   17.019    0.000
#>     Age:Female      0.245      0.069    3.543    0.000
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Age ~~        
#>     Female         -0.091      0.038   -2.423    0.015
#>     Age:Female     -0.017      0.014   -1.212    0.225
#>   Female ~~     
#>     Age:Female      0.001      0.015    0.063    0.949
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .Survived        0.440      0.078    5.619    0.000
#>     Age             1.000                             
#>     Female          1.000                             
#>     Age:Female      1.000

pls_predict(fit_int, benchmark = "acc")
#> PlsSemPredict object
#> Available fields: $Y, $X.cont, $X.cont.pred, $X.ord, $X.ord.pred, $benchmark
#> 
#> Y [714 x 4] (head)
#>   Survvd    Age Female Ag:Fml
#> 1 -0.368 -0.530 -0.759  0.495
#> 2  1.082  0.571  1.317  0.845
#> 3  0.876 -0.255  1.317 -0.242
#> 4  1.031  0.365  1.317  0.574
#> 5 -0.600  0.365 -0.759 -0.184
#> 7 -0.938  1.673 -0.759 -1.176
#> 
#> X.cont [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.636 -0.530 -0.759
#> 2  0.947  0.571  1.317
#> 3  0.947 -0.255  1.317
#> 4  0.947  0.365  1.317
#> 5 -0.636  0.365 -0.759
#> 7 -0.636  1.673 -0.759
#> 
#> X.cont.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1 -0.368 -0.530 -0.759
#> 2  1.082  0.571  1.317
#> 3  0.876 -0.255  1.317
#> 4  1.031  0.365  1.317
#> 5 -0.600  0.365 -0.759
#> 7 -0.938  1.673 -0.759
#> 
#> X.ord [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> X.ord.pred [714 x 3] (head)
#>   Survvd    Age Female
#> 1  1.000 -0.530 -0.759
#> 2  2.000  0.571  1.317
#> 3  2.000 -0.255  1.317
#> 4  2.000  0.365  1.317
#> 5  1.000  0.365 -0.759
#> 7  1.000  1.673 -0.759
#> 
#> Benchmark summary
#> acc: n=1, mean=0.780, median=0.780, min=0.780, max=0.780
#>  variable value
#>  Survived 0.780