Power Analysis for Moderation, Mediation, and Moderated Mediation

2025-08-31

Introduction

This article is a brief illustration of how to use power4test() from the package power4mome to do power analysis of mediation, moderation, and moderated mediation in a model to be fitted by structural equation modeling using lavaan.

Prerequisite

Basic knowledge about fitting models by lavaan is required. Readers are also expected to have basic knowledge of mediation, moderation, and/or moderated mediation.

Scope

This is a brief illustration. More complicated scenarios and other features of power4mome will be described in other vignettes.

Package

This introduction only needs the following package:

library(power4mome)

Workflow

Two functions are sufficient for estimating power given a model, population values, sample size, and the test to be used. This is the workflow:

  1. Specify the model syntax for the population model, in lavaan style, and set the population values of the model parameters.

  2. Call power4test() to examine the setup and the datasets generated. Repeat this and previous steps until the model is specified correctly.

  3. Call power4test() again, with the test to do specified.

  4. Call rejection_rates() to compute the power.

Mediation

Let’s consider a simple mediation model. We would like to estimate the power of testing a mediation effect by Monte Carlo confidence interval.

Specify the Population Model

This is the model syntax

mod <-
"
m ~ x
y ~ m + x
"

Note that, even if we are going to test mediation, moderation, or moderated mediation effects, we do not need to add any labels to this model. This will be taken care of by the test functions, through the use of the package manymome (Cheung & Cheung, 2024).

Specify The Population Values

There are two approaches to do this:

The second approach is demonstrated below.

Suppose we want to estimate the power when:

By default, power4mome uses this convention for regression path and correlation:1

For a product term in moderation, this is the convention:

All these values are for the standardized solution (the correlations and so-called “betas”).

The following string denotes the desired values:

mod_es <-
"
m ~ x: l
y ~ m: m
y ~ x: s
"

Each line starts with a tag, which is the parameter presented in lavaan syntax. The tag ends with a colon, :.

After the colon is population value, which can be:

All other regression coefficients and covariances, if not specified in this string, are set to zero.

Call power4test() to Check the Model

out <- power4test(nrep = 2,
                  model = mod,
                  pop_es = mod_es,
                  n = 50000,
                  iseed = 1234)

These are the arguments used:

The population values can be shown by printing this object:

out
#> 
#> ====================== Model Information ======================
#> 
#> == Model on Factors/Variables ==
#> 
#> m ~ x
#> y ~ m + x
#> 
#> == Model on Variables/Indicators ==
#> 
#> m ~ x
#> y ~ m + x
#> 
#> ====== Population Values ======
#> 
#> Regressions:
#>                    Population
#>   m ~                        
#>     x                 0.500  
#>   y ~                        
#>     m                 0.300  
#>     x                 0.100  
#> 
#> Variances:
#>                    Population
#>    .m                 0.750  
#>    .y                 0.870  
#>     x                 1.000  
#> 
#> (Computing indirect effects for 2 paths ...)
#> 
#> == Population Conditional/Indirect Effect(s) ==
#> 
#> == Indirect Effect(s) ==
#> 
#>               ind
#> x -> m -> y 0.150
#> x -> y      0.100
#> 
#>  - The 'ind' column shows the indirect effect(s).
#>  
#> ======================= Data Information =======================
#> 
#> Number of Replications:  2 
#> Sample Sizes:  50000 
#> 
#> Call print with 'data_long = TRUE' for further information.
#> 
#> ==================== Extra Element(s) Found ====================
#> 
#> - fit
#> 
#> === Element(s) of the First Dataset ===
#> 
#> ============ <fit> ============
#> 
#> lavaan 0.6-19 ended normally after 1 iteration
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                         5
#> 
#>   Number of observations                         50000
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                 0.000
#>   Degrees of freedom                                 0

By default, the population model will be fitted to each dataset, hence the section <fit>. Perfect fit is expected if the population model is a saturated model.

We check the section Population Values to see whether the values are what we expected.

If they are different from what we expect, check the string for pop_es to see whether we set the population values correctly.

If necessary, we can check the data generation by adding data_long = TRUE when printing the output:

print(out,
      data_long = TRUE)
#> 
#> ====================== Model Information ======================
#> 
#> == Model on Factors/Variables ==
#> 
#> m ~ x
#> y ~ m + x
#> 
#> == Model on Variables/Indicators ==
#> 
#> m ~ x
#> y ~ m + x
#> 
#> ====== Population Values ======
#> 
#> Regressions:
#>                    Population
#>   m ~                        
#>     x                 0.500  
#>   y ~                        
#>     m                 0.300  
#>     x                 0.100  
#> 
#> Variances:
#>                    Population
#>    .m                 0.750  
#>    .y                 0.870  
#>     x                 1.000  
#> 
#> (Computing indirect effects for 2 paths ...)
#> 
#> == Population Conditional/Indirect Effect(s) ==
#> 
#> == Indirect Effect(s) ==
#> 
#>               ind
#> x -> m -> y 0.150
#> x -> y      0.100
#> 
#>  - The 'ind' column shows the indirect effect(s).
#>  
#> ======================= Data Information =======================
#> 
#> Number of Replications:  2 
#> Sample Sizes:  50000 
#> 
#> ==== Descriptive Statistics ====
#> 
#>   vars     n mean sd skew kurtosis se
#> m    1 1e+05 0.00  1 0.01     0.03  0
#> y    2 1e+05 0.01  1 0.01     0.00  0
#> x    3 1e+05 0.00  1 0.01     0.01  0
#> 
#> ===== Parameter Estimates Based on All 2 Samples Combined =====
#> 
#> Total Sample Size: 100000 
#> 
#> ==== Standardized Estimates ====
#> 
#> Variances and error variances omitted.
#> 
#> Regressions:
#>                     est.std
#>   m ~                      
#>     x                 0.500
#>   y ~                      
#>     m                 0.295
#>     x                 0.102
#> 
#> 
#> ==================== Extra Element(s) Found ====================
#> 
#> - fit
#> 
#> === Element(s) of the First Dataset ===
#> 
#> ============ <fit> ============
#> 
#> lavaan 0.6-19 ended normally after 1 iteration
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                         5
#> 
#>   Number of observations                         50000
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                 0.000
#>   Degrees of freedom                                 0

The section Descriptive Statistics, generated by psych::describe(), shows basic descriptive statistics for the observed variables. As expected, they have means close to zero and standard deviations close to one, because the datasets were generated using the standardized model.

The section Parameter Estimates Based on shows the parameter estimates when the population model is fitted to all the datasets combined. When the total sample size is large, these estimates should be close to the population values.

The results show that we have specified the population model correctly. We can proceed to specify the test and estimate the power.

Call power4test() to Do the Target Test

We can now do the simulation to estimate power. A large number of datasets (e.g., 500) of the target sample size are to be generated, and then the target test will be conducted in each of these datasets.

Suppose we would like to estimate the power of using Monte Carlo confidence intervals to test the indirect effect from x to y through m, when sample size is 50. This is the call:

out <- power4test(nrep = 400,
                  model = mod,
                  pop_es = mod_es,
                  n = 50,
                  R = 2000,
                  ci_type = "mc",
                  test_fun = test_indirect_effect,
                  test_args = list(x = "x",
                                   m = "m",
                                   y = "y",
                                   mc_ci = TRUE),
                  iseed = 1234,
                  parallel = TRUE)

These are the new arguments used:

For nrep = 400, the 95% confidence limits for a power of .80 are about .04 below and above .80. This should be precise enough for determining whether a sample size has sufficient power.

This is the default printout:

out
#> 
#> ====================== Model Information ======================
#> 
#> == Model on Factors/Variables ==
#> 
#> m ~ x
#> y ~ m + x
#> 
#> == Model on Variables/Indicators ==
#> 
#> m ~ x
#> y ~ m + x
#> 
#> ====== Population Values ======
#> 
#> Regressions:
#>                    Population
#>   m ~                        
#>     x                 0.500  
#>   y ~                        
#>     m                 0.300  
#>     x                 0.100  
#> 
#> Variances:
#>                    Population
#>    .m                 0.750  
#>    .y                 0.870  
#>     x                 1.000  
#> 
#> (Computing indirect effects for 2 paths ...)
#> 
#> == Population Conditional/Indirect Effect(s) ==
#> 
#> == Indirect Effect(s) ==
#> 
#>               ind
#> x -> m -> y 0.150
#> x -> y      0.100
#> 
#>  - The 'ind' column shows the indirect effect(s).
#>  
#> ======================= Data Information =======================
#> 
#> Number of Replications:  400 
#> Sample Sizes:  50 
#> 
#> Call print with 'data_long = TRUE' for further information.
#> 
#> ==================== Extra Element(s) Found ====================
#> 
#> - fit
#> - mc_out
#> 
#> === Element(s) of the First Dataset ===
#> 
#> ============ <fit> ============
#> 
#> lavaan 0.6-19 ended normally after 1 iteration
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                         5
#> 
#>   Number of observations                            50
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                 0.000
#>   Degrees of freedom                                 0
#> 
#> =========== <mc_out> ===========
#> 
#> 
#> == A 'mc_out' class object ==
#> 
#> Number of Monte Carlo replications: 2000 
#> 
#> 
#> ====================== Test(s) Conducted ======================
#> 
#> - test_indirect: x->m->y
#> 
#> Call print() and set 'test_long = TRUE' for a detailed report.

If test_long = TRUE is added when printing the output by print(), a summary of the test will also be printed.

print(out,
      test_long = TRUE)

The summary of the test:

#>                                                       
#>   Test statistic                                 0.000
#>   Degrees of freedom                                 0
#> 
#> =========== <mc_out> ===========
#> 
#> 
#> == A 'mc_out' class object ==
#> 
#> Number of Monte Carlo replications: 2000

The mean of the estimates across all the replications is 0.152, close to the population value.

Compute the Power

The power estimate is simply the proportion of significant results, the rejection rate, because the null hypothesis is false. In addition to using test_long = TRUE in print(), the rejection rate can also be retrieved by rejection_rates().

out_power <- rejection_rates(out)
out_power
#> [test]: test_indirect: x->m->y 
#> [test_label]: Test 
#>     est   p.v reject r.cilo r.cihi
#> 1 0.152 1.000  0.468  0.419  0.516
#> Notes:
#> - p.v: The proportion of valid replications.
#> - est: The mean of the estimates in a test across replications.
#> - reject: The proportion of 'significant' replications, that is, the
#>   rejection rate. If the null hypothesis is true, this is the Type I
#>   error rate. If the null hypothesis is false, this is the power.
#> - r.cilo,r.cihi: The confidence interval of the rejection rate, based
#>   on normal approximation.
#> - Refer to the tests for the meanings of other columns.

In the example above, the estimated power of the test of the indirect effect, conducted by Monte Carlo confidence intervals, is 0.468, under the column reject.

p.v is the proportion of valid results across replications. 1.000 means that the test conducted normally in all replications.

By default, the 95% confidence interval of the rejection rate (power) based on normal approximation is also printed, under the column r.cilo and r.cihi. In this example, the 95% confidence interval is [0.419; 0.516].

Moderation

Let’s consider a moderation model, with some control variables.

Specify the Population Model and Values

mod2 <-
"
y ~ x + w + x:w + control
"

This model has only moderation, with the predictor x and the moderator w. The product term is included in the lavaan style, x:w.

It is unrealistic to specific the population values for all control variables. Therefore, we can just add a proxy, control to represent the set of control variables that may be included.

This is the syntax for the population values:

mod2_es <-
"
.beta.: s
x ~~ control: s
y ~ control: s
y ~ x:w: l
"

This example introduces one useful tag, .beta. For a model with many paths, it is inconvenient to specify all of them manually. The tag .beta. specifies the default value for all regression paths not specified explicitly, which is small (.10) in this example. If a path is explicitly included (such as y ~ control and y ~ x:w), the manually specified value will be used instead of .beta..

This example also illustrates that we can set the population values for correlations (covariances in the standardized solution). Control variables are included usually because they may correlate with the predictors. Therefore, in this example, it is hypothesized that there is a small correlation between x and the proxy control variable (x ~~ control: s).

Last, recall from this section that the convention for product term values is different: l denotes .15 for product terms.

Call power4test() to Check the Model

We check the model first:

out2 <- power4test(nrep = 2,
                   model = mod2,
                   pop_es = mod2_es,
                   n = 50000,
                   iseed = 1234)
print(out2,
      data_long = TRUE)
#> 
#> ====================== Model Information ======================
#> 
#> == Model on Factors/Variables ==
#> 
#> y ~ x + w + x:w + control
#> 
#> == Model on Variables/Indicators ==
#> 
#> y ~ x + w + x:w + control
#> 
#> ====== Population Values ======
#> 
#> Regressions:
#>                    Population
#>   y ~                        
#>     x                 0.100  
#>     w                 0.100  
#>     x:w               0.150  
#>     control           0.100  
#> 
#> Covariances:
#>                    Population
#>   x ~~                       
#>     w                 0.000  
#>     x:w               0.000  
#>     control           0.100  
#>   w ~~                       
#>     x:w               0.000  
#>     control           0.000  
#>   x:w ~~                     
#>     control           0.000  
#> 
#> Variances:
#>                    Population
#>    .y                 0.945  
#>     x                 1.000  
#>     w                 1.000  
#>     x:w               1.000  
#>     control           1.000  
#> 
#> (Computing indirect effects for 1 paths ...)
#> 
#> (Computing conditional effects for 2 paths ...)
#> 
#> == Population Conditional/Indirect Effect(s) ==
#> 
#> == Effect(s) ==
#> 
#>                ind
#> control -> y 0.100
#> 
#>  - The 'ind' column shows the effect(s).
#>  
#> == Conditional effects ==
#> 
#>  Path: x -> y
#>  Conditional on moderator(s): w
#>  Moderator(s) represented by: w
#> 
#>       [w] (w)    ind
#> 1 M+1.0SD   1  0.250
#> 2 Mean      0  0.100
#> 3 M-1.0SD  -1 -0.050
#> 
#>  - The 'ind' column shows the conditional effects.
#>  
#> 
#> == Conditional effects ==
#> 
#>  Path: w -> y
#>  Conditional on moderator(s): x
#>  Moderator(s) represented by: x
#> 
#>       [x] (x)    ind
#> 1 M+1.0SD   1  0.250
#> 2 Mean      0  0.100
#> 3 M-1.0SD  -1 -0.050
#> 
#>  - The 'ind' column shows the conditional effects.
#>  
#> 
#> ======================= Data Information =======================
#> 
#> Number of Replications:  2 
#> Sample Sizes:  50000 
#> 
#> ==== Descriptive Statistics ====
#> 
#>         vars     n mean sd  skew kurtosis se
#> y          1 1e+05 0.00  1  0.02     0.01  0
#> x          2 1e+05 0.00  1  0.01     0.01  0
#> w          3 1e+05 0.00  1  0.00    -0.02  0
#> x:w        4 1e+05 0.00  1 -0.03     6.01  0
#> control    5 1e+05 0.01  1  0.00     0.00  0
#> 
#> ===== Parameter Estimates Based on All 2 Samples Combined =====
#> 
#> Total Sample Size: 100000 
#> 
#> ==== Standardized Estimates ====
#> 
#> Variances and error variances omitted.
#> 
#> Regressions:
#>                     est.std
#>   y ~                      
#>     x                 0.097
#>     w                 0.104
#>     x:w               0.152
#>     control           0.099
#> 
#> Covariances:
#>                     est.std
#>   x ~~                     
#>     w                -0.003
#>     x:w               0.002
#>     control           0.101
#>   w ~~                     
#>     x:w              -0.004
#>     control           0.001
#>   x:w ~~                   
#>     control           0.004
#> 
#> 
#> ==================== Extra Element(s) Found ====================
#> 
#> - fit
#> 
#> === Element(s) of the First Dataset ===
#> 
#> ============ <fit> ============
#> 
#> lavaan 0.6-19 ended normally after 1 iteration
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                         5
#> 
#>   Number of observations                         50000
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                 0.000
#>   Degrees of freedom                                 0

The population values for the regression paths are what we specified, and the estimates based on 5 × 104 by 2 or 100000 support that the dataset were generated correctly.

NOTE: If a product term is involved, and the component terms (x and w in this example) are correlated, the population standard deviation of this product term may not be equal to one (Bohrnstedt & Goldberger, 1969). Therefore, the model can be specified correctly even if the standard deviations of product terms in the section Descriptive Statistics are not close to one.

Call power4test() to Test The Moderation Effect

We can now do the simulation to estimate power. In this simple model, the test is just a test of the product term, x:w. This model can be fitted by linear regression using lm(). Let’s estimate the power when the sample size is 50 and the model is fitted by lm():

out2 <- power4test(nrep = 400,
                   model = mod2,
                   pop_es = mod2_es,
                   n = 100,
                   fit_model_args = list(fit_function = "lm"),
                   test_fun = test_moderation,
                   iseed = 1234,
                   parallel = TRUE)

These are the new arguments used:

Compute the Power

We can ues rejection_rates() again to estimate the power:

out2_power <- rejection_rates(out2)
out2_power
#> [test]: test_moderation: CIs  
#> [test_label]: y~x:w 
#>     est   p.v reject r.cilo r.cihi
#> 1 0.158 1.000  0.347  0.301  0.394
#> Notes:
#> - p.v: The proportion of valid replications.
#> - est: The mean of the estimates in a test across replications.
#> - reject: The proportion of 'significant' replications, that is, the
#>   rejection rate. If the null hypothesis is true, this is the Type I
#>   error rate. If the null hypothesis is false, this is the power.
#> - r.cilo,r.cihi: The confidence interval of the rejection rate, based
#>   on normal approximation.
#> - Refer to the tests for the meanings of other columns.

The estimated power of the test of the product term, x:w, is 0.347, with 95% confidence interval [0.301; 0.394].

Moderated mediation

Let’s consider a moderated mediation model.

Specify the Population Model and Values

mod3 <-
"
m ~ x + w + x:w
y ~ m + x
"

This model is a mediation model with the a-path, m ~ x, moderated by w. As explained before, there is no need to use any label nor define and parameters. This will be handled by the test function to be used.

This is the syntax for the population values:

mod3_es <-
"
.beta.: s
m ~ x: m
y ~ m: m
m ~ x:w: s
"

Please refer to the previous section on setting up this syntax.

Call power4test() to Check the Model

We check the model first:

out3 <- power4test(nrep = 2,
                   model = mod3,
                   pop_es = mod3_es,
                   n = 50000,
                   iseed = 1234)
print(out3,
      data_long = TRUE)
#> 
#> ====================== Model Information ======================
#> 
#> == Model on Factors/Variables ==
#> 
#> m ~ x + w + x:w
#> y ~ m + x
#> 
#> == Model on Variables/Indicators ==
#> 
#> m ~ x + w + x:w
#> y ~ m + x
#> 
#> ====== Population Values ======
#> 
#> Regressions:
#>                    Population
#>   m ~                        
#>     x                 0.300  
#>     w                 0.100  
#>     x:w               0.050  
#>   y ~                        
#>     m                 0.300  
#>     x                 0.100  
#> 
#> Covariances:
#>                    Population
#>   x ~~                       
#>     w                 0.000  
#>     x:w               0.000  
#>   w ~~                       
#>     x:w               0.000  
#> 
#> Variances:
#>                    Population
#>    .m                 0.898  
#>    .y                 0.881  
#>     x                 1.000  
#>     w                 1.000  
#>     x:w               1.000  
#> 
#> (Computing indirect effects for 1 paths ...)
#> 
#> (Computing conditional effects for 2 paths ...)
#> 
#> == Population Conditional/Indirect Effect(s) ==
#> 
#> == Effect(s) ==
#> 
#>          ind
#> x -> y 0.100
#> 
#>  - The 'ind' column shows the effect(s).
#>  
#> == Conditional indirect effects ==
#> 
#>  Path: x -> m -> y
#>  Conditional on moderator(s): w
#>  Moderator(s) represented by: w
#> 
#>       [w] (w)   ind   m~x   y~m
#> 1 M+1.0SD   1 0.105 0.350 0.300
#> 2 Mean      0 0.090 0.300 0.300
#> 3 M-1.0SD  -1 0.075 0.250 0.300
#> 
#>  - The 'ind' column shows the conditional indirect effects.
#>  - 'm~x','y~m' is/are the path coefficient(s) along the path conditional
#>    on the moderator(s).
#> 
#> 
#> == Conditional indirect effects ==
#> 
#>  Path: w -> m -> y
#>  Conditional on moderator(s): x
#>  Moderator(s) represented by: x
#> 
#>       [x] (x)   ind   m~w   y~m
#> 1 M+1.0SD   1 0.045 0.150 0.300
#> 2 Mean      0 0.030 0.100 0.300
#> 3 M-1.0SD  -1 0.015 0.050 0.300
#> 
#>  - The 'ind' column shows the conditional indirect effects.
#>  - 'm~w','y~m' is/are the path coefficient(s) along the path conditional
#>    on the moderator(s).
#> 
#> 
#> ======================= Data Information =======================
#> 
#> Number of Replications:  2 
#> Sample Sizes:  50000 
#> 
#> ==== Descriptive Statistics ====
#> 
#>     vars     n mean sd skew kurtosis se
#> m      1 1e+05    0  1 0.03     0.03  0
#> y      2 1e+05    0  1 0.01    -0.01  0
#> x      3 1e+05    0  1 0.00    -0.02  0
#> w      4 1e+05    0  1 0.00     0.01  0
#> x:w    5 1e+05    0  1 0.04     5.92  0
#> 
#> ===== Parameter Estimates Based on All 2 Samples Combined =====
#> 
#> Total Sample Size: 100000 
#> 
#> ==== Standardized Estimates ====
#> 
#> Variances and error variances omitted.
#> 
#> Regressions:
#>                     est.std
#>   m ~                      
#>     x                 0.303
#>     w                 0.099
#>     x:w               0.052
#>   y ~                      
#>     m                 0.299
#>     x                 0.098
#> 
#> Covariances:
#>                     est.std
#>   x ~~                     
#>     w                 0.003
#>     x:w              -0.001
#>   w ~~                     
#>     x:w               0.008
#> 
#> 
#> ==================== Extra Element(s) Found ====================
#> 
#> - fit
#> 
#> === Element(s) of the First Dataset ===
#> 
#> ============ <fit> ============
#> 
#> lavaan 0.6-19 ended normally after 1 iteration
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                         7
#> 
#>   Number of observations                         50000
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                 0.007
#>   Degrees of freedom                                 2
#>   P-value (Chi-square)                           0.997

The population values and the estimates based on 5 × 104 by 2 or 100000 are what we expect.

Call power4test() to Test The Moderated Mediation Effect

To estimate the power of a moderated mediation effect, we can test the index of moderated mediation (Hayes, 2015). In this example, it is the product of the coefficient m ~ x:w and the coefficient y ~ m. This can be done by the test function test_index_of_mome(), provided by power4mome. Again, Monte Carlo confidence interval is used.

Let’s estimate the power when sample size is 100.

out3 <- power4test(nrep = 400,
                   model = mod3,
                   pop_es = mod3_es,
                   n = 100,
                   R = 2000,
                   ci_type = "mc",
                   test_fun = test_index_of_mome,
                   test_args = list(x = "x",
                                    m = "m",
                                    y = "y",
                                    w = "w",
                                    mc_ci = TRUE),
                   iseed = 1234,
                   parallel = TRUE)

The call is similar to the one used in testing mediation.

This is the new argument used:

Compute the Power

We can ues rejection_rates() again to estimate the power:

out3_power <- rejection_rates(out3)
out3_power
#> [test]: test_index_of_mome: x->m->y, moderated by w 
#> [test_label]: Test 
#>     est   p.v reject r.cilo r.cihi
#> 1 0.016 1.000  0.055  0.033  0.077
#> Notes:
#> - p.v: The proportion of valid replications.
#> - est: The mean of the estimates in a test across replications.
#> - reject: The proportion of 'significant' replications, that is, the
#>   rejection rate. If the null hypothesis is true, this is the Type I
#>   error rate. If the null hypothesis is false, this is the power.
#> - r.cilo,r.cihi: The confidence interval of the rejection rate, based
#>   on normal approximation.
#> - Refer to the tests for the meanings of other columns.

The estimated power of the test of moderated mediation effect, conducted by a test of the index of moderated mediation, is 0.055, 95% confidence interval [0.033; 0.077].

Unlike the previous example on moderation tested by regression, estimating the power of Monte Carlo confidence intervals is substantially slower. However, this is necessary because Monte Carlo or nonparametric bootstrapping confidence interval is the test usually used in moderated mediation (and mediation).

Repeating a Simulation With A Different Sample Size

The function power4test() also supports redoing an analysis using a new value for the sample size (or population effect sizes set to pop_es). Simply

For example, we can repeat the simulation for the test of moderation above, but for a sample size of 200. We simply call power4test() again, set the previous output (out2 in the example for moderation) as the first argument, and set n to a new value (200 in this example):

out2_new_n <- power4test(out2,
                         n = 200)
out2_new_n

This is the estimated power when the sample size is 200.

out2_new_n_reject <- rejection_rates(out2_new_n)
out2_new_n_reject
#> [test]: test_moderation: CIs  
#> [test_label]: y~x:w 
#>     est   p.v reject r.cilo r.cihi
#> 1 0.148 1.000  0.527  0.479  0.576
#> Notes:
#> - p.v: The proportion of valid replications.
#> - est: The mean of the estimates in a test across replications.
#> - reject: The proportion of 'significant' replications, that is, the
#>   rejection rate. If the null hypothesis is true, this is the Type I
#>   error rate. If the null hypothesis is false, this is the power.
#> - r.cilo,r.cihi: The confidence interval of the rejection rate, based
#>   on normal approximation.
#> - Refer to the tests for the meanings of other columns.

The estimated power is 0.527, 95% confidence interval [0.479; 0.576], when the sample size is 200.

This technique can be repeated to find the required sample size for a target power, and can be used for all the other scenarios covered above, such as mediation and moderated mediation.

Find the Sample Size With The Desired Power

There are several more efficient ways to find the sample size with the desired power.

Using n_region_from_power()

The function n_region_from_power() can be used to find the region of sample sizes likely to have the desired power. If the default settings are to be used, then it can be called directly on the output of power4test():

out2_region <- n_region_from_power(out2,
                                   seed = 2345)

This is the recommended way for sample size planning, when there is no predetermined range of sample sizes.

See the templates for examples on using n_region_from_power() for common models.

Using power4test_by_n()

The function power4test_by_n() can be used To estimate the power for a sequence of sample sizes. For example, we can estimate the power in the moderation model above for these sample sizes: 250, 300, 350, 400.

out2_several_ns <- power4test_by_n(out2,
                                   n = c(250, 300, 350, 400),
                                   by_seed = 4567)

The first argument is the output of power4test() for an arbitrary sample size.

The argument n is a numeric vector of sample sizes to examine.

The argument by_seed, if set to an integer, try to make the results reproducible.

The call will take some times to run because it is equivalent to calling power4test() once for each sample size.

The rejection rates for each sample size can be retrieved by rejection_rates() too:

rejection_rates(out2_several_ns)
#> [test]: test_moderation: CIs  
#> [test_label]: y~x:w 
#>     n   est   p.v reject r.cilo r.cihi
#> 1 250 0.149 1.000  0.660  0.614  0.706
#> 2 300 0.150 1.000  0.733  0.689  0.776
#> 3 350 0.151 1.000  0.810  0.772  0.848
#> 4 400 0.150 1.000  0.850  0.815  0.885
#> Notes:
#> - n: The sample size in a trial.
#> - p.v: The proportion of valid replications.
#> - est: The mean of the estimates in a test across replications.
#> - reject: The proportion of 'significant' replications, that is, the
#>   rejection rate. If the null hypothesis is true, this is the Type I
#>   error rate. If the null hypothesis is false, this is the power.
#> - r.cilo,r.cihi: The confidence interval of the rejection rate, based
#>   on normal approximation.
#> - Refer to the tests for the meanings of other columns.

The results show that, to have a power of about .800 to detect the moderation effect, a sample size of about 350 is needed.

This approach is used when the range of sample sizes has already been decided and the levels of power are needed to determine the final sample size.

Please refer to the help page of power4test_by_n() for other examples.

Using x_from_power()

The function x_from_power() can be used to systematically search within an interval the sample size with the target power. This takes longer to run but, instead of manually trying different sample size, this function do the search automatically.

This approach can be used when the goal is to find the probable minimum or maximum sample size with the desired level of power. The first approach, using n_region_from_power(), simply uses this approach twice to find the region of sample sizes.

See this article for an illustration of how to use x_from_power().

Other Advanced Features

This brief illustration only covers the basic features of power4mome. These are other advanced features to be covered in other articles:

out2_new_xw <- power4test(out2,
                          pop_es = c("y ~ x:w" = ".30"))

Limitations

References

Bohrnstedt, G. W., & Goldberger, A. S. (1969). On the exact covariance of products of random variables. Journal of the American Statistical Association, 64(328), 1439–1442. https://doi.org/10.2307/2286081
Cheung, S. F., & Cheung, S.-H. (2024). Manymome: An R package for computing the indirect effects, conditional effects, and conditional indirect effects, standardized or unstandardized, and their bootstrap confidence intervals, in many (though not all) models. Behavior Research Methods, 56(5), 4862–4882. https://doi.org/10.3758/s13428-023-02224-z
Hayes, A. F. (2015). An index and test of linear moderated mediation. Multivariate Behavioral Research, 50(1), 1–22. https://doi.org/10.1080/00273171.2014.962683

  1. Users can specify the values directly if necessary.↩︎

  2. The functions used are parallel::clusterSetRNGStream() for parallel processing, and set.seed() for serial processing.↩︎

  3. They are implemented by manymome::do_mc() and manymome::do_boot(), respectively.↩︎

  4. The test is implemented by manymome::indirect().↩︎

  5. The number of cores is determined automatically but can be set directly by the ncores argument.↩︎

  6. See the help page of fit_model() on other available arguments.↩︎

  7. The test name has CIs in it but this is equivalent to using the t test when the model is fitted by lm().↩︎

  8. The test is implemented by manymome::test_index_of_mome().↩︎