Title: | Twang Causal Mediation Modeling via Weighting |
---|---|
Description: | Provides functions for estimating natural direct and indirect effects for mediation analysis. It uses weighting where the weights are functions of estimates of the probability of exposure or treatment assignment (Hong, G (2010). <https://cepa.stanford.edu/sites/default/files/workshops/GH_JSM%20Proceedings%202010.pdf> Huber, M. (2014). <doi:10.1002/jae.2341>). Estimation of probabilities can use generalized boosting or logistic regression. Additional functions provide diagnostics of the model fit and weights. The vignette provides details and examples. |
Authors: | Dan McCaffrey [aut, cre], Katherine Castellano [aut], Donna Coffman [aut], Brian Vegetabile [aut], Megan Schuler [aut], Haoyu Zhou [aut] |
Maintainer: | Dan McCaffrey <[email protected]> |
License: | GPL-3 |
Version: | 1.2 |
Built: | 2025-03-13 04:31:42 UTC |
Source: | https://github.com/cran/twangMediation |
Provides functions for estimating natural direct and indirect effects for mediation analysis. It uses weighting where the weights are functions of estimates of the probability of exposure or treatment assignment (Hong, G (2010). <https://cepa.stanford.edu/sites/default/files/workshops/GH_JSM
Maintainer: Dan McCaffrey [email protected]
Authors:
Katherine Castellano [email protected]
Donna Coffman [email protected]
Brian Vegetabile [email protected]
Megan Schuler [email protected]
Haoyu Zhou [email protected]
Compute the balance table for mediation object.
bal.table.mediation(x, digits = 3, details = FALSE, plot = FALSE, ...)
bal.table.mediation(x, digits = 3, details = FALSE, plot = FALSE, ...)
x |
A |
digits |
Number of digits to round to. Dafault: 3 |
details |
logical. If |
plot |
logical. If |
... |
Additional arguments. |
res |
tables detailing covariate balance across exposure groups both before and after weighting |
print.bal.table.mediation
, wgtmed
data("tMdat") ## tMdat is small simulated data set included in twangMedRiation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "logistic" ) bal.table.mediation(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMedRiation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "logistic" ) bal.table.mediation(fit.es.max)
Calculate the actual effects
calculate_effects(w_11, w_00, w_10, w_01, y_outcome, sampw = NULL)
calculate_effects(w_11, w_00, w_10, w_01, y_outcome, sampw = NULL)
w_11 |
The Y(1, M(1)) weights |
w_00 |
The Y(0, M(0)) weights |
w_10 |
The Y(1, M(0)) weights |
w_01 |
The Y(0, M(1)) weights |
y_outcome |
The Y variable |
sampw |
Sampling weights, set to NULL by default. |
res |
The actual effects |
check_missing
raises and error if the data contains.
NA or NAN values.
check_missing(x)
check_missing(x)
x |
numeric The data set to check for NA or NAN values. |
Indicator of the existence of NA or NAN values
Describe the effects, and calculate standard errors and confidence intervals
desc.effects(x, ...)
desc.effects(x, ...)
x |
An object |
... |
list, optional Additional arguments. |
Effects, standard errors and confidence intervals of an object
desc.effects.mediation
, wgtmed
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) desc.effects(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) desc.effects(fit.es.max)
Describe the effects, and calculate standard errors and confidence intervals from a mediation object
## S3 method for class 'mediation' desc.effects(x, y_outcome = NULL, ...)
## S3 method for class 'mediation' desc.effects(x, y_outcome = NULL, ...)
x |
A mediation object |
y_outcome |
The outcome; if |
... |
Additional arguments.. |
results |
effects, standard errors, and confidence intervals of a mediation object |
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) desc.effects(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) desc.effects(fit.es.max)
dx.wts.mediation
takes a ps
object or a set of propensity scores and
computes diagnostics assessing covariates balance.
dx.wts.mediation( x, data, estimand, vars = NULL, treat.var, x.as.weights = TRUE, sampw = NULL, perm.test.iters = 0 )
dx.wts.mediation( x, data, estimand, vars = NULL, treat.var, x.as.weights = TRUE, sampw = NULL, perm.test.iters = 0 )
x |
A data frame, matrix, or vector of propensity score weights or a ps
object. |
data |
A data frame. |
estimand |
The estimand of interest: either "ATT" or "ATE". |
vars |
A vector of character strings naming variables in |
treat.var |
A character string indicating which variable in |
x.as.weights |
|
sampw |
Optional sampling weights. If |
perm.test.iters |
A non-negative integer giving the number of iterations
of the permutation test for the KS statistic. If |
Creates a balance table that compares unweighted and weighted means and standard deviations, computes effect sizes, and KS statistics to assess the ability of the propensity scores to balance the treatment and control groups.
Returns a list containing
treat
The vector of 0/1 treatment assignment indicators.
wgtmed,bal.table.mediation,
print.mediation,summary.mediation
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) ## dx.wts.mediation is used internally by bal.table.mediation, ## print.mediation, and summary.mediation summary(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) ## dx.wts.mediation is used internally by bal.table.mediation, ## print.mediation, and summary.mediation summary(fit.es.max)
A dataset containing the substance use condition and sexual orientation of 40293 women respondents to the 2017 & 2018 National Survey of Drug Use and Health.
NSDUH_female
NSDUH_female
A data frame with 40293 rows and 24 variables:
indiidual smoked any cigarettes within the past month, yes or no
education level, 1 = less than high school diploma, 2 = high school diploma, 3 = some college/associates degree, 4 = college degree or higher
income level, 1 <= $20,000, 2 = $20,000 - $49,999, 3 = $50,000 - 70,000, 4 = $75,000+
NSDUH sampling weight
NSDUH strata variable
NSDUH replicate within stratum
employment status, 1 = full-time employment, 2 = part-time employment, 3 = student, 4 = unemployed, 5 = other
1 = non-Hispanic white, 2 = non-Hispanic Black, 3 = student, 4 = multiracial/other race
iniciated alcohol use prior to 15 years old
iniciated smoking prior to 15 years old, yes or no
1 = lesbian, gay or sexual, 0 = heterosexual
individual meets criteria for either past-year alcohol use disorder or nicotine dependence
NSDUH sampling weights(scaled for pooling 2017 and 2018 survey years)
age, 1 = 18-25, 2 = 26-34, 3 = 35-49, 4 = 50+
NSDUH_female |
A sample data for demonstration |
https://nsduhweb.rti.org/respweb/homepage.cfm
## Not run: data(NSDUH_female) ## End(Not run)
## Not run: data(NSDUH_female) ## End(Not run)
mediation
object.Plot the mediation
object.
## S3 method for class 'mediation' plot(x, subset = NULL, color = TRUE, ...)
## S3 method for class 'mediation' plot(x, subset = NULL, color = TRUE, ...)
x |
weighted_mediation object |
subset |
Used to restrict which of the |
color |
If |
... |
Additional arguments. |
Distribution plots of NIE1 (distribution of mediator for treatment sample weighted to match distribution of mediator under control for the population) and NIE0 (distribution of mediator for control sample weighted to match distribution of mediator under treatment for the population) for each mediator. For continuous mediators, distributions are plotted with density curves and for categorical (factor) mediators, distributions are plotted with barplots. .
wgtmed
for function input
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) plot(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) plot(fit.es.max)
mediation
classDefault print statement for mediation
class
## S3 method for class 'bal.table.mediation' print(x, ...)
## S3 method for class 'bal.table.mediation' print(x, ...)
x |
A |
... |
Additional arguments. |
Default print statement.
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) bal.table.mediation(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) bal.table.mediation(fit.es.max)
mediation
classDefault print statement for mediation
class
## S3 method for class 'mediation' print(x, ...)
## S3 method for class 'mediation' print(x, ...)
x |
A |
... |
Additional arguments. |
Default print statement.
wgtmed
for in put.
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) print(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) print(fit.es.max)
mediation
object.Displays a useful description of a mediation
object.
## S3 method for class 'mediation' summary(object, ...)
## S3 method for class 'mediation' summary(object, ...)
object |
A |
... |
Additional arguments. |
ps_tables |
Table of observations' propensity scores |
mediator_distribution_check |
balance tables for NIE_1 and NIE_0 |
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details ## The tMdat data contains the following variables ## See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) summary(fit.es.max)
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details ## The tMdat data contains the following variables ## See ?tMdat for details fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) summary(fit.es.max)
Call this in the wgtmed()
function and the bal.table.mediation()
function.
swapTxCtrl(dd)
swapTxCtrl(dd)
dd |
numeric An element of a desc object from a ps or dx.wts object |
A desc object with swapped treatment and control
A simulate dataset for demonstrating the functions in the twangMediation package.
tMdat
tMdat
A data frame with 500 rows and 7 variables:
Simulated continuous covariate
Simulated continuous covariate
Simulated continuous covariate
Simulated dichotomous exposure indicator
Simulated continuous outcome
Simulated mediator that has 11 unique values
Estimated inverse probability weight, estimated using GBM via the twang ps function
tMdat |
A sample of simulated data for demonstration |
## Not run: data(tMdat) ## End(Not run)
## Not run: data(tMdat) ## End(Not run)
weighted_mean
calculates a weighted mean, given a vector.
weighted_mean(x, weights, multiplier = NULL, na.rm = TRUE)
weighted_mean(x, weights, multiplier = NULL, na.rm = TRUE)
x |
numeric The the data set |
weights |
numeric The weights |
multiplier |
An additional vector to multiply
Default : |
na.rm |
Whether to remove NA values.
Default: |
numeric The weighted mean of the data.
Estimate causal mediation mechanism of a treatment using propensity score weighting.
wgtmed( formula.med, data, a_treatment, y_outcome = NULL, med_interact = NULL, total_effect_wts = NULL, total_effect_ps = NULL, total_effect_stop_rule = NULL, method = "ps", sampw = NULL, ps_n.trees = 10000, ps_interaction.depth = 3, ps_shrinkage = 0.01, ps_bag.fraction = 1, ps_n.minobsinnode = 10, ps_perm.test.iters = 0, ps_verbose = FALSE, ps_stop.method = c("ks.mean", "ks.max"), ps_version = "gbm", ps_ks.exact = NULL, ps_n.keep = 1, ps_n.grid = 25, ps_cv.folds = 10, ps_keep.data = FALSE )
wgtmed( formula.med, data, a_treatment, y_outcome = NULL, med_interact = NULL, total_effect_wts = NULL, total_effect_ps = NULL, total_effect_stop_rule = NULL, method = "ps", sampw = NULL, ps_n.trees = 10000, ps_interaction.depth = 3, ps_shrinkage = 0.01, ps_bag.fraction = 1, ps_n.minobsinnode = 10, ps_perm.test.iters = 0, ps_verbose = FALSE, ps_stop.method = c("ks.mean", "ks.max"), ps_version = "gbm", ps_ks.exact = NULL, ps_n.keep = 1, ps_n.grid = 25, ps_cv.folds = 10, ps_keep.data = FALSE )
formula.med |
A object of class formula relating the mediatior(s) to the covariates (potential confounding variables). |
data |
A dataset of class data.frame that includes the treatment indicator, mediator(s), and covariates. |
a_treatment |
The (character) name of the treatment variable, which must be dichotomous (0, 1). |
y_outcome |
The (character) name of the outcome variable, y. If this is not provided, then
no effects will be calculated and a warning will be raised. Default : |
med_interact |
The (character) vector of names of variables specified on the right-hand side of formula.med that consist of crossproducts or interactions between a covariate and the mediator. See the tutorial for details on these variables. |
total_effect_wts |
A vector of total effect weights, which if left |
total_effect_ps |
A ps object that contains the total effect weights, |
total_effect_stop_rule |
The stopping rule ( |
method |
The method for getting weights ("ps", "logistic", or "crossval"). Default : |
sampw |
Optional sampling weights Default : |
ps_n.trees |
Number of gbm iterations passed on to gbm. Default: 10000. |
ps_interaction.depth |
A positive integer denoting the tree depth used in gradient boosting. Default: 3. |
ps_shrinkage |
A numeric value between 0 and 1 denoting the learning rate. See gbm for more details. Default: 0.01. |
ps_bag.fraction |
A numeric value between 0 and 1 denoting the fraction of the observations randomly selected in each iteration of the gradient boosting algorithm to propose the next tree. See gbm for more details. Default: 1.0. |
ps_n.minobsinnode |
An integer specifying the minimum number of observations in the terminal nodes of the trees used in the gradient boosting. See gbm for more details. Default: 10. |
ps_perm.test.iters |
A non-negative integer giving the number of iterations
of the permutation test for the KS statistic. If |
ps_verbose |
If |
ps_stop.method |
A method or methods of measuring and summarizing balance across pretreatment
variables. Current options are |
ps_version |
"gbm", "xgboost", or "legacy", indicating which version of the twang package to use. |
ps_ks.exact |
|
ps_n.keep |
A numeric variable indicating the algorithm should only
consider every |
ps_n.grid |
A numeric variable that sets the grid size for an initial
search of the region most likely to minimize the |
ps_cv.folds |
A numeric variable that sets the number of cross-validation folds if using method='crossval'. Default: 10. |
ps_keep.data |
A logical variable that determines if the dataset should be saved
in the resulting |
For users comfortable with ps, any options prefaced with
ps_
are passed directly to the ps()
function.
Model A is used to estimate Pr(A=1 | X) where X is the vector of background covariates specified in formula.med
. If method
equals "ps"
model A is fit using the twang ps
function with estimand= "ATE"
. If method
equals "logistic"
then model A is fit using logistic regression. If method
equals "crossval"
then gbm using cross-validation is used to estimate model A. Because X might include variables not used to estimate the user-provided total effect weights, model A is fit rather than using the user-provided total effect weights to derive Pr(A | X). If the user uses the same set of variables to estimate their provided total effect weights as they enter in the wgtmed function to estimate the cross-world weights and the user uses the same estimation method and arguments as specified in the wgtmed function, then the estimated model A will match the model the user used to obtain the provided total effect weights.
mediation object
The mediation
object includes the following:
model_a
The model A ps()
results.
model_m1
The model M1 ps()
results.
model_m0
The model M0 ps()
results.
data
The data set used to compute models
stopping_methods
The stopping methods passed to stop.method
.
datestamp
The date when the analysis was run.
For each stop.method
, a list with the following:
TE
The total effect.
NDE_0
The natural direct effect, holding the mediator constant at 0.
NIE_1
The natural indirect effect, holding the exposure constant at 1.
NDE_1
The natural direct effect, holding the mediator constant at 1.
NIE_0
The natural indirect effect, holding the exposure constant at 0.
expected_treatment0_mediator0
E(Y(0, M(0)))
expected_treatment1_mediator1
E(Y(1, M(1)))
expected_treatment1_mediator0
E(Y(1, M(0)))
expected_treatment0_mediator1
E(Y(0, M(1)))
dx.wts
A list with information for checking covariate balance of for each
estimated effect. Elements are TE, NIE1, NDE0, NIE0, NDE1, with results of
twang
dx.wts
for the covariates when weighted by weights used in the
estimating the effect.
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details head(tMdat) ## The tMdat data contains the following variables: ## w1, w2, w3 -- Simulatad covariates ## A -- Simulated dichotomous exposure indicator ## M -- Simulated discrete mediator (11 values) ## Y -- Simulated continuous outcome ## te.wgt -- Estimated inverse probability weight, estimated using ## GBM via the twang ps function fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) fit.es.max
data("tMdat") ## tMdat is small simulated data set included in twangMediation for ## demonstrating the functions. See ?tMdat for details head(tMdat) ## The tMdat data contains the following variables: ## w1, w2, w3 -- Simulatad covariates ## A -- Simulated dichotomous exposure indicator ## M -- Simulated discrete mediator (11 values) ## Y -- Simulated continuous outcome ## te.wgt -- Estimated inverse probability weight, estimated using ## GBM via the twang ps function fit.es.max <- wgtmed(M ~ w1 + w2 + w3, data = tMdat, a_treatment = "A", y_outcome = "Y", total_effect_wts = tMdat$te.wgt, method = "ps", ps_n.trees=1500, ps_shrinkage=0.01, ps_stop.method=c("es.max") ) fit.es.max