Estimating the Capacity for Improvement in Diagnostic Risk Prediction with an additional marker based on the Diagnostic Likelihood Ratio (DLR)

This function allows for estimating the log diagnostic likelihood ratio in a regression model approach. It can be used to assess the gain in diagnostic accuracy for a new binary or continuous diagnostic marker compared to established markers, to determine the impact of covariates on the risk prediction model, and to estimate the DLR for selected marker/covariate values.

Usage

DLR(basemodel, augmentedmodel, diseasestatus, dataset, clustervar = NULL, alpha=0.05)

Arguments

basemodel: pre-test/base model X, formula character string
augmentedmodel: post-test/ augmented model V, formula character string, this is usually the basemodel X including the additional diagnostic test of interest Y and interactions XY
diseasestatus: variable name containing disease status, assumed to be a 0/1 variable, for having condition of interest (1) or not (0), character string
dataset: dataframe, needs to be in wide format with one observation per subject
clustervar: optional, cluster variable name in dataset, character string
alpha: significance level alpha used for confidence intervals, the default is 0.05.

Details

This function is an implementation of the algorithm described in the appendix of Gu and Pepe (2009) using the GEE approach in order to get standard error estimates. The definition of I and Zero matrices is slightly more flexible than the ones described in section 3 in order to allow for models without interaction.

Value

Returns a list including

logPreTestModel: logistic regression model output for prior disease using base model X: P(D=1|X). All estimates are on a log scale.
logPostTestModel: logistic regression model output for posterior disease using augmented model V: P(D=1|X,Y),i.e. P(D=1|V). All estimates are on a log scale.
logDLRModel: regression model output for log DLR defined as difference between logPostTestModel and logPreTestModel. All estimates are on a log scale.
DLR: Positive/negative DLR for diagnostic marker Y, with all base covariates X set to 1. Results are only sensible for binary marker Y taking values 0/1.

References

Gu, W. and Pepe, M. S. (2009). Estimating the capacity for improvement in risk prediction with a marker. Biostatistics, 10(1):172-86.

Author

Thomas Hielscher (t.hielscher@dkfz.de)

Examples

library(DTComPair)
data(Paired1)

# test y1 conditioned on null model: DLR+(Y1=1) and DLR-(Y1=0)

DLR("~ 1","~ y1","d",Paired1)
#> Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
#> running glm to get initial regression estimate
#> x.intercept v.intercept        v.y1 
#>   0.5469469  -1.1871657   2.7402852 
#> $logPreTestModel
#>              Estimate Naive S.E.  Naive z Robust S.E. Robust z
#> x.intercept 0.5469469 0.07785552 7.025153  0.07777347 7.032564
#> 
#> $logPostTestModel
#>              Estimate Naive S.E.   Naive z Robust S.E.  Robust z
#> v.intercept -1.187166  0.1556254 -7.628355   0.1554614 -7.636403
#> v.y1         2.740285  0.1966554 13.934448   0.1964482 13.949150
#> 
#> $logDLRModel
#>               coefs  robustSE     Zstat       pvalue   lowerCI   upperCI
#> intercept -1.734113 0.1346088 -12.88261 5.639448e-38 -1.997941 -1.470284
#> y1         2.740285 0.1964482  13.94915 3.183780e-44  2.355254  3.125317
#> 
#> $DLR
#>                    DLR   lowerCI   upperCI
#> DLR(Y=1|X=1) 2.7351124 2.2860079 3.2724472
#> DLR(Y=0|X=1) 0.1765568 0.1356142 0.2298601
#> 

# test y1 conditioned on test y2 with interaction, DLR+(Y1=1|Y2=1) and DLR-(Y1=0|Y2=1)

DLR("~ y2","~ y2 * y1","d",Paired1) 
#> Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
#> running glm to get initial regression estimate
#> x.intercept        x.y2 v.intercept        v.y2        v.y1     v.y2.y1 
#>  -0.6370577   2.4986483  -1.5776892   1.5776892   1.9641061   0.3670976 
#> $logPreTestModel
#>               Estimate Naive S.E.   Naive z Robust S.E.  Robust z
#> x.intercept -0.6370577  0.1181415 -5.392329   0.1178923 -5.403725
#> x.y2         2.4986483  0.1893413 13.196530   0.1889420 13.224420
#> 
#> $logPostTestModel
#>               Estimate Naive S.E.    Naive z Robust S.E.   Robust z
#> v.intercept -1.5776892  0.1945794 -8.1082030   0.1941690 -8.1253390
#> v.y2         1.5776892  0.3593813  4.3900152   0.3586233  4.3992931
#> v.y1         1.9641061  0.2639766  7.4404546   0.2634199  7.4561794
#> v.y2.y1      0.3670978  0.4433058  0.8280915   0.4423709  0.8298416
#> 
#> $logDLRModel
#>                coefs  robustSE      Zstat       pvalue    lowerCI    upperCI
#> intercept -0.9406315 0.1542822 -6.0968227 1.081974e-09 -1.2430191 -0.6382439
#> y2        -0.9209591 0.3048141 -3.0213796 2.516257e-03 -1.5183837 -0.3235344
#> y1         1.9641061 0.2634199  7.4561794 8.906744e-14  1.4478126  2.4803997
#> y2.y1      0.3670978 0.4423709  0.8298416 4.066283e-01 -0.4999332  1.2341288
#> 
#> $DLR
#>                    DLR    lowerCI   upperCI
#> DLR(Y=1|X=1) 1.5993757 1.27265876 2.0099673
#> DLR(Y=0|X=1) 0.1554252 0.09284387 0.2601895
#>