Identification-through-group-testing

> library(binGroup2)

We provide a few examples here showing how to use binGroup2 for the identification aspect of group testing. More in-depth examples are available in Bilder et al. (2023), which has been tentatively accepted by the R Journal at the time of writing this vignette. For further information on group testing, please see www.chrisbilder.com/grouptesting. This website also includes the most recent version of Bilder et al. (2023).

Operating characteristics

The opChar1() and opChar2() functions compute operating characteristics, such as the expected number of tests, for a group testing algorithm. Below is an example of how opChar1() can be used for two-stage hierarchical testing (also known as Dorfman testing) with a one-infection assay. In this example, we use an overall prevalence of 0.01, a sensitivity and specificity of 0.99 at each stage, and an initial group size of 10.

> # Group membership matrix
> group.member <- GroupMembershipMatrix(stage1 = 10)
> group.member
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    1    1    1    1    1    1    1    1     1
#> [2,]    1    2    3    4    5    6    7    8    9    10
> 
> # Compute operating characteristics for a one-infection assay
> save1 <- opChar1(algorithm = "D2", p = 0.01, Se = 0.99, Sp = 0.99,
     hier.config = group.member, print.time = FALSE)
> names(save1)
#> [1] "algorithm" "prob"      "Se"        "Sp"        "Config"    "p.vec"    
#> [7] "ET"        "value"     "Accuracy"
> summary(save1)
#> 
#> Algorithm: Non-informative two-stage hierarchical testing 
#> 
#> Testing configuration:
#> Stage 1: 10
#> 
#> Expected number of tests: 2.04
#> Expected number of tests per individual: 0.2037
#> 
#> Accuracy for individuals:
#>      PSe    PSp   PPPV   PNPV Individuals
#> 1 0.9801 0.9991 0.9127 0.9998         All
#> 
#> Overall accuracy of the algorithm:
#>      PSe    PSp   PPPV   PNPV
#> 1 0.9801 0.9991 0.9127 0.9998
#> 
#> PSe denotes the pooling sensitivity.
#> PSp denotes the pooling specificity.
#> PPPV denotes the pooling positive predictive value.
#> PNPV denotes the pooling negative predictive value.
> ExpTests(save1)
#>   ExpTests ExpTestsPerIndividual PercentReductionTests PercentIncreaseTestCap
#> 1   2.0371                0.2037                 79.63                 390.90

The group membership matrix shows that each individual is tested in group 1 for the first stage. If this group tests positive, each individual is tested separately in the second stage. The opChar1() function returns its calculations in a list object that is then summarized by summary(). For example, the expected number of tests is 2.04 for this group. Additional functions, like ExpTests(), can access the information available in objects created by opChar1() as well.

The opChar2() function performs very similar calculations but for two-infection assays. Below is an example using two-stage hierarchical testing again. Because there are two infections, a vector of joint probabilities is specified in the form \((p_{--},p_{+-},p_{-+},p_{++})\), where \(p_{ab}\) is the probability of being positive/negative \((+/-)\) for infections \(a\) and \(b\). We continue using a sensitivity and specificity of 0.99 at each stage and an initial group size of 10.

> # Compute operating characteristics for a two-infection assay
> save2 <- opChar2(algorithm = "D2", p.vec = c(0.95, 0.02, 0.02, 0.01),
     Se = c(0.99, 0.99), Sp = c(0.99, 0.99), hier.config = group.member,
     print.time = FALSE)
> names(save2)
#> [1] "algorithm" "prob.vec"  "Se"        "Sp"        "Config"    "p.mat"    
#> [7] "ET"        "value"     "Accuracy"
> summary(save2)
#> 
#> Algorithm: Non-informative two-stage hierarchical testing 
#> 
#> Testing configuration:
#> Stage 1: 10
#> 
#> Expected number of tests: 5.10
#> Expected number of tests per individual: 0.5104
#> 
#> Disease 1 accuracy for individuals:
#>      PSe    PSp   PPPV   PNPV Individuals
#> 1 0.9850 0.9961 0.8859 0.9995         All
#> 
#> Disease 2 accuracy for individuals:
#>      PSe    PSp   PPPV   PNPV Individuals
#> 1 0.9850 0.9961 0.8859 0.9995         All
#> 
#> Overall accuracy of the algorithm:
#>      PSe    PSp   PPPV   PNPV
#> 1 0.9850 0.9961 0.8859 0.9995
#> 2 0.9850 0.9961 0.8859 0.9995
#> 
#> PSe denotes the pooling sensitivity.
#> PSp denotes the pooling specificity.
#> PPPV denotes the pooling positive predictive value.
#> PNPV denotes the pooling negative predictive value.
> ExpTests(save2)
#>   ExpTests ExpTestsPerIndividual PercentReductionTests PercentIncreaseTestCap
#> 1   5.1042                0.5104                 48.96                  95.92

The expected number of tests is 5.10 for this testing configuration.

Optimal testing configuration

The OTC1() and OTC2() functions find the optimal testing configuration for a group testing algorithm. Below is an example of how OTC1() can be used for two-stage hierarchical testing with a one-infection assay. In this example, we again use an overall prevalence of 0.01 and a sensitivity and specificity of 0.99 at each stage. The OTC is searched for over group sizes of 3 to 20.

> # Find OTC for a one-infection assay
> save3 <- OTC1(algorithm = "D2", p = 0.01, Se = 0.99, Sp = 0.99, group.sz = 3:20,
     obj.fn = "ET", print.time = FALSE)
#> Initial Group Size = 3
#> Initial Group Size = 4
#> Initial Group Size = 5
#> Initial Group Size = 6
#> Initial Group Size = 7
#> Initial Group Size = 8
#> Initial Group Size = 9
#> Initial Group Size = 10
#> Initial Group Size = 11
#> Initial Group Size = 12
#> Initial Group Size = 13
#> Initial Group Size = 14
#> Initial Group Size = 15
#> Initial Group Size = 16
#> Initial Group Size = 17
#> Initial Group Size = 18
#> Initial Group Size = 19
#> Initial Group Size = 20
> names(save3)
#> [1] "algorithm" "prob"      "Se"        "Sp"        "opt.ET"    "Configs"  
#> [7] "group.sz"
> summary(save3)
#> 
#> Algorithm: Non-informative two-stage hierarchical testing 
#> 
#> Optimal testing configuration:
#>    Stage 1
#> ET      11
#> 
#> Expected number of tests:
#>    E(T)  Value
#> ET 2.24 0.2035
#> 
#> E(T) denotes the expected number of tests.
#> Value denotes the objective function value per individual.
#> 
#> Overall accuracy of the algorithm:
#>       PSe    PSp   PPPV   PNPV
#> ET 0.9801 0.9990 0.9052 0.9998
#> 
#> PSe denotes the pooling sensitivity.
#> PSp denotes the pooling specificity.
#> PPPV denotes the pooling positive predictive value.
#> PNPV denotes the pooling negative predictive value.
> Config(save3)
#>    I     ET  value    PSe    PSp   PPPV   PNPV
#> 1 11 2.2383 0.2035 0.9801 0.9990 0.9052 0.9998
#> 2 10 2.0371 0.2037 0.9801 0.9991 0.9127 0.9998
#> 3 12 2.4561 0.2047 0.9801 0.9989 0.8979 0.9998
#> 4  9 1.8528 0.2059 0.9801 0.9991 0.9203 0.9998
#> 5 13 2.6904 0.2070 0.9801 0.9988 0.8908 0.9998

The OTC has a group size of 11 with an expected number of tests per individual of 0.2035. This testing configuration is slightly more efficient than using a group size of 10 as specified in our previous example. This can be seen by comparing the summary() outputs or from the Config() function results which accesses information stored in save3.

The OTC2() function works in much the same way as OTC1() with the main difference being the specification of joint probabilities rather a single-infection prevalence. In the example below, we search for the OTC over group sizes of 3 to 20 when using a two-stage hierarchical testing algorithm.

> # Find OTC for a two-infection assay
> save4 <- OTC2(algorithm = "D2", p.vec = c(0.95, 0.02, 0.02, 0.01),
     Se = c(0.99, 0.99), Sp = c(0.99, 0.99), group.sz = 3:20, obj.fn = "ET",
     print.time = FALSE)
#> Initial Group Size = 3
#> Initial Group Size = 4
#> Initial Group Size = 5
#> Initial Group Size = 6
#> Initial Group Size = 7
#> Initial Group Size = 8
#> Initial Group Size = 9
#> Initial Group Size = 10
#> Initial Group Size = 11
#> Initial Group Size = 12
#> Initial Group Size = 13
#> Initial Group Size = 14
#> Initial Group Size = 15
#> Initial Group Size = 16
#> Initial Group Size = 17
#> Initial Group Size = 18
#> Initial Group Size = 19
#> Initial Group Size = 20
> names(save4)
#> [1] "algorithm" "prob.vec"  "Se"        "Sp"        "opt.ET"    "Configs"  
#> [7] "group.sz"
> summary(save4)
#> 
#> Algorithm: Non-informative two-stage hierarchical testing 
#> 
#> Optimal testing configuration:
#>    Stage 1
#> ET       5
#> 
#> Expected number of tests:
#>    E(T)  Value
#> ET 2.20 0.4399
#> 
#> E(T) denotes the expected number of tests.
#> Value denotes the objective function value per individual.
#> 
#> Overall accuracy of the algorithm:
#>      PSe    PSp   PPPV   PNPV
#> 1 0.9842 0.9978 0.9336 0.9995
#> 2 0.9842 0.9978 0.9336 0.9995
#> 
#> PSe denotes the pooling sensitivity.
#> PSp denotes the pooling specificity.
#> PPPV denotes the pooling positive predictive value.
#> PNPV denotes the pooling negative predictive value.
> Config(save4)
#>   I     ET  value   PSe1   PSp1  PPPV1  PNPV1   PSe2   PSp2  PPPV2  PNPV2
#> 1 5 2.1996 0.4399 0.9842 0.9978 0.9336 0.9995 0.9842 0.9978 0.9336 0.9995
#> 2 6 2.6655 0.4443 0.9843 0.9974 0.9226 0.9995 0.9843 0.9974 0.9226 0.9995
#> 3 4 1.8012 0.4503 0.9840 0.9982 0.9454 0.9995 0.9840 0.9982 0.9454 0.9995
#> 4 7 3.1937 0.4562 0.9845 0.9971 0.9124 0.9995 0.9845 0.9971 0.9124 0.9995
#> 5 8 3.7791 0.4724 0.9847 0.9967 0.9029 0.9995 0.9847 0.9967 0.9029 0.9995

The OTC has a group size of 5 with an expected number of tests per individual of 0.4399. To determine how much more efficient the OTC is in comparison to our previous two-infection example’s use of a group size of 10, we use the CompareConfig() function.

> # Compare testing configurations
> group.member.OTC <- GroupMembershipMatrix(stage1 = 5)
> save5 <- opChar2(algorithm = "D2", p.vec = c(0.95, 0.02, 0.02, 0.01),
     Se = c(0.99, 0.99), Sp = c(0.99, 0.99), hier.config = group.member.OTC,
     print.time = FALSE)
> CompareConfig(save2, save5)
#> Testing configurations compared to individual testing: 
#>   ExpTests ExpTestsPerIndividual PercentReductionTests PercentIncreaseTestCap
#> 1   5.1042                0.5104                 48.96                  95.92
#> 2   2.1996                0.4399                 56.01                 127.31
#> 
#> Percent reduction in tests when using the second testing
#>      configuration rather than the first: 13.81 
#>  
#> Percent increase in testing capacity when using the second testing
#>      configuration rather than the first: 16.02
#> 

The OTC is 13.81% more efficient than using a group size of 10.

This vignette focuses on a two-stage hierarchical testing algorithm in a homogeneous population. Many other group testing algorithms are available! We provide a large number of examples within the help for the package to illustrate these other algorithms. Our Bilder et al. (2023) paper also provides many examples. In particular, the appendix of the paper provides more advanced uses of functions in the package.