To give a first impression of how DiscreteFDR works, we consider an artificial toy example. A more realistic example involving pharmacovigilance data is given in Section 2.
Suppose we would like to compare two treatments in nine different populations. For each population we do this by evaluating the responders and non-responders for each treatment. This leads to categorical data which can be represented, for each population i = 1, …, 9 in the following 2 × 2 table:
Responders | Non-responders | ||
---|---|---|---|
Treatment 1 | x1i | y1i | n1i |
Treatment 2 | x2i | y2i | n2i |
Total | x1i + x2i | y1i + y2i | n = n1i + n2i |
Denoting the responder probabilities for population i by π1i and π2i we can test e.g.
H0i : π1i = π2i vs. H1i : π1i ≠ π2i
by using Fisher’s (two-sided) exact test (see Lehmann and Romano
(2006)), which is implemented in the R function
fisher.test
. Suppose the data in the nine populations are
independent and we observe the following data frame df
library(knitr)
X1 <- c(4, 2, 2, 14, 6, 9, 4, 0, 1)
X2 <- c(0, 0, 1, 3, 2, 1, 2, 2, 2)
N1 <- rep(148, 9)
N2 <- rep(132, 9)
Y1 <- N1 - X1
Y2 <- N2 - X2
df <- data.frame(X1, Y1, X2, Y2)
kable(df, caption = "Toy Example")
X1 | Y1 | X2 | Y2 |
---|---|---|---|
4 | 144 | 0 | 132 |
2 | 146 | 0 | 132 |
2 | 146 | 1 | 131 |
14 | 134 | 3 | 129 |
6 | 142 | 2 | 130 |
9 | 139 | 1 | 131 |
4 | 144 | 2 | 130 |
0 | 148 | 2 | 130 |
1 | 147 | 2 | 130 |
In this data frame each of the 9 rows represents the data of an
observed 2 × 2 table: e.g., the third
row of the data corresponds to x13 = 2, y13 = 146, x23 = 1, y23 = 131.
Even though in this example, the total number of tested hypotheses m = 9 is very small, for
illustrative purposes we deal with the multiplicity problem here by
controlling FDR at level α = 5%. The DBH step-down procedure
can be applied directly to the data frame object df
and
perform Fisher’s exact test in-between. This yields an S3 object of
class DiscreteFDR
, for which we provide both
print
and summary
methods:
library(DiscreteFDR)
DBH.sd.fast <- direct.discrete.BH(df, "fisher", direction = "sd")
print(DBH.sd.fast)
##
## Discrete Benjamini-Hochberg procedure (step-down)
##
## Data: df
## Number of tests = 9
## Number of rejections = 2 at global FDR level 0.05
## (Original BH rejections = 0)
## Largest rejected p-value: 0.02126871
##
## Discrete Benjamini-Hochberg procedure (step-down)
##
## Data: df
## Number of tests = 9
## Number of rejections = 2 at global FDR level 0.05
## (Original BH rejections = 0)
## Largest rejected p-value: 0.02126871
##
## Index P.value Adjusted Rejected
## 1 4 0.01243145 0.03819796 TRUE
## 2 6 0.02126871 0.03819796 TRUE
## 3 1 0.12476691 0.25630985 FALSE
## 4 8 0.22135177 0.47895996 FALSE
## 5 5 0.28849298 0.51482782 FALSE
## 6 2 0.49984639 1.00000000 FALSE
## 7 9 0.60329543 1.00000000 FALSE
## 8 7 0.68723229 1.00000000 FALSE
## 9 3 1.00000000 1.00000000 FALSE
The output of the summary
function contains the same
output as the print
method, but adds a table that lists the
raw p-values, their adjusted
counterparts and their respective rejection decisions. It is sorted by
raw p-values in ascending
order. Our summary
method actually creates a
summary.DiscreteFDR
object, which includes all contents of
an DiscreteFDR
object plus the aforementioned table. This
table can be accessed directly by the $Table
command.
DBH.sd.fast.summary <- summary(DBH.sd.fast)
summary.table <- DBH.sd.fast.summary$Table
kable(summary.table, caption = "Summary table")
Index | P.value | Adjusted | Rejected |
---|---|---|---|
4 | 0.0124314 | 0.0381980 | TRUE |
6 | 0.0212687 | 0.0381980 | TRUE |
1 | 0.1247669 | 0.2563098 | FALSE |
8 | 0.2213518 | 0.4789600 | FALSE |
5 | 0.2884930 | 0.5148278 | FALSE |
2 | 0.4998464 | 1.0000000 | FALSE |
9 | 0.6032954 | 1.0000000 | FALSE |
7 | 0.6872323 | 1.0000000 | FALSE |
3 | 1.0000000 | 1.0000000 | FALSE |
Thus we can reject two hypotheses at FDR-level α = 5%. Note, that our
print
method also gives the number of rejections of the
usual (continuous) BH procedure. In order to compare its adjusted p-values with ours, we have to
determine the raw p-values
first. This would be possible by applying the fisher.test
function to all nine 2 × 2 tables.
Alternatively, we may use the more convenient function
generate.pvalues
, which is included in our package, for
accessing the raw p-values.
Since it only accept hypothesis test functions from the package
DiscreteTests
(either as a function object or a character
string that can be abbreviated), we could also use such a function
directly, e.g. fisher.test.pv
. An even more simple way is
to extract them from the DiscreteFDR
object that we
obtained before and contains the results:
# compute results of Fisher's exact test for each row of 'df'
fisher.p <- generate.pvalues(df, "fisher", list(alternative = "two.sided"))
# extract raw observed p-values
raw.pvalues <- fisher.p$get_pvalues()
# or
library(DiscreteTests)
fisher.p.2 <- fisher.test.pv(df, "two.sided")
raw.pvalues.2 <- fisher.p.2$get_pvalues()
# or:
raw.pvalues.3 <- DBH.sd.fast$Data$raw.pvalues
all(raw.pvalues == raw.pvalues.2) && all(raw.pvalues == raw.pvalues.3)
## [1] TRUE
## 1 2 3 4 5 6 7
## 0.37430072 0.74976959 1.00000000 0.09570921 0.51928737 0.09570921 0.77313633
## 8 9
## 0.49804147 0.77313633
Applying the continuous BH procedure from the stats
package in the last line of code, we find that we cannot reject any
hypotheses at FDR-level α = 5%. Another approach of
determining this is to compare the critical values of the discrete and
continuous BH procedures. In the discrete case, we need the observed
p-values and their
distributions. Both were computed by our generate.pvalues
function above. We can either extract them and pass them to the
DBH
function, or directly apply the function to the test
results object itself:
# extract raw observed p-values
raw.pvalues <- fisher.p$get_pvalues()
# extract p-value support sets
pCDFlist <- fisher.p$get_pvalue_supports()
# use raw pvalues and list of supports:
DBH.sd.crit <- DBH(raw.pvalues, pCDFlist, 0.05, "sd", TRUE)
crit.vals.BH.disc <- DBH.sd.crit$Critical.values
# use test results object directly
DBH.sd.crit.2 <- DBH(fisher.p, 0.05, "sd", TRUE)
crit.vals.BH.disc.2 <- DBH.sd.crit.2$Critical.values
# compare
all(crit.vals.BH.disc == crit.vals.BH.disc.2)
## [1] TRUE
The latter way enables the use of a pipe:
## [1] 0.01243145 0.02832448 0.03109596 0.04839433 0.05014119 0.07657062 0.07657062
## [8] 0.10328523 0.10328523
Now, we can compare the critical values:
crit.vals.BH.cont <- 1:9 * 0.05/9
tab <- data.frame(raw.pvalues = sort(raw.pvalues), crit.vals.BH.disc, crit.vals.BH.cont)
kable(tab, caption = "Observed p-values and critical values")
raw.pvalues | crit.vals.BH.disc | crit.vals.BH.cont | |
---|---|---|---|
4 | 0.0124314 | 0.0124314 | 0.0055556 |
6 | 0.0212687 | 0.0283245 | 0.0111111 |
1 | 0.1247669 | 0.0310960 | 0.0166667 |
8 | 0.2213518 | 0.0483943 | 0.0222222 |
5 | 0.2884930 | 0.0501412 | 0.0277778 |
2 | 0.4998464 | 0.0765706 | 0.0333333 |
9 | 0.6032954 | 0.0765706 | 0.0388889 |
7 | 0.6872323 | 0.1032852 | 0.0444444 |
3 | 1.0000000 | 0.1032852 | 0.0500000 |
Obviously, the critical values of the discrete approach are considerably larger than those of the continuous method. As a result, the two smallest p-values are rejected by the discrete BH procedure, because they are smaller than or equal to the respective critical values. The continuous BH approach does not reject any hypothesis, because all its critical values are smaller than the observed p-values.
For illustration, our package includes a plot
method for
DiscreteFDR
S3 class objects. It allows to define colors,
line types and point characters separately for accepted and rejected
p-values and for critical
values (if present in the object).
plot(DBH.sd.crit, col = c("red", "blue", "green"), pch = c(4, 2, 19), lwd = 2, type.crit = 'o',
legend = "topleft", cex = 1.3)
Now, it is visible which observed p-values are rejected. A comparison with the continuous BH procedure could be done as follows:
plot(DBH.sd.crit, col = c("red", "blue", "green"), pch = c(4, 2, 19), lwd = 2, type.crit = 'o',
cex = 1.3, ylim = c(0, 0.25), main = "Comparison of discrete and continuous BH procedures")
points(crit.vals.BH.cont, pch = 19, cex = 1.3, lwd = 2)
legend("topright", c("Rejected", "Accepted", "Critical Values (disc.)", "Critical Values (cont.)"),
col = c("red", "blue", "green", "black"), pch = c(4, 2, 19, 19), lwd = 2, lty = 0)
As this example illustrates, the discrete approach can potentially yield a large increase in power. The gain depends on the involved distribution functions and the raw p-values. To appreciate where this comes from, it is instructive to consider the distribution functions F1, …, F9 of the p-values under the null in more detail. Take for instance the first 2 × 2 table:
Responders | Non-responders | ||
---|---|---|---|
Treatment 1 | 4 | 144 | 148 |
Treatment 2 | 0 | 132 | 132 |
Total | 4 | 276 | 280 |
Fisher’s exact test works by determining the probability of observing
this (or a more ‘extreme’) table, given that the margins are fixed. So
each Fi is
determined by the margins of table i. Since x11 + x21 = 4,
the only potentially observable tables are given by x11 = 0, …, 4. For each
one of these 5 values a p-value can be determined using the
hypergeometric distribution. Therefore, the p-value of any 2 × 2 table with margins given by the above
table can take (at most) 5 distinct values, say x1, …, x5.
Combining these 5 values into a set, we obtain the support
𝒜1 = {x1, …, x5}
of distribution F1.
Now we can continue in this vein for the remaining 2 × 2 tables to obtain the supports 𝒜1, …, 𝒜9 for the
distribution functions F1, …, F9.
The supports can be accessed via the $get_pvalue_supports()
function of the results object, e.g.
# extract p-value support sets
pCDFlist <- fisher.p$get_pvalue_supports()
# extract 1st and 5th support set
pCDFlist[c(1,5)]
## [[1]]
## [1] 0.04820493 0.12476691 0.34598645 0.62477763 1.00000000
##
## [[2]]
## [1] 0.002173856 0.007733719 0.028324482 0.069964309 0.154043258 0.288492981
## [7] 0.481808361 0.726262402 1.000000000
Here, this returns 𝒜1 and 𝒜5. Panel (a) in the following figure shows a graph of the distribution functions F1, …, F9. Each Fi is a step-function with Fi(0) = 0, the jumps occurring only on the support 𝒜i and Fi(x) = x only for x ∈ 𝒜i. In particular, all distributions are stochastically larger than the uniform distribution (i.e., Fi(x) ≤ x), but in a heterogeneous manner. This heterogeneity can be exploited e.g., by transforming the raw p-values from the exact Fisher’s test using the function $$\displaystyle \xi_{\text{SD}}(x) = \sum_{i = 1}^{9} \frac{F_i(x)}{1 - F_i(x)}.$$ Panel (b) of the following plot shows that ξSD is a step function. Its jumps occur on the joint support 𝒜 = 𝒜1 ∪ … ∪ 𝒜9. Panel (b) also shows that ξSD(x) ≪ x, at least for small values of x. It turns out that the critical values of our new DBH step-down procedure are essentially given by inverting ξSD at the critical values of the [BH] procedure 1 ⋅ α/9, 2 ⋅ α/9, …, α, so that these values are considerably larger than the [BH] critical values. This is illustrated in panel (c) along with the ordered p~values. In particular, all asterisks are located above the green [BH] dots, therefore this procedure can not reject any hypothesis. In contrast, the two smallest p~values are located below red DBH step-down dots, so that this procedure rejects two hypotheses as we had already seen earlier.
stepf <- lapply(pCDFlist, function(x) stepfun(x, c(0, x)))
par(mfcol = c(1, 3), mai = c(1, 0.5, 0.3, 0.1))
plot(stepf[[1]], xlim = c(0, 1), ylim = c(0, 1), do.points = FALSE, lwd = 1, lty = 1, ylab = "F(x)",
main = "(a)")
for(i in (2:9)){
plot(stepf[[i]], add = TRUE, do.points = FALSE, lwd = 1, col = i)
}
segments(0, 0, 1, 1, col = "grey", lty = 2)
# Plot xi
support <- sort(unique(unlist(pCDFlist)))
components <- lapply(stepf, function(s){s(support) / (1 - s(support))})
xi.values <- 1/9 * Reduce('+', components)
xi <- stepfun(support, c(0, xi.values))
plot(xi, xlim = c(0, 0.10), ylim = c(0, 0.10), do.points = FALSE, ylab = expression(xi), main = "(b)")
segments(0, 0, 0.1, 0.1, col = "grey", lty = 2)
# Plot discrete critical values as well a BH constants and raw p-values
DBH.sd <- DBH(fisher.p, direction = "sd", ret.crit.consts = TRUE)
plot(DBH.sd, col = c("black", "black", "red"), pch = c(4, 4, 19), type.crit = 'p', ylim = c(0, 0.15),
cex = 1.3, main = "(c)", ylab = "Critical Values")
points(1:9, 0.05 * (1:9) / 9, col = "green", pch = 19, cex = 1.3)
mtext("Figure 1", 1, outer = TRUE, line = -2)
Panel (a) depicts the distribution functions F1, …, F9 in various colors, (b) is a graph of the transformation ξSD. The uniform distribution function is shown in light gray in (a) and (b). Panel (c) shows the [BH] critical values (green dots), the DBH step-down critical values (red dots) and the sorted raw p-values (asterisks).