Volume 8 Issue 9 - May 15, 2009
Interval Estimation of Binomial Proportion in Clinical Trials with a Two-Stage Design
Wei-Yann Tsai1, Yunchan Chi2,*, Chia-Min Chen2

1Department of Biostatistics, Columbia University, New York City, New York, U.S.A
2Department of Statistics, College of Management, National Cheng Kung University

Statistics in Medicine 2008, 27, 15-35

Font Normal   Font Enlarge

In a drug development process, the primary goal of Phase II clinical trials is to determine whether there is sufficient evidence of efficacy and safety to make it worth further study in a Phase III clinical trial. For example, suppose a new drug is developed for patients with liver cancer. To investigate whether this new drug extends the life of liver cancer patients, the criterion of drug efficacy is defined to be the presence of tumor shrinkage, which is a binary endpoint with response probability p. In a single-stage design, if the number of patients with tumor shrinkage is large enough, then this new drug may be used as evidence for further testing.

Most often, drug experiments employ a Simon’s two-stage design [1] in order to avoid giving patients an ineffective drug. Suppose there are n1 patients participating in the first-stage experiment, and there are n2 additional patients participating in the second-stage experiment if the new drug design is allowed to continue. The number of responses Y1 is observed at the first stage, and if Y1 is less than a specified value a, the design is stopped. Otherwise, this drug design is allowed to continue and the number of responses Y2, which is independent of Y1, is observed at the second stage. Consequently, the response probability of patients with tumor shrinkage is to be estimated in order to plan a further study.

It is natural to use sample proportion to estimate p. However, when the second stage is allowed to continue, this estimator will overestimate the true p, because the number of responses at the first stage is truncated by a and follows a truncated binomial distribution. Therefore, a maximum likelihood estimator based on the truncated binomial distribution is derived to take into account the truncation effect. In addition, to take into account the inherent variability of patients in the measured responses, the confidence interval, that is a range of feasible values within which the true p may lie, is also constructed for p. Two types of interval estimators, the Wald interval without (with) continuity correction [2] and the score interval without (with) continuity correction [3], are constructed in the next section.

2.The proposed method

When the second stage is allowed to continue, the possible values of Y1 are truncated by a, so the probability distribution of Y1 is referred to as truncated binomial distribution with response probability p and its probability mass function is
with While the probability distribution of Y2 is binomial with the same response probability p, and its probability mass function is
where Let denote the logarithm of likelihood function based jointly on Y1=y1 and Y2=y2 and its first derivative with respective to p is U(p). Note that the maximum likelihood estimator of p is the solution to U(p)= 0. The graph of U(p) displays that a unique solution can be obtained for a <y1+y2<n1+n2 and the equation U(p) = 0 does not have a closed-form solution. Hence, the Newton-Raphson algorithm is employed to obtain a numerical estimate of p and the iteration equation p[j]=p[j-1]+ U(p[j-1])/J(p[j-1]), where the observed Fisher information, J(p), must be derived as the negative of the second derivative of log L(p) with respective top. Note that the iterations proceed until difference between p[j] and p[j-1] is smaller than a required error tolerance level (say, 10-4).

Next, the interval estimators for p are constructed as follows. By the property of maximum likelihood estimator, the statistic has an asymptotic standard normal distribution, where the Fisher information of , I(p), is the expectation of J(p). Consequently, the lower and upper limits of a two-sided Wald interval for p are given by solving the equations of the expression = zα/2 and = zα/2; respectively, where zα/2 is the 100(1-α/2) th percentile of the standard normal distribution. The resulting Wald interval (Wald) is denoted by , . Because a discrete random variable can take on only specified values, the correction for continuity adjustment, 1/(2(n-a)), is employed. The resulting Wald interval with continuity correction (Wald_c) is denoted by +1/(2(n-a)) Following Wilson’s [3] concept, the lower and upper limits of the score confidence interval without (with) continuity correction can also be solved by replacing with the actual success probabilityp. So for given , the lower and upper limits of the score interval are the solution to the equations and respectively. The resulting confidence interval (Score) is denoted by . Similarly, the score interval with continuity correction (Score_c) is denoted by .

3.Comparison results

The evaluation criteria for comparing the performance of interval estimators are the actual coverage probability and the expected width of an interval estimator, which is defined as CP(p)=∑I(x,p)P(X=x) and EW(p) = , respectively, where I(x, p) is an indicator with 1 if the interval based on X=x contains p, and 0 if it does not contain p and is the width of an interval estimate. Note that the length of confidence interval measures the precision of estimation, and it is desirable to obtain a confidence interval that is short enough with adequate confidence (or coverage probability).

In Phase II clinical trials, the investigators are asked to specify the largest (smallest) success probability, p1 (p0), which, if true, would clearly imply that the treatment is (is not) promising for further study. For given p0, p1, α, and β (type II error rate), Table I displays the actual coverage probabilities and the expected width. In this sense, the maximum likelihood estimator of p is slightly underestimated and the sample proportion is overestimated. When n1 is considerably larger than n2, for example, n1=34 and n2=5, the maximum likelihood estimator underestimates significantly the true p and the coverage probability of the score interval without continuity correction is considerably lower than the nominal confidence level which is 0.95 in this comparison. Although the expected interval width of the score interval without continuity correction is almost the smallest, the difference between the expected interval widths of two score interval estimators is negligible. Therefore, according to the mean coverage probability and expected interval width, this paper recommended the score interval with continuity correction for estimating p based on the data from Simon’s two-stage designs.
Table I. The coverage probabilities, expected interval width and biases for Simon’s designs with α= 0.05 and β= 0.2 for p1-p0= 0.2

  1. Simom R. Optimal two-stage designs for Phase II clinical trials. Controlled Clinical Trials 1989; 10:1-10.
  2. Casella G, Berger R. Statistical inference. Duxbury: Pacific Grove, 2002.
  3. Wilson EB. Probable inference, the law of succession, and statistical inference. American Statistical Association 1927; 22:209-212.
< previousnext >
Copyright National Cheng Kung University