---
title: "Confidence Intervals of $f_2$ Using Bootstrap Method"
author: "Zhengguo XU"
date: "`r format(Sys.Date())`"
output: 
  rmarkdown::html_vignette:
    keep_md: true
    toc: false
    toc_depth: 3
    fig_width: 7
    fig_height: 4.5
  highlight: "tango"
bibliography: ref.bib
notes-after-punctuation: false
link-citations: yes
csl: ref.csl
vignette: >
  %\VignetteIndexEntry{Confidence Intervals of $f_2$ Using Bootstrap Method}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup0, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#"
)
```

```{r setup, echo=FALSE}
library(bootf2)
```

## Introduction
To compare the dissolution profiles, the most widely used method is the
similarity factor $f_2$, which is usually defined in the regulatory 
guideline as 
$$f_2 = 50 \log\frac{100}{\sqrt{1 + \frac{\sum_{t=1}^{t=n}\left(\bar{R}_t-\bar{T}_t\right)^2}{n}}},$$
where $\bar{R}_t$ and $\bar{T}_t$ are mean dissolution profile of reference 
ad test product at time $t$; $n$ is the number of time points.

Nevertheless, there are several prerequisites for the use of f2 method according
to the regulatory guidelines, such as no more than one time point above 85%
dissolved should be used and the variability, expressed as coefficient of
variation (CV), should be no more than 20% and 10% for early and later time
points, respectively. See vignette *Introduction to bootf2* for details. 

Recently, several guidelines recommended the use of confidence interval of 
$f_2$ approach using bootstrap when the traditional $f_2$ method cannot be
applied [@EMA-2018-09-QA.MSD.DISSO; @Davit-2013-03-BA; @Lum-2019-05-WS;
@Mandula-2019-05-WS]. However, none of the guidelines specified in details 
which estimators or types of confidence intervals should be used. Therefore, 
the function `bootf2()` will output various confidence intervals by default
for several $f_2$ estimators. 

### Types of $f_2$

According to Shah et al, the population $f_2$ for the inference is 
$$
f_2 = 100-25\log\left(1 + \frac{1}{P}\sum_{i=1}^P
  \left(\mu_{\mathrm{T},i} - \mu_{\mathrm{R},i}\right)^2 \right)\, ,
$$ 
where $P$ is the number of time points; $\mu_{\mathrm{T},i}$ and
$\mu_{\mathrm{R},i}$, typically unknown, are population mean of test and
reference product at time point $i$, respectively [@Shah-1998-06-PR].
Several estimators for the population $f_2$ has been described in the
literature [@Shah-1998-06-PR; @Ma-2000-05-JBS; @Mendyk-2013-02-DT; 
@Mendyk-2019]. The function is able to calculate the following five
estimators for $f_2$, along with their corresponding 90% confidence intervals
using bootstrap.

#### The estimated $f_2$ ($\hat{f}_2$)

The estimated $f_2$ ($\hat{f}_2$, denoted by `est.f2` in the function) 
is the one written in various regulatory guidelines. It is expressed
differently, but mathematically equivalently, as
$$
\hat{f}_2 = 100-25\log\left(1 + \frac{1}{P}\sum_{i=1}^P\left(
  \bar{X}_{\mathrm{T},i} - \bar{X}_{\mathrm{R},i}\right)^2 \right)\:,
$$ 
where $P$ is the number of time points;
$\bar{X}_{\mathrm{T},i}$ and $\bar{X}_{\mathrm{R},i}$ are mean
dissolution data at the $i$th time point of *random samples* chosen from
the test and the reference population, respectively. 
Compared to the equation of population $f_2$ above, the only difference
is that in the equation of $\hat{f}_2$ the *sample means* of dissolution
profiles replace the *population means* for the approximation.
*In other words, a point estimate is used for the statistical inference in 
practice.*

#### The Bias-corrected $f_2$ ($\hat{f}_{2,\mathrm{bc}}$)

Bias-corrected $f_2$ ($\hat{f}_{2,\mathrm{bc}}$, denoted by `bc.f2` in the
function) was described in the article of Shah et al [@Shah-1998-06-PR], as
$$
\hat{f}_{2,\mathrm{bc}} = 100-25\log\left(1 + \frac{1}{P}
  \left(\sum_{i=1}^P\left(\bar{X}_{\mathrm{T},i} - 
    \bar{X}_{\mathrm{R},i}\right)^2 - \frac{1}{n}\sum_{i=1}^P
    \left(S_{\mathrm{T},i}^2 + S_{\mathrm{R},i}^2\right)\right)\right)\,,
$$ 
where $S_{\mathrm{T},i}^2$ and $S_{\mathrm{R},i}^2$ are
unbiased estimates of variance at the $i$th time points of random
samples chosen from test and reference population, respectively; and
$n$ is the sample size. $\bar{X}_{\mathrm{T},i}$,
$\bar{X}_{\mathrm{R},i}$ and $P$ are same as described previously.
As domain of the $\log(\cdot)$ function needs to be positive, when
variance is sufficiently high, i.e., when inequality 
$$\frac{1}{n}\sum_{i=1}^P\left(S_{\mathrm{T},i}^2 + S_{\mathrm{R},i}^2\right)\ge
\sum_{i=1}^P \left(\bar{X}_{\mathrm{T},i}-\bar{X}_{\mathrm{R},i}\right)^2+P$$ 
is valid, $\hat{f}_{2,\mathrm{bc}}$ cannot be calculated.

#### The variance- and bias-corrected $f_2$ ($\hat{f}_{2, \mathrm{vcbc}}$)

Bias-corrected $f_2$ described earlier assumes equal weight of 
variance between test and reference, which might not necessarily be true,
therefore, the function also includes the following variance- and 
bias-corrected $f_2$ ($\hat{f}_{2, \mathrm{vcbc}}$, denoted by `vc.bc.f2` in 
the function)
$$
\hat{f}_{2, \mathrm{vcbc}} = 100-25\log\left(1 +
  \frac{1}{P}\left(\sum_{i=1}^P \left(\bar{X}_{\mathrm{T},i} -
  \bar{X}_{\mathrm{R},i}\right)^2 -
  \frac{1}{n}\sum_{i=1}^P\left(w_{\mathrm{T},i}\cdot S_{\mathrm{T},i}^2 +
  w_{\mathrm{R},i}\cdot S_{\mathrm{R},i}^2\right)\right)\right)\,,
$$ 
where $w_{\mathrm{T},i}$ and $w_{\mathrm{R},i}$ are weighting factors
for variance of test and reference products, respectively, which can be
calculated as follows:
$$
w_{\mathrm{T},i} = 0.5 + \frac{S_{\mathrm{T},i}^2}
  {S_{\mathrm{T},i}^2 + S_{\mathrm{R},i}^2}\,,
$$
and
$$
w_{\mathrm{R},i} = 0.5 + \frac{S_{\mathrm{R},i}^2}
  {S_{\mathrm{T},i}^2 + S_{\mathrm{R},i}^2}\,.
$$
Similar to $\hat{f}_{2, \mathrm{bc}}$, the $\hat{f}_{2, \mathrm{vcbc}}$ 
cannot be estimated when the variance is sufficiently high, i.e., 
when the inequality 
$$
\frac{1}{n}\sum_{i=1}^P  \left(w_{\mathrm{T},i}\cdot S_{\mathrm{T},i}^2 +  w_{\mathrm{R},i}\cdot S_{\mathrm{R},i}^2\right) \ge \sum_{i=1}^P  \left(\bar{X}_{\mathrm{T},i}-\bar{X}_{\mathrm{R},i}\right)^2+P
$$
is valid.

#### The expected $f_2$ ($\hat{f}_{2, \mathrm{exp}}$)

The mathematical expectation of $\hat{f}_2$ is 
\begin{align}\label{eq:mathexp}
  \mathbb{E}\left(\hat{f}_2\right) &= \mathbb{E}\left(100 - 25\log\left(1 +
    \frac{1}{P} \sum_{i=1}^P\left(\bar{X}_{\mathrm{T},i} -
      \bar{X}_{\mathrm{R},i}\right)^2 \right)\right)\notag\\
     &\approx 100 - 25\log\left(1 + \mathbb{E}\left(\frac{1}{P} \sum_{i=1}^P
      \left(\bar{X}_{\mathrm{T},i} -
        \bar{X}_{\mathrm{R},i}\right)^2\right)\right) \notag\\
    &= 100 - 25\log\left(1 + \frac{1}{P}
    \left(\sum_{i=1}^P\left(\mu_{\mathrm{T},i} - \mu_{\mathrm{R},i}\right)^2
      + \frac{1}{n}\sum_{i=1}^P \left(\sigma_{\mathrm{T},i}^2 +
         \sigma_{\mathrm{R},i}^2\right)\right)\right),
\end{align}
where $\sigma_{\mathrm{T},i}^2$ and $\sigma_{\mathrm{R},i}^2$ are
*population variance* of the test and the reference product,
respectively.

The expected $f_2$ ($\hat{f}_{2, \mathrm{exp}}$, denoted by `exp.f2` in the
function) is calculated based on the mathematical expectation of
$\hat{f}_2$, using mean dissolution profiles and variance from samples 
for the approximation [@Ma-2000-05-JBS; @Mendyk-2013-02-DT; @Mendyk-2019]. 
$$
\hat{f}_{2, \mathrm{exp}} = 100-25\log\left(1 + \frac{1}{P}\left(\sum_{i=1}^P
  \left(\bar{X}_{\mathrm{T},i} - \bar{X}_{\mathrm{R},i}\right)^2 +
  \frac{1}{n}\sum_{i=1}^P\left(S_{\mathrm{T},i}^2 +
  S_{\mathrm{R},i}^2\right)\right)\right)\,.
$$

#### The variance-corrected expected $f_2$ ($\hat{f}_{2, \mathrm{vcexp}}$))

Similarly, since $\hat{f}_{2, \mathrm{exp}}$ assumes equal weight of 
variance between test and reference, the variance-corrected version
($\hat{f}_{2, \mathrm{vcexp}}$, denoted by `vc.exp.f2` in the function)
was also included in the function.
$$
\hat{f}_{2, \mathrm{vcexp}} = 100-25\log\left(1 + \frac{1}{P}\left(\sum_{i=1}^P
  \left(\bar{X}_{\mathrm{T},i} - \bar{X}_{\mathrm{R},i}\right)^2 +
  \frac{1}{n}\sum_{i=1}^P\left(w_{\mathrm{T},i}\cdot S_{\mathrm{T},i}^2 +
  w_{\mathrm{R},i}\cdot S_{\mathrm{R},i}^2\right)\right)\right)\,.
$$ 


### Types of confidence intervals

The following 90% confidence intervals are included in the function:

#### The Normal interval

The Normal interval (denoted by `normal` in the function)
with bias correction was estimated according to
Davison and Hinkley [@Davison-1997-Bootstrap],
\begin{align}
  \hat{f}_{2, \mathrm{L}} &= \hat{f}_{2, \mathrm{S}} -
    E_B - \sqrt{V_B}\cdot Z_{1-\alpha}\,, \\
  \hat{f}_{2, \mathrm{U}} &= \hat{f}_{2, \mathrm{S}} -
    E_B + \sqrt{V_B}\cdot Z_{1-\alpha}\,,
\end{align}
where $\hat{f}_{2, \mathrm{L}}$ and $\hat{f}_{2, \mathrm{U}}$
are the lower and upper limit of the confidence interval estimated
from bootstrap samples; $\hat{f}_{2, \mathrm{S}}$ denotes the
estimators as described in Section [Types of $f_2$] and
calculated using the *randomly selected samples from populations*;
$Z_{1-\alpha}$ is $\Phi^{-1}(1-\alpha)$, where $\Phi(\cdot)$
represents standard normal cumulative distribution function and
$\Phi^{-1}(\cdot)$ denotes its inverse, the quantile function;
$\alpha$ is the type I error rate and equal to 0.05 in the current
study; $E_B$ and $V_B$ are the *resampling estimates* of
bias and variance calculated as
\begin{align}
  E_B &= \frac{1}{B}\sum_{b=1}^{B}\hat{f}_{2,b}^\star - \hat{f}_{2, \mathrm{S}}
    = \bar{f}_2^\star - \hat{f}_{2, \mathrm{S}}\label{eq:eb}\,, \\
  V_B &= \frac{1}{B-1}\sum_{b=1}^{B}
    \left(\hat{f}_{2,b}^\star-\bar{f}_2^\star\right)^2\label{eq:vb}\,,
\end{align}
where superscript $^\star$ denotes that the $f_2$ estimates are
calculated from *bootstrap samples*; $B$ is the number of
bootstrap samples, which is equal to 10000 in the function as default value;
$\hat{f}_{2,b}^\star$ is the $f_2$ estimate with the $b$th bootstrap
sample and $\bar{f}_2^\star$ is the mean value.

#### The basic interval

The basic interval (denoted by `basic` in the function)
was calculated according to Davison and
Hinkley [@Davison-1997-Bootstrap],
\begin{align}
  \hat{f}_{2, \mathrm{L}} &= 2\hat{f}_{2, \mathrm{S}} -
    \hat{f}_{2,(B+1)(1-\alpha)}^\star\,, \\
  \hat{f}_{2, \mathrm{U}} &= 2\hat{f}_{2, \mathrm{S}} -
    \hat{f}_{2,(B+1)\alpha}^\star\,,
\end{align}
where $\hat{f}_{2,(B+1)\alpha}^\star$ and
$\hat{f}_{2,(B+1)(1-\alpha)}^\star$ are the $(B+1)\alpha$th
and the $(B+1)(1-\alpha)$th *ordered resampling estimates* of $f_2$,
respectively. When $(B+1)\alpha$ is not an integer, the following equation
is used for interpolation,
$$
  \hat{f}_{2,(B+1)\alpha}^\star = \hat{f}_{2,k}^\star 
  + \frac{\Phi^{-1}\left(\alpha\right)-\Phi^{-1}\left(\frac{k}{B+1}\right)}
  {\Phi^{-1}\left(\frac{k+1}{B+1}\right)-\Phi^{-1}\left(\frac{k}{B+1}\right)}
  \left(\hat{f}_{2,k+1}^\star-\hat{f}_{2,k}^\star\right),
$$
where $k = \left[(B+1)\alpha\right]$, the *integer part* of
$(B+1)\alpha$, $\hat{f}_{2,k+1}^\star$ and $\hat{f}_{2,k}^\star$
are the $(k+1)$th and the $k$th ordered resampling estimates of $f_2$,
respectively.

#### The percentile interval

The percentile intervals (denoted by `percentile` in the function)
were estimated using nine different types of
quantiles, percentile Type 1 to Type 9, as summarized in
Hyndman and Fan's article [@Hyndman-1996-AS]
and implemented in `R`'s `quantile` function.
Using `R`'s `boot` package, program `bootf2BCA` outputs
a percentile interval using the equation above for
interpolation. To be able to compare the results among different
programs, the same interval, denoted by `Percentile (boot)` in the function,
is also calculated in this study.

#### The BCa intervals

The bias-corrected and accelerated intervals were estimated as follows
according to literature [@Efron1993AIttB],
\begin{align}
  \hat{f}_{2, \mathrm{L}} &= \hat{f}_{2, \alpha_1}^\star\,,\\
  \hat{f}_{2, \mathrm{U}} &= \hat{f}_{2, \alpha_2}^\star\,,
\end{align}
where $\hat{f}_{2, \alpha_1}^\star$ and $\hat{f}_{2, \alpha_2}^\star$
are the $100\alpha_1$th and the $100\alpha_2$th percentile of the resampling
estimates of $f_2$, respectively. Type I errors $\alpha_1$ and $\alpha_2$ were
obtained as
\begin{align}
  \alpha_1 &= \Phi\left(\hat{z}_0 + \frac{\hat{z}_0 + \hat{z}_\alpha}
    {1-\hat{a}\left(\hat{z}_0 + \hat{z}_\alpha\right)}\right), \\
  \alpha_2 &= \Phi\left(\hat{z}_0 + \frac{\hat{z}_0 + \hat{z}_{1-\alpha}}
    {1-\hat{a}\left(\hat{z}_0 + \hat{z}_{1-\alpha}\right)}\right),
\end{align}
where $\hat{z}_0$ and $\hat{a}$ are called
*bias-correction* and *acceleration*, respectively.

There are different methods to estimate the $\hat{z}_0$ and $\hat{a}$.
Shah et al. [@Shah-1998-06-PR] used jackknife technique (denoted by 
`bca.jackknife`) as 
\begin{equation}\label{eq:z0}
  \hat{z}_0 = \Phi^{-1}\left(\frac{\#\left\{\hat{f}_{2,b}^\star <
    \hat{f}_{2,\mathrm{S}} \right\}}{B}\right),
\end{equation}
and 
\begin{equation}\label{eq:a0}
  \hat{a} = \frac{\sum_{i=1}^{n}\left(\hat{f}_{2,\mathrm{m}} -
    \hat{f}_{2, i}\right)^3}{6\left(\sum_{i=1}^{n}
    \left(\hat{f}_{2,\mathrm{m}} - \hat{f}_{2, i}\right)^2\right)^{3/2}}\,,
\end{equation}
where $\#\left\{\cdot\right\}$ denotes the number of element in the
set, $\hat{f}_{2, i}$ is the $i$th jackknife statistic,
$\hat{f}_{2,\mathrm{m}}$ is the mean of the jackknife statistics, and $n$ is
the sample size.

Program `bootf2BCA` gives a slightly different BCa interval with `R`'s 
`boot` package [@Mendyk-2019]. This approach, denoted by `bca.boot` 
in the function, was also implemented in the function `bootf2`
for estimating the interval.

## Usage

The complete list of arguments of the function is as follows:

```{r bootf2-code0, eval=FALSE}
bootf2(test, ref, path.in, file.in, path.out, file.out,
       n.boots = 10000L, seed = 306, digits = 2L, alpha = 0.05,
       regulation = c("EMA", "FDA", "WHO"), min.points = 1L,
       both.TR.85 = FALSE, print.report = TRUE,
       report.style = c("concise", "intermediate", "detailed"),
       f2.type = c("all", "est.f2", "exp.f2", "bc.f2",
                   "vc.exp.f2", "vc.bc.f2"),
       ci.type = c("all", "normal", "basic", "percentile",
                   "bca.jackknife", "bca.boot"),
       quantile.type = c("all", 1:9, "boot"),
       jackknife.type = c("nt+nr", "nt*nr", "nt=nr"),
       time.unit = c("min", "h"), output.to.screen = FALSE,
       sim.data.out = FALSE)
```

### Notes on function arguments

1. Data input: `test`, `ref`, `path.in`, `file.in`
    - In the typical interactive use, `test` and `ref` are *data frames* with
      the time as the first column, and individual units for the rest of 
      columns. In such cases, arguments `path.in` and `file.in` should not 
      be used.
    - Data can be read directly from an Excel file with extension `.xlsx` or
      `.xls`. In this case, data of test and reference should be stored in
      separate worksheets. The first column should be time, the rest columns 
      are individual units. *The first row* is the column head indicating the
      names, such as 'time', 'unit 01', unit 02', .... It doesn't matter what
      the names are as columns will be renamed internally by the function. The
      important point is that *the first row will not be read, so do not put
      dissolution data on the first row*. 
    - When `path.in` and `file.in` are provided, the argument `test` and
      `ref` should be the *worksheet names inside quotation mark*, e.g., 
      `"lot ABCD1234 pH 6.8"`. 
    - `path.in` can be an absolute path such as`"/home/myname/my.project/dat/"`,
      or a relative path such as `"../dat/"` if the working directory is in the
      folder `"/home/myname/my.project/analysis/"` and the data file is in the
      folder `"/home/myname/my.project/dat/"`. 
    - One more note for Windows user: As Windows use "\" instead of "/" to
      separate path, you have to either escape it by an additional "\", 
      e.g., `"C:\user\myname\my.project\dat\"` *cannot* be the `path.in`, 
      you have to changed it to `"C:\\user\\myname\\my.project\\dat\\"`, or
      to `"C:/user/myname/my.project/dat/"`, the same format as used in Linux
      system.
1. Output: `path.out` and `file.out`, `output.to.screen`:
    - If argument `path.out` and `file.out` are not provided, but argument
      `print.report` is `TRUE`, then a plain text report will be generated
      automatically in the current working directory with file name
      `test_vs_ref_TZ_YYYY-MM-DD_HHMMSS.txt`, where `test` and `ref` are data
      set names of test and reference, `TZ` is the time zone such as CEST, 
      `YYYY-MM-DD` is the numeric date format and `HHMMSS` is the numeric time
      format for hour, minute, and second.
    - For a quick check, set argument `output.to.screen = TRUE`, a summary
      report very similar to concise style report will be printed on screen.
1. Argument `jackknife.type`
    - For any sample with size $n$, the jackknife estimator is obtained by
      estimating the parameter for each subsample omitting the $i$th 
      observation. However, when two samples (e.g., test and reference) are
      involved, there are several possible ways to do it. Assuming sample size
      of test and reference are $n_\mathrm{T}$ and $n_\mathrm{R}$, the
      following three possibility are considered:
      - Estimated by removing one observation from both test and reference
        samples. In this case, the prerequisite is 
        $n_\mathrm{T} = n_\mathrm{R}$, denoted by `nt=nr` in the function.
        So if there are 12 units in test and reference data sets, there will
        be 12 jackknife estimators.
      - Estimate the jackknife for test sample while keeping the reference data
        unchanged; and then estimate jackknife for reference sample while
        keeping the test sample unchanged. This is denoted by `nt+nr` in the
        function. This is the default method. So if there are 12 units in test
        and reference data sets, there will be $12 + 12 = 24$ jackknife
        estimators.
      - For each observation deleted from test sample, estimate jackknife for
        reference sample. This is denoted by `nt*nr` in the function. So if
        there are 12 units in test and reference data sets, there will be 
        $12 \times 12 = 144$ jackknife estimators.
1. Arguments `f2.type` and `ci.type` are explained in the previous section.
1. By default, the individual bootstrapped data set are not included in the
   output as those data sets are only useful for validation purpose. To include
   the individual data sets, set the argument `sim.data.out = TRUE`. 
1. Read the function manual `help("bootf2")` for more details of each argument.

## Examples

The minimum required arguments are `test` and `ref`. First, let's 
simulate some dissolution profiles to play with.

```{r bootf2-sim01}
# time points
tp <- c(5, 10, 15, 20, 30, 45, 60)

# model.par for reference with low variability
par.r <- list(fmax = 100, fmax.cv = 3, mdt = 15, mdt.cv = 14, 
                  tlag = 0, tlag.cv = 0, beta = 1.5, beta.cv = 8)

# simulate reference data
dr <- sim.dp(tp, model.par = par.r, seed = 100, plot = FALSE)

# model.par for test 
par.t <- list(fmax = 100, fmax.cv = 3, mdt = 12.29, mdt.cv = 12,
                  tlag = 0, tlag.cv = 0, beta = 1.727, beta.cv = 9)

# simulate test data with low variability
dt <- sim.dp(tp, model.par = par.t, seed = 100, plot = FALSE)
```

Use default setting for most arguments. To reduce the compilation time,
bootstrap number is set to 100 only.

```{r bootf2-ex01, eval=FALSE}
# output file will be generated automatically
bootf2(dt$sim.disso, dr$sim.disso, n.boots = 100, print.report = FALSE,
       output.to.screen = TRUE)
```