Type: Package
Title: Privacy-Preserving Distributed Algorithms
Version: 1.2.8
Date: 2025-03-10
Description: A collection of privacy-preserving distributed algorithms for conducting multi-site data analyses. The regression analyses can be linear regression for continuous outcome, logistic regression for binary outcome, Cox proportional hazard regression for time-to event outcome, Poisson regression for count outcome, or multi-categorical regression for nominal or ordinal outcome. The PDA algorithm runs on a lead site and only requires summary statistics from collaborating sites, with one or few iterations. The package can be used together with the online system (https://pda-ota.pdamethods.org/) for safe and convenient collaboration. For more information, please visit our software websites: https://github.com/Penncil/pda, and https://pdamethods.org/.
Maintainer: Yiwen Lu <yiwenlu@sas.upenn.edu>
License: Apache License 2.0
Suggests: imager, lme4
Depends: R (≥ 4.1.0)
Imports: Rcpp (≥ 0.12.19), stats, httr, rvest, jsonlite, data.table, survival, minqa, glmnet, MASS, numDeriv, metafor, ordinal, plyr
LinkingTo: Rcpp, RcppArmadillo
RoxygenNote: 7.2.3
Encoding: UTF-8
LazyData: true
NeedsCompilation: yes
Packaged: 2025-03-10 14:59:12 UTC; cjiajie
Author: Chongliang Luo [aut], Rui Duan [aut], Mackenzie Edmondson [aut], Jiayi Tong [aut], Xiaokang Liu [aut], Kenneth Locke [aut], Yiwen Lu [cre], Yong Chen [aut], Penn Computing Inference Learning (PennCIL) lab [cph]
Repository: CRAN
Date/Publication: 2025-03-10 15:30:01 UTC

ADAP derivatives

Description

ADAP derivatives

Usage

ADAP.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


ADAP surrogate estimation

Description

ADAP surrogate estimation

Usage

ADAP.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate objective function at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ADAP initialize

Description

ADAP initialize

Usage

ADAP.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


ADAP simulated data

Description

A simulated data set for ADAP demonstration

Usage

ADAP_data

Format

A list containing the following elements:

sites

site id, 300 'site1', 300 'site2', 300 'site3'

status

binary outcome of length 900

x

900 by 49 matrix generated by standard normal distribution, representing the covariates


PDA DLM estimation

Description

PDA DLM estimation

Usage

DLM.estimate(ipdata=NULL,control,config)

Arguments

ipdata

no need

control

PDA control

config

cloud configuration

Details

DLM estimation: (1) Linear model, (2) Linear model with fixed effects, (3) Linear model with random effects (Linear mixed model)

Value

list(bhat, sebhat, sigmahat, uhat, seuhat)


DLM initialize

Description

DLM initialize

Usage

DLM.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation. IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.


DPQL derive

Description

DPQL derive

Usage

DPQL.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

This step calculated the intermediate aggregated data (XtWX, XtWY, and YtWY) for each site. May need to be iterated several times until prespecified rounds are met.

Value

list(SiX, SiXY, SiY, ni)

References

Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling. medRxiv, doi:10.1101/2021.05.03.21256561.
Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.


PDA DPQL estimation

Description

PDA DPQL estimation

Usage

DPQL.estimate(ipdata=NULL,control,config)

Arguments

ipdata

no need

control

PDA control

config

cloud configuration

Details

DPQL estimation: (iterative) weighted DLMM using AD from all sites

Value

list(risk_factor, risk_factor_heterogeneity, bhat, sebhat, uhat, seuhat, Vhat)

References

Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling. medRxiv, doi:10.1101/2021.05.03.21256561.
Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.


DPQL initialize

Description

DPQL initialize

Usage

DPQL.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

To initialize, fit glm at each individual site and send the estimated effect size and variances to the lead site. This step may be optional if we just use zero's as initial effect sizes to start the PQL algorithm.

Value

init


Length of Stay data

Description

A simulated data set of hospitalization Length of Stay (LOS) from 3 sites

Usage

LOS

Format

A data frame with 1000 rows and 5 variables:

site

site id, 500 'site1', 400 'site2' and 100 'site3'

age

3 categories, 'young', 'middle', and 'old'

sex

2 categories, 'M' for male and 'F' for female

lab

lab test results, continuous value ranging from 0 to 100

los

LOS in days, ranging from 1 tp 28. Treated as continuous outcome in DLM


Generate pda UWZ derivatives

Description

Generate pda UWZ derivatives

Usage

ODAC.derive(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Details

Calculate and broadcast 1st and 2nd order derivative at initial bbar for ODAC, this requires 2 substeps: 1st calculate summary stats (U, W, Z), 2nd calculate derivatives (logL_D1, logL_D2)

Value

list(T_all=T_all, b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), U=U, W=W, Z=Z, logL_D1=logL_D1, logL_D2=logL_D2)


Generate pda UWZ summary statistics before calculating derivatives

Description

Generate pda UWZ summary statistics before calculating derivatives

Usage

ODAC.deriveUWZ(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(T_all=T_all, b_meta=b_meta, site=control$mysite, site_size = nrow(ipdata), U=U, W=W, Z=Z, logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAC.estimate(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

step-4: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAC initialize

Description

ODAC initialize

Usage

ODAC.initialize(ipdata, control, config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(T_i = T_i, bhat_i = fit_i$coef, Vhat_i = summary(fit_i)$coef[,2]^2, site=control$mysite, site_size= nrow(ipdata))

References

Rui Duan, et al. "Learning from local to global: An efficient distributed algorithm for modeling time-to-event data". Journal of the American Medical Informatics Association, 2020, https://doi.org/10.1093/jamia/ocaa044 Chongliang Luo, et al. "ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data". medRxiv, 2021, https://doi.org/10.1101/2021.04.18.21255694


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODAC.synthesize(ipdata, control, config)

Arguments

ipdata

local data in data frame

control

pda control

config

cloud config

Details

Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODACAT derivatives

Description

ODACAT derivatives

Usage

ODACAT.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODACAT.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODACAT initialize

Description

ODACAT initialize

Usage

ODACAT.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODACAT.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Details

Optional step-4: synthesize all the surrogate est btilde from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODACATH derivatives

Description

ODACATH derivatives

Usage

ODACATH.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = n, S_site=S_site, eta=eta_mat[site,])


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODACATH.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate efficient score at the master/lead site

Value

list(btilde=betanew, btilde.se=beta_SE,eta_mat=eta_mat,eta_mat_theta=NULL,site=config$site_id, site_size=n_site)


ODACATH initialize

Description

ODACATH initialize

Usage

ODACATH.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODACATH.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Details

Optional step-4: synthesize all the surrogate est btilde from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODACAT simulated data

Description

A simulated data set for ODACAT demonstration

Usage

ODACAT_nominal

Format

A data frame with 300 rows and 5 variables:

id.site

site id, 102 'site1', 100 'site2', 98 'site3'

outcome

3-category outcome, possible values are 1,2,3. Category 3 will be used as reference

X1

the first covariate, continuous

X2

the second covariate, binary

X3

the third covariate, binary


ODACAT simulated data

Description

A simulated data set for ODACAT demonstration

Usage

ODACAT_ordinal

Format

A data frame with 300 rows and 5 variables:

id.site

site id, 105 'site1', 105 'site2', 90 'site3'

outcome

3-category outcome, possible values are 1,2,3. Category 3 will be used as reference

X1

the first covariate, continuous

X2

the second covariate, binary

X3

the third covariate, binary


ODAH derivatives

Description

ODAH derivatives

Usage

ODAH.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1_zero = logL_D1_zero, logL_D1_count = logL_D1_count, logL_D2_zero = logL_D2_zero, logL_D2_count = logL_D2_count)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAH.estimate(ipdata,control,config)

Arguments

ipdata

local data in a list(ipdata, X_count, X_zero)

control

PDA control

config

cloud configuration

Details

construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAH initialize

Description

ODAH initialize

Usage

ODAH.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

TBD


ODAL derivatives

Description

ODAL derivatives

Usage

ODAL.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

list(site=config$site_id, site_size = nrow(ipdata), logL_D1=logL_D1, logL_D2=logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAL.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3: construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAL initialize

Description

ODAL initialize

Usage

ODAL.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

Rui Duan, et al. "Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm". Journal of the American Medical Informatics Association, 2020, https://doi.org/10.1093/jamia/ocz199


PDA synthesize surrogate estimates from all sites, optional

Description

PDA synthesize surrogate estimates from all sites, optional

Usage

ODAL.synthesize(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

pda control

config

pda cloud configuration

Details

Optional step-4: synthesize all the surrogate est btilde_i from each site, if step-3 from all sites is broadcasted

Value

list(btilde=btilde, Vtilde=Vtilde)


ODAP derivatives

Description

ODAP derivatives

Usage

ODAP.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1 = logL_D1, logL_D2 = logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAP.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame (generated in pda)

control

PDA control

config

cloud configuration

Details

construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAP initialize

Description

ODAP initialize

Usage

ODAP.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

TBD


ODAPB derivatives

Description

ODAPB derivatives

Usage

ODAPB.derive(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

derivatives list(site = config$site_id, site_size = nrow(ipdata), logL_D1 = logL_D1, logL_D2 = logL_D2)


PDA surrogate estimation

Description

PDA surrogate estimation

Usage

ODAPB.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame (generated in pda)

control

PDA control

config

cloud configuration

Details

construct and solve surrogate logL at the master/lead site

Value

list(btilde = sol$par, Htilde = sol$hessian, site=control$mysite, site_size=nrow(ipdata))


ODAPB initialize

Description

ODAPB initialize

Usage

ODAPB.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

TBD


COVID-19 LOS and mortality data

Description

A simulated data set of hospitalization Length of Stay (LOS) and mortality from 6 sites

Usage

covid

Format

A data frame with 2100 rows and 6 variables:

site

site id, 600 'site1', 500 'site2', 400 'site3', 300 'site4', 200 'site5', 100 'site6'

age

continuous age in year, min 3 max 97

sex

2 categories, '1' for male and '0' for female

lab

lab test results, continuous value ranging from 2.3 to 97.4

los

LOS in days, ranging from 1 to 29

death

mortality status, '1' for death and '0' for alive.


CrabSatellites data

Description

A data set modified from the CrabSatellites data in countreg package (see demo(ODAH)).

Usage

cs

Format

A data frame containing 173 observations on 4 variables.

site

Simulated site id, 85 'site1' and 88 'site2'.

satellites

Number of satellites. Treated as (zero-inflated) count outcome in ODAH

width

Carapace width (cm).

weight

Weight (kg).

Source

https://rdrr.io/rforge/countreg/man/CrabSatellites.html


dGEM hospital-specific effect derivation

Description

dGEM hospital-specific effect derivation

Usage

dGEM.derive(ipdata,control,config,hosdata)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

hosdata

hospital-level data

Value

hospital_effect


dGEM standardized event rate estimation

Description

dGEM standardized event rate estimation

Usage

dGEM.estimate(ipdata,control,config)

Arguments

ipdata

local data in data frame

control

PDA control

config

cloud configuration

Details

step-3:

Value

event rate


dGEM initialize

Description

dGEM initialize

Usage

dGEM.initialize(ipdata,control,config)

Arguments

ipdata

individual participant data

control

pda control data

config

local site configuration

Value

init

References

NA


PDA dGEM synthesize

Description

PDA dGEM synthesize

Usage

dGEM.synthesize(control,config)

Arguments

control

pda control

config

pda cloud configuration

Details

Synthesis to get the standardized mortality rate

Value

list(final_event_rate=final_event_rate)


gather cloud settings into a list

Description

gather cloud settings into a list

Usage

getCloudConfig(site_id,dir,uri,secret)

Arguments

site_id

site identifier

dir

shared directory path if flat files

uri

web uri if web service

secret

web token if web service

Value

A list of cloud parameters: site_id, secret and uri

See Also

pda


Lung cancer survival time data

Description

A data set modified from the lung data in survival package (see demo(ODAC)).

Usage

lung2

Format

A data frame with 228 rows and 5 variables:

site

simulated site id, 86 'site1', 83 'site2' and 59 'site3'

time

survival time in days

status

censoring status 0=censored, 1=dead

age

age in years

sex

1 for female and 0 for male

Source

https://CRAN.R-project.org/package=survival


A flexible version of MASS::glmmPQL

Description

A flexible version of MASS::glmmPQL

Usage

myglmmPQL(formula.glm, formula, offset=NULL, family, data, fixef.init = NULL, 
                 weights=NULL, REML=T, niter=10, verbose=T)

Arguments

formula.glm

formula used to fit glm for initial fixed effects

formula

formula used to fit iterative lmer in PQL algorithm

offset

glm offset term

family

glm family

data

glm data

fixef.init

initial fixed effects estimates, set to zeros if NULL

weights

glm weights

REML

lmer logical scalar - Should the estimates be chosen to optimize the REML criterion (as opposed to the log-likelihood)?

niter

glmmPQL maximum number of iterations.

verbose

glmmPQL logical: print out record of iterations?

Details

Use lme4::lmer instead of nlme::varFixed in PQL iteration to allow REML

Value

An object wiht the same format as lmer.


PDA: Privacy-preserving Distributed Algorithm

Description

Fit Privacy-preserving Distributed Algorithms for linear, logistic, Poisson and Cox PH regression with possible heterogeneous data across sites.

Usage

pda(ipdata,site_id,control,dir,uri,secret,hosdata)

Arguments

ipdata

Local IPD data in data frame, should include at least one column for the outcome and one column for the covariates

site_id

Character site name

control

pda control data

dir

directory for shared flat file cloud

uri

Universal Resource Identifier for this run

secret

password to authenticate as site_id on uri

hosdata

hospital-level data, should include the same name as defined in the control file

Value

control

control

References

Michael I. Jordan, Jason D. Lee & Yun Yang (2019) Communication-Efficient Distributed Statistical Inference,
Journal of the American Statistical Association, 114:526, 668-681
doi:10.1080/01621459.2018.1429274.
(DLM) Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation. IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data. medRxiv, doi:10.1101/2020.11.16.20230730.
(DPQL) Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling.
medRxiv, doi:10.1101/2021.05.03.21256561.
(ODAL) Rui Duan, et al. (2020) Learning from electronic health records across multiple sites:
A communication-efficient and privacy-preserving distributed algorithm.
Journal of the American Medical Informatics Association, 27.3:376–385,
doi:10.1093/jamia/ocz199.
(ODAC) Rui Duan, et al. (2020) Learning from local to global: An efficient distributed algorithm for modeling time-to-event data.
Journal of the American Medical Informatics Association, 27.7:1028–1036,
doi:10.1093/jamia/ocaa044.
(ODACH) Chongliang Luo, et al. (2021) ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data.
medRxiv, doi:10.1101/2021.04.18.21255694.
(ODAH) Mackenzie J. Edmondson, et al. (2021) An Efficient and Accurate Distributed Learning Algorithm for Modeling Multi-Site Zero-Inflated Count Outcomes. medRxiv, pp.2020-12.
doi:10.1101/2020.12.17.20248194.
(ADAP) Xiaokang Liu, et al. (2021) ADAP: multisite learning with high-dimensional heterogeneous data via A Distributed Algorithm for Penalized regression.
(dGEM) Jiayi Tong, et al. (2022) dGEM: Decentralized Generalized Linear Mixed Effects Model

See Also

pdaPut, pdaList, pdaGet, getCloudConfig and pdaSync.

Examples

require(survival)
require(data.table)
require(pda)
data(lung)

## In the toy example below we aim to analyze the association of lung status with 
## age and sex using logistic regression, data(lung) from 'survival', we randomly 
## assign to 3 sites: 'site1', 'site2', 'site3'. we demonstrate using PDA ODAL can 
## obtain a surrogate estimator that is close to the pooled estimate. We run the 
## example in local directory. In actual collaboration, account/password for pda server 
## will be assigned to the sites at the server https://pda.one.
## Each site can access via web browser to check the communication of the summary stats.

## for more examples, see demo(ODAC) and demo(ODAP)

# Create 3 sites, split the lung data amongst them
sites = c('site1', 'site2', 'site3')
set.seed(42)
lung2 <- lung[,c('status', 'age', 'sex')]
lung2$sex <- lung2$sex - 1
lung2$status <- ifelse(lung2$status == 2, 1, 0)
lung_split <- split(lung2, sample(1:length(sites), nrow(lung), replace=TRUE))
## fit logistic reg using pooled data
fit.pool <- glm(status ~ age + sex, family = 'binomial', data = lung2)


# ############################  STEP 1: initialize  ###############################
control <- list(project_name = 'Lung cancer study',
                step = 'initialize',
                sites = sites,
                heterogeneity = FALSE,
                model = 'ODAL',
                family = 'binomial',
                outcome = "status",
                variables = c('age', 'sex'),
                optim_maxit = 100,
                lead_site = 'site1',
                upload_date = as.character(Sys.time()) )


## run the example in local directory:
## specify your working directory, default is the tempdir
mydir <- tempdir()
## assume lead site1: enter "1" to allow transferring the control file  
pda(site_id = 'site1', control = control, dir = mydir)
## in actual collaboration, account/password for pda server will be assigned, thus:
## Not run: pda(site_id = 'site1', control = control, uri = 'https://pda.one', secret='abc123')
## you can also set your environment variables, and no need to specify them in pda:
## Not run: Sys.setenv(PDA_USER = 'site1', PDA_SECRET = 'abc123', PDA_URI = 'https://pda.one')
## Not run: pda(site_id = 'site1', control = control)

##' assume remote site3: enter "1" to allow tranferring your local estimate 
pda(site_id = 'site3', ipdata = lung_split[[3]], dir=mydir)

##' assume remote site2: enter "1" to allow tranferring your local estimate  
pda(site_id = 'site2', ipdata = lung_split[[2]], dir=mydir)

##' assume lead site1: enter "1" to allow tranferring your local estimate  
##' control.json is also automatically updated
pda(site_id = 'site1', ipdata = lung_split[[1]], dir=mydir)

##' if lead site1 initialized before other sites,
##' lead site1: uncomment to sync the control before STEP 2
## Not run: pda(site_id = 'site1', control = control)
## Not run: config <- getCloudConfig(site_id = 'site1')
## Not run: pdaSync(config)

#' ############################'  STEP 2: derivative  ############################ 
##' assume remote site3: enter "1" to allow tranferring your derivatives  
pda(site_id = 'site3', ipdata = lung_split[[3]], dir=mydir)

##' assume remote site2: enter "1" to allow tranferring your derivatives  
pda(site_id = 'site2', ipdata = lung_split[[2]], dir=mydir)

##' assume lead site1: enter "1" to allow tranferring your derivatives  
pda(site_id = 'site1', ipdata = lung_split[[1]], dir=mydir)


#' ############################'  STEP 3: estimate  ############################ 
##' assume lead site1: enter "1" to allow tranferring the surrogate estimate  
pda(site_id = 'site1', ipdata = lung_split[[1]], dir=mydir)

##' the PDA ODAL is now completed!
##' All the sites can still run their own surrogate estimates and broadcast them.

##' compare the surrogate estimate with the pooled estimate 
config <- getCloudConfig(site_id = 'site1', dir=mydir)
fit.odal <- pdaGet(name = 'site1_estimate', config = config)
cbind(b.pool=fit.pool$coef,
      b.odal=fit.odal$btilde,
      sd.pool=summary(fit.pool)$coef[,2],
      sd.odal=sqrt(diag(solve(fit.odal$Htilde)/nrow(lung2))))
      
## see demo(ODAL) for more optional steps


Function to download json and return as object

Description

Function to download json and return as object

Usage

pdaGet(name,config)

Arguments

name

of file

config

cloud configuration

Value

A list of data objects from the json file on the cloud

See Also

pda


Function to list available objects

Description

Function to list available objects

Usage

pdaList(config)

Arguments

config

a list of variables for cloud configuration

Value

A list of (json) files on the cloud

See Also

pda


Function to upload object to cloud as json

Description

Function to upload object to cloud as json

Usage

pdaPut(obj,name,config)

Arguments

obj

R object to encode as json and uploaded to cloud

name

of file

config

a list of variables for cloud configuration

Value

NONE

See Also

pda


pda control synchronize

Description

update pda control if ready (run by lead)

Usage

pdaSync(config)

Arguments

config

cloud configuration

Value

control

See Also

pda