Version: | 1.0.0 |
Title: | Sparse Regression with Paired Covariates |
Description: | Implements sparse regression with paired covariates (<doi:10.1007/s11634-019-00375-6>). The paired lasso is designed for settings where each covariate in one set forms a pair with a covariate in the other set (one-to-one correspondence). For the optional correlation shrinkage, install ashr (https://github.com/stephens999/ashr) and CorShrink (https://github.com/kkdey/CorShrink) from GitHub (see README). |
Depends: | R (≥ 3.0.0) |
Imports: | glmnet, Matrix, survival |
Suggests: | knitr, testthat, rmarkdown, remotes, pROC, edgeR, ashr, CorShrink |
License: | GPL-3 |
Encoding: | UTF-8 |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/rauschenberger/palasso, https://rauschenberger.github.io/palasso/ |
BugReports: | https://github.com/rauschenberger/palasso/issues |
NeedsCompilation: | no |
Packaged: | 2024-09-26 12:43:23 UTC; armin.rauschenberger |
Author: | Armin Rauschenberger
|
Maintainer: | Armin Rauschenberger <armin.rauschenberger@uni.lu> |
Repository: | CRAN |
Date/Publication: | 2024-09-26 22:40:02 UTC |
Paired lasso
Description
The function palasso
fits the paired lasso.
Use this function if you have paired covariates
and want a sparse model.
Usage
palasso(y = y, X = X, max = 10, ...)
Arguments
y |
response:
vector of length |
X |
covariates:
list of matrices,
each with |
max |
maximum number of non-zero coefficients:
positive numeric, or |
... |
Details
Let x
denote one entry of the list X
. See glmnet
for alternative specifications of y
and x
. Among the further
arguments, family
must equal "gaussian"
, "binomial"
,
"poisson"
, or "cox"
, and penalty.factor
must not be
used.
Hidden arguments:
Deactivate adaptive lasso by setting adaptive
to FALSE
,
activate standard lasso by setting standard
to TRUE
,
and activate shrinkage by setting shrink
to TRUE
.
Value
This function returns an object of class palasso
.
Available methods include
predict
,
coef
,
weights
,
fitted
,
residuals
,
deviance
,
logLik
,
and summary
.
References
Armin Rauschenberger, Iiuliana Ciocanea-Teodorescu, Marianne A. Jonker, Renee X. Menezes, and Mark A. van de Wiel (2020). "Sparse classification with paired covariates." Advances in Data Analysis and Classification 14:571-588. doi:10.1007/s11634-019-00375-6. (Click here to access PDF. Contact: armin.rauschenberger@uni.lu.)
Examples
set.seed(1)
n <- 50; p <- 20
y <- rbinom(n=n,size=1,prob=0.5)
X <- lapply(1:2,function(x) matrix(rnorm(n*p),nrow=n,ncol=p))
object <- palasso(y=y,X=X,family="binomial") # adaptive=TRUE,standard=FALSE
names(object)
Arguments
Description
Checks the validity of the provided arguments.
Usage
.args(...)
Arguments
... |
Arguments supplied to |
Value
Returns the arguments as a list, including default values for missing arguments.
Examples
NA
Combining p-values
Description
This function combines local p
-values to a global p
-value.
Usage
.combine(x, method = "simes")
Arguments
x |
local |
method |
character |
Value
These functions return a numeric vector of length p
(main effects),
or a numeric matrix with p
rows and p
columns
(interaction effects).
References
Westfall, P. H. (2005). "Combining p-values". Encyclopedia of Biostatistics doi:10.1002/0470011815.b2a15181
Examples
# independence
p <- runif(10)
palasso:::.combine(p)
## dependence
#runif <- function(n,cor=0){
# Sigma <- matrix(cor,nrow=n,ncol=n)
# diag(Sigma) <- 1
# mu <- rep(0,times=n)
# q <- MASS::mvrnorm(n=1,mu=mu,Sigma=Sigma)
# stats::pnorm(q=q)
#}
#p <- runif(n=10,cor=0.8)
#combine(p)
Correlation
Description
Calculates the correlation between the response and the covariates. Shrinks the correlation coefficients for each covariate set separately.
Usage
.cor(y, x, args)
Arguments
y |
vector of length |
x |
matrix with |
args |
options for paired lasso: list of arguments (output from .dims and .args) |
Value
list of vectors
Examples
NA
Cross-validation
Description
Repeatedly leaves out samples, and predicts their response.
Usage
.cv(y, x, foldid, lambda, args)
Arguments
y |
response:
vector of length |
x |
covariates:
matrix with |
foldid |
fold identifiers:
vector of length |
lambda |
lambda sequence: vector of decreasing positive values |
args |
options for paired lasso: list of arguments (output from .dims and .args) |
Value
Returns matrix of predicted values (except "cox")
Examples
NA
Dimensionality
Description
This function extracts the dimensions.
Usage
.dims(y, X, args = NULL)
Arguments
y |
response:
vector of length |
X |
covariates:
list of matrices,
each with |
args |
options for paired lasso: list of arguments (output from .dims and .args) |
Value
The function .dims
extracts the dimensionality.
It returns the numbers of samples,
covariate pairs and covariate sets.
It also returns the number of weighting schemes,
and the names of these weighting schemes.
Examples
NA
Extraction
Description
Extracts cv.glmnet
-like object.
Usage
.extract(fit, lambda, cvm, type.measure)
Arguments
fit |
matrix with one row for each sample
("gaussian", "binomial" and "poisson"),
or one row for each fold (only "cox"),
and one column for each |
lambda |
lambda sequence: vector of decreasing positive values |
cvm |
mean cross-validated loss:
vector of same length as |
type.measure |
... loss function: character "deviance", "mse", "mae", "class", or "auc" |
Examples
NA
Model bag
Description
Fits all models from the chosen bag.
Usage
.fit(y, x, args)
Arguments
y |
response:
vector of length |
x |
covariates:
matrix with |
args |
options for paired lasso: list of arguments (output from .dims and .args) |
Value
list of glmnet
-like objects
Examples
NA
Cross-validation folds
Description
Assigns samples to cross-validation folds, balancing the folds in the case of a binary or survival response.
Usage
.folds(y, nfolds, foldid = NULL)
Arguments
y |
response:
vector of length |
nfolds |
number of folds:
positive integer
( |
foldid |
fold identifiers:
vector of length |
Value
Returns the fold identifiers.
Examples
NA
Cross-validation loss
Description
Calculates mean cross-validated loss
Usage
.loss(y, fit, family, type.measure, foldid = NULL)
Arguments
y |
response:
vector of length |
fit |
matrix with one row for each sample
("gaussian", "binomial" and "poisson"),
or one row for each fold (only "cox"),
and one column for each |
family |
model family: character "gaussian", "binomial", "poisson", or "cox" |
type.measure |
... loss function: character "deviance", "mse", "mae", "class", or "auc" |
foldid |
fold identifiers:
vector of length |
Value
Returns list of vectors, one for each model.
Examples
NA
Weighting schemes
Description
Calculates the weighting schemes.
Usage
.weight(cor, args)
Arguments
cor |
correlation coefficients:
list of |
args |
options for paired lasso: list of arguments (output from .dims and .args) |
Value
list of named vectors (one for each weighting scheme)
Examples
NA
Arguments for "palasso"
Description
This page lists the arguments for the (internal) "palasso" function(s).
Arguments
y |
response:
vector of length |
X |
covariates:
list of matrices,
each with |
max |
maximum number of non-zero coefficients:
positive numeric, or |
... |
|
x |
covariates:
matrix with |
args |
options for paired lasso: list of arguments (output from .dims and .args) |
nfolds |
number of folds:
positive integer
( |
foldid |
fold identifiers:
vector of length |
cor |
correlation coefficients:
list of |
lambda |
lambda sequence: vector of decreasing positive values |
family |
model family: character "gaussian", "binomial", "poisson", or "cox" |
type.measure |
... loss function: character "deviance", "mse", "mae", "class", or "auc" |
fit |
matrix with one row for each sample
("gaussian", "binomial" and "poisson"),
or one row for each fold (only "cox"),
and one column for each |
cvm |
mean cross-validated loss:
vector of same length as |
Methods for class "palasso"
Description
This page lists the main methods for class "palasso".
Usage
## S3 method for class 'palasso'
predict(object, newdata, model = "paired", s = "lambda.min", max = NULL, ...)
## S3 method for class 'palasso'
coef(object, model = "paired", s = "lambda.min", max = NULL, ...)
## S3 method for class 'palasso'
weights(object, model = "paired", max = NULL, ...)
## S3 method for class 'palasso'
fitted(object, model = "paired", s = "lambda.min", max = NULL, ...)
## S3 method for class 'palasso'
residuals(object, model = "paired", s = "lambda.min", max = NULL, ...)
## S3 method for class 'palasso'
deviance(object, model = "paired", max = NULL, ...)
## S3 method for class 'palasso'
logLik(object, model = "paired", max = NULL, ...)
## S3 method for class 'palasso'
summary(object, model = "paired", ...)
Arguments
object |
palasso object |
newdata |
covariates:
list of matrices, each with |
model |
character |
s |
penalty parameter:
character |
max |
maximum number of non-zero coefficients,
positive integer,
or |
... |
further arguments for
|
Details
By default, the function predict
returns
the linear predictor (type="link"
).
Consider predicting the response (type="response"
).
See Also
Use palasso to fit the paired lasso.
Analysis functions for manuscript
Description
Functions for the palasso
manuscript.
Usage
.prepare(X, filter = 1, cutoff = "zero", scale = TRUE)
.simulate(x, effects)
.predict(
y,
X,
nfolds.ext = 5,
nfolds.int = 5,
adaptive = TRUE,
standard = TRUE,
elastic = TRUE,
shrink = TRUE,
family = "binomial",
...
)
.select(y, X, index, nfolds = 5, standard = TRUE, adaptive = TRUE, ...)
Arguments
X |
covariates:
matrix with |
filter |
numeric, multiplying the sample size |
cutoff |
character "zero", "knee", or "half" |
scale |
logical |
x |
covariates:
list of length |
effects |
number of causal covariates:
vector of length |
y |
response:
vector of length |
nfolds.ext |
number of external folds |
... |
arguments for palasso |
index |
indices of causal covariates:
list of length |
Details
.prepare
:
pre-processes sequencing data by
removing features with a low total abundance,
and adjusting for different library sizes;
obtains two transformations of the same data
by (1) binarising the counts with some cutoff
and (2) taking the Anscombe transform;
scales all covariates to mean zero and unit variance.
.simulate
:
simulates the response by
exploiting two experimental covariate matrices;
allows for different numbers of non-zero coefficients for X and Z.
.predict
:
estimates the predictive performance of different lasso models
(standard X and/or Z, adaptive X and/or Z, paired X and Z);
minimises the loss function "deviance", but also returns other loss functions;
supports logistic and Cox regression.
.select
:
estimates the selective performance of different lasso models
(standard X and/or Z, adaptive X and/or Z, paired X and Z);
limits the number of covariates to 10
;
returns the number of selected covariates,
and the number of correctly selected covariates.
See Also
Use palasso to fit the paired lasso.
Examples
## Not run: set.seed(1)
n <- 30; p <- 40
X <- matrix(rpois(n*p,lambda=3),nrow=n,ncol=p)
x <- palasso:::.prepare(X)
y <- palasso:::.simulate(x,effects=c(1,2))
predict <- palasso:::.predict(y,x)
select <- palasso:::.select(y,x,attributes(y))
## End(Not run)
Plot functions for manuscript
Description
Functions for the palasso
manuscript.
Usage
plot_score(X, choice = NULL, ylab = "count")
plot_table(
X,
margin = 2,
labels = TRUE,
colour = TRUE,
las = 1,
cex = 1,
cutoff = NA
)
plot_circle(b, w, cutoff = NULL, group = NULL)
plot_box(
X,
choice = NULL,
ylab = "",
ylim = NULL,
zero = FALSE,
invert = FALSE
)
plot_pairs(x, y = NULL, ...)
plot_diff(x, y, prob = 0.95, ylab = "", xlab = "", ...)
Arguments
X |
matrix with |
choice |
numeric between |
margin |
|
cutoff |
numeric between |
b |
between-group correlation:
vector of length |
w |
within-group correlation:
matrix with |
group |
vector of length |
x , y |
vectors of equal length |
... |
additional arguments |
prob |
confidence interval:
numeric between |
Details
The function plot_score
compares a selected column to each of the
other columns. It counts the number of rows where the entry in the selected
column is smaller (blue), equal (white), or larger (red).
Value
to do
See Also
Use palasso to fit the paired lasso.
Examples
### score ###
n <- 10; p <- 4
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
palasso:::plot_score(X)
### table ###
n <- 5; p <- 3
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
palasso:::plot_table(X,margin=2)
### circle ###
n <- 50; p <- 25
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
Z <- matrix(rnorm(n*p),nrow=n,ncol=p)
b <- sapply(seq_len(p),function(i) abs(cor(X[,i],Z[,i])))
w <- pmax(abs(cor(X)),abs(cor(Z)),na.rm=TRUE)
palasso:::plot_circle(b,w,cutoff=0)
### box ###
n <- 10; p <- 5
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
palasso:::plot_box(X,choice=5)
### pairs ###
n <- 10
x <- runif(n)
y <- runif(n)
palasso:::plot_pairs(x,y)
### diff ###
n <- 100
x <- runif(n)
y <- runif(n)
palasso:::plot_diff(x,y)