Type: | Package |
Title: | Fast Logistic Regression Wrapper |
Date: | 2023-08-07 |
Version: | 1.2.0 |
License: | GPL-3 |
Description: | Provides very fast logistic regression with coefficient inferences plus other useful methods such as a forward stepwise model generator (see the benchmarks by visiting the github page at the URL below). The inputs are flexible enough to accomodate GPU computations. The coefficient estimation employs the fastLR() method in the 'RcppNumerical' package by Yixuan Qiu et al. This package allows their work to be more useful to a wider community that consumes inference. |
Encoding: | UTF-8 |
Depends: | R (≥ 4.0.0) |
Imports: | RcppNumerical, Rcpp, checkmate, stats, MASS, methods |
LinkingTo: | Rcpp, RcppEigen |
URL: | https://github.com/kapelner/fastLogisticRegressionWrap |
BugReports: | https://github.com/kapelner/fastLogisticRegressionWrap/issues |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | yes |
Packaged: | 2023-08-06 13:08:24 UTC; kapel |
Author: | Adam Kapelner |
Maintainer: | Adam Kapelner <kapelner@qc.cuny.edu> |
Repository: | CRAN |
Date/Publication: | 2023-08-08 15:30:02 UTC |
Asymmetric Cost Explorer
Description
Given a set of desired proportions of predicted outcomes, what is the error rate for each of those models?
Usage
asymmetric_cost_explorer(
phat,
ybin,
steps = seq(from = 0.001, to = 0.999, by = 0.001),
outcome_of_analysis = 0,
proportions_desired = seq(from = 0.1, to = 0.9, by = 0.1),
proportion_tolerance = 0.01
)
Arguments
phat |
The vector of probability estimates to be thresholded to make a binary decision |
ybin |
The true binary responses |
steps |
All possibile thresholds which must be a vector of numbers in (0, 1). Default is |
outcome_of_analysis |
Which class do you care about performance? Either 0 or 1 for the negative class or positive class. Default is |
proportions_desired |
Which proportions of |
proportion_tolerance |
If the model cannot match the proportion_desired within this amount, it does not return that model's performance. Default is |
Value
A table with column 1: proportions_desired
, column 2: actual proportions (as close as possible), column 3: error rate, column 4: probability threshold.
Author(s)
Adam Kapelner
Binary Confusion Table and Errors
Description
Provides a binary confusion table and error metrics
Usage
confusion_results(yhat, ybin, skip_argument_checks = FALSE)
Arguments
yhat |
The binary predictions |
ybin |
The true binary responses |
skip_argument_checks |
If |
Value
A list of raw results
Examples
library(MASS); data(Pima.te)
ybin = as.numeric(Pima.te$type == "Yes")
flr = fast_logistic_regression(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = ybin
)
phat = predict(flr, model.matrix(~ . - type, Pima.te))
confusion_results(phat > 0.5, ybin)
A fast Xt [times] diag(w) [times] X function
Description
Via the eigen package
Usage
eigen_Xt_times_diag_w_times_X(X, w, num_cores = 1)
Arguments
X |
A numeric matrix of size n x p |
w |
A numeric vector of length p |
num_cores |
The number of cores to use. Unless p is large, keep to the default of 1. |
Value
The resulting matrix
Examples
n = 100
p = 10
X = matrix(rnorm(n * p), nrow = n, ncol = p)
w = rnorm(p)
eigen_Xt_times_diag_w_times_X(t(X), w)
Compute Single Value of the Diagonal of a Symmetric Matrix's Inverse
Description
Via the eigen package's conjugate gradient descent algorithm.
Usage
eigen_compute_single_entry_of_diagonal_matrix(M, j, num_cores = 1)
Arguments
M |
The symmetric matrix which to invert (and then extract one element of its diagonal) |
j |
The diagonal entry of |
num_cores |
The number of cores to use. Default is 1. |
Value
The value of m^-1_j,j
Author(s)
Adam Kapelner
Examples
n = 500
X = matrix(rnorm(n^2), nrow = n, ncol = n)
M = t(X) %*% X
j = 137
eigen_compute_single_entry_of_diagonal_matrix(M, j)
solve(M)[j, j] #to ensure it's the same value
A fast det(X) function
Description
Via the eigen package
Usage
eigen_det(X, num_cores = 1)
Arguments
X |
A numeric matrix of size p x p |
num_cores |
The number of cores to use. Unless p is large, keep to the default of 1. |
Value
The determinant as a scalar numeric value
Examples
p = 30
eigen_det(matrix(rnorm(p^2), nrow = p))
A fast solve(X) function
Description
Via the eigen package
Usage
eigen_inv(X, num_cores = 1)
Arguments
X |
A numeric matrix of size p x p |
num_cores |
The number of cores to use. Unless p is large, keep to the default of 1. |
Value
The resulting matrix
Examples
p = 10
eigen_inv(matrix(rnorm(p^2), nrow = p))
A Wrapper for FastLR
Description
A tool to find many types of a priori experimental designs
Author(s)
Adam Kapelner kapelner@qc.cuny.edu
References
Kapelner, A
FastLR Wrapper
Description
Returns most of what you get from glm
Usage
fast_logistic_regression(
Xmm,
ybin,
drop_collinear_variables = FALSE,
lm_fit_tol = 1e-07,
do_inference_on_var = "none",
Xt_times_diag_w_times_X_fun = NULL,
sqrt_diag_matrix_inverse_fun = NULL,
num_cores = 1,
...
)
Arguments
Xmm |
The model.matrix for X (you need to create this yourself before) |
ybin |
The binary response vector |
drop_collinear_variables |
Should we drop perfectly collinear variables? Default is |
lm_fit_tol |
When |
do_inference_on_var |
Which variables should we compute approximate standard errors of the coefficients and approximate p-values for the test of
no linear log-odds probability effect? Default is |
Xt_times_diag_w_times_X_fun |
A custom function whose arguments are |
sqrt_diag_matrix_inverse_fun |
A custom function that returns a numeric vector which is square root of the diagonal of the inverse of the inputted matrix. Its arguments are |
num_cores |
Number of cores to use to speed up matrix multiplication and matrix inversion (used only during inference computation). Default is 1.
Unless the number of variables, i.e. |
... |
Other arguments to be passed to |
Value
A list of raw results
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes")
)
Rapid Forward Stepwise Logistic Regression
Description
Roughly duplicates the following glm
-style code:
Usage
fast_logistic_regression_stepwise_forward(
Xmm,
ybin,
mode = "aic",
pval_threshold = 0.05,
use_intercept = TRUE,
verbose = TRUE,
drop_collinear_variables = FALSE,
lm_fit_tol = 1e-07,
...
)
Arguments
Xmm |
The model.matrix for X (you need to create this yourself before). |
ybin |
The binary response vector. |
mode |
"aic" (default, fast) or "pval" (slow, but possibly yields a better model). |
pval_threshold |
The significance threshold to include a new variable. Default is |
use_intercept |
Should we automatically begin with an intercept? Default is |
verbose |
Print out messages during the loop? Default is |
drop_collinear_variables |
Parameter used in |
lm_fit_tol |
Parameter used in |
... |
Other arguments to be passed to |
Details
nullmod = glm(ybin ~ 0, data.frame(Xmm), family = binomial)
fullmod = glm(ybin ~ 0 + ., data.frame(Xmm), family = binomial)
forwards = step(nullmod, scope = list(lower = formula(nullmod), upper = formula(fullmod)), direction = "forward", trace = 0)
Value
A list of raw results
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression_stepwise_forward(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes")
)
General Confusion Table and Errors
Description
Provides a confusion table and error metrics for general factor vectors. There is no need for the same levels in the two vectors.
Usage
general_confusion_results(yhat, yfac, proportions_scaled_by_column = FALSE)
Arguments
yhat |
The factor predictions |
yfac |
The true factor responses |
proportions_scaled_by_column |
When returning the proportion table, scale by column? Default is |
Value
A list of raw results
Examples
library(MASS); data(Pima.te)
ybin = as.numeric(Pima.te$type == "Yes")
flr = fast_logistic_regression(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = ybin
)
phat = predict(flr, model.matrix(~ . - type, Pima.te))
yhat = array(NA, length(ybin))
yhat[phat <= 1/3] = "no"
yhat[phat >= 2/3] = "yes"
yhat[is.na(yhat)] = "maybe"
general_confusion_results(factor(yhat, levels = c("no", "yes", "maybe")), factor(ybin))
#you want the "no" to align with 0, the "yes" to align with 1 and the "maybe" to be
#last to align with nothing
FastLR Wrapper Predictions
Description
Predicts returning p-hats
Usage
## S3 method for class 'fast_logistic_regression'
predict(object, newdata, type = "response", ...)
Arguments
object |
The object built using the |
newdata |
A matrix of observations where you wish to predict the binary response. |
type |
The type of prediction required. The default is |
... |
Further arguments passed to or from other methods |
Value
A numeric vector of length nrow(newdata)
of estimates of P(Y = 1) for each unit in newdata
.
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes")
)
phat = predict(flr, model.matrix(~ . - type, Pima.te))
FastLR Wrapper Predictions
Description
Predicts returning p-hats
Usage
## S3 method for class 'fast_logistic_regression_stepwise'
predict(object, newdata, type = "response", ...)
Arguments
object |
The object built using the |
newdata |
A matrix of observations where you wish to predict the binary response. |
type |
The type of prediction required. The default is |
... |
Further arguments passed to or from other methods |
Value
A numeric vector of length nrow(newdata)
of estimates of P(Y = 1) for each unit in newdata
.
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression_stepwise_forward(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes")
)
phat = predict(flr, model.matrix(~ . - type, Pima.te))
FastLR Wrapper Print
Description
Returns the summary table a la glm
Usage
## S3 method for class 'fast_logistic_regression'
print(x, ...)
Arguments
x |
The object built using the |
... |
Other arguments to be passed to print |
Value
The summary as a data.frame
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes"))
print(flr)
FastLR Wrapper Print
Description
Returns the summary table a la glm
Usage
## S3 method for class 'fast_logistic_regression_stepwise'
print(x, ...)
Arguments
x |
The object built using the |
... |
Other arguments to be passed to print |
Value
The summary as a data.frame
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression_stepwise_forward(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes"))
print(flr)
FastLR Wrapper Summary
Description
Returns the summary table a la glm
Usage
## S3 method for class 'fast_logistic_regression'
summary(object, alpha_order = TRUE, ...)
Arguments
object |
The object built using the |
alpha_order |
Should the coefficients be ordered in alphabetical order? Default is |
... |
Other arguments to be passed to |
Value
The summary as a data.frame
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes"))
summary(flr)
FastLR Wrapper Summary
Description
Returns the summary table a la glm
Usage
## S3 method for class 'fast_logistic_regression_stepwise'
summary(object, ...)
Arguments
object |
The object built using the |
... |
Other arguments to be passed to |
Value
The summary as a data.frame
Examples
library(MASS); data(Pima.te)
flr = fast_logistic_regression_stepwise_forward(
Xmm = model.matrix(~ . - type, Pima.te),
ybin = as.numeric(Pima.te$type == "Yes"))
summary(flr)