--- title: "Extending lolR for Arbitrary Embedding Algorithms" author: "Eric Bridgeford" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{extend_embedding} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Writing New Embedding Algorithms For example, the below algorithm for `lol.project.lol`: ``` #' Linear Optimal Low-Rank Projection (LOL) #' #' A function for implementing the Linear Optimal Low-Rank Projection (LOL) Algorithm. #' #' @param X \code{[n, d]} the data with \code{n} samples in \code{d} dimensions. #' @param Y \code{[n]} the labels of the samples with \code{K} unique labels. #' @param r the rank of the projection. Note that \code{r >= K}, and \code{r < d}. #' @param ... trailing args. #' @return A list of class \code{embedding} containing the following: #' \item{A}{\code{[d, r]} the projection matrix from \code{d} to \code{r} dimensions.} #' \item{ylabs}{\code{[K]} vector containing the \code{K} unique, ordered class labels.} #' \item{centroids}{\code{[K, d]} centroid matrix of the \code{K} unique, ordered classes in native \code{d} dimensions.} #' \item{priors}{\code{[K]} vector containing the \code{K} prior probabilities for the unique, ordered classes.} #' \item{Xr}{\code{[n, r]} the \code{n} data points in reduced dimensionality \code{r}.} #' \item{cr}{\code{[K, r]} the \code{K} centroids in reduced dimensionality \code{r}.} #' @author Eric Bridgeford #' @examples #' library(lolR) #' data <- lol.sims.rtrunk(n=200, d=30) # 200 examples of 30 dimensions #' X <- data$X; Y <- data$Y #' model <- lol.project.lol(X=X, Y=Y, r=5) # use lol to project into 5 dimensions #' @export lol.project.lol <- function(X, Y, r, ...) { # class data info <- lol.utils.info(X, Y) priors <- info$priors; centroids <- info$centroids K <- info$K; ylabs <- info$ylabs n <- info$n; d <- info$d deltas <- lol.utils.deltas(centroids, priors) centroids <- t(centroids) nv <- r - (K) if (nv > 0) { A <- cbind(deltas, lol.project.cpca(X, Y, nv)$A) } else { A <- deltas[, 1:r, drop=FALSE] } # orthogonalize and normalize A <- qr.Q(qr(A)) return(list(A=A, centroids=centroids, priors=priors, ylabs=ylabs, Xr=lol.embed(X, A), cr=lol.embed(centroids, A))) } ``` As we can see in the above segment, the function `lol.project.lol` returns a list of items. To use many of the `lol` functionality, researchers can trivially write an `embedding` method following the below spec: ``` Inputs: keyworded arguments for: - X: a [n, d] data matrix with n samples in d dimensions. - Y: a [n] vector of class labels for each sample. Outputs: a list containing the following: - : a [d, r] embedding matrix from d dimensions to r << d dimensions. ``` Note that the inputs MUST be named `X, Y`. In the above example, I call my embedding matrix `A`, but you can call it whatever you want. # Embedding with your algorithm After you have written your algorithm ``, you may be interested in embedding with it. With your algorithm in your `namespace`, you can embed points as follows, noting that `` will be additional arguments you pass to your function: ``` # given: X, Y contain the data matrix and class labels, respectively result <- (X, Y, ) # embed new points in your testing set, Xt Xr <- lol.embed(Xt, result$A) ``` # Performing Cross-Validation with your Algorithm With your new algorithm, you may want to perform some sort of cross-validation. Following the above spec, this is incredibly easy. Your argument may, for instance, require its own individual hyperparameters. For example, in my example above, I have a hyperparameter for `r`, the rank of the embedding. I can define the following list of the optional arguments: ``` alg = lol.project.lol r = # the desired rank I want to embed into alg.opts = list(r=r) embed = "A" # the name of the embedding matrix produced alg.return = embed ``` I can then pass my algorithm into the `lol.xval.eval` algorithm: ``` xval.out <- lol.xval.eval(X, Y, alg=alg, alg.opts=alg.opts, alg.return=alg.return, k=) ``` where `` specifies the desired cross-validation method to use. For more details, see the `xval` vignette. See the tutorial vignette `extend_classification` for how to specify the `classifier`, `classifier.opts`, and `classifier.return`. Alternatively, do not include these keyworded arguments to `lol.xval.xval` to use the default `lda` classifier. Now, you should be able to use your user-defined embedding method with the `lol` package.