Type: | Package |
Title: | Multivariate Analysis Using Biplots in R |
Version: | 23.11.0 |
Date: | 2023-11-16 |
Author: | Jose Luis Vicente-Villardon, Laura Vicente-Gonzalez, Elisa Frutos-Bernal |
Maintainer: | Jose Luis Vicente Villardon <villardon@usal.es> |
Description: | Several multivariate techniques from a biplot perspective. It is the translation (with many improvements) into R of the previous package developed in 'Matlab'. The package contains some of the main developments of my team during the last 30 years together with some more standard techniques. Package includes: Classical Biplots, HJ-Biplot, Canonical Biplots, MANOVA Biplots, Correspondence Analysis, Canonical Correspondence Analysis, Canonical STATIS-ACT, Logistic Biplots for binary and ordinal data, Multidimensional Unfolding, External Biplots for Principal Coordinates Analysis or Multidimensional Scaling, among many others. References can be found in the help of each procedure. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
Repository: | CRAN |
Depends: | R (≥ 4.0.0) |
Imports: | MASS, scales, geometry, deldir, mirt, GPArotation, Hmisc, car, dunn.test, gplots, lattice, polycor, dae, xtable, mvtnorm, psych, ThreeWay, knitr |
LazyData: | yes |
Archs: | i386, x64 |
NeedsCompilation: | no |
Packaged: | 2023-11-20 11:36:24 UTC; joseluis |
Date/Publication: | 2023-11-21 15:00:06 UTC |
Multivariate Analysis using Biplots
Description
Classical PCA biplot with aditional features as non-standard data transformations, scales for the variables, together with many graphical aids as sizes or colors of the points according to their qualities of representation or predictiveness. The package includes also Alternating Least Squares (ALS) or Criss-Cross procedures for the calculation of the reduced rank approximation that can deal with missing data, differencial weights for each element of the data matrix or even ronust versions of the procedure.
This is part of a bigger project called MULTBIPLOT that contains many other biplot techniques and is a translation to R of the package MULBIPLOT programmed in MATLAB. A GUI for the package is also in preparation.
Details
Package: | MultBiplot |
Type: | Package |
Version: | 0.1.00 |
Date: | 2015-01-14 |
License: | GPL(>=2) |
Author(s)
Jose Luis Vicente Villardon Maintainer: Jose Luis Vicente Villardon <villardon@usal.es>
References
Vicente-Villardon, J.L. (2010). MULTBIPLOT: A package for Multivariate Analysis using Biplots. Departamento de Estadistica. Universidad de Salamanca. (http://biplot.usal.es/ClassicalBiplot/index.html).
Vicente-Villardon, J. L. (1992). Una alternativa a las técnicas factoriales clasicas basada en una generalización de los metodos Biplot (Doctoral dissertation, Tesis. Universidad de Salamanca. España. 248 pp.[Links]).
Gabriel KR (1971) The biplot graphic display of matrices with application to principal component analysis. Biometrika 58(3):453-467
Gabriel KR (1998) Generalised bilinear regresion, J. L. (1998). Use of biplots to diagnose independence models in three-way contingency tables. Visualization of Categorical Data. Academic Press. London.
Gabriel, K. R. (2002). Le biplot-outil d'exploration de donnes multidimensionnelles. Journal de la Societe francaise de statistique, 143(3-4).
Gabriel KR, Zamir S (1979) Lower rank approximation of matrices by least squares with any choice of weights. Technometrics 21(4):489-498.
Gower J, Hand D (1996) Biplots. Monographs on statistics and applied probability. 54. London: Chapman and Hall., 277 pp.
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Qüestiió. 1986, vol. 10, núm. 1.
Demey J, Vicente-Villardon JL, Galindo MP, Zambrano A (2008) Identifying molecular markers associated with classification of genotypes using external logistic biplots. Bioinformatics 24(24):2832-2838.
Vicente-Villardon JL, Galindo MP, Blazquez-Zaballos A (2006) Logistic biplots. Multiple Correspondence Analysis and related methods pp 491-509.
Santos, C., Munoz, S. S., Gutierrez, Y., Hebrero, E., Vicente, J. L., Galindo, P., Rivas, J. C. (1991). Characterization of young red wines by application of HJ biplot analysis to anthocyanin profiles. Journal of Agricultural and food chemistry, 39(6), 1086-1090.
Rivas-Gonzalo, J. C., Gutierrez, Y., Polanco, A. M., Hebrero, E., Vicente, J. L., Galindo, P., Santos-Buelga, C. (1993). Biplot analysis applied to enological parameters in the geographical classification of young red wines. American journal of enology and viticulture, 44(3), 302-308.
Examples
data(iris)
bip=PCA.Biplot(iris[,1:4])
plot(bip)
Add suplementary binary variables to a biplot
Description
Add suplementary binary variables to a biplot of any kind
Usage
AddBinVars2Biplot(bip, Y, IncludeConst = TRUE, penalization = 0.2,
freq = NULL, tolerance = 1e-05, maxiter = 100)
Arguments
bip |
A biplot object |
Y |
Matrix of binary variables to add |
IncludeConst |
Should include a constant in the fit |
penalization |
Penalization for the fit |
freq |
frequencies for each row of Y. By default is 1. |
tolerance |
Tolerance for the fit |
maxiter |
Maximum number of iterations |
Details
Fits binary variables to an existing biplot using penalized logistic regression.
Value
The biplot object with supplementary binary variables added.
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardón, J. L., & Hernández-Sánchez, J. C. (2020). External Logistic Biplots for Mixed Types of Data. In Advanced Studies in Classification and Data Science (pp. 169-183). Springer, Singapore.
Examples
## No examples yet
Add clusters to a biplot object
Description
The function add clusters to a biplot object to be represented on the biplot. The clusters can be defined by a nominal variable provided by the user, obtained from the hclust
function of the base package or from the kmeans
function
Usage
AddCluster2Biplot(Bip, NGroups=3, ClusterType="hi", Groups=NULL,
Original=FALSE, ClusterColors=NULL, ...)
Arguments
Bip |
A Biplot object obtained from any biplot procedure. It has to be a list containing a field called |
NGroups |
Number of groups or clusters. Only necessary when hierarchical or k-means procedures are used. |
ClusterType |
The type of cluster to add. There are three possibilities "us" (User Defined), "hi" (hierarchical clusters), "km" (kmeans clustering) or "gm" (gaussian mixture). |
Groups |
A factor defining the groups provided by the user. |
Original |
Should the clusters be calculated using the original data rather than the reduced dimensions?. |
ClusterColors |
Colors for the clusters. |
... |
Any other parameter for the |
Details
One of the main shortcomings of cluster analysis is that it is not easy to search for the variables associated to the obtained classification; representing the clusters on the biplot can help to perform that interpretation. If you consider the technique for dimension reduction as a way to separate the signal from the noise, clusters should be constructed using the dimensions retained in the biplot, otherwise the complete original data matrix can be used. The colors used by each cluster should match the color used in the Dendrogram. User defined clusters can also be plotted, for example, to investigate the relation of the biplot solution to an external nominal variable.
Value
The function returns the biplot object with the information about the clusters added in new fields
ClusterType |
The method of clustering as defined in the argument |
Clusters |
A factor containing the solution or the user defined clusters |
ClusterNames |
The names of the clusters |
ClusterColors |
The colors of the clusters |
Dendrogram |
The Dendrogram if we have used hirarchical clustering |
ClusterObject |
The object obtained from |
Author(s)
Jose Luis Vicente Villardon
References
Demey, J. R., Vicente-Villardon, J. L., Galindo-Villardon, M. P., & Zambrano, A. Y. (2008). Identifying molecular markers associated with classification of genotypes by External Logistic Biplots. Bioinformatics, 24(24), 2832-2838.
Gallego-Alvarez, I., & Vicente-Villardon, J. L. (2012). Analysis of environmental indicators in international companies by applying the logistic biplot. Ecological Indicators, 23, 250-261.
Galindo, P. V., Vaz, T. D. N., & Nijkamp, P. (2011). Institutional capacity to dynamically innovate: an application to the Portuguese case. Technological Forecasting and Social Change, 78(1), 3-12.
Vazquez-de-Aldana, B. R., Garcia-Criado, B., Vicente-Tavera, S., & Zabalgogeazcoa, I. (2013). Fungal Endophyte (Epichloë festucae) Alters the Nutrient Content of Festuca rubra Regardless of Water Availability. PloS one, 8(12), e84539.
See Also
For clusters not provided by the user the function uses the standard procedures in hclust
and kmeans
.
Examples
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
plot(bip)
# Add user defined clusters containing the region (North, South, Center)
bip=AddCluster2Biplot(bip, ClusterType="us", Groups=Protein$Region)
plot(bip, mode="a", margin=0.1, PlotClus=TRUE)
# Hierarchical clustering on the biplot coordinates using the Ward method
bip=AddCluster2Biplot(bip, ClusterType="hi", method="ward.D")
op <- par(mfrow=c(1,2))
plot(bip, mode="s", margin=0.1, PlotClus=TRUE)
plot(bip$Dendrogram)
par(op)
# K-means cluster on the biplot coordinates using the Ward method
bip=AddCluster2Biplot(bip, ClusterType="hi", method="ward.D")
op <- par(mfrow=c(1,2))
plot(bip, mode="s", margin=0.1, PlotClus=TRUE)
plot(bip$Dendrogram)
par(op)
Adds supplementary continuous variables to a biplot object
Description
Adds supplementary continuous variables to a biplot object
Usage
AddContVars2Biplot(bip, X, dims = NULL, Scaling = 5, Fit = NULL)
Arguments
bip |
A biplot object |
X |
Matrix containing the supplementary continuos variables |
dims |
Dimension of the solution |
Scaling |
Transformation to apply to X |
Fit |
Type of fit. Linear by default. |
Details
More types of fit will be added in the future
Value
A biplot object with the coordinates for the supplementary variables added.
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
# Not yet
Adds supplementary ordinal variables to an existing biplot objects.
Description
Adds supplementary ordinal variables to an existing biplot objects.
Usage
AddOrdVars2Biplot(bip, Y, tol = 1e-06, maxiterlogist = 100,
penalization = 0.2, showiter = TRUE, show = FALSE)
Arguments
bip |
A biplot object. |
Y |
A matrix of ordinal variables. |
tol |
Tolerance. |
maxiterlogist |
Maximum number of iterations for the logistic fit. |
penalization |
Penalization for the logistic fit |
showiter |
Should the itrations be shown on screen |
show |
Show details. |
Details
Adds supplementary ordinal variables to an existing biplot objects.
Value
An object with the information of the fits
Author(s)
Jose Luis Vicente-Villardon
References
Vicente-Villardon, J. L., & Hernandez-Sanchez, J. C. (2020). External Logistic Biplots for Mixed Types of Data. In Advanced Studies in Classification and Data Science (pp. 169-183). Springer, Singapore.
Examples
# not yet
Adds supplementary variables to a biplot object
Description
Adds supplementary bariables to a biplot object constructed with any of the biplot methods of the package. The new variables are fitted using the coordinates for the rows. Each variable is fitted using the adequate procedure for its type.
Usage
AddSupVars2Biplot(bip, X)
Arguments
bip |
The biplot object |
X |
A data frame with the supplementary variables. |
Details
Binary, nominal or ordinal variables are fitted using logistic biplots. Continuous variables are fitted with linear regression.
Value
A biplot object with the coordinates for the supplementary variables added.
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
# Not yet
Bartlett tests
Description
Bartlett tests foor the columns of a matrix and a grouping variable
Usage
Bartlett.Tests(X, groups = NULL)
Arguments
X |
A data frame or a matrix containing several numerical variables |
groups |
A factor with the groups |
Details
Bartlett tests foor the columns of a matrix and a grouping variable
Value
A matrix with the tests for each column
Author(s)
Jose Luis Vicente Villardon
References
Bartlett, M. S. (1937). "Properties of sufficiency and statistical tests". Proceedings of the Royal Statistical Society, Series A 160, 268-282 JSTOR 96803
Examples
data(wine)
Bartlett.Tests(wine[,4:8], groups = wine$Origin)
Basic descriptive sataistics
Description
Basic descriptive sataistics of several variables by the categories of a factor.
Usage
BasicDescription(X, groups = NULL, SortByGroups = FALSE, na.rm = FALSE, Intervals = TRUE)
Arguments
X |
A data frame or a matrix containing several numerical variables |
groups |
A factor with the groupings |
SortByGroups |
Sorting by groups |
na.rm |
a logical value indicating whether NA values should be stripped before the computation proceeds. |
Intervals |
Should the confidence intervals be calculated? |
Details
Basic descriptive sataistics of several variables by the categories of a factor.
Value
A list with the description of each variable.
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
BasicDescription(wine[,4:8], groups = wine$Origin)
Binary Distances
Description
Calculates distances among rows of a binary data matrix or among the rows of two binary matrices. The end user will use BinaryProximities rather than this function. Input must be a matrix with 0 or 1 values.
Usage
BinaryDistances(x, y = NULL, coefficient= "Simple_Matching", transformation="sqrt(1-S)")
Arguments
x |
Main binary data matrix. Distances among rows are calculated if y=NULL. |
y |
Second binary data matrix. If not NULL the distances among the rows of x and y are calculated |
coefficient |
Similarity coefficient. Use the name (see details) |
transformation |
Transformation of the similarities. Use the name (see details) |
Details
The following coefficients are calculated
1.- Kulezynski = a/(b + c)
2.- Russell_and_Rao = a/(a + b + c+d)
3.- Jaccard = a/(a + b + c)
4.- Simple_Matching = (a + d)/(a + b + c + d)
5.- Anderberg = a/(a + 2 * (b + c))
6.- Rogers_and_Tanimoto = (a + d)/(a + 2 * (b + c) + d)
7.- Sorensen_Dice_and_Czekanowski = a/(a + 0.5 * (b + c))
8.- Sneath_and_Sokal = (a + d)/(a + 0.5 * (b + c) + d)
9.- Hamman = (a - (b + c) + d)/(a + b + c + d)
10.- Kulezynski = 0.5 * ((a/(a + b)) + (a/(a + c)))
11.- Anderberg2 = 0.25 * (a/(a + b) + a/(a + c) + d/(c + d) + d/(b + d))
12.- Ochiai = a/sqrt((a + b) * (a + c))
13.- S13 = (a * d)/sqrt((a + b) * (a + c) * (d + b) * (d + c))
14.- Pearson_phi = (a * d - b * c)/sqrt((a + b) * (a + c) * (d + b) * (d + c))
15.- Yule = (a * d - b * c)/(a * d + b * c)
The following transformations of the similarity3 are calculated
1.- 'Identity' dis=sim
2.- '1-S' dis=1-sim
3.- 'sqrt(1-S)' dis = sqrt(1 - sim)
4.- '-log(s)' dis=-1*log(sim)
5.- '1/S-1' dis=1/sim -1
6.- 'sqrt(2(1-S))' dis== sqrt(2*(1 - sim))
7.- '1-(S+1)/2' dis=1-(sim+1)/2
8.- '1-abs(S)' dis=1-abs(sim)
9.- '1/(S+1)' dis=1/(sim)+1
Value
An object of class proximities
.This has components:
comp1 |
Description of 'comp1' |
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J. C. (2006) Similarity dissimilarity and Distance, measures of. Encyclopedia of Statistical Sciences. 2nd. ed. Volume 12. Wiley
See Also
Examples
data(spiders)
Binary logistic biplot with the EM algorithm.
Description
Binary logistic biplot with the EM algorithm
Usage
BinaryLogBiplotEM(x, freq = matrix(1, nrow(x), 1), aini = NULL,
dimens = 2, nnodos = 15, tol = 1e-04, maxiter = 100, penalization = 0.2)
Arguments
x |
A binary data matrix |
freq |
A vector of frequencies. |
aini |
Initial values for the row coordinates. |
dimens |
Dimension of the solution. |
nnodos |
Number of nodes for the gaussian quadrature |
tol |
Tolerance |
maxiter |
Maximum number of iterations. |
penalization |
Penalization for the fit (ridge) |
Details
Binary logistic biplot with the EM algorithm based on marginal maximum likelihood.
Value
A logistic biplot object.
Author(s)
Jose Luis Vicente-Villardon
References
Vicente-Villardón, J. L., Galindo-Villardón, M. P., & Blázquez-Zaballos, A. (2006). Logistic biplots. Multiple correspondence analysis and related methods. London: Chapman & Hall, 503-521.
Examples
# Not yet
Binary Logistic Biplot with Gradient Descent Estimation
Description
Binary Logistic Biplot with Gradient Descent Estimation. An external optimization function is used to calculate the parameters.
Usage
BinaryLogBiplotGD(X, freq = matrix(1, nrow(X), 1), dim = 2, tolerance =
1e-07, penalization = 0.01, num_max_iters = 100,
RotVarimax = FALSE, seed = 0, OptimMethod = "CG",
Initial = "random", Orthogonalize = FALSE, Algorithm =
"Joint", ...)
Arguments
X |
A binary data matrix |
freq |
Frequencies of each row. When adequate. |
dim |
Dimension of the final solution. |
tolerance |
Tolerance for convergence of the algorithm. |
penalization |
Ridge penalization constant. |
num_max_iters |
Maximum number of iterations of the algorithm. |
RotVarimax |
Should the final solution be rotated. |
seed |
Seed for the random numbers. Used for reproductibility. |
OptimMethod |
Optimization method used by |
Initial |
Initial configuration to start the iterations. |
Orthogonalize |
Should te solution be orthogonalized?. |
Algorithm |
Algorithm for esimation: Joint or alternated. |
... |
Aditional parameters used by the optimization function. |
Details
Fits a binary logistic biplot using gradient descent. The general function optim
is used to optimize the loss function. Conjugate gradien is used as a default although other alternatives can be USED.
Value
An object of class "Binary.Logistic.Biplot".
Author(s)
Jose Luis Vicente-Villardon
References
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Análisis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
Examples
data(spiders)
X=Dataframe2BinaryMatrix(spiders)
logbip=BinaryLogBiplotGD(X,penalization=0.1)
plot(logbip, Mode="a")
summary(logbip)
Binary Logistic Biplot with Recursive Gradient Descent Estimation
Description
Binary Logistic Biplot with Recursive Gradient Descent Estimation. An external optimization function is used to calculate the parameters.
Usage
BinaryLogBiplotGDRecursive(X, freq = matrix(1, nrow(X), 1), dim = 2, tolerance = 1e-04,
penalization = 0.2, num_max_iters = 100,
RotVarimax = FALSE, OptimMethod = "CG",
Initial = "random", ...)
Arguments
X |
A binary data matrix |
freq |
Frequencies of each row. When adequate. |
dim |
Dimension of the final solution. |
tolerance |
Tolerance for convergence of the algorithm. |
penalization |
Ridge penalization constant. |
num_max_iters |
Maximum number of iterations of the algorithm. |
RotVarimax |
Should the final solution be rotated. |
OptimMethod |
Optimization method used by |
Initial |
Initial configuration to start the iterations. |
... |
Aditional parameters used by the optimization function. |
Details
Fits a binary logistic biplot using recursive gradient descent. The general function optim
is used to optimize the loss function. Conjugate gradien is used as a default although other alternatives can be USED. It can be considered as a generalization of the NIPALS algorithm for a matrix of binary data.
Value
An object of class "Binary.Logistic.Biplot".
Author(s)
José Luis Vicente Villardon
References
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Análisis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
Examples
data(spiders)
X=Dataframe2BinaryMatrix(spiders)
logbip=BinaryLogBiplotGDRecursive(X,penalization=0.1)
plot(logbip, Mode="a")
summary(logbip)
Binary logistic biplot with a gradient descent algorithm.
Description
Binary logistic biplot with a gradient descent algorithm.
Usage
BinaryLogBiplotJoint(x, freq = matrix(1, nrow(x), 1), dim = 2,
ainit = NULL, tolerance = 1e-04, maxiter = 30, penalization = 0.2,
maxcond = 7, RotVarimax = FALSE, lambda = 0.1, ...)
Arguments
x |
A binary data matrix |
freq |
A vector of frequencies. |
dim |
Dimension of the solution |
ainit |
Initial values for the row coordinates. |
tolerance |
Tolerance |
maxiter |
Maximum number of iterations. |
penalization |
Penalization for the fit (ridge) |
maxcond |
Naximum condition number |
RotVarimax |
Should a Varimax Rotation be used? |
lambda |
Penalization argument |
... |
Aditional arguments |
Details
Binary logistic biplot with a gradient descent algorithm. Estimates row and column parameters at the same time.
Value
A logistic biplot object.
Author(s)
Jose Luis Vicente-Villardon
References
Vicente-Villardón, J. L., Galindo-Villardón, M. P., & Blázquez-Zaballos, A. (2006). Logistic biplots. Multiple correspondence analysis and related methods. London: Chapman & Hall, 503-521.
Vicente-Villardon, J. L., & Vicente-Gonzalez, L. Redundancy Analysis for Binary Data Based on Logistic Responses in Data Analysis and Rationality in a Complex World. Springer.
Examples
# not yet
Binary logistic biplot with Item Response Theory.
Description
Binary logistic biplot with Item Response Theory.
Usage
BinaryLogBiplotMirt(x, dimens = 2, tolerance = 1e-04,
maxiter = 30, penalization = 0.2, Rotation = "varimax", ...)
Arguments
x |
The binary Data matrix |
dimens |
Dimension of the solution |
tolerance |
Tolerance of the algorithm |
maxiter |
Maximum number of iterations |
penalization |
Rige Penalization |
Rotation |
Should a rotation be applied? |
... |
Aditional argumaents. |
Details
Binary logistic biplot with Item Response Theory.
Value
A logistic biplot object.
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardón, J. L., Galindo-Villardón, M. P., & Blázquez-Zaballos, A. (2006). Logistic biplots. Multiple correspondence analysis and related methods. London: Chapman & Hall, 503-521.
Examples
# Not yet
Binary Logistic Biplot
Description
Fits a binary lo gistic biplot to a binary data matrix.
Usage
BinaryLogisticBiplot(x, dim = 2, compress = FALSE, init = "mca",
method = "EM", rotation = "none", tol = 1e-04,
maxiter = 100, penalization = 0.2, similarity = "Simple_Matching", ...)
Arguments
x |
The binary data matrix |
dim |
Dimension of the solution |
compress |
Compress the data before the fitting (not yet implemented) |
init |
Type of initial configuration. ("random", "mirt", "PCoA", "mca") |
method |
Method to fit the logistic biplot ("EM", "Joint", "mirt", "JointGD", "AlternatedGD", "External", "Recursive") |
rotation |
Rotation of the solution ("none", "oblimin", "quartimin", "oblimax" ,"entropy", "quartimax", "varimax", "simplimax" ) see GPARotation |
tol |
Tolerance for the algorithm |
maxiter |
Maximum number of iterations. |
penalization |
Panalization for the different algorithms |
similarity |
Similarity coefficient for the initial configuration or the external model |
... |
Any other argument for each particular method. |
Details
Fits a binary lo gistic biplot to a binary data matrix.
Different Initial configurations can be selected:
1.- random : Random coordinates for each point.
2.- mirt: scores of the procedure mirt (Multidimensional Item Response Theory)
3.- PCoA: Principal Coordinates Analysis
4.- mca: Multiple Correspondence Analysis
We can use also different methods for the estimation
1.- Joint: Joint estimation of the row and column parameters. The Initial alorithm.
2.- EM: Marginal Maximum Likelihood
3.- mirt: Similar to the previous but fitted using the package mirt.
4.- JointGD: Joint estimation of the row and column methods using the gradient descent method.
5.- AlternatedGD: Alternated estimation of the row and column methods using the gradient descent method.
6.- External: Logistic fits on the Principal Coordinates Analysis.
7.- Recursive: Recursive (one axis at a time) estimation of the row and column methods using the gradient descent method. This is similar to the NIPALS algorithm for PCA
Value
A Logistic Biplot object.
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Análisis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
See Also
BinaryLogBiplotJoint
, BinaryLogBiplotEM
, BinaryLogBiplotGD
, BinaryLogBiplotMirt
,
Examples
# data(spiders)
# X=Dataframe2BinaryMatrix(spiders)
# logbip=BinaryLogBiplotGD(X,penalization=0.1)
# plot(logbip, Mode="a")
# summary(logbip)
Binary PLS Regression.
Description
Fits Binary PLS regression.
Usage
BinaryPLSFit(Y, X, S = 2, tolerance = 5e-06, maxiter = 100, show = FALSE,
penalization = 0.1, OptimMethod = "CG", seed = 0)
Arguments
Y |
The response |
X |
The matrix of independent variables |
S |
The Dimension of the solution |
tolerance |
Tolerance for convergence of the algorithm |
maxiter |
Maximum Number of iterations |
show |
Show the steps of the algorithm |
penalization |
Penalization for the Ridge Logistic Regression |
OptimMethod |
Optimization methods from optimr |
seed |
Seed. By default is 0. |
Details
Fits Binary PLS Regression. It is used for a higher level function.
Value
The PLS fit used by the BinaryPLSR function.
Author(s)
Jose Luis Vicente Villardon
References
Ugarte Fajardo, J., Bayona Andrade, O., Criollo Bonilla, R., Cevallos‐Cevallos, J., Mariduena‐Zavala, M., Ochoa Donoso, D., & Vicente Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
Vicente-Gonzalez, L., & Vicente-Villardon, J. L. (2022). Partial Least Squares Regression for Binary Responses and Its Associated Biplot Representation. Mathematics, 10(15), 2580.
Examples
## Not yet
Partial Least Squares Regression with Binary Data
Description
Fits Partial Least Squares Regression with Binary Data
Usage
BinaryPLSR(Y, X, S = 2, tolerance = 5e-05, maxiter = 100, show = FALSE,
penalization = 0.1, OptimMethod = "CG", seed = 0)
Arguments
Y |
The response |
X |
The matrix of independent variables |
S |
The Dimension of the solution |
tolerance |
Tolerance for convergence of the algorithm |
maxiter |
Maximum Number of iterations |
show |
Show the steps of the algorithm |
penalization |
Penalization for the Ridge Logistic Regression |
OptimMethod |
Optimization methods from optim |
seed |
Seed. By default is 0. |
Details
The function fits the PLSR method for the case when there are two sets of binary variables, using logistic rather than linear fits to take into account the nature of responses. We term the method BPLSR (Binary Partial Least Squares Regression). This can be considered as a generalization of the NIPALS algorithm when the data are all binary.
Value
Method |
Description of 'comp1' |
X |
The predictors matrix |
Y |
The responses matrix |
ScaledX |
The scaled X matrix |
tolerance |
Tolerance used in the algorithm |
maxiter |
Maximum number of iterations used |
penalization |
Ridge penalization |
XScores |
Scores of the X matrix, used later for the biplot |
XLoadings |
Loadings of the X matrix |
YScores |
Scores of the Y matrix |
YLoadings |
Loadings of the Y matrix |
XStructure |
Correlations among the X variables and the PLS scores |
InterceptsY |
Intercepts for the Y loadings |
InterceptsX |
Intercepts for the Y loadings |
LinTerm |
Linear terms for each response |
Expected |
Expected probabilities for the responses |
Predictions |
Binary predictions of the responses |
PercentCorrect |
Global percent of correct predictions |
PercentCorrectCols |
Percent of correct predictions for each column |
Author(s)
José Luis Vicente Villardon
References
Ugarte Fajardo, J., Bayona Andrade, O., Criollo Bonilla, R., Cevallos‐Cevallos, J., Mariduena‐Zavala, M., Ochoa Donoso, D., & Vicente Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
Vicente-Gonzalez, L., & Vicente-Villardon, J. L. (2022). Partial Least Squares Regression for Binary Responses and Its Associated Biplot Representation. Mathematics, 10(15), 2580.
Examples
X=as.matrix(wine[,4:21])
Y=cbind(Factor2Binary(wine[,1])[,1], Factor2Binary(wine[,2])[,1])
rownames(Y)=wine[,3]
colnames(Y)=c("Year", "Origin")
pls=PLSRBin(Y,X, penalization=0.1, show=TRUE, S=2)
Proximity Measures for Binary Data
Description
Calculation of proxymities among rows or columns of a binary data matrix or a data frame that will be converted into a binary data matrix.
Usage
BinaryProximities(x, y = NULL, coefficient = "Jaccard", transformation =
NULL, transpose = FALSE, ...)
Arguments
x |
A data frame or a binary data matrix. Proximities among the rows of |
y |
Supplementary data. The proximities amond the rows of |
coefficient |
Similarity coefficient. Use the number or the name (see details) |
transformation |
Transformation of the similarities. Use the number or the name (see details) |
transpose |
Logical. If |
... |
Used to provide additional parameters for the conversion of the dataframe into a binary matrix |
Details
A binary data matrix is a matrix with values 0 or 1 coding the absence or presence of several binary characters. When a data frame is provided, every variable in the data frame is converted to a binary variable using the function Dataframe2BinaryMatrix
. Factors with two levels are converted directly to binary variables, factors with more than two levels are converted to a matrix with as meny columns as levels and numerical variables are converted to binary variables using a cut point that can be the median, the mean or a value provided by the user.
The following coefficients are calculated
1.- Kulezynski = a/(b + c)
2.- Russell_and_Rao = a/(a + b + c+d)
3.- Jaccard = a/(a + b + c)
4.- Simple_Matching = (a + d)/(a + b + c + d)
5.- Anderberg = a/(a + 2 * (b + c))
6.- Rogers_and_Tanimoto = (a + d)/(a + 2 * (b + c) + d)
7.- Sorensen_Dice_and_Czekanowski = a/(a + 0.5 * (b + c))
8.- Sneath_and_Sokal = (a + d)/(a + 0.5 * (b + c) + d)
9.- Hamman = (a - (b + c) + d)/(a + b + c + d)
10.- Kulezynski = 0.5 * ((a/(a + b)) + (a/(a + c)))
11.- Anderberg2 = 0.25 * (a/(a + b) + a/(a + c) + d/(c + d) + d/(b + d))
12.- Ochiai = a/sqrt((a + b) * (a + c))
13.- S13 = (a * d)/sqrt((a + b) * (a + c) * (d + b) * (d + c))
14.- Pearson_phi = (a * d - b * c)/sqrt((a + b) * (a + c) * (d + b) * (d + c))
15.- Yule = (a * d - b * c)/(a * d + b * c)
The following transformations of the similarity3 are calculated
1.- 'Identity' dis=sim
2.- '1-S' dis=1-sim
3.- 'sqrt(1-S)' dis = sqrt(1 - sim)
4.- '-log(s)' dis=-1*log(sim)
5.- '1/S-1' dis=1/sim -1
6.- 'sqrt(2(1-S))' dis== sqrt(2*(1 - sim))
7.- '1-(S+1)/2' dis=1-(sim+1)/2
8.- '1-abs(S)' dis=1-abs(sim)
9.- '1/(S+1)' dis=1/(sim)+1
Note that, after transformation the similarities are converted to distances except for "Identity". Not all the transformations are suitable for all the coefficients. Use them at your own risk. The default values are admissible combinations.
Value
An object of class proximities
.This has components:
TypeData |
Binary, Continuous or Mixed. Binary in this case. |
Coefficient |
Coefficient used to calculate the proximities |
Transformation |
Transformation used to calculate the proximities |
Data |
Data used to calculate the proximities |
SupData |
Supplementary Data, if any |
Proximities |
Proximities among rows of |
SupProximities |
Proximities among rows of |
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J. C. (2006) Similarity dissimilarity and Distance, measures of. Encyclopedia of Statistical Sciences. 2nd. ed. Volume 12. Wiley
See Also
BinaryDistances
, Dataframe2BinaryMatrix
Examples
data(spiders)
D=BinaryProximities(spiders, coefficient="Jaccard", transformation="sqrt(1-S)")
D2=BinaryProximities(spiders, coefficient=3, transformation=3)
Biplot for a PLSR model with binary data
Description
Builds a Biplot for a PLSR model with binary data
Usage
Biplot.BinaryPLSR(plsr, BinBiplotType=1)
Arguments
plsr |
A BinaryPLSR object |
BinBiplotType |
The type of biplot: 1:The biplot resulting from the fit, for the binary data. 2: The biplot for the coefficients |
Details
Builds a Biplot for a PLSR model with binary data. The result is a biplot for the matrix with the binary predictors (X) adding the binary responses as suplementary variables. There are two possible types, 1 for the biplot directly obtained in the fit (the default) and 2 for the biplot obtaines after refitting the binary variables using Ridge Logistic Regression.
Value
An object of class Binary.Logistic.Biplot
Author(s)
Jose Luis Vicente Villardon
References
Ugarte Fajardo, J., Bayona Andrade, O., Criollo Bonilla, R., Cevallos‐Cevallos, J., Mariduena‐Zavala, M., Ochoa Donoso, D., & Vicente Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
Vicente-Gonzalez, L., & Vicente-Villardon, J. L. (2022). Partial Least Squares Regression for Binary Responses and Its Associated Biplot Representation. Mathematics, 10(15), 2580.
Examples
X=as.matrix(wine[,4:21])
Y=cbind(Factor2Binary(wine[,1])[,1], Factor2Binary(wine[,2])[,1])
rownames(Y)=wine[,3]
colnames(Y)=c("Year", "Origin")
pls=PLSRBin(Y,X, penalization=0.1, show=TRUE, S=2)
plsbip=Biplot.PLSRBIN(pls, BinBiplotType=1)
plsbip=AddCluster2Biplot(plsbip, ClusterType = "us",
Groups = wine$Group)
plot(plsbip, margin=0.05, mode="s", PlotClus = TRUE,
ModeSupBinVars = "s", ShowAxis = FALSE,
ColorSupBinVars = "blue", CexInd=0.5,
ClustCenters = TRUE, LabelInd = FALSE, ShowBox = TRUE)
Partial Least Squares Biplot
Description
Adds a Biplot to a Partial Lest Squares (plsr) object.
Usage
Biplot.PLSR(plsr)
Arguments
plsr |
A plsr object from the PLSR function |
Details
Adds a Biplot to a Partial Lest Squares (plsr) object. The biplot is constructed with the matrix of predictors, the dependent variable is projected onto the biplot as a continuous supplementary variable.
Value
An object of class ContinuousBiplot with the dependent variables as supplemntary.
Author(s)
Jose Luis Vicente Villardon
References
Oyedele, O. F., & Lubbe, S. (2015). The construction of a partial least-squares biplot. Journal of Applied Statistics, 42(11), 2449-2460.
See Also
Examples
X=as.matrix(wine[,4:21])
y=as.numeric(wine[,2])-1
mifit=PLSR(y,X, Validation="None")
mibip=Biplot.PLSR(mifit)
plot(mibip, PlotVars=TRUE, IndLabels = y, ColorInd=y+1)
Biplot for a PLSR model with a binary response
Description
Biplot for a PLSR model with a binary response
Usage
Biplot.PLSR1BIN(plsr)
Arguments
plsr |
An object of class PLSR1BIN. |
Details
Biplot for a PLSR model with a binary response
Value
The biplot for the independent variables with the response as supplementary binary variable.
Author(s)
Jose Luis Vicente Villardon
References
Ugarte-Fajardo, J., Bayona-Andrade, O., Criollo-Bonilla, R., Cevallos-Cevallos, J., Mariduena-Zavala, M., Ochoa-Donoso, D., & Vicente-Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
See Also
Examples
# Not Yet
Biplot for a PLSR model with binary responses
Description
Builds a Biplot for a PLSR model with binary responses
Usage
Biplot.PLSRBIN(plsr, BinBiplotType = 1)
Arguments
plsr |
A PLSRBin object |
BinBiplotType |
The type of biplot: 1:The biplot resulting from the fit, for the binary responses. 2: The biplot for the coefficients |
Details
Builds a Biplot for a PLSR model with binary responses. The result is a biplot for the matrix with the predictors (X) adding the binary responses as suplementary variables. There are two possible types, 1 for the biplot directly obtained in the fit ( the default) and 2 for the biplot obtaines after refitting the binary variables using Ridge Logistic Regression.
Value
An object of class ContinuousBiplot
Author(s)
Jose Luis Vicente Villardon
References
Ugarte Fajardo, J., Bayona Andrade, O., Criollo Bonilla, R., Cevallos‐Cevallos, J., Mariduena‐Zavala, M., Ochoa Donoso, D., & Vicente Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
Examples
X=as.matrix(wine[,4:21])
Y=cbind(Factor2Binary(wine[,1])[,1], Factor2Binary(wine[,2])[,1])
rownames(Y)=wine[,3]
colnames(Y)=c("Year", "Origin")
pls=PLSRBin(Y,X, penalization=0.1, show=TRUE, S=2)
plsbip=Biplot.PLSRBIN(pls, BinBiplotType=1)
plsbip=AddCluster2Biplot(plsbip, ClusterType = "us",
Groups = wine$Group)
plot(plsbip, margin=0.05, mode="s", PlotClus = TRUE,
ModeSupBinVars = "s", ShowAxis = FALSE,
ColorSupBinVars = "blue", CexInd=0.5,
ClustCenters = TRUE, LabelInd = FALSE, ShowBox = TRUE)
External Biplot for functional data from a functional PCA object.
Description
The function calculates a biplot from a functional PCA object and the data used tocalculate it.
Usage
BiplotFPCA(FPCA, X)
Arguments
FPCA |
Functional PCA object |
X |
Data used to calculate the fuctional PCA |
Details
The function calculates a biplot from a functional PCA object and the data used tocalculate it. At this moment the function calculates only an external biplot by regressing X o the funcional components. Furure versions will include the internal biplot.
Value
A Continuous biplot object
Author(s)
José Luis Vicente Villardón
Examples
# not yet
Bootstrap on the distance matrices used for Principal Coordinates Analysis (PCoA)
Description
Obtains bootstrap replicates of a distance matrix using ramdom samples or permuatations of the residual matrix from a Principal Coordinates (Components) Analysis. The object is to estimate the sampling variability of absorbed variances, coordinates and qualities of representation in a PCoA.
Usage
BootstrapDistance(D, W=diag(nrow(D)), nB=200, dimsol=2,
ProcrustesRot=TRUE, method=c("Sampling", "Permutation"))
Arguments
D |
A distance matrix |
W |
A diagonal matrix containing waiths for the rows of D |
nB |
Number of Bootstrap replications |
dimsol |
Dimension of the solution |
ProcrustesRot |
Should each replication be rotated to match the initial solution? |
method |
The replications are obtained "Sampling" or "Permutating" the residuals. |
Details
The function calculates bootstrap confidence intervals for the inertia, coordinates and qualties of representation of a Principal Coordinates Analysis using a distance matrix as a basis. The funcion uses random sampling or permutations of the residuals to obtain the bootstrap replications. The procedure preserves the length of the points in the multidimensional space perturbating only the angles among the vectors. It is done so to preserve the property of positiveness of the diagonal elements of the scalar product matrices. The procedure may result into a scalar product that does not have an euclidean configuration and then has some negative eigenvalues; to avoid this problem the negative eigenvalues are removed to approximate the perturbated matrix by the closest with the required properties.
It is well known that the eigenvectors of a matrix are unique except for reflections, that is, if we change the sign of each component of the eigenvector we have the same solution. If that happens, an unwanted increase in the variability due to this artifact may invalidate the results. To avoid this we can calculate the scalar product of each eigenvector of the initial matrix with the corresponding eigenvector of the bootstrap replicate and change the signs of the later if the result is negative.
Another artifact of the procedure may arise when the dimension of the solution is higher than 1 because the eigenvectors of a replicate may generate the same subspace although are not in the same directions, i. e., the subspace is referred to a different system. That also may produce an unwanted increase of the variability that invalidates the results. To avoid this, every replicate may be rotated to match as much as possible the subspace generated by the eigenvectors of the initial matrix. This is done by Procrustes Analysis, taking the rotated matrix as solution. The solution to this problem is also a sulution to the reflection, then only this problem is considered.
Value
Returns an object of class "PCoABootstrap" with the information for each bootstrap replication.
Eigenvalues |
A matrix with dimensions in rows and replicates in columns containing the eigenvalues of each replicate in columns |
Inertias |
A matrix with dimensions in rows and replicates in columns containing the inertias of each replicate in columns |
Coordinates |
A list with a component for each object. A component contains the coordinates of an object for each replicate (in columns) |
Values-Table |
A list with a component for each object. A component contains the qualities of an object for each replicate (in columns) |
NReplicates |
Number of bootstrap replicates |
Author(s)
Jose L. Vicente-Villardon
References
Efron, B.; Tibshirani, RJ. (1993). An introduction to the bootstrap. New York: Chapman and Hall. 436p.
Ringrose, T. J. (1992). Bootstrapping and correspondence analysis in archaeology. Journal of Archaeological Science, 19(6), 615-629.
MILAN, L., & WHITTAKER, J. (1995). Application of the parametric bootstrap to models that incorporate a singular value decomposition. Applied statistics, 44(1), 31-49.
See Also
BootstrapScalar
, ~~~
Examples
data(spiders)
D=BinaryProximities(spiders, coefficient="Jaccard", transformation="sqrt(1-S)")
DB=BootstrapDistance(D$Proximities)
Bootstrap on the scalar product matrices used for Principal Coordinates Analysis (PCoA)
Description
Obtains bootstrap replicates of a scalar products matrix using ramdom samples or permuatations of the residual matrix from a Principal Coordinates (Components) Analysis. The object is to estimate the sampling variability of absorbed variances, coordinates and qualities of representation in a PCoA.
Usage
BootstrapScalar(B, W=diag(nrow(B)), nB=200, dimsol=2,
ProcrustesRot=TRUE, method=c("Sampling", "Permutation"))
Arguments
B |
A scalar product matrix |
W |
A diagonal matrix containing waiths for the rows of D |
nB |
Number of Bootstrap replications |
dimsol |
Dimension of the solution |
ProcrustesRot |
Should each replication be rotated to match the initial solution? |
method |
The replications are obtained "Sampling" or "Permutating" the residuals. |
Details
The function calculates bootstrap confidence intervals for the inertia, coordinates and qualties of representation of a Principal Coordinates Analysis using a distance matrix as a basis. The funcion uses random sampling or permutations of the residuals to obtain the bootstrap replications. The procedure preserves the length of the points in the multidimensional space perturbating only the angles among the vectors. It is done so to preserve the property of positiveness of the diagonal elements of the scalar product matrices. The procedure may result into a scalar product that does not have an euclidean configuration and then has some negative eigenvalues; to avoid this problem the negative eigenvalues are removed to approximate the perturbated matrix by the closest with the required properties.
It is well known that the eigenvectors of a matrix are unique except for reflections, that is, if we change the sign of each component of the eigenvector we have the same solution. If that happens, an unwanted increase in the variability due to this artifact may invalidate the results. To avoid this we can calculate the scalar product of each eigenvector of the initial matrix with the corresponding eigenvector of the bootstrap replicate and change the signs of the later if the result is negative.
Another artifact of the procedure may arise when the dimension of the solution is higher than 1 because the eigenvectors of a replicate may generate the same subspace although are not in the same directions, i. e., the subspace is referred to a different system. That also may produce an unwanted increase of the variability that invalidates the results. To avoid this, every replicate may be rotated to match as much as possible the subspace generated by the eigenvectors of the initial matrix. This is done by Procrustes Analysis, taking the rotated matrix as solution. The solution to this problem is also a sulution to the reflection, then only this problem is considered.
Value
Returns an object of class "PCoABootstrap" with the information for each bootstrap replication.
Eigenvalues |
A matrix with dimensions in rows and replicates in columns containing the eigenvalues of each replicate in columns |
Inertias |
A matrix with dimensions in rows and replicates in columns containing the inertias of each replicate in columns |
Coordinates |
A list with a component for each object. A component contains the coordinates of an object for each replicate (in columns) |
Values-Table |
A list with a component for each object. A component contains the qualities of an object for each replicate (in columns) |
NReplicates |
Number of bootstrap replicates |
Author(s)
Jose L. Vicente-Villardon
References
Efron, B.; Tibshirani, RJ. (1993). An introduction to the bootstrap. New York: Chapman and Hall. 436p.
Ringrose, T. J. (1992). Bootstrapping and correspondence analysis in archaeology. Journal of Archaeological Science, 19(6), 615-629.
Milan, L., & Whittaker, J. (1995). Application of the parametric bootstrap to models that incorporate a singular value decomposition. Applied statistics, 44(1), 31-49.
See Also
Examples
## Not yet
Bootstrap on the distance matrices used for MDS with Smacof
Description
Obtains bootstrap replicates of a distance matrix using ramdom samples or permuatations of a distance matrix. The object is to estimate the sampling variability of the results of the Smacof algorithm.
Usage
BootstrapSmacof(D, W=NULL, Model=c("Identity", "Ratio", "Interval", "Ordinal"),
dimsol=2, maxiter=100, maxerror=0.000001, StandardizeDisparities=TRUE,
ShowIter=TRUE, nB=200, ProcrustesRot=TRUE,
method=c("Sampling", "Permutation"))
Arguments
D |
A distance matrix |
W |
A diagonal matrix containing waiths for the rows of D |
Model |
Mesurement level of the distances |
dimsol |
Dimension of the solution |
maxiter |
Maximum number of iterations for the smacof algorithm |
maxerror |
Tolerance for the smacof algorithm |
StandardizeDisparities |
Should the disparities be standardized in the smacof algorithm? |
ShowIter |
Should the information on each ieration be printed on the screen? |
nB |
Number of Bootstrap replications |
ProcrustesRot |
Should each replication be rotated to match the initial solution? |
method |
The replications are obtained "Sampling" or "Permutating" the residuals. |
Details
The function calculates bootstrap confidence intervals for coordinates and different stress measures using a distance matrix as a basis. The funcion uses random sampling or permutations of the residuals to obtain the bootstrap replications. The procedure preserves the length of the points in the multidimensional space perturbating only the angles among the vectors. It is done so to preserve the property of positiveness of the diagonal elements of the scalar product matrices. The procedure may result into a scalar product that does not have an euclidean configuration and then has some negative eigenvalues; to avoid this problem the negative eigenvalues are removed to approximate the perturbated matrix by the closest with the required properties.
It is well known that the eigenvectors of a matrix are unique except for reflections, that is, if we change the sign of each component of the eigenvector we have the same solution. If that happens, an unwanted increase in the variability due to this artifact may invalidate the results. To avoid this we can calculate the scalar product of each eigenvector of the initial matrix with the corresponding eigenvector of the bootstrap replicate and change the signs of the later if the result is negative.
Another artifact of the procedure may arise when the dimension of the solution is higher than 1 because the eigenvectors of a replicate may generate the same subspace although are not in the same directions, i. e., the subspace is referred to a different system. That also may produce an unwanted increase of the variability that invalidates the results. To avoid this, every replicate may be rotated to match as much as possible the subspace generated by the eigenvectors of the initial matrix. This is done by Procrustes Analysis, taking the rotated matrix as solution. The solution to this problem is also a sulution to the reflection, then only this problem is considered.
Value
Returns an object of class "PCoABootstrap" with the information for each bootstrap replication.
Info |
Information about the procedure |
InitialDistance |
Initial distance |
RawStress |
A vector containing the raw stress for all the bootstrap replicates |
stress1 |
A vector containing the value of the stress1 formula for all the bootstrap replicates |
stress2 |
A vector containing the value of the stress2 formula for all the bootstrap replicates |
sstress1 |
A vector containing the value of the sstress1 formula for all the bootstrap replicates |
sstress2 |
A vector containing the value of the sstress2 formula for all the bootstrap replicates |
Coordinates |
A list with a component for each object. A component contains the coordinates of an object for all the bootstrap replicates (in columns) |
NReplicates |
Number of bootstrap replicates |
Author(s)
Jose L. Vicente-Villardon
References
Efron, B.; Tibshirani, RJ. (1993). An introduction to the bootstrap. New York: Chapman and Hall. 436p.
Ringrose, T. J. (1992). Bootstrapping and correspondence analysis in archaeology. Journal of Archaeological Science, 19(6), 615-629.
MILAN, L., & WHITTAKER, J. (1995). Application of the parametric bootstrap to models that incorporate a singular value decomposition. Applied statistics, 44(1), 31-49.
Jacoby, W. G., & Armstrong, D. A. (2014). Bootstrap Confidence Regions for Multidimensional Scaling Solutions. American Journal of Political Science, 58(1), 264-278.
See Also
Examples
data(spiders)
D=BinaryProximities(spiders, coefficient="Jaccard", transformation="sqrt(1-S)")
DB=BootstrapDistance(D$Proximities)
Panel of box plots
Description
Panel of box plots for a set of numerical variables and a grouping factor.
Usage
BoxPlotPanel(X, groups = NULL, nrows = NULL, panel = TRUE,
notch = FALSE, GroupsTogether = TRUE, ...)
Arguments
X |
The matrix of continuous variables |
groups |
The grouping factor |
nrows |
Number of rows of the panel. |
panel |
Should the plots be organized into a panel? (or separated) |
notch |
Should notches be used in the box plots? |
GroupsTogether |
Should all the groups be together in the same plot? |
... |
Other graphical arguments |
Details
Panel of box plots for a set of numerical variables and a grouping factor.
Value
The box plot panel
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
BoxPlotPanel(wine[,4:7], groups = wine$Origin, nrows = 2, ylab="")
Correspondence Analysis
Description
Correspondence Analysis for a frequency or abundace data matrix.
Usage
CA(x, dim = 2, alpha = 1)
Arguments
x |
The frequency or abundance data matrix. |
dim |
Dimension of the final solution |
alpha |
Alpha to determine the kind of biplot to use. |
Details
Calculates Correspondence Analysis for a tww-way frequency or abundance table
Value
Correspondence analysis solution
Author(s)
Jose Luis Vicente Villardon
References
Benzécri, J. P. (1992). Correspondence analysis handbook. New York: Marcel Dekker.
Greenacre, M. J. (1984). Theory and applications of correspondence analysis. Academic Press.
Examples
data(SpidersSp)
cabip=CA(SpidersSp)
plot(cabip)
Canonical Correspondence Analysis
Description
Calculates the solution of a Canonical Correspondence Analysis Biplot
Usage
CCA(P, Z, alpha = 1, dimens = 4)
Arguments
P |
Abundance Matrix of sites by species. |
Z |
Environmental variables measured at the same sites |
alpha |
Alpha for the biplot decomposition [0,1]. With alpha=1 the emphasis is on the sites and with alpha=0 the emphasis is on the species |
dimens |
Dimension of the solution |
Details
A pair of ecological tables, made of a species abundance matrix and an environmental variables matrix measured at the same sampling sites, is usually analyzed by Canonical Correspondence Analysis (CCA) (Ter BRAAK, 1986). CCA can be considered as a Correspondence Analysis (CA) in which the ordination axis are constrained to be linear combinations of the environmental variables. Recently the procedure has been extended to other disciplines as Sociology or Psichology and it is potentially useful in many other fields.
Value
A CCA solution object
Author(s)
Jose Luis vicente Villardon
References
Ter Braak, C. J. (1986). Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology, 67(5), 1167-1179.
Johnson, K. W., & Altman, N. S. (1999). Canonical correspondence analysis as an approximation to Gaussian ordination. Environmetrics, 10(1), 39-52.
Graffelman, J. (2001). Quality statistics in canonical correspondence analysis. Environmetrics, 12(5), 485-497.
Graffelman, J., & Tuft, R. (2004). Site scores and conditional biplots in canonical correspondence analysis. Environmetrics, 15(1), 67-80.
Greenacre, M. (2010). Canonical correspondence analysis in social science research (pp. 279-286). Springer Berlin Heidelberg.
Examples
data(riano)
Sp=riano[,3:15]
Env=riano[,16:25]
ccabip=CCA(Sp, Env)
plot(ccabip)
Biplot representation of a Canonical Variate Analysis or a Manova (Canonical-Biplot or MANOVA-Biplot)
Description
Calculates a canonical biplot with confidence regions for the means.
Usage
Canonical.Variate.Analysis(X, group, InitialTransform = 5)
Arguments
X |
A data matrix |
group |
A factor containing the groups |
InitialTransform |
Initial transformation of the data matrix |
Details
The Biplot method (Gabriel, 1971; Galindo, 1986; Gower and Hand, 1996) is becoming one of the most popular techniques for analysing multivariate data. Biplot methods are techniques for simultaneous representation of the n
rows and n
columns of a data matrix \bf{X}
, in reduced dimensions, where the rows represent individuals, objects or samples and the columns the variables measured on them. Classical Biplot methods are a graphical representation of a Principal Components Analysis (PCA) that it is used to obtain linear combinations that successively maximize the total variability.
PCA is not considered an appropriate approach where there is known a priori group structure in the data. The most general methodology for discrimination among groups, using multiple observed variables, is Canonical Variate Analysis (CVA). CVA allows us to derive linear combinations that successively maximize the ratio of "between-groups"" to "pooled within-group" sample variance. Several authors propose a Biplot representation for CVA called Canonical Biplot (CB) (Vicente-Villardon, 1992 and Gower & Hand, 1996) when it is oriented to the discrimination between groups or MANOVA-Biplot Gabriel (1972, 1995) when the aim is to study the variables responsible for the discrimination. The main advantage of the Biplot version of the technique is that it is possible not only to establish the differences between groups but also to characterise the variables responsible for them. The methodology is not yet widely used mainly because it is still not available in the major statistical packages.
Amaro, Vicente-Villardon & Galindo (2004) extend the methodology for two-way designs and propose confidence circles based on univariate and multivariate tests to perform post-hoc analysis of each variable.
Value
An object of class "Canonical.Biplot"
Author(s)
Jose Luis Vicente Villardon
References
Amaro, I. R., Vicente-Villardon, J. L., & Galindo-Villardon, M. P. (2004). Manova Biplot para arreglos de tratamientos con dos factores basado en modelos lineales generales multivariantes. Interciencia, 29(1), 26-32.
Vicente-Villardón, J. L. (1992). Una alternativa a las técnicas factoriales clásicas basada en una generalización de los métodos Biplot (Doctoral dissertation, Tesis. Universidad de Salamanca. España. 248 pp.[Links]).
Gabriel KR (1971) The biplot graphic display of matrices with application to principal component analysis. Biometrika 58(3):453-467.
Gabriel, K. R. (1995). MANOVA biplots for two-way contingency tables. WJ Krzanowski (Ed.), Recent advances in descriptive multivariate analysis, Oxford University Press, Toronto. 227-268.
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Qüestiió. 1986, vol. 10, núm. 1.
Gower y Hand (1996): Biplots. Chapman & Hall.
Varas, M. J., Vicente-Tavera, S., Molina, E., & Vicente-Villardon, J. L. (2005). Role of canonical biplot method in the study of building stones: an example from Spanish monumental heritage. Environmetrics, 16(4), 405-419.
Santana, M. A., Romay, G., Matehus, J., Villardon, J. L., & Demey, J. R. (2009). simple and low-cost strategy for micropropagation of cassava (Manihot esculenta Crantz). African Journal of Biotechnology, 8(16).
Examples
data(wine)
X=wine[,4:21]
canbip=CanonicalBiplot(X, group=wine$Group)
plot(canbip, mode="s")
Biplot representation of a Canonical Variate Analysis or a Manova (Canonical-Biplot or MANOVA-Biplot)
Description
Calculates a canonical biplot with confidence regions for the means.
Usage
CanonicalBiplot(X, group, SUP = NULL, InitialTransform = 5, LDA=FALSE, MANOVA = FALSE)
Arguments
X |
A data matrix |
group |
A factor containing the groups |
SUP |
Supplementary observations to project on the biplot |
InitialTransform |
Initial transformation of the data matrix |
LDA |
A logical to indicate if the discriminant analysis should also be included |
MANOVA |
A logical to indicate if MANOVA should also be included |
Details
The Biplot method (Gabriel, 1971; Galindo, 1986; Gower and Hand, 1996) is becoming one of the most popular techniques for analysing multivariate data. Biplot methods are techniques for simultaneous representation of the n
rows and n
columns of a data matrix \bf{X}
, in reduced dimensions, where the rows represent individuals, objects or samples and the columns the variables measured on them. Classical Biplot methods are a graphical representation of a Principal Components Analysis (PCA) that it is used to obtain linear combinations that successively maximize the total variability.
PCA is not considered an appropriate approach where there is known a priori group structure in the data. The most general methodology for discrimination among groups, using multiple observed variables, is Canonical Variate Analysis (CVA). CVA allows us to derive linear combinations that successively maximize the ratio of "between-groups"" to "pooled within-group" sample variance. Several authors propose a Biplot representation for CVA called Canonical Biplot (CB) (Vicente-Villardon, 1992 and Gower & Hand, 1996) when it is oriented to the discrimination between groups or MANOVA-Biplot Gabriel (1972, 1995) when the aim is to study the variables responsible for the discrimination. The main advantage of the Biplot version of the technique is that it is possible not only to establish the differences between groups but also to characterise the variables responsible for them. The methodology is not yet widely used mainly because it is still not available in the major statistical packages.
Amaro, Vicente-Villardon & Galindo (2004) extend the methodology for two-way designs and propose confidence circles based on univariate and multivariate tests to perform post-hoc analysis of each variable.
Value
An object of class "Canonical.Biplot"
Author(s)
Jose Luis Vicente Villardon
References
Amaro, I. R., Vicente-Villardon, J. L., & Galindo-Villardon, M. P. (2004). Manova Biplot para arreglos de tratamientos con dos factores basado en modelos lineales generales multivariantes. Interciencia, 29(1), 26-32.
Vicente-Villardón, J. L. (1992). Una alternativa a las técnicas factoriales clásicas basada en una generalización de los métodos Biplot (Doctoral dissertation, Tesis. Universidad de Salamanca. España. 248 pp.[Links]).
Gabriel KR (1971) The biplot graphic display of matrices with application to principal component analysis. Biometrika 58(3):453-467.
Gabriel, K. R. (1995). MANOVA biplots for two-way contingency tables. WJ Krzanowski (Ed.), Recent advances in descriptive multivariate analysis, Oxford University Press, Toronto. 227-268.
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Qüestiió. 1986, vol. 10, núm. 1.
Gower y Hand (1996): Biplots. Chapman & Hall.
Varas, M. J., Vicente-Tavera, S., Molina, E., & Vicente-Villardon, J. L. (2005). Role of canonical biplot method in the study of building stones: an example from Spanish monumental heritage. Environmetrics, 16(4), 405-419.
Santana, M. A., Romay, G., Matehus, J., Villardon, J. L., & Demey, J. R. (2009). simple and low-cost strategy for micropropagation of cassava (Manihot esculenta Crantz). African Journal of Biotechnology, 8(16).
Examples
data(wine)
X=wine[,4:21]
canbip=CanonicalBiplot(X, group=wine$Group)
plot(canbip, mode="s")
MANOVA and Canonical Analysis of Distances
Description
Performs a MANOVA and a Canonical Analysis based on of Distance Matrices (usally for continuous data)
Usage
CanonicalDistanceAnalysis(Prox, group, dimens = 2, Nsamples = 1000,
PCoA = "Standard", ProjectInd = TRUE)
Arguments
Prox |
A object containing proximities |
group |
A factor with the group structure of the rows |
dimens |
The dimension of the solution |
Nsamples |
Number of samples for the permutation test. Number of permutations. |
PCoA |
Type of Principal Coordinates for the Canonical Analysis calculated from the Principal coordinates of the Mean Matrix. "Standard" : Standard Principal Coordinates Analysis, "Weighted": Weighted Principal Coordinates Analysis, "WPCA") |
ProjectInd |
Should the individual points be Projected onto the representation For the moment only available for Continuous Data. |
Details
Performs a MANOVA and a Canonical Analysis based on of Distance Matrices (usally for continuous data). The MANOVA statistics is calculated from a decomposition of the distance matrix based on a factor of a external classification. The significance of the test is calculated using a premutation test. The approach depens only on the distances and then is useful with any kind of data.
The Canonical Representation is calculated from a Principal Coordinates Analysis od the distance matrix among the means. Thus, it is only possible for continuous data. The PCoA representation can be "Standard" using the means directly, "Weighted" weighting each group with its sample size or using weighted Princiopal Components Analysis of the matrix of means.
A measure of the quality of representation of the groups is provided. When possible, the measure is also provided for the individual points.
Soon, a biplot representation will also be developed.
Value
An object of class "CanonicalDistanceAnalysis" with:
Distances |
The Matrix of Distances from which the Analysis has been made |
Groups |
A factor containing the group struture of the individuals |
TSS |
Total sum of squares |
BSS |
Between groups sum of squares |
WSS |
Within groups sum of squares |
Fexp |
Experimental pseudo F-value |
pvalue |
p value based on the permutation test |
Nsamples |
p value based on the permutation test |
ExplainedVariance |
Variances explained by the PCoA |
MeanCoordinates |
Coordinates of the groups for the graphical representation |
Qualities |
Qualities of the representation of the groups |
CummulativeQualities |
Cummulative qualities of the representation of the groups |
RowCoordinates |
Coordinates of the individuals for the graphical representation |
Note
The MANOVA and the representation of the means can be applied to any Distance althoug the projection of the individuals can be made only for continuous data.
Author(s)
Jose Luis Vicente Villardon
References
Gower, J. C., & Krzanowski, W. J. (1999). Analysis of distance for structured multivariate data and extensions to multivariate analysis of variance. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(4), 505-519.
Krzanowski, W. J. (2004). Biplots for multifactorial analysis of distance. Biometrics, 60(2), 517-524.
Examples
data(iris)
group=iris[,5]
X=as.matrix(iris[1:4])
D=ContinuousProximities(X, coef = 1)
CDA=CanonicalDistanceAnalysis(D, group, dimens=2)
summary(CDA)
plot(CDA)
CANONICAL STATIS-ACT for multiple tables with common rows and its associated Biplot
Description
The procedure performs STATIS-ACT methodology for multiple tables with common rows and its associated biplot
Usage
CanonicalStatisBiplot(X, Groups, InitTransform = "Standardize columns", dimens = 2,
SameVar = FALSE)
Arguments
X |
A list containing multiple tables with common rows |
Groups |
A factor containing the groups |
InitTransform |
Initial transformation of the data matrices |
dimens |
Dimension of the final solution |
SameVar |
Are the variables the same for all occasions? |
Details
The procedure performs Canonical STATIS-ACT methodology for multiple tables with common rows and its associated biplot. When the variables are the same for all occasions trajectories for the variables can also be plotted.
Value
An object of class StatisBiplot
Author(s)
Jose Luis Vicente Villardon
References
Vallejo-Arboleda, A., Vicente-Villardon, J. L., & Galindo-Villardon, M. P. (2007). Canonical STATIS: Biplot analysis of multi-table group structured data based on STATIS-ACT methodology. Computational statistics & data analysis, 51(9), 4193-4205.
Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012). STATIS and DISTATIS: optimum multitable principal component analysis and three way metric multidimensional scaling. WIREs Comput Stat, 4, 124-167.
Efron, B.,Tibshirani, RJ. (1993). An introduction to the bootstrap. New York: Chapman and Hall. 436p.
Escoufier, Y. (1976). Operateur associe a un tableau de donnees. Annales de laInsee, 22-23, 165-178.
Escoufier, Y. (1987). The duality diagram: a means for better practical applications. En P. Legendre & L. Legendre (Eds.), Developments in Numerical Ecology, pp. 139-156, NATO Advanced Institute, Serie G. Berlin: Springer.
L'Hermier des Plantes, H. (1976). Structuration des Tableaux a Trois Indices de la Statistique. [These de Troisieme Cycle]. University of Montpellier, France.
Ringrose, T.J. (1992). Bootstrapping and Correspondence Analysis in Archaeology. Journal of Archaeological Science. 19:615-629.
Examples
data(Chemical)
x= Chemical[37:144,5:9]
weeks=as.factor(as.numeric(Chemical$WEEKS[37:144]))
levels(weeks)=c("W2" , "W3", "W4")
X=Convert2ThreeWay(x,weeks, columns=FALSE)
Groups=Chemical$Treatment[1:36]
canstbip=CanonicalStatisBiplot(X, Groups, SameVar = TRUE)
plot(canstbip, mode="s", PlotVars=TRUE, ShowBox=TRUE)
Distances among individuals using nominal variables.
Description
Distances among individuals using nominal variables.
Usage
CategoricalDistances(x, y = NULL, coefficient = "GOW", transformation = "sqrt(1-S)")
Arguments
x |
Matrix of Categorical Data |
y |
A second matrix of categorical data with the same variables as x |
coefficient |
Similarity coefficient to use (see details) |
transformation |
Transformation of the similarity into a distance |
Details
The function calculates similarities and dissimilarities among a set ob ogjects characterized by a set of nominal variables. The function uses similarities and converts into dissimilarities using a variety of transformations controled by the user.
Value
A matrix with distances among the rows of x and y. If y is NULL the interdistances among the rows of x are calculated.
Author(s)
Jose Luis Vicente Villardon
References
dos Santos, T. R., & Zarate, L. E. (2015). Categorical data clustering: What similarity measure to recommend?. Expert Systems with Applications, 42(3), 1247-1260.
Boriah, S., Chandola, V., & Kumar, V. (2008). Similarity measures for categorical data: A comparative evaluation. red, 30(2), 3.
Examples
##---- Should be DIRECTLY executable !! ----
Proximities among individuals using nominal variables.
Description
Proximities among individuals using nominal variables.
Usage
CategoricalProximities(Data, SUP = NULL, coefficient = "GOW", transformation = 3, ...)
Arguments
Data |
A data frame containing categorical (nominal) variables |
SUP |
Supplementary data (Used to project supplementary individuals onto the PCoA configuration, for example) |
coefficient |
Similarity coefficient to use (see details) |
transformation |
Transformation of the similarity into a distance |
... |
Extra parameters |
Details
The function calculates similarities and dissimilarities among a set ob ogjects characterized by a set of nominal variables. The function uses similarities and converts into dissimilarities using a variety of transformations controled by the user.
Value
A list of Values
Author(s)
Jose Luis Vicente Villardon
References
dos Santos, T. R., & Zarate, L. E. (2015). Categorical data clustering: What similarity measure to recommend?. Expert Systems with Applications, 42(3), 1247-1260.
Boriah, S., Chandola, V., & Kumar, V. (2008). Similarity measures for categorical data: A comparative evaluation. red, 30(2), 3.
Examples
data(Doctors)
Dis=CategoricalProximities(Doctors, SUP=NULL, coefficient="GOW" , transformation=3)
pco=PrincipalCoordinates(Dis)
plot(pco, RowCex=0.7, RowColors=as.integer(Doctors[[1]]), RowLabels=as.character(Doctors[[1]]))
Checks if a data matrix is binary
Description
Checks if a data matrix is binary
Usage
CheckBinaryMatrix(x)
Arguments
x |
Matrix to check. |
Details
Checks if all the entries of the matix are either 0 or 1.
Value
TRUE
if the matrix is binary.
Author(s)
Jose Luis Vicente-Villardon
Examples
data(spiders)
sp=Dataframe2BinaryMatrix(spiders)
CheckBinaryMatrix(sp)
Checks if a vector is binary
Description
Checks if all the entries of a vector are 0 or 1
Usage
CheckBinaryVector(x)
Arguments
x |
he vector to check |
Value
The logical result
Author(s)
Jose luis Vicente Villardon
Examples
x=c(0, 0, 0, 0, 1, 1, 1, 2)
CheckBinaryVector(x)
Chemical data
Description
Ecological data
Usage
data("Chemical")
Format
A data frame with 324 observations on the following 16 variables.
Treatment
a factor with levels
F0N0
F0N1
F0N2
F0N3
F1N0
F1N1
F1N2
F1N3
F2N0
F2N1
F2N2
F2N3
FISH
a factor with levels
F0
F1
F2
NUTRIENTS
a factor with levels
N0
N1
N2
N3
WEEKS
a factor with levels
W1
W2
W3
W4
W5
W6
W7
W8
W9
TEMPERATURE
a numeric vector
pH
a numeric vector
ALKALINITYmeql
a numeric vector
CO2free
a numeric vector
NNH4mgl
a numeric vector
NNO3mgl
a numeric vector
SRPmglP
a numeric vector
TPmgl
a numeric vector
TSSmgl
a numeric vector
CONDUCTIVITYmScm
a numeric vector
TSPmglP
a numeric vector
Chlorophyllamgl
a numeric vector
Details
Chemical Data
Source
Department of Ecology. University of Leon. (Spain)
References
To add
Examples
data(Chemical)
## maybe str(Chemical) ; plot(Chemical) ...
Draws a circle
Description
Draws a circle for a given radius at the specified center with the given color
Usage
Circle(radius = 1, origin = c(0, 0), col = 1, ...)
Arguments
radius |
radius of the circle |
origin |
Centre of the circle |
col |
Color od the circle |
... |
Aditional graphical parameters |
Details
Draws a circle for a given radius at the specified center with the given color
Value
No value is returned
Author(s)
Jose Luis Vicente Villardon
Examples
plot(0,0)
Circle(1,c(0,0))
Coinertia Analysis.
Description
Calculates a Coinertia Analysis for two matrices of continuous data
Usage
Coinertia(X, Y, ScalingX = 5, ScalingY = 5, dimsol = 3)
Arguments
X |
The first matrix in the analysis |
Y |
The second matrix in the analysis |
ScalingX |
Transformation of the X matrix |
ScalingY |
Transformation of the Y matrix |
dimsol |
Dimension of the solution |
Details
Coinertia analysis for two continuous data matrices.
Value
An object of class Coinertia.SOL
Author(s)
Jose Luis Vicente Villardon
References
Doledec, S., & Chessel, D. (1994). Co-inertia analysis: an alternative method for studying species-environment relationships. Freshwater biology, 31(3), 277-294.
Examples
SSI$Year == "a2006"
SSI2D=SSI[SSI$Year == "a2006",3:23]
rownames(SSI2D)=as.character(SSI$Country[SSI$Year == "a2006"])
SSIHuman2D=SSI2D[,1:9]
SSIEnvir2D=SSI2D[,10:16]
SSIEcon2D=SSI2D[,17:21]
Coin=Coinertia(SSIHuman2D, SSIEnvir2D)
Plots the contributios of a biplot
Description
Plots the contributios of a biplot
Usage
ColContributionPlot(bip, A1 = 1, A2 = 2, Colors = NULL, Labs = NULL,
MinQuality = 0, CorrelationScale = FALSE, ContributionScale = TRUE,
AddSigns2Labs = TRUE, ...)
Arguments
bip |
An object of class ContinuousBiplot |
A1 |
First dimension to plot |
A2 |
Second dimension to plot |
Colors |
Colors for the variables |
Labs |
Labels for the variables |
MinQuality |
Min quality to plot |
CorrelationScale |
Scales for correlation |
ContributionScale |
Scales for contributions |
AddSigns2Labs |
Add the siggns of the correlations to the labels |
... |
Any other graphical parameter |
Details
Plots the contributions on a plot that sows also the sum for both axes-
Value
The contribution plot
Author(s)
Jose Luis Vicente Villardon
Examples
## Simple Biplot with arrows
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
# Plot of the Variable Contributions
ColContributionPlot(bip, cex=1)
Concentration ellipse for a se of two-dimensional points
Description
The function calculates a non-parametric concentration ellipse for a set of two-dimensional points.
Usage
ConcEllipse(data, confidence=1, npoints=100)
Arguments
data |
The set of two-dimensional points |
confidence |
Percentage of points to be included in the ellipse |
npoints |
Number of points to draw the eelipse contour. The hier the number of points the smouther is the ellipse. |
Details
The procedre uses the Mahalanobis distances to determine the points that will be used for the calculations.
Value
A list with the following fields
data |
Data Used for the calculations |
confidence |
The confidence level used |
ellipse |
The points on the ellipse contour to be plotted |
center |
The center of the points |
Author(s)
Jose Luis Vicente Villardon
References
Meulman, J. J., & Heiser, W. J. (1983). The display of bootstrap solutions in multidimensional scaling. Murray Hill, NJ: Bell Laboratories.
Linting, M., Meulman, J. J., Groenen, P. J., & Van der Kooij, A. J. (2007). Stability of nonlinear principal components analysis: An empirical study using the balanced bootstrap. Psychological Methods, 12(3), 359.
Examples
data(iris)
dat=as.matrix(iris[1:50,1:2])
plot(iris[,1], iris[,2],col=iris[,5], asp=1)
E=ConcEllipse(dat, 0.95)
plot(E)
Confidence Interval for the mean
Description
Calculates Confidence Interval for the mean of a Numerical Variable.
Usage
ConfidenceInterval(x, Desv = NULL, df = NULL, Confidence = 0.95)
Arguments
x |
The numerical variable |
Desv |
Standard deviation. If NULL, the sd is calculated from the data |
df |
Degrees of freedom |
Confidence |
Confidence Level |
Details
Calculates Confidence Interval for the mean of a Numerical Variable.
Value
The confidence Interval for the mean
Author(s)
Jose Luis Vicente Villardon
Examples
# Not yet
Constrained Binary Logistic Biplot
Description
Constrained Binary Logistic Biplot or Redundancy Analysis for Binary Data based on logistic responses
Usage
ConstrainedLogisticBiplot(Y, X, dim = 2, Scaling = 5, tolerance = 1e-05,
maxiter = 100, penalization = 0.1)
Arguments
Y |
A binary data matrix |
X |
A matrix of predictors |
dim |
Dimension of the Solution |
Scaling |
Transformation of the columns of the predictor matrix. |
tolerance |
Tolerance for the algorithm |
maxiter |
Maximum number of iterations. |
penalization |
Penalization for the fit (ridge) |
Details
Constrained Binary Logistic Biplot or Redundancy Analysis for Binary Data based on logistic responses.
Value
A logistic Biplot with the reponse and the predictive variables projected onto it.
Author(s)
Jose Luis Vicente-Villardon
References
Vicente-Villardon, J. L., & Vicente-Gonzalez, L. Redundancy Analysis for Binary Data Based on Logistic Responses in Data Analysis and Rationality in a Complex World. Springer.
Examples
# not yet
Constrained Ordinal Logistic Biplot
Description
Constrained Ordinal Logistic Biplot or Redundancy Analysis for Ordinal Data based on logistic responses
Usage
ConstrainedOrdinalLogisticBiplot(Y, X, dim = 2, Scaling = 5,
tolerance = 1e-05, maxiter = 100, penalization = 0.1, show = FALSE)
Arguments
Y |
A binary data matrix |
X |
A matrix of predictors |
dim |
Dimension of the Solution |
Scaling |
Transformation of the columns of the predictor matrix. |
tolerance |
Tolerance for the algorithm |
maxiter |
Maximum number of iterations. |
penalization |
Penalization for the fit (ridge) |
show |
Show each step ot the fit |
Details
Constrained Ordinal Logistic Biplot or Redundancy Analysis for Ordinal Data based on logistic responses.
Value
An ordinal logistic Biplot with the reponse and the predictive variables projected onto it.
Author(s)
Jose Luis Vicente-Villardon
References
Vicente-Villardon, J. L., & Vicente-Gonzalez, L. Redundancy Analysis for Binary Data Based on Logistic Responses in Data Analysis and Rationality in a Complex World. Springer.
Examples
# not yet
Distances for Continuous Data
Description
Calculates distances among rows of a continuous data matrix or among the rows of two continuous matrices.
Usage
ContinuousDistances(x, y = NULL, coef = "Pythagorean", r = 1)
Arguments
x |
Main data matrix. Distances among rows are calculated if y=NULL. |
y |
Supplementary data matrix. If not NULL the distances among the rows of x and y are calculated |
coef |
Distance coefficient. Use the name or the number(see details) |
r |
Exponent for the Minkowsky |
Details
The following coefficients are calculated
1.- Pythagorean = sqrt(sum((y[i, ] - x[j, ])^2)/p)
2.- Taxonomic = sqrt(sum(((y[i,]-x[j,])^2)/r^2)/p)
3.- City = sum(abs(y[i,]-x[j,])/r)/p
4.- Minkowski = (sum((abs(y[i,]-x[j,])/r)^t)/p)^(1/t)
5.- Divergence = sqrt(sum((y[i,]-x[j,])^2/(y[i,]+x[j,])^2)/p)
6.- dif_sum = sum(abs(y[i,]-x[j,])/abs(y[i,]+x[j,]))/p
7.- Camberra = sum(abs(y[i,]-x[j,])/(abs(y[i,])+abs(x[j,])))
8.- Bray_Curtis = sum(abs(y[i,]-x[j,]))/sum(y[i,]+x[j,])
9.- Soergel = sum(abs(y[i,]-x[j,]))/sum(apply(rbind(y[i,],x[j,]),2,max))
10.- Ware_hedges = sum(abs(y[i,]-x[j,]))/sum(apply(rbind(y[i,],x[j,]),2,max))
Value
A list with:
Data |
A matrix with the initial data (x matrix). |
SupData |
A matrix with the supplementary data (y matrix). |
D |
The matrix of distances |
Coefficient |
The coefficient used. |
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J. C. (2006) Similarity dissimilarity and Distance, measures of. Encyclopedia of Statistical Sciences. 2nd. ed. Volume 12. Wiley
See Also
Examples
data(wine)
dis=ContinuousDistances(wine[,4:21])
Proximities for Continuous Data
Description
Calculates proximities among rows of a continuous data matrix or among the rows of two continuous matrices.
Usage
ContinuousProximities(x, y = NULL, ysup = FALSE,
transpose = FALSE, coef = "Pythagorean", r = 1)
Arguments
x |
Main data matrix. Distances among rows are calculated if y=NULL. |
y |
Supplementary data matrix. If not NULL the distances among the rows of x and y are calculated |
ysup |
Supplementary Y data |
transpose |
Transpose rows and columns |
coef |
Distance coefficient. Use the name or the number(see details) |
r |
Exponent for the Minkowsky |
Details
The following coefficients are calculated
1.- Pythagorean = sqrt(sum((y[i, ] - x[j, ])^2)/p)
2.- Taxonomic = sqrt(sum(((y[i,]-x[j,])^2)/r^2)/p)
3.- City = sum(abs(y[i,]-x[j,])/r)/p
4.- Minkowski = (sum((abs(y[i,]-x[j,])/r)^t)/p)^(1/t)
5.- Divergence = sqrt(sum((y[i,]-x[j,])^2/(y[i,]+x[j,])^2)/p)
6.- dif_sum = sum(abs(y[i,]-x[j,])/abs(y[i,]+x[j,]))/p
7.- Camberra = sum(abs(y[i,]-x[j,])/(abs(y[i,])+abs(x[j,])))
8.- Bray_Curtis = sum(abs(y[i,]-x[j,]))/sum(y[i,]+x[j,])
9.- Soergel = sum(abs(y[i,]-x[j,]))/sum(apply(rbind(y[i,],x[j,]),2,max))
10.- Ware_hedges = sum(abs(y[i,]-x[j,]))/sum(apply(rbind(y[i,],x[j,]),2,max))
Value
Data |
A matrix with the initial data (x matrix). |
SupData |
A matrix with the supplementary data (y matrix). |
D |
The matrix of distances |
Coefficient |
The coefficient used. |
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J. C. (2006) Similarity dissimilarity and Distance, measures of. Encyclopedia of Statistical Sciences. 2nd. ed. Volume 12. Wiley
Examples
data(wine)
dis=ContinuousProximities(wine[,4:21])
Three way array from a two way matrix
Description
Converts a two-dimensional matrix into a list where each cell is the two dimensional data matrix for an occasion or group.
Usage
Convert2ThreeWay(x, groups, columns = FALSE, RowNames = NULL)
Arguments
x |
The two dimensional matrix |
groups |
A factor defining the groups |
columns |
Are the grouos defined for columns? |
RowNames |
Names for the rows of each table. |
Details
Converts a two dimensional matrix into a multitable list according to the groups provided by the user. Each field of the list has the name of the corresponding group.
Value
A Multitable list. Ech filed is the data matrix for a group.
X |
The multitable list |
Author(s)
Jose Luis Vicente Villardon
Examples
data(Chemical)
x= Chemical[,5:16]
X=Convert2ThreeWay(x,Chemical$WEEKS, columns=FALSE)
Converts a three way array into a list
Description
Converts a three way array into a list
Usage
Convert3wArray2List(X)
Arguments
X |
A three way array |
Details
Converts a three way array into a list
Value
A list
Author(s)
Jose Luis Vicente-Villardon
Examples
#No examples yet
Convert a factor to integer numbers
Description
Convert a factor to integer numbers
Usage
ConvertFactors2Integers(x)
Arguments
x |
A vector with a factor |
Details
Convert a factor to integer numbers
Value
a vector with the converted values
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Converts a list of matrices into a three way array
Description
Converts a list of matrices into a three way array. All the matrices in the list must have the same size.
Usage
ConvertList23wArray(X)
Arguments
X |
A list with data matrices. |
Details
Converts a list of matrices into a three way array. All the matrices in the list must have the same size.
Value
A three-way array
Author(s)
Jose Luis Vicente-Villardon
Examples
# No examples yet
Circle of correlations
Description
Circle of correlations among the manifiest variables and the principal comonents (or dimensions of any biplot).
Usage
CorrelationCircle(bip, A1 = 1, A2 = 2, Colors = NULL, Labs = NULL, ...)
Arguments
bip |
an biplot object of any kind. |
A1 |
First dimension for the representation |
A2 |
Second dimension for the representation |
Colors |
Colors of the variables |
Labs |
Labels of the variables |
... |
Any other graphical parameters |
Details
Circle of correlations among the manifiest variables and the principal comonents (or dimensions of any biplot).
Value
The plot of the circle of correlations
Author(s)
Jose Luis Vicente Villardon
Examples
bip=PCA.Biplot(wine[,4:21])
CorrelationCircle(bip)
Alternated Least Squares Biplot
Description
Alternated Least Squares Biplot with any choice of weigths for each element of the data matrix
Usage
CrissCross(x, w = matrix(1, dim(x)[1], dim(x)[2]), dimens = 2, a0 = NULL,
b0 = NULL, maxiter = 100, tol = 1e-04, addsvd = TRUE, lambda = 0)
Arguments
x |
Data Matrix to be analysed |
w |
Weights matrix. Must be of the same size as X. |
dimens |
Dimension of the solution. |
a0 |
Starting row coordinates. Random coordinates are calculated if the argument is NULL. |
b0 |
Starting column coordinates. Random coordinates are calculated if the argument is NULL. |
maxiter |
Maximum number of iterations |
tol |
Tolerance for the algorithm to converge. |
addsvd |
Calculate an additional SVD at the end of the algorithm. That meakes the final solution more readable |
lambda |
Constant to add to the diagonal of the natrices to be inverted in order to improve stability when the matrices are ill-conditioned. |
Details
The function calculates Alternated Least Squares Biplot with any choice of weigths for each element of the data matrix. The function is useful when we want a low rank approximation of a data matrix in which each element of the matrix has a different weight, for example, all the weights are 1 except for the missing elements that are 0, or to model the logarithms of a frequency table using the frequencies as weights.
Value
An object of class .Biplot" with the following components:
n |
Number of Rows |
p |
Number of Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
RowCoordinates |
Coordinates for the rows |
ColCoordinates |
Coordinates for the columns |
RowContributions |
Contributions for the rows |
ColContributions |
Contributions for the columns |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
GABRIEL, K.R. and ZAMIR, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21: 489-498.
See Also
Examples
data(Protein)
X=as.matrix(Protein[,3:11])
X = InitialTransform(X, transform=5)$X
bip=CrissCross(X)
Cummulative sums
Description
Cummulative sums
Usage
CumSum(X, dimens = 1)
Arguments
X |
Data Matrix |
dimens |
Dimension for summing |
Details
Cummulative sums within rows (dimens=1) or columns (dimens=2) of a data matrix
Value
A matrix of the same size as X with cummulative sums within each row or each column
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
X=wine[,4:21]
CumSum(X,1)
CumSum(X,2)
Prepares a matrix for regression from a data frame
Description
Prepares a matrix for regression from a data frame
Usage
DataFrame2Matrix4Regression(X, last = TRUE, Intercept = FALSE)
Arguments
X |
A data frame |
last |
Logical to use the last category of nominal variabless as baseline. |
Intercept |
Logical to tell the function if a constant must be present |
Details
Nominal variables are converted to a matrix of dummy variables for regression.
Value
A matrix ready to use as independent variables in a regression
Author(s)
Jose Luis Vicente Vilardon
Examples
##---- Should be DIRECTLY executable !! ----
Converts a Data Frame into a Binary Data Matrix
Description
Converts a Data Frame into a Binary Data Matrix
Usage
Dataframe2BinaryMatrix(dataf, cuttype = "Median", cut = NULL, BinFact = TRUE)
Arguments
dataf |
data.frame to be converted |
cuttype |
Type of cut point for continuous variables. Must be "Median" or "Mean". Does not have any effect for factors |
cut |
Personalized cut value for continuous variables. |
BinFact |
Should I treat a factor with two levels as binary. This means that only a single dummy rather than two is used |
Details
The function converts a data frame into a Binary Data Matrix (A matrix with entries either 0 or 1).
Factors with two levels are directly transformed into a column of 0/1 entries.
Factors with more than two levels are converted into a binary submatrix with
as many rows as x
and as many columns as levels or categories. (Indicator matrix)
Integer Variables are treated as factors
Continuous Variables are converted into binary variables using a cut point that can be the median, the mean or a value provided by the user.
Value
A Binary Data Matrix.
Author(s)
Jose Luis Vicente Villardon
Examples
data(spiders)
Dataframe2BinaryMatrix(spiders)
Adds Non-parametric densities to a biplot. Separated densities are calculated for different clusters
Description
Adds Non-parametric densities to a biplot. Separated densities are calculated for different clusters
Usage
DensityBiplot(X, y = NULL, grouplabels = NULL, ncontours = 6,
groupcolors = NULL, ncolors=20, ColorType=4)
Arguments
X |
Two dimensional coordinates of the points in a biplot (or any other) |
y |
A factor containing clusters or groups for separate densities. |
grouplabels |
Labels for the groups |
ncontours |
Number of contours to represent on the biplot |
groupcolors |
Colors for the groups |
ncolors |
Number of colors for the density patterns |
ColorType |
One of the following: "1" = rainbow, "2" = heat.colors, "3" = terrain.colors, "4" = topo.colors, "5" = cm.colors |
Details
Non parametric densities are used to investigate the concentration of row points on different areas of the biplot representation. The densities can be calculated for different groups or clusters in order to investigate if individuals with differnt characteristics are concentrated on particular areas of the biplot. The procedure is particularly useful with a high number of individuals.
Value
No value returned. It has effect on the graph.
Author(s)
Jose Luis Vicente Villardon
References
Gower, J. C., Lubbe, S. G., & Le Roux, N. J. (2011). Understanding biplots. John Wiley & Sons.
Examples
bip=PCA.Biplot(iris[,1:4])
plot(bip, mode="s", CexInd=0.1)
Calculation of Disparities
Description
Calculation of Disparities for a MDS model
Usage
Dhats(P, D, W, Model = c("Identity", "Ratio", "Interval", "Ordinal"), Standardize = TRUE)
Arguments
P |
A matrix of proximities (usually dissimilarities) |
D |
A matrix of distances obtained from an euclidean configuration |
W |
A matrix of weights |
Model |
Measurement level of the proximities |
Standardize |
Should the Disparities be standardized? |
Details
Calculation of disparities using standard or monotone regression depending on the MDS model.
Value
Returns the proximities.
Author(s)
Jose L. Vicente Villardon
References
Borg, I., & Groenen, P. J. (2005). Modern multidimensional scaling: Theory and applications. Springer.
Examples
## Function is used inside MDS or smacof
Labels for the selected dimensions in a biplot
Description
Creates a character vector with labels for the dimensions of the biplot
Usage
DimensionLabels(dimens, Root = "Dim")
Arguments
dimens |
Number of dimensions |
Root |
Root of the label |
Details
An auxiliary function to cretae labels for the dimensions. Useful to label the matrices of results
Value
Returns a vector of labels
Author(s)
Jose Luis Vicente Villardon
Examples
DimensionLabels(dimens=3, Root = "Dim")
Data set extracted from the Careers of doctorate holders survey carried out by Spanish Statistical Office in 2008.
Description
The sample data, as part of a large survey, corresponds to 100 people who have the PhD degree and it shows the level of satisfaction of the doctorate holders about some issues.
Usage
data(Doctors)
Format
This data frame contains 100 observation for the following 5 ordinal variables, with four categories each: (1= "Very Satisfied", 2= "Somewhat Satisfied",3="Somewhat dissatisfied", 4="Very dissatisfied")
- Salary
- Benefits
- Job Security
- Job Location
- Working conditions
Source
Spanish Statistical Institute. Survey of PDH holders, 2006. URL: http://www.ine.es.
Examples
data(Doctors)
## maybe str(Doctors) ; plot(Doctors) ...
Plots a panel of error bars
Description
Plots a panel of error bars to compare the means of several variables in the levels of a factor using confidence intervals.
Usage
ErrorBarPlotPanel(X, groups = NULL, nrows = NULL, panel = TRUE,
GroupsTogether = TRUE, Confidence = 0.95, p.adjust.method = "None",
UseANOVA = FALSE, Colors = "blue", Title = "Error Bar Plot",
sort = TRUE, ...)
Arguments
X |
A matrix containing several variables |
groups |
A factor defining groups of individuals |
nrows |
Number of rows of the panel. The function calculates the number of columns needed. |
panel |
The plots are shown on a panel (TRUE) or in separated graphs (FALSE) |
GroupsTogether |
The groups appear together on the same plot |
Confidence |
Confidence levels for the error bars (confidence intervals) |
p.adjust.method |
Method for adjusting the p-value to cope with multiple comparisons. |
UseANOVA |
If TRUE the function uses the residual variance of the ANOVA to calculate the confidence interval. ("None", "Bonferroni" or "Sidak") |
Colors |
Colors to identyfy the groups |
Title |
Title of the graph |
sort |
Should short the means before plotting |
... |
Other graphical parameters |
Details
The funtion plots a panel of error bars plots to compare several groups for several variables.
Value
A panel of error bars plots.
Author(s)
Jose Luis Vicente Villardon
Examples
ErrorBarPlotPanel(wine[4:9], wine$Group, UseANOVA=TRUE, Title="", sort=FALSE)
Classical Euclidean Distance (Pythagorean Distance)
Description
Calculates the eucliden distances among the rows of an euclidean configurations in any dimensions
Usage
EuclideanDistance(x)
Arguments
x |
A matrix containing the euclidean configuration |
Details
eucliden distances among the rows of an euclidean configurations in any dimensions
Value
Returns the distance matrix
Author(s)
Jose Luis Vicente Villardon
Examples
x=matrix(runif(20),10,2)
D=EuclideanDistance(x)
Expands a compressed table of patterns and frequencies
Description
Expands a compressed table of patterns and frequencies
Usage
ExpandTable(table)
Arguments
table |
A compressed table of patterns and frequencies |
Details
To simplify the calculations of some of the algorithms we compress the tables by searching for the distinct patterns and its frequencies. This function recovers the original data. It serves also to assign the corrdinates on the biplot to the original individuals.
Value
A matrix with the original data
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
External Logistic Biplot for binary Data
Description
Fits an External Logistic Biplot to the results of a Principal Coordinates Analysis obtained from binary data.
Usage
ExternalBinaryLogisticBiplot(Pco, IncludeConst=TRUE, penalization=0.2, freq=NULL,
tolerance = 1e-05, maxiter = 100)
Arguments
Pco |
An object of class "Principal.Coordinates" |
IncludeConst |
Should the logistic fit include the constant term? |
penalization |
Penalization for the ridge regression |
freq |
frequencies for each observation or pattern (usually 1) |
tolerance |
Tolerance for convergence |
maxiter |
Maximum number of iterations |
Details
Let {\bf{X}}
be the matrix of binary data scored as present or absent (1 or 0), in which the rows correspond to n individuals or entries (for example, genotypes) and the columns to p binary characters (for example alleles or bands), let {\bf{S}} = ({s_{ij}})
be a matrix containing the similarities among rows, obtained from the binary data matrix , and let \Delta = ({\delta _{ij}})
be the corresponding dissimilarity/distance matrix, taking for example {\delta _{ij}} = \sqrt {1 - {s_{ij}}}
.
Despite the fact that, in Cluster Analysis and Principal Coordinates Analysis, interpretation of the variables responsible for grouping or ordination is not straightforward, those methods are normally used to classify individual in which binary variables have been measured.
we use a combination of Principal Coordinates Analysis (PCoA), Cluster Analysis (CA) and External Logistic Regression (ELB), as a better way to interpret the binary variables associated to the classification of genotypes. The combination of three standard techniques with some new ideas about the geometry of the procedures, allows to construct a External Logistic Regression (ELB), that helps the interpretation of the variables responsible for the classification or ordination.
Suppose we have obtained an euclidean configuration {\bf{Y}}
obtained from the Principal Coordinates (PCoA) of the similarity matrix.
To search for the variables associated to the ordination obtained in PCoA, we can look for the directions in the ordination diagram that better predict the probability of presence of each allele.
More formally, if we defined {\pi _{ij}} = E({x_{ij}})= {\textstyle{1 \over {1 + \exp ( - ({b_{j0}} + \sum\limits_{s = 1}^k {{b_{js}}{y_{is}}} ))}}}
as the expected probability that the allele j be present at genotype for a genotype with coordinates y_{is}
(i=1, ...,n; s=1, ..., k) on the ordination diagram, as
where bjs ( j=1,..., p) are the logistic regression coefficients that correspond to the jth variable (alleles or bands) in the sth dimension. The model is a generalized linear model having the logit as a link function.
where and , y's and b's define a biplot in logit scale. This is called External Logistic Biplot because the coordinates of the genotypes are calculated in an external procedure (PCoA). Given that the y's are known from PCoA, obtaining the b´s is equivalent to performing a logistic regression using the j-th column of X as a response variable and the columns of y as regressors.
Value
An object of class External.Binary.Logistic.Biplot
with the fields of the Principal.Coordinates
object with the following fields added.
ColumnParameters |
Parameters resulting from fitting a logistic regression to each column of the original binary data matrix |
VarInfo |
Information of the fit for each variable |
VarInfo$Deviances |
A vector with the deviances of each variable calculated as the difference with the null model |
VarInfo$Dfs |
A vector with degrees of freedom for each variable |
VarInfo$pvalues |
A vector with the p values each variable |
VarInfo$Nagelkerke |
A vector with the Nagelkerke pseudo R-squared for each variable |
VarInfo$PercentsCorrec |
A vector with the percentage of correct classifications for each variable |
DevianceTotal |
Total Deviance as the difference with the null model |
p |
p value for the complete representation |
TotalPercent |
Total percentage of correct classification |
Author(s)
Jose Luis Vicente Villardon
References
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Análisis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
Examples
data(spiders)
x2=Dataframe2BinaryMatrix(spiders)
colnames(x2)=colnames(spiders)
dist=BinaryProximities(x2)
pco=PrincipalCoordinates(dist)
pcobip=ExternalBinaryLogisticBiplot(pco)
Extracts unique patterns and its frequencies for a discrete data matrix (numeric)
Description
Extracts the patterns and the frequencies of a discrete data matrix reducing the size of the data matrix in order to accelerate calculations in some techniques.
Usage
ExtractTable(x)
Arguments
x |
A matrix of integers containing information of discrete variables. The input matrix must be numerical for the procedure to work properly. |
Details
For any numerical matrix, calculates the different patterns and the frequencies associated to each pattern The result contains the pattern matrix, a vector with the frequencies, a list with rows sharing the same pattern. The final pattern matrix has different ordering than the original matrix.
Value
OriginalNames |
Names before grouping the equal rows |
Patterns |
The reduced table with only unique patterns |
EqualRows |
A list with as many components as unique patterns specifying the original rows with each pattern. That will allow for the reconstruction of the initial matrix |
Author(s)
Jose Luis Vicente-Villardon
Examples
data(spiders)
spidersbin=Dataframe2BinaryMatrix(spiders)
spiderstable=ExtractTable(spidersbin)
Biplot for Factor Analysis.
Description
Biplot used as a graphical representation of Factor Analysis.
Usage
FA.Biplot(X, dimension = 3, Extraction="PC", Rotation="varimax",
InitComunal="A1", normalize=FALSE, Scores= "Regression",
MethodArgs=NULL, sup.rows = NULL, sup.cols = NULL, ...)
Arguments
X |
Data Matrix |
dimension |
Dimension of the solution |
Extraction |
Method for the extraction of the factors. Can be "PC", "IPF" or "ML" ("Principal Components", "Iterated Principal Factor" or "Maximum Likelihood") |
Rotation |
Method for the rotation of the factors. Can be "PC", "IPF" or "ML" |
InitComunal |
Initial communalities for the iterated principal factor method. Can be "A1", "HSC" or "MC" ("All 1", "Highest Simple Correlation" or "Multiple Correlation") |
normalize |
Should the loadings be normalized |
Scores |
Method to calculate the Row Scores. Must be "Regression" or "Bartlett". |
MethodArgs |
Aditional arguments associated to the rotation method. |
sup.rows |
Supplementary or illustrative rows, if any. |
sup.cols |
Supplementary or illustrative rows, if any. |
... |
Additional arguments for the rotation procedure. |
Details
Biplots represent the rows and columns of a data matrix in reduced dimensions. Usually rows represent individuals, objects or samples and columns are variables measured on them. The most classical versions can be thought as visualizations associated to Principal Components Analysis (PCA) or Factor Analysis (FA) obtained from a Singular Value Decomposition or a related method. From another point of view, Classical Biplots could be obtained from regressions and calibrations that are essentially an alternated least squares algorithm equivalent to an EM-algorithm when data are normal This routine Calculates a biplot as a graphical representation of a classical Factor Analysis alowing for different extraction methods and different rotations.
Value
An object of class "ContinuousBiplot" with the following components:
Title |
A general title |
Non_Scaled_Data |
Original Data Matrix |
Means |
Means of the original Variables |
Medians |
Medians of the original Variables |
Deviations |
Standard Deviations of the original Variables |
Minima |
Minima of the original Variables |
Maxima |
Maxima of the original Variables |
P25 |
25 Percentile of the original Variables |
P75 |
75 Percentile of the original Variables |
Gmean |
Global mean of the complete matrix |
Sup.Rows |
Supplementary rows (Non Transformed) |
Sup.Cols |
Supplementary columns (Non Transformed) |
Scaled_Data |
Transformed Data |
Scaled_Sup.Rows |
Supplementary rows (Transformed) |
Scaled_Sup.Cols |
Supplementary columns (Transformed) |
n |
Number of Rows |
p |
Number of Columns |
nrowsSup |
Number of Supplementary Rows |
ncolsSup |
Number of Supplementary Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
EV |
EigenVectors |
Structure |
Correlations of the Principal Components and the Variables |
RowCoordinates |
Coordinates for the rows, including the supplementary |
ColCoordinates |
Coordinates for the columns, including the supplementary |
RowContributions |
Contributions for the rows, including the supplementary |
ColContributions |
Contributions for the columns, including the supplementary |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
Gabriel, K.R.(1971): The biplot graphic display of matrices with applications to principal component analysis. Biometrika, 58, 453-467.
Gabriel, K. R. AND Zamir, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21(21):489–498, 1979.
Gabriel, K.R.(1998): Generalised Bilinear Regression. Biometrika, 85, 3, 689-700.
Gower y Hand (1996): Biplots. Chapman & Hall.
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez-Zaballos, A. (2006). Logistic Biplots. Multiple Correspondence Analysis and related methods 491-509.
See Also
Examples
data(Protein)
X=Protein[,3:11]
bip=FA.Biplot(X, Extraction="ML", Rotation="oblimin")
plot(bip, mode="s", margin=0.05, AddArrow=TRUE)
Converts a Factor into its indicator matrix
Description
Converts a factor into a binary matrix with as many columns as categories of the factor
Usage
Factor2Binary(y, Name = NULL)
Arguments
y |
A factor |
Name |
Name to use in the final matrix |
Value
An indicator binary matrix
Author(s)
Jose Luis Vicente Villardon
Examples
y=factor(c(1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 2, 2, 2, 1, 1, 1))
Factor2Binary(y)
Selection of a fraction of the data
Description
Selects a percentage of the data eliminating the observations with higher Mahalanobis distances to the center.
Usage
Fraction(data, confidence = 1)
Arguments
data |
Two dimensional data set |
confidence |
Percentage to retain. (0-1) |
Details
The function is used to select a fraction of the data to be plotted for example when clusters are used. The function eliminates the extreme values.
Value
An object of class fraction
with the following fields
data |
The originaldata |
fraction |
The selected data |
confidence |
The percentage selected |
Author(s)
Jose Luis Vicente Villardon
References
Meulman, J. J., & Heiser, W. J. (1983). The display of bootstrap solutions in multidimensional scaling. Murray Hill, NJ: Bell Laboratories.
Linting, M., Meulman, J. J., Groenen, P. J., & Van der Kooij, A. J. (2007). Stability of nonlinear principal components analysis: An empirical study using the balanced bootstrap. Psychological Methods, 12(3), 359.
See Also
ConcEllipse
, AddCluster2Biplot
Examples
a=matrix(runif(50), 25,2)
a2=Fraction(a, 0.7)
Biplot for continuous data based on gradient descent methods
Description
Biplot for continuous data based on gradient descent methods.
Usage
GD.Biplot(X, dimension = 2, Scaling = 5,
lambda = 0.01, OptimMethod = "CG",
Orthogonalize = FALSE, Algorithm = "Alternated",
sup.rows = NULL, sup.cols = NULL,
grouping = NULL, tolerance = 1e-04,
num_max_iters = 300, Initial = "random")
Arguments
X |
A data matrix with continuous variables. |
dimension |
Dimension of the final solution. |
Scaling |
Transformation of the raw data matrix before the calculation of the biplot. |
lambda |
Constant for the ridge Penalization |
OptimMethod |
Optimization method passed to the |
Orthogonalize |
Should the solution be ortogonalized. |
Algorithm |
Algorithm to calculate the Biplot. (Alternated, Joint, Recursive) |
sup.rows |
Supplementary Rows. (not working now) |
sup.cols |
Supplementary Columns. (not working now) |
grouping |
Grouping factor for the within groups transformation. |
tolerance |
Tolerance for convergence |
num_max_iters |
Maximum number of iterations. |
Initial |
Initial Configuration |
Details
The function calculates a bilot using gradient descent methods. The function optim
is used to optimize the loss function. By default CG (Conjugate Gradient) method is used althoug other possibilities can be used.
Value
An object of class "ContinuousBiplot" is returned.
Author(s)
Jose Luis Vicente Villardon
Examples
data("Protein")
X=Protein[,3:11]
gdbip=GD.Biplot(X, dimension=2, Algorithm="Joint",
Orthogonalize=FALSE, lambda=0.3, Initial="random")
plot(gdbip)
summary(gdbip)
Games-Howell post-hoc tests for Welch's one-way analysis
Description
This function produces results from Games-Howell post-hoc tests for Welch's one-way analysis of variance (ANOVA) for a matrix of numeric data and a grouping variable.
Usage
Games_Howell(data, group)
Arguments
data |
The matrix of continuous data. |
group |
The grouping variable |
Details
This function produces results from Games-Howell post-hoc tests for Welch's one-way analysis of variance (ANOVA) for a matrix of numeric data and a grouping variable.
Value
The tests for each column of the data matrix
Author(s)
Jose Luis Vicente Villardon
References
Ruxton, G. D., & Beauchamp, G. (2008). Time for some a priori thinking about post hoc testing. Behavioral ecology, 19(3), 690-693.
Examples
# Not yet
Generalized Procrustes Analysis
Description
Generalized Procrustes Analysis
Usage
GeneralizedProcrustes(x, tolerance = 1e-05, maxiter = 100, Plot = FALSE)
Arguments
x |
Three dimensional array with the configurations. The first dimension contains the rows of the configurations, the second contains the columns and the third the number of configurations. So x[,,i] is the i-th configuration |
tolerance |
Tolerance for the Procrustes algorithm. |
maxiter |
Maximum number of iterations |
Plot |
Should the results be plotted? |
Details
Generalized Procrustes Analysis for several configurations contained in a three-dimensional array.
Value
An object of class GenProcustes
.This has components:
History |
History of Iterations |
X |
Initial configurations in a three dimensional array |
RotatedX |
Transformed configurations in a three dimensional array |
Scale |
Scale factors for each configuration |
Rotations |
Rotation Matrices in a three dimensional array |
rss |
Residual Sum of Squares |
Fit |
Goodness of fit as percent of expained variance |
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J.C., (1975). Generalised Procrustes analysis. Psychometrika 40, 33-51.
Ingwer Borg, I. & Groenen, P. J.F. (2005). Modern Multidimensional Scaling. Theory and Applications. Second Edition. Springer
See Also
Examples
data(spiders)
n=dim(spiders)[1]
p=dim(spiders)[2]
prox=array(0,c(n,2,4))
p1=BinaryProximities(spiders,coefficient=5)
prox[,,1]=PrincipalCoordinates(p1)$RowCoordinates
p2=BinaryProximities(spiders,coefficient=2)
prox[,,2]=PrincipalCoordinates(p2)$RowCoordinates
p3=BinaryProximities(spiders,coefficient=3)
prox[,,3]=PrincipalCoordinates(p3)$RowCoordinates
p4=BinaryProximities(spiders,coefficient=4)
prox[,,4]=PrincipalCoordinates(p4)$RowCoordinates
GeneralizedProcrustes(prox)
Calculates the scales for the variables on a linear biplot
Description
Calculates the scales for the variables on a linear prediction biplot There are several types of scales and values that can be shown on the graphical representation. See details.
Usage
GetBiplotScales(Biplot, nticks = 3, TypeScale = "Complete", ValuesScale = "Original")
Arguments
Biplot |
Object of class PCA.Biplot |
nticks |
Number of ticks for the biplot axes |
TypeScale |
Type of scale to use : "Complete", "StdDev" or "BoxPlot" |
ValuesScale |
Values to show on the scale: "Original" or "Transformed" |
Details
The function calculates the points on the biplot axes where the scales should be placed.
There are three types of scales when the transformations of the raw data are made by columns:
"Complete": Covers the whole range of the variable using the number of ticks specified in "nticks". A smaller number of points could be shown if some fall outsite the range of the scatter.
"StdDev": The mean +/- 1, 2 and 3 times the standard deviation.A smaller number of points could be shown if some fall outsite the range of the scatter.
"BoxPlot": Median, 25, 75 percentiles maximum and minimum values are shown. The extremes of the interquartile range are connected with a thicker line. A smaller number of points could be shown if some fall outsite the range of the scatter.
There are two kinds of values that can be shown on the biplot axis:
"Original": The values before transformation. Only makes sense when the transformations are for each column.
"Transformed": The values after transformation, for example, after standardization.
Although the function is public, the end used will not normally use it.
Value
A list with the following components:
Ticks |
A list containing the ticks for each variable |
Labels |
A list containing the labels for each variable |
Author(s)
Jose Luis Vicente Villardon
Examples
data(iris)
bip=PCA.Biplot(iris[,1:4])
GetBiplotScales(bip)
Calculates scales for plotting the environmental variables in a Canonical Correspondence Analysis
Description
Calculates scales for plotting the environmental variables in a Canonical Correspondence Analysis
Usage
GetCCAScales(CCA, nticks = 7, TypeScale = "Complete", ValuesScale = "Original")
Arguments
CCA |
A CCA solution object |
nticks |
Number of ticks to represent |
TypeScale |
Type of scale to represent |
ValuesScale |
Values to represent (Original or Transformed) |
Details
Calculates scales for plotting the environmental variables in a Canonical Correspondence Analysis
Value
Returns the points and the labels for each biplot axis
Author(s)
Jose Luis Vicente Villardon
References
Gower, J. C., & Hand, D. J. (1995). Biplots (Vol. 54). CRC Press.
Gower, J. C., Lubbe, S. G., & Le Roux, N. J. (2011). Understanding biplots. John Wiley & Sons.
Vicente-Villardón, J. L., Galindo Villardón, M. P., & Blázquez Zaballos, A. (2006). Logistic biplots. Multiple correspondence analysis and related methods. London: Chapman & Hall, 503-521.
Examples
# No examples yet
Gower Dissimilarities for mixed types of data
Description
Gower Dissimilarities for mixed types of data
Usage
GowerProximities(x, y = NULL, Binary = NULL, Classes = NULL,
transformation = 3, IntegerAsOrdinal = FALSE, BinCoef
= "Simple_Matching", ContCoef = "Gower", NomCoef =
"GOW", OrdCoef = "GOW")
Arguments
x |
Main data. Distances among rows are calculated if y=NULL. Must be a data frame. |
y |
Suplementary data matrix. If not NULL the distances among the rows of x and y are calculated. Must be a data frame with the same columns as x. |
Binary |
A vector containing the binary variables. |
Classes |
Vector with column types. If NULL the data frame types are used. |
transformation |
Transformation for the similarities. |
IntegerAsOrdinal |
Should integer variables be used as ordinal? |
BinCoef |
Coefficient for the binary data |
ContCoef |
Coefficient for the continuous data |
NomCoef |
Coefficient for the nominal data |
OrdCoef |
Coefficient for the ordinal data |
Details
The transformation sqrt(1-S)
is applied to the similarity.
Value
An object of class proximities
.This has components:
comp1 |
Description of |
Author(s)
Jose Luis Vicente-Villardon
References
J. C. Gower. (1971) A General Coefficient of Similarity and Some of its Properties. Biometrics, Vol. 27, No. 4, pp. 857-871.
Examples
data(spiders)
Gower Dissimilarities for mixed types of data
Description
Gower Dissimilarities for mixed types of data
Usage
GowerSimilarities(x, y = NULL, Classes = NULL, transformation =
"sqrt(1-S)", BinCoef = "Simple_Matching", ContCoef =
"Gower", NomCoef = "GOW", OrdCoef = "GOW")
Arguments
x |
Main data. Distances among rows are calculated if y=NULL. Must be a data frame. |
y |
Suplementary data matrix. If not NULL the distances among the rows of x and y are calculated. Must be a data frame with the same columns as x. |
Classes |
Vector containing the classes of each variable. |
transformation |
Transformation to apply to the similarities. |
BinCoef |
Coefficient for the binary data |
ContCoef |
Coefficient for the continuous data |
NomCoef |
Coefficient for the nominal data |
OrdCoef |
Coefficient for the ordinal data |
Details
Gower Dissimilarities for mixed types of data.
The transformation sqrt(1-S)
is applied to the similarity by default.
Value
An object of class proximities
.This has components:
comp1 |
Description of |
Author(s)
Jose Luis Vicente-Villardon
References
J. C. Gower. (1971) A General Coefficient of Similarity and Some of its Properties. Biometrics, Vol. 27, No. 4, pp. 857-871.
Examples
data(spiders)
HJ Biplot with added features.
Description
HJ Biplot with added features.
Usage
HJ.Biplot(X, dimension = 3, Scaling = 5, sup.rows = NULL,
sup.cols = NULL, grouping = NULL)
Arguments
X |
Data Matrix |
dimension |
Dimension of the solution |
Scaling |
Transformation of the original data. See InitialTransform for available transformations. |
sup.rows |
Supplementary or illustrative rows, if any. |
sup.cols |
Supplementary or illustrative rows, if any. |
grouping |
factor to stadadize with the within groups variability |
Details
Biplots represent the rows and columns of a data matrix in reduced dimensions. Usually rows represent individuals, objects or samples and columns are variables measured on them. The most classical versions can be thought as visualizations associated to Principal Components Analysis (PCA) or Factor Analysis (FA) obtained from a Singular Value Decomposition or a related method. From another point of view, Classical Biplots could be obtained from regressions and calibrations that are essentially an alternated least squares algorithm equivalent to an EM-algorithm when data are normal.
Value
An object of class ContinuousBiplot with the following components:
Title |
A general title |
Non_Scaled_Data |
Original Data Matrix |
Means |
Means of the original Variables |
Medians |
Medians of the original Variables |
Deviations |
Standard Deviations of the original Variables |
Minima |
Minima of the original Variables |
Maxima |
Maxima of the original Variables |
P25 |
25 Percentile of the original Variables |
P75 |
75 Percentile of the original Variables |
Gmean |
Global mean of the complete matrix |
Sup.Rows |
Supplementary rows (Non Transformed) |
Sup.Cols |
Supplementary columns (Non Transformed) |
Scaled_Data |
Transformed Data |
Scaled_Sup.Rows |
Supplementary rows (Transformed) |
Scaled_Sup.Cols |
Supplementary columns (Transformed) |
n |
Number of Rows |
p |
Number of Columns |
nrowsSup |
Number of Supplementary Rows |
ncolsSup |
Number of Supplementary Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
EV |
EigenVectors |
Structure |
Correlations of the Principal Components and the Variables |
RowCoordinates |
Coordinates for the rows, including the supplementary |
ColCoordinates |
Coordinates for the columns, including the supplementary |
RowContributions |
Contributions for the rows, including the supplementary |
ColContributions |
Contributions for the columns, including the supplementary |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio. 1986, vol. 10, núm. 1.
See Also
Examples
## Simple Biplot with arrows
data(Protein)
bip=HJ.Biplot(Protein[,3:11])
plot(bip)
Gauss-Hermite quadrature
Description
Find the Gauss-Hermite abscissae and weights.
Usage
Hermquad(N)
Arguments
N |
Number of nodes of the quadrature |
Details
Find the Gauss-Hermite abscissae and weights.
Value
X |
A column vector containing the abscissae. |
W |
A vector containing the corresponding weights. |
Author(s)
Jose Luis Vicente Villardon (translated from a Matlab function by Greg von Winckel) )
References
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1992). Numerical Recipes in C: The Art of Scientific Computing (New York. Cambridge University Press, 636-9.
http://www.mathworks.com/matlabcentral/fileexchange/8836-hermite-quadrature/content/hermquad.m
Examples
Hermquad(10)
Panel of histograms
Description
Panel of histograms for a set of numerical variables.
Usage
HistogramPanel(X, nrows = NULL, separated = FALSE, ...)
Arguments
X |
The matrix of continuous variables |
nrows |
Number of rows of the panel. |
separated |
Should the plots be organized into a panel? (or separated) |
... |
Aditional graphical arguments |
Details
Jose Luis Vicente Villardon
Value
The histogram panel.
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
HistogramPanel(wine[,4:7], nrows = 2, xlab="")
Checks if a point is inside a box.
Description
Checks if a point is inside a box. The point is specified bi its x and y coordinates and the bom with the minimum and maximum values on both coordinate axis: xmin, xmax, ymin, ymax. The vertices of the box are then (xmin, ymin), (xmax, ymin), (xmax, ymax) and (xmin, ymax)
Usage
InBox(x, y, xmin, xmax, ymin, ymax)
Arguments
x |
x coordinate of the point |
y |
x coordinate of the point |
xmin |
minimum value of X |
xmax |
maximum value of X |
ymin |
minimum value of Y |
ymax |
maximum value of Y |
Value
Returns a logical value : TRUE if the point is inside the box and FALSE otherwise.
Author(s)
Jose Luis Vicente Villardon
Examples
InBox(0, 0, -1, 1, -1, 1)
Initial transformation of data
Description
Initial transformation of data before the construction of a biplot. (or any other technique)
Usage
InitialTransform(X, sup.rows = NULL, sup.cols = NULL,
InitTransform = "None", transform = "Standardize columns",
grouping = NULL)
Arguments
X |
Original Raw Data Matrix |
sup.rows |
Supplementary or illustrative rows. |
sup.cols |
Supplementary or illustrative columns. |
InitTransform |
Pevious transformation (to use. See details)none or log. |
transform |
Transformation to use. See details. |
grouping |
factor to stadadize with the within groups variability |
Details
Possible Transformations are:
1.- "Raw Data": When no transformation is required.
2.- "Substract the global mean": Eliminate an eefect common to all the observations
3.- "Double centering" : Interaction residuals. When all the elements of the table are comparable. Useful for AMMI models.
4.- "Column centering": Remove the column means.
5.- "Standardize columns": Remove the column means and divide by its standard deviation.
6.- "Row centering": Remove the row means.
7.- "Standardize rows": Divide each row by its standard deviation.
8.- "Divide by the column means and center": The resulting dispersion is the coefficient of variation.
9.- "Normalized residuals from independence" for a contingency table.
The transformation can be provided to the function by using the string beetwen the quotes or just the associated number.
The supplementary rows and columns are not used to calculate the parameters (means, standard deviations, etc). Some of the transformations are not compatible with supplementary data.
Value
A list with the following components
X |
Transformed data matrix |
sup.rows |
Transformed supplementary rows |
sup.rows |
Transformed supplementary columns |
Author(s)
Jose Luis Vicente Villardon
References
M. J. Baxter (1995) Standardization and Transformation in Principal Component Analysis, with Applications to Archaeometry. Journal of the Royal Statistical Society. Series C (Applied Statistics). Vol. 44, No. 4 (1995) , pp. 513-527
Kroonenberg, P. M. (1983). Three-mode principal component analysis: Theory and applications (Vol. 2). DSWO press. (Chapter 6)
Examples
data(iris)
x=as.matrix(iris[,1:4])
x=InitialTransform(x, transform=4)
x
Transforms an Integer Variable into a Binary Variable
Description
Transforms an Integer Variable into a Binary Variable
Usage
Integer2Binary(y, name = "My_Factor")
Arguments
y |
Vector with the factor |
name |
name of the factor |
Details
Transforms an Integer vector into a Binary Indicator Matrix
Value
A Binary Data Matrix
Author(s)
Jose Luis Vicente-Villardon
Examples
dat=c(1, 2, 2, 4, 1, 1, 4, 2, 4)
Integer2Binary(dat,"Myfactor")
Kruskal Wallis Tests
Description
Kruskal Wallis Tests for a matrix of continuous variables and a grouping factor.
Usage
Kruskal.Wallis.Tests(X, groups, posthoc = "none", alternative = "two.sided", digits = 4)
Arguments
X |
The matrix of continuous variables |
groups |
The factor with the groups |
posthoc |
Method used for multipe comparisons in the Dunn test |
alternative |
Kind of alternative hypothesis |
digits |
number of digitd for he output |
Details
Kruskal Wallis Tests for a matrix of continuous variables and a grouping factor, including the Dunn test for multiple comparisons.
Value
the organized output.
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
Kruskal.Wallis.Tests(wine[,4:7], wine$Group, posthoc = "bonferroni")
Levene Tests
Description
Levene Tests for a matrix of continuous variables and a grouping factor.
Usage
Levene.Tests(X, groups = NULL)
Arguments
X |
The matrix of continuous variables |
groups |
The factor with the groups |
Details
Levene Tests for a matrix of continuous variables and a grouping factor.
Value
The organized output
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
Levene.Tests(wine[,4:7], wine$Group)
Weighted Biplot for a table of frequencies
Description
Biplot for the logarithms of the frequencies of a contingency table using the frequencies as weights.
Usage
LogFrequencyBiplot(x, Scaling = 2, logoffset = 1, freqoffset = logoffset, ...)
Arguments
x |
The frequency table to be biplotted |
Scaling |
Transformation of the matrix after the logarithms |
logoffset |
Constant to add to the frequencies before calculating the logarithms. This is to avoid calculating the logaritm of zero, so, a covenient value for this argument is 1. |
freqoffset |
Constant to add to the frequencies before calculating the weigths. This is usually the same as the offset used to add to the frequencies but may be different when we do not want the frequencies zero to influence the biplot, i. e., we want zero weigths. |
... |
Any other parameter for the CrissCross procedure. |
Details
Biplot for the logarithms of the frequencies of a contingency table using the frequencies as weigths.
Value
An object of class .Biplot" with the following components:
Title |
A general title |
Non_Scaled_Data |
Original Data Matrix |
Means |
Means of the original Variables |
Medians |
Medians of the original Variables |
Deviations |
Standard Deviations of the original Variables |
Minima |
Minima of the original Variables |
Maxima |
Maxima of the original Variables |
P25 |
25 Percentile of the original Variables |
P75 |
75 Percentile of the original Variables |
Gmean |
Global mean of the complete matrix |
Sup.Rows |
Supplementary rows (Non Transformed) |
Sup.Cols |
Supplementary columns (Non Transformed) |
Scaled_Data |
Transformed Data |
Scaled_Sup.Rows |
Supplementary rows (Transformed) |
Scaled_Sup.Cols |
Supplementary columns (Transformed) |
n |
Number of Rows |
p |
Number of Columns |
nrowsSup |
Number of Supplementary Rows |
ncolsSup |
Number of Supplementary Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
EV |
EigenVectors |
Structure |
Correlations of the Principal Components and the Variables |
RowCoordinates |
Coordinates for the rows, including the supplementary |
ColCoordinates |
Coordinates for the columns, including the supplementary |
RowContributions |
Contributions for the rows, including the supplementary |
ColContributions |
Contributions for the columns, including the supplementary |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
Gabriel, K. R., Galindo, M. P. & Vicente-Villardon, J. L. (1995) Use of Biplots to Diagnose Independence Models in Three-Way Contingency Tables. in: M. Greenacre & J. Blasius. eds. Visualization of Categorical Data. Academis Press. London.
GABRIEL, K.R. and ZAMIR, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21: 489-498.
See Also
CrissCross
, ~~~
Examples
data(smoking)
logbip=LogFrequencyBiplot(smoking, Scaling=1, logoffset=0, freqoffset=0)
Multidimensional Scaling
Description
Multidimensional Scaling using SMACOF algorithm and Bootstraping the coordinates.
Usage
MDS(Proximities, W = NULL, Model = c("Identity", "Ratio", "Interval", "Ordinal"),
dimsol = 2, maxiter = 100, maxerror = 1e-06, Bootstrap = FALSE, nB = 200,
ProcrustesRot = TRUE, BootstrapMethod = c("Sampling", "Permutation"),
StandardizeDisparities = FALSE, ShowIter = FALSE)
Arguments
Proximities |
An object of class proximities |
W |
A matrix of weigths |
Model |
MDS model. "Identity", "Ratio", "Interval" or "Ordinal". |
dimsol |
Dimension of the solution |
maxiter |
Maximum number of iterations of the algorithm |
maxerror |
Tolerance for convergence of the algorithm |
Bootstrap |
Should Bootstraping be performed? |
nB |
Number of Bootstrap samples. |
ProcrustesRot |
Should the bootstrap replicates be rotated to match the initial configuration using Procrustes? |
BootstrapMethod |
The bootstrap is performed by samplig or permutaing the residuals? |
StandardizeDisparities |
Should the disparities be standardized |
ShowIter |
Show the iteration proccess |
Details
Multidimensional Scaling using SMACOF algorithm and Bootstraping the coordinates. MDS performs multidimensional scaling of proximity data to find a least- squares representation of the objects in a low-dimensional space. A majorization algorithm guarantees monotone convergence for optionally transformed, metric and nonmetric data under a variety of models.
Value
An object of class Principal.Coordinates
and MDS
. The function adds the information of the MDS to the object of class proximities
. Together with the information about the proximities the object has:
Analysis |
The type of analysis performed, "MDS" in this case |
Model |
MDS model used |
RowCoordinates |
Coordinates for the objects in the MDS procedure |
RawStress |
Raw Stress values |
stress1 |
stress formula 1 |
stress2 |
stress formula 2 |
sstress1 |
sstress formula 1 |
sstress2 |
sstress formula 2 |
rsq |
Squared correlation between disparities and distances |
Spearman |
Spearman correlation between disparities and distances |
Kendall |
Kendall correlation between disparities and distances |
BootstrapInfo |
The result of the bootstrap calculations |
Author(s)
Jose Luis Vicente Villardon
References
Commandeur, J. J. F. and Heiser, W. J. (1993). Mathematical derivations in the proximity scaling (PROXSCAL) of symmetric data matrices (Tech. Rep. No. RR- 93-03). Leiden, The Netherlands: Department of Data Theory, Leiden University.
Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29, 28-42.
De Leeuw, J. & Mair, P. (2009). Multidimensional scaling using majorization: The R package smacof. Journal of Statistical Software, 31(3), 1-30, http://www.jstatsoft.org/v31/i03/
Borg, I., & Groenen, P. J. F. (2005). Modern Multidimensional Scaling (2nd ed.). Springer.
Borg, I., Groenen, P. J. F., & Mair, P. (2013). Applied Multidimensional Scaling. Springer.
Groenen, P. J. F., Heiser, W. J. and Meulman, J. J. (1999). Global optimization in least squares multidimensional scaling by distance smoothing. Journal of Classification, 16, 225-254.
Groenen, P. J. F., van Os, B. and Meulman, J. J. (2000). Optimal scaling by alternating length-constained nonnegative least squares, with application to distance-based analysis. Psychometrika, 65, 511-524.
See Also
Examples
data(spiders)
Dis=BinaryProximities(spiders)
MDSSol=MDS(Dis, Bootstrap=FALSE)
plot(MDSSol)
Mixture Gaussian Clustering
Description
Model based clustering using mixtures of gaussian distriutions.
Usage
MGC(x, NG = 2, init = "km", RemoveOutliers=FALSE, ConfidOutliers=0.995,
tolerance = 1e-07, maxiter = 100, show=TRUE, ...)
Arguments
x |
The data matrix |
NG |
Number of groups or clusters to obtain |
init |
Initial centers can be obtained from k-means ("km") or at random ("rd") |
RemoveOutliers |
Should the extreme values be removed to calculate the clusters? |
ConfidOutliers |
Percentage of the points to keep for the calculations when RemoveOutliers is true. |
tolerance |
Tolerance for convergence |
maxiter |
Maximum number of iterations |
show |
Should the likelihood at each iteration be shown? |
... |
Maximum number of iterationsAny other parameter that can affect k-means if that is the initial configuration |
Details
A basic algorithm for clustering with mixtures of gaussians with no restrictions on the covariance matrices
Value
Clusters
Author(s)
Jose Luis Vicente Villardon
References
Me falta
Examples
X=as.matrix(iris[,1:4])
mod1=MGC(X,NG=3)
plot(iris[,1:4], col=mod1$Classification)
table(iris[,5],mod1$Classification)
Matrix to Proximities
Description
Converts a matrix of proximities into a Proximities object as used in Principal Coordinates or MDS
Usage
Matrix2Proximities(x, TypeData = "User Provided",
Type = c("dissimilarity", "similarity", "products"),
Coefficient = "None", Transformation = "None", Data = NULL)
Arguments
x |
The matrix of proximities (a symmetrical matrix) |
TypeData |
By default is User provided but could be any type. |
Type |
Type of proximity: dissimilarity, similarity or scalar product. If not provided, the default is dissimilarity |
Coefficient |
Name of the procedure to calculate the proximities (if any). |
Transformation |
Transformation used to calculate dissimilarities from similarities (if any) |
Data |
Raw data used to calculate the proximity (if any). |
Details
Converts a matrix of proximities into a Proximities object as used in Principal Coordinates or MDS aading some extra information about the procedure used to obtain the proximities. Is mainly used when the proximities matrix has been provided by the user and not calculated from raw data using BinaryProximities, ContinuousDistances or any other function.
Value
An object of class Proximities
containing the proximities matrix and some extra information about it.
Author(s)
Jose Luis Vicente Villardon
Weighted Isotonic Regression (Weighted Monotone Regression)
Description
Performs weighted isotonic (monotone) regression using the non-negative weights in w. The function is a direct translation of the matlab function lsqisotonic.
Usage
MonotoneRegression(x, y, w = NULL)
Arguments
x |
The independent variable vector |
y |
The dependent variable vector |
w |
A vector of weigths |
Details
YHAT = MonotoneRegression(X,Y) returns a vector of values that minimize the sum of squares (Y - YHAT).^2 under the monotonicity constraint that X(I) > X(J) => YHAT(I) >= YHAT(J), i.e., the values in YHAT are monotonically non-decreasing with respect to X (sometimes referred to as "weak monotonicity"). LSQISOTONIC uses the "pool adjacent violators" algorithm.
If X(I) == X(J), then YHAT(I) may be <, ==, or > YHAT(J) (sometimes referred to as the "primary approach"). If ties do occur in X, a plot of YHAT vs. X may appear to be non-monotonic at those points. In fact, the above monotonicity constraint is not violated, and a reordering within each group of ties, by ascending YHAT, will produce the desired appearance in the plot.
Value
The fitted values after the monotone regression
Note
The function is a direct translation of the matlab function lsqisotonic.
Author(s)
Jose L. Vicente Villardon (from a matlab functiom)
References
Kruskal, J.B. (1964) "Nonmetric multidimensional scaling: a numerical method", Psychometrika 29:115-129.
Cox, R.F. and Cox, M.A.A. (1994) Multidimensional Scaling, Chapman&Hall.
Examples
## Used inside MDS
Statistics for multiple tables
Description
Statistics for multiple tables
Usage
MultiTableStatistics(X, dual = FALSE)
Arguments
X |
A multiple table |
dual |
Is the transformation for the dual versions? |
Details
Statistics for multiple tables
Value
A list with vectors of statistics for each table
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Initial Transformation of a multi table object
Description
Initial Transformation of a multi table object
Usage
MultiTableTransform(X, InitTransform = "Standardize columns", dual = FALSE,
CommonSD = TRUE)
Arguments
X |
Multi-table object |
InitTransform |
Initial Transformattion |
dual |
Is the transformation for the dual versions? |
CommonSD |
Should a common standard deviation be used for all the groups? |
Details
Initial Transformation of a multi table object
Value
he table transformed
Author(s)
Jose Luis Vicente Villardon
Multidimensional Gauss-Hermite quadrature
Description
Multidimensional Gauss-Hermite quadrature
Usage
Multiquad(nnodes, dims)
Arguments
nnodes |
Number of nodes of the quadrature |
dims |
Dimension of the solution |
Details
Multidimensional Gauss-Hermite quadrature
Value
Multidimensional Gauss-Hermite quadrature
Author(s)
Jose Luis Vicente Villardon
References
Jackel, P. (2005). A note on multivariate Gauss-Hermite quadrature. http://www.awdz65.dsl.pipex.com/ANoteOnMultivariateGaussHermiteQuadrature.pdf
Examples
Multiquad(5, 3)
Biplot using the NIPALS algorithm
Description
Biplot using the NIPALS algorithm including a truncated and a sparse version.
Usage
NIPALS.Biplot(X, alpha = 1, dimension = 3, Scaling = 5,
Type = "Regular", grouping = NULL, ...)
Arguments
X |
The data matrix |
alpha |
A number between 0 and 1. 0 for GH-Biplot, 1 for JK-Biplot and 0.5 for SQRT-Biplot. Use 2 or any other value not in the interval [0,1] for HJ-Biplot. |
dimension |
Dimension of the solution |
Scaling |
Transformation of the original data. See InitialTransform for available transformations. |
Type |
Type of biplot (Regular, Truncated or Sparse) |
grouping |
Grouping fartor when the scaling is made with the within groups variability |
... |
Aditional arguments for the different types of biplots. |
Details
Biplot using the NIPALS algorithm including a truncated and a sparse version.
Value
An object of class ContinuousBiplot with the following components:
Title |
A general title |
Type |
NIPALS |
call |
call |
Non_Scaled_Data |
Original Data Matrix |
Means |
Means of the original Variables |
Medians |
Medians of the original Variables |
Deviations |
Standard Deviations of the original Variables |
Minima |
Minima of the original Variables |
Maxima |
Maxima of the original Variables |
P25 |
25 Percentile of the original Variables |
P75 |
75 Percentile of the original Variables |
Gmean |
Global mean of the complete matrix |
Sup.Rows |
Supplementary rows (Non Transformed) |
Sup.Cols |
Supplementary columns (Non Transformed) |
Scaled_Data |
Transformed Data |
Scaled_Sup.Rows |
Supplementary rows (Transformed) |
Scaled_Sup.Cols |
Supplementary columns (Transformed) |
n |
Number of Rows |
p |
Number of Columns |
nrowsSup |
Number of Supplementary Rows |
ncolsSup |
Number of Supplementary Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
EV |
EigenVectors |
Structure |
Correlations of the Principal Components and the Variables |
RowCoordinates |
Coordinates for the rows, including the supplementary |
ColCoordinates |
Coordinates for the columns, including the supplementary |
RowContributions |
Contributions for the rows, including the supplementary |
ColContributions |
Contributions for the columns, including the supplementary |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
Wold, H. (1966). Estimation of principal components and related models by iterative least squares. Multivariate analysis. ACEDEMIC PRESS. 391-420.
Examples
bip1=NIPALS.Biplot(wine[,4:21], Type="Sparse", lambda=0.15)
plot(bip1)
NIPALS algorithm for PCA
Description
Classical NIPALS algorithm for PCA and Biplot.
Usage
NIPALSPCA(X, dimens = 2, tol = 1e-06, maxiter = 1000)
Arguments
X |
The data matrix. |
dimens |
The dimension of the solution |
tol |
Tolerance of the algorithm. |
maxiter |
Maximum number of iteratios. |
Details
Classical NIPALS algorithm for the singular value decomposition that allows for the construction of PCA and Biplot.
Value
The singular value decomposition
u |
The coordinates of the rows (standardized) |
d |
The singuklar values |
v |
The coordinates of the columns (standardized) |
Author(s)
Jose Luis Vicente Villardon
References
Wold, H. (1966). Estimation of principal components and related models by iterative least squares. Multivariate analysis. ACEDEMIC PRESS. 391-420.
Examples
# Not yet
Nice numbers: simple decimal numbers
Description
Calculates a close nice number, i. e. a number with simple decimals.
Usage
NiceNumber(x = 6, round = TRUE)
Arguments
x |
A number |
round |
Should the number be rounded? |
Details
Calculates a close nice number, i. e. a number with simple decimals.
Value
A number with simple decimals
Author(s)
Jose Luis Vicente Villardon
References
Heckbert, P. S. (1990). Nice numbers for graph labels. In Graphics Gems (pp. 61-63). Academic Press Professional, Inc..
See Also
Examples
NiceNumber(0.892345)
Distances among individuals with nominal variables
Description
This function computes several measures of distance (or similarity) among individuals from a nominal data matrix.
Usage
NominalDistances(X, method = 1, diag = FALSE, upper = FALSE, similarity = TRUE)
Arguments
X |
Matrix or data.frame with the nominal variables. |
method |
An integer between 1 and 6. See details |
diag |
A logical value indicating whether the diagonal of the distance matrix should be printed. |
upper |
a logical value indicating whether the upper triangle of the distance matrix should be printed. |
similarity |
A logical value indicating whether the similarity matrix should be computed. |
Details
Let be the table of nominal data. All these distances are of type d=\sqrt{1-s}
with s a similarity coefficient.
- 1 = Overlap method
The overlap measure simply counts the number of attributes that match in the two data instances.
- 2 = Eskin
Eskin et al. proposed a normalization kernel for record-based network intrusion detection data. The original measure is distance-based and assigns a weight of
\frac{2}{n_{k}^{2}}
for mismatches; when adapted to similarity, this becomes a weight of\frac{n_{k}^{2}}{n_{k}^{2}+2}
.This measure gives more weight to mismatches that occur on attributes that take many values.- 3=IOF (Inverse Occurrence Frequency .)
-
This measure assigns lower similarity to mismatches on more frequent values. The IOF measure is related to the concept of inverse document frequency which comes from information retrieval, where it is used to signify the relative number of documents that contain a spe- cific word.
- 4 = OF (Ocurrence Frequency)
This measure gives the opposite weighting of the IOF measure for mismatches, i.e., mismatches on less frequent values are assigned lower similarity and mismatches on more frequent values are assigned higher similarity
- 5 = Goodall3
This measure assigns a high similarity if the matching values are infrequent regardless of the frequencies of the other values.
- 6 = Lin
This measure gives higher weight to matches on frequent values, and lower weight to mismatches on infrequent values.
Value
An object of class distance
Author(s)
Jose L. Vicente-Villardon
References
Boriah, S., Chandola, V. & Kumar,V.(2008). Similarity measures for categorical data: A comparative evaluation. In proceedings of the eight SIAM International Conference on Data Mining, pp 243–254.
See Also
BinaryDistances
,ContinuousDistances
Examples
## Not run:
data(Env)
Distance<-NominalDistances(Env,upper=TRUE,diag=TRUE,similarity=FALSE,method=1)
## End(Not run)
Normality tests
Description
Normality tests foor the columns of a matrix and a grouping variable.
Usage
NormalityTests(X, groups = NULL, plot = FALSE, SortByGroups = FALSE)
Arguments
X |
A data frame or a matrix containing several numerical variables |
groups |
A factor with the groups |
plot |
If TRUE the qqnorm plots are shown |
SortByGroups |
Should the results be sorted by groups? |
Details
Normality tests foor the columns of a matrix and a grouping variable.
Value
The normality tests and the plots
Author(s)
Jose Luis Vicente Villardon
Examples
data(wine)
NormalityTests(wine[,4:6], groups = wine$Origin, plot=TRUE)
Converts a numeric variable into a binary one
Description
Converts a numeric variable into a binary one using a cut point
Usage
Numeric2Binary(y, name= "MyVar", cut = NULL)
Arguments
y |
Vector containing the numeric values |
name |
Name of the variable |
cut |
Cut point to cut the values of the variable. If is NULL the median is used. |
Details
Converts a numeric variable into a binary one using a cut point. If the cut is NULL the median is used.
Value
A binary Variable
Author(s)
Jose Luis Vicente-Villardon
See Also
Dataframe2BinaryMatrix
Examples
y=c(1, 1.2, 3.2, 2.4, 1.7, 2.2, 2.7, 3.1)
Numeric2Binary(y)
Alternated EM algorithm for Ordinal Logistic Biplots
Description
This function computes, with an alternated algorithm, the row and column parameters of an Ordinal Logistic Biplot for ordered polytomous data. The row coordinates (E-step) are computed using multidimensional Gauss-Hermite quadratures and Expected a posteriori (EAP) scores and parameters for each variable or items (M-step) using Ridge Ordinal Logistic Regression to solve the separation problem present when the points for different categories of a variable are completely separated on the representation plane and the usual fitting methods do not converge. The separation problem is present in almost avery data set for which the goodness of fit is high.
Usage
OrdLogBipEM(Data, freq=NULL, dim = 2, nnodes = 15,
tol = 0.0001, maxiter = 100, maxiterlogist = 100,
penalization = 0.2, show = FALSE, initial = 1, alfa = 1,
Orthogonalize=TRUE, Varimax=TRUE, ...)
Arguments
Data |
Data frame with the ordinal data. All the variables must be ordered factors. |
freq |
Frequencies for compacted tables |
dim |
Dimension of the solution |
nnodes |
Number of nodes for the multidimensional Gauss-Hermite quadrature |
tol |
Value to stop the process of iterations. |
maxiter |
Maximum number of iterations for the biplot procedure. |
maxiterlogist |
Maximum number of iterations for the logistic regression step or the Mirt initial configuration. |
penalization |
Penalization used in the diagonal matrix to avoid singularities. |
show |
Boolean parameter to specify if the user wants to see every iteration. |
initial |
Method used to choose the initial ability in the algorithm. Default value is 1. |
alfa |
Optional parameter to calculate row and column coordinates in Simple correspondence analysis if the initial parameter is equal to 1. |
Orthogonalize |
Should the final row coordinates be orthogonalized?. The column parameters have to be recalculated. |
Varimax |
Should the final row coordinates be rotated using the varimax procedure?. |
... |
Aditional argunments for mirt. |
Value
An object of class "Ordinal.Logistic.Biplot"
.This has components:
RowCoordinates |
Coordinates for the rows or the individuals |
ColumnParameters |
List with information about the Ordinal Logistic Models calculated for each variable including: estimated parameters with thresholds,percents of correct classifications,and pseudo-Rsquared |
loadings |
factor loadings |
LogLikelihood |
Logarithm of the likelihood |
r2 |
R squared coefficient |
Ncats |
Number of the categories of each variable |
Author(s)
Jose Luis Vicente-Villardon
References
Bock,R. & Aitkin,M. (1981),Marginal maximum likelihood estimation of item parameters: Aplication of an EM algorithm, Phychometrika 46(4), 443-459.
Examples
## Not run:
data(Doctors)
olb = OrdLogBipEM(Doctors,dim = 2, nnodes = 10, initial=4,
tol = 0.001, maxiter = 100, penalization = 0.1, show=TRUE)
olb
summary(olb)
PlotOrdinalResponses(olb)
## End(Not run)
Plots an ordinal variable on the biplot
Description
Plots an ordinal variable on the biplot from its fitted parameters
Usage
OrdVarBiplot(bi1, bi2, threshold, xmin = -3, xmax = 3, ymin = -3,
ymax = 3, label = "Point", mode = "a", CexPoint = 0.8,
PchPoint = 1, Color = "green", tl = 0.03, textpos = 1, CexScale= 0.5, ...)
Arguments
bi1 |
Slope for the first dimension to plot |
bi2 |
Slope for the second dimension to plot |
threshold |
Thresholds for each category of the variable |
xmin |
Minimum value of the X on the plot |
xmax |
Maximum value of the X on the plot |
ymin |
Minimum value of the Y on the plot |
ymax |
Maximum value of the X on the plot |
label |
Label of the variable |
mode |
Mode of the plot (as in a regular biplot) |
CexPoint |
Size of the point |
PchPoint |
Mark for the point |
Color |
Color |
tl |
Tick Length |
textpos |
Position of the label |
CexScale |
Sizes of the scales |
... |
Any aditional graphical parameter |
Details
Plots an ordinal variable on the biplot from its fitted parameters. The plot uses the same parameters as any other biplot.
Value
Returns a graphical representation of the ordinal variable on the current plot
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardon, J. L., & Sanchez, J. C. H. (2014). Logistic Biplots for Ordinal Data with an Application to Job Satisfaction of Doctorate Degree Holders in Spain. arXiv preprint arXiv:1405.0294.
Examples
##---- Should be DIRECTLY executable !! ----
Coordinates of an ordinal variable on the biplot.
Description
Coordinates of an ordinal variable on the biplot.
Usage
OrdVarCoordinates(tr, b = c(1, 1), inf = -12, sup = 12, step = 0.01,
plotresponse = FALSE, label = "Item", labx = "z", laby
= "Probability", catnames = NULL, Legend = TRUE,
LegendPos = 1)
Arguments
tr |
A vector containing the thresholds of the model, that is, the constatn for each category of the ordinal variable |
b |
Vector containing the common slopes for all categories of the ordinal variable |
inf |
The inferior limit of the values to be sampled on the biplot axis (it depends on the scale of the biplot). |
sup |
The superior limit of the values to be sampled on the biplot axis (it depends on the scale of the biplot). |
step |
Increment (step) of the squence |
plotresponse |
Should the item be plotted |
label |
Label of the item. |
labx |
Label for the X axis in the summary of the item. |
laby |
Label for the Y axis in the summary of the item. |
catnames |
Names of the categories. |
Legend |
Should a legend be plotted |
LegendPos |
Position of the legend. |
Details
The function calculates the coordinates of the points that define the separation among the categories of an ordinal variable projected onto an ordinal logistic biplot.
Value
An object of class OrdVarCoord
z |
Values of the cut points on the scale of the biplot axis (not used) |
points |
The points for the marks to be represented on the biplot. |
labels |
The labels for the points |
hidden |
Are there any hidden categories? (Categories whose probability is never hier than the probabilities of the rest) |
cathidden |
Number of the hidden cateories |
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardon, J. L., & Sanchez, J. C. H. (2014). Logistic Biplots for Ordinal Data with an Application to Job Satisfaction of Doctorate Degree Holders in Spain. arXiv preprint arXiv:1405.0294.
Examples
# No examples
Fits an ordinal logistic regression with ridge penalization
Description
This function fits a logistic regression between a dependent ordinal variable y and some independent variables x, and solves the separation problem using ridge penalization.
Usage
OrdinalLogisticFit(y, x, penalization = 0.1, tol = 1e-04, maxiter = 200, show = FALSE)
Arguments
y |
Dependent variable. |
x |
A matrix with the independent variables. |
penalization |
Penalization used to avoid singularities. |
tol |
Tolerance for the iterations. |
maxiter |
Maximum number of iterations. |
show |
Should the iteration history be printed?. |
Details
The problem of the existence of the estimators in logistic regression can be seen in Albert (1984); a solution for the binary case, based on the Firth's method, Firth (1993) is proposed by Heinze(2002). All the procedures were initially developed to remove the bias but work well to avoid the problem of separation. Here we have chosen a simpler solution based on ridge estimators for logistic regression Cessie(1992).
Rather than maximizing {L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})
we maximize
{{L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})} - \lambda \left( {\left\| {{{\bf{b}}_{j0}}} \right\| + \left\| {{{\bf{B}}_j}} \right\|} \right)
Changing the values of \lambda
we obtain slightly different solutions not affected by the separation problem.
Value
An object of class "pordlogist"
. This has components:
nobs |
Number of observations |
J |
Maximum value of the dependent variable |
nvar |
Number of independent variables |
fitted.values |
Matrix with the fitted probabilities |
pred |
Predicted values for each item |
Covariances |
Covariances matrix |
clasif |
Matrix of classification of the items |
PercentClasif |
Percent of good classifications |
coefficients |
Estimated coefficients for the ordinal logistic regression |
thresholds |
Thresholds of the estimated model |
logLik |
Logarithm of the likelihood |
penalization |
Penalization used to avoid singularities |
Deviance |
Deviance of the model |
DevianceNull |
Deviance of the null model |
Dif |
Diference between the two deviances values calculated |
df |
Degrees of freedom |
pval |
p-value of the contrast |
CoxSnell |
Cox-Snell pseudo R squared |
Nagelkerke |
Nagelkerke pseudo R squared |
MacFaden |
Nagelkerke pseudo R squared |
iter |
Number of iterations made |
Author(s)
Jose Luis Vicente-Villardon
References
Albert,A. & Anderson,J.A. (1984),On the existence of maximum likelihood estimates in logistic regression models, Biometrika 71(1), 1–10.
Bull, S.B., Mak, C. & Greenwood, C.M. (2002), A modified score function for multinomial logistic regression, Computational Statistics and dada Analysis 39, 57–74.
Firth, D.(1993), Bias reduction of maximum likelihood estimates, Biometrika 80(1), 27–38
Heinze, G. & Schemper, M. (2002), A solution to the problem of separation in logistic regression, Statistics in Medicine 21, 2109–2419
Le Cessie, S. & Van Houwelingen, J. (1992), Ridge estimators in logistic regression, Applied Statistics 41(1), 191–201.
Examples
# No examples yet
Orthogonalize a set of Scores calculated by other procedure
Description
Orthogonalize a set of Scores calculated by other procedure
Usage
OrthogonalizeScores(scores)
Arguments
scores |
A matrix containing the scores |
Details
Orthogonalize a set of Scores calculated by other procedure proyecting onto the dimensions defined by the eigenvectors of the covariance matrix
Value
The orthogonalised scores.
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Classical PCA Biplot with added features.
Description
Classical PCA Biplot with added features.
Usage
PCA.Analysis(X, dimension = 3, Scaling = 5, ...)
Arguments
X |
Data Matrix |
dimension |
Dimension of the solution |
Scaling |
Transformation of the original data. See InitialTransform for available transformations. |
... |
Any other useful argument |
Details
Biplots represent the rows and columns of a data matrix in reduced dimensions. Usually rows represent individuals, objects or samples and columns are variables measured on them. The most classical versions can be thought as visualizations associated to Principal Components Analysis (PCA) or Factor Analysis (FA) obtained from a Singular Value Decomposition or a related method. From another point of view, Classical Biplots could be obtained from regressions and calibrations that are essentially an alternated least squares algorithm equivalent to an EM-algorithm when data are normal.
Value
An object of class ContinuousBiplot with the following components:
Title |
A general title |
Non_Scaled_Data |
Original Data Matrix |
Means |
Means of the original Variables |
Medians |
Medians of the original Variables |
Deviations |
Standard Deviations of the original Variables |
Minima |
Minima of the original Variables |
Maxima |
Maxima of the original Variables |
P25 |
25 Percentile of the original Variables |
P75 |
75 Percentile of the original Variables |
Gmean |
Global mean of the complete matrix |
Sup.Rows |
Supplementary rows (Non Transformed) |
Sup.Cols |
Supplementary columns (Non Transformed) |
Scaled_Data |
Transformed Data |
Scaled_Sup.Rows |
Supplementary rows (Transformed) |
Scaled_Sup.Cols |
Supplementary columns (Transformed) |
n |
Number of Rows |
p |
Number of Columns |
nrowsSup |
Number of Supplementary Rows |
ncolsSup |
Number of Supplementary Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
EV |
EigenVectors |
Structure |
Correlations of the Principal Components and the Variables |
RowCoordinates |
Coordinates for the rows, including the supplementary |
ColCoordinates |
Coordinates for the columns, including the supplementary |
RowContributions |
Contributions for the rows, including the supplementary |
ColContributions |
Contributions for the columns, including the supplementary |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
Gabriel, K.R.(1971): The biplot graphic display of matrices with applications to principal component analysis. Biometrika, 58, 453-467.
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio. 1986, vol. 10, núm. 1.
Gabriel, K. R. AND Zamir, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21(21):489–498, 1979.
Gabriel, K.R.(1998): Generalised Bilinear Regression. Biometrika, 85, 3, 689-700.
Gower y Hand (1996): Biplots. Chapman & Hall.
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez-Zaballos, A. (2006). Logistic Biplots. Multiple Correspondence Analysis and related methods 491-509.
Demey, J., Vicente-Villardon, J. L., Galindo, M. P. and Zambrano, A. (2008). Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics 24 2832-2838.
See Also
Examples
## Simple Biplot with arrows
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
plot(bip)
## Biplot with scales on the variables
plot(bip, mode="s", margin=0.2)
# Structure plot (Correlations)
CorrelationCircle(bip)
# Plot of the Variable Contributions
ColContributionPlot(bip, cex=1)
Classical PCA Biplot with added features.
Description
Classical PCA Biplot with added features.
Usage
PCA.Biplot(X, alpha = 1, dimension = 2, Scaling = 5, sup.rows = NULL,
sup.cols = NULL, grouping = NULL)
Arguments
X |
Data Matrix |
alpha |
A number between 0 and 1. 0 for GH-Biplot, 1 for JK-Biplot and 0.5 for SQRT-Biplot. Use 2 or any other value not in the interval [0,1] for HJ-Biplot. |
dimension |
Dimension of the solution |
Scaling |
Transformation of the original data. See InitialTransform for available transformations. |
sup.rows |
Supplementary or illustrative rows, if any. |
sup.cols |
Supplementary or illustrative rows, if any. |
grouping |
A factor to standardize with the variability within groups |
Details
Biplots represent the rows and columns of a data matrix in reduced dimensions. Usually rows represent individuals, objects or samples and columns are variables measured on them. The most classical versions can be thought as visualizations associated to Principal Components Analysis (PCA) or Factor Analysis (FA) obtained from a Singular Value Decomposition or a related method. From another point of view, Classical Biplots could be obtained from regressions and calibrations that are essentially an alternated least squares algorithm equivalent to an EM-algorithm when data are normal.
Value
An object of class ContinuousBiplot with the following components:
Title |
A general title |
Non_Scaled_Data |
Original Data Matrix |
Means |
Means of the original Variables |
Medians |
Medians of the original Variables |
Deviations |
Standard Deviations of the original Variables |
Minima |
Minima of the original Variables |
Maxima |
Maxima of the original Variables |
P25 |
25 Percentile of the original Variables |
P75 |
75 Percentile of the original Variables |
Gmean |
Global mean of the complete matrix |
Sup.Rows |
Supplementary rows (Non Transformed) |
Sup.Cols |
Supplementary columns (Non Transformed) |
Scaled_Data |
Transformed Data |
Scaled_Sup.Rows |
Supplementary rows (Transformed) |
Scaled_Sup.Cols |
Supplementary columns (Transformed) |
n |
Number of Rows |
p |
Number of Columns |
nrowsSup |
Number of Supplementary Rows |
ncolsSup |
Number of Supplementary Columns |
dim |
Dimension of the Biplot |
EigenValues |
Eigenvalues |
Inertia |
Explained variance (Inertia) |
CumInertia |
Cumulative Explained variance (Inertia) |
EV |
EigenVectors |
Structure |
Correlations of the Principal Components and the Variables |
RowCoordinates |
Coordinates for the rows, including the supplementary |
ColCoordinates |
Coordinates for the columns, including the supplementary |
RowContributions |
Contributions for the rows, including the supplementary |
ColContributions |
Contributions for the columns, including the supplementary |
Scale_Factor |
Scale factor for the traditional plot with points and arrows. The row coordinates are multiplied and the column coordinates divided by that scale factor. The look of the plot is better without changing the inner product. For the HJ-Biplot the scale factor is 1. |
Author(s)
Jose Luis Vicente Villardon
References
Gabriel, K.R.(1971): The biplot graphic display of matrices with applications to principal component analysis. Biometrika, 58, 453-467.
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio. 1986, vol. 10, núm. 1.
Gabriel, K. R. AND Zamir, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21(21):489–498, 1979.
Gabriel, K.R.(1998): Generalised Bilinear Regression. Biometrika, 85, 3, 689-700.
Gower y Hand (1996): Biplots. Chapman & Hall.
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez-Zaballos, A. (2006). Logistic Biplots. Multiple Correspondence Analysis and related methods 491-509.
Demey, J., Vicente-Villardon, J. L., Galindo, M. P. and Zambrano, A. (2008). Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics 24 2832-2838.
See Also
Examples
## Simple Biplot with arrows
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
plot(bip)
## Biplot with scales on the variables
plot(bip, mode="s", margin=0.2)
# Structure plot (Correlations)
CorrelationCircle(bip)
# Plot of the Variable Contributions
ColContributionPlot(bip, cex=1)
Principal Components Analysis with bootstrap confidence intervals.
Description
Calculates a Principal Components Analysis with bootstrap confidence intervals for its parameters
Usage
PCA.Bootstrap(X, dimens = 2, Scaling = "Standardize columns", B = 1000, type = "np")
Arguments
X |
The original raw data matrix |
dimens |
Desired dimension of the solution. |
Scaling |
Transformation that should be applied to the raw data. |
B |
Number of Bootstrap samples to draw. |
type |
Type of Bootstrap ("np", "pa", "spper", "spres") |
Details
The types of bootstrap used are:
- "np : "
Non Parametric
- "pa : "
parametric (data is obtained from a Multivariate Normal Distribution)
- "spper : "
Semi-parametric Residuals are permutated
- "spres : "
Semi-parametric Residuals are resampled
For the moment, only the non-parametric bootstrap is implemented.
The Principal Components (eigenvectors) are obtained using bootstrap samples.
The Row scotes are obtained projecting the completen data matrix into the bootstrap Principal Components. In this way all the individulas have the same number of replications.
Value
Type |
The type of Bootstrap used |
InitTransform |
Transformation of the raw data |
InitData |
Initial data provided to the function' |
TransformedData |
Transformed Data |
InitialSVD |
Singular value decomposition of the transformed data |
InitScores |
Row Scores for the initial Data |
InitCorr |
Correlation among variables and Principal Components for the Initial Data |
Samples |
Matrix containing the members of the Bootstrap Samples |
EigVal |
Matrix containing the eigenvalues (columns) for each bootstrap sample (columns) |
Inertia |
Matrix containing the proportions of accounted variance (columns) for each bootstrap sample (columns) |
Us |
Three-dimensional array containing the left singular vectors for each bootstrap sample |
Vs |
Three-dimensional array containing the right singular vectors for each bootstrap sample |
As |
Projection of the bootstrap sampled matrix onto the bottstrap principal components |
Bs |
Projection of the bootstrap sampled matrix onto the bottstrap principal coordinates |
Scores |
Projection of the original matrix onto the bootstrap principal components |
Struct |
Correlation of the Initial Variabblñes and the PCs for each bootstrap sample |
Author(s)
Jose Luis Vicente Villardon
References
Daudin, J. J., Duby, C., & Trecourt, P. (1988). Stability of principal component analysis studied by the bootstrap method. Statistics: A journal of theoretical and applied statistics, 19(2), 241-258.
Chateau, F., & Lebart, L. (1996). Assessing sample variability in the visualization techniques related to principal component analysis: bootstrap and alternative simulation methods. COMPSTAT, Physica-Verlag, 205-210.
Babamoradi, H., van den Berg, F., & Rinnan, Å. (2013). Bootstrap based confidence limits in principal component analysis—A case study. Chemometrics and Intelligent Laboratory Systems, 120, 97-105.
Fisher, A., Caffo, B., Schwartz, B., & Zipunnikov, V. (2016). Fast, exact bootstrap principal component analysis for p> 1 million. Journal of the American Statistical Association, 111(514), 846-860.
See Also
Examples
## Not run: X=wine[,4:21]
grupo=wine$Group
rownames(X)=paste(1:45, grupo, sep="-")
pcaboot=PCA.Bootstrap(X, dimens=2, Scaling = "Standardize columns", B=1000)
plot(pcaboot, ColorInd=as.numeric(grupo))
summary(pcaboot)
## End(Not run)
Partial Least Squares Regression
Description
Partial Least Squares Regression for numerical variables.
Usage
PLSR(Y, X, S = 2, InitTransform = 5, grouping = NULL,
centerY = TRUE, scaleY = TRUE, tolerance = 5e-06,
maxiter = 100, show = FALSE, Validation = NULL, nB = 500)
Arguments
Y |
Matrix of Dependent Variables |
X |
Matrix of Independent Variables |
S |
Dimension of the solution |
InitTransform |
Initial transformation of the independent variables. |
grouping |
Fator when the init transformation is the standardization with the within groups deviation. |
centerY |
Should the dependent variables be centered? |
scaleY |
Should the dependent variables be standadized? |
tolerance |
Tolerance for the algorithm |
maxiter |
Maximum number of iterations |
show |
Show the progress of the algorithm? |
Validation |
Validation (None, Cross, Bootstrap) |
nB |
number of samples for the bottstrap validation |
Details
Partial Least Squares Regression for numerical variables.
Value
An object of class plsr with fiends
Method |
PLSR |
X |
The X matrix |
Y |
The Y matrix |
centerY |
Is the Y matrix centered |
scaleY |
Is the Y matrix scaled |
Initial_Transformation |
Initial transformation of the Y matrix |
ScaledX |
Transformed X matrix |
ScaledY |
Transformed Y matrix |
Intercept |
Intercept of the model |
XScores |
Scores for the individals from the X matrix |
XWeights |
Weigths for the X set |
XLoadings |
Loadings for the X set |
YScores |
Scores for the individals from the Y matrix |
YWeights |
Weigths for the Y set |
YLoadings |
Loadings for the Y set |
RegParameters |
Final Regression Parameters |
ExpectedY |
Expected values of Y |
R2 |
R-squared |
XStructure |
Relation of the X variables with its structure |
YStructure |
Relation of the Y variables with its structure |
YXStructure |
Relation of the Y variables with the X components |
Author(s)
Jose Luis Vicente Villardon
References
H. Abdi, Partial least squares regression and projection on latent structure regression (PLS regression), WIREs Comput. Stat. 2 (2010), pp. 97-106.
See Also
Examples
X=as.matrix(wine[,4:21])
y=as.numeric(wine[,2])-1
mifit=PLSR(y,X, Validation="None")
Partial Least Squares Regression with Binary Response
Description
Fits Partial Least Squares Regression with Binary Response
Usage
PLSR1Bin(Y, X, S = 2, InitTransform = 5, grouping = NULL,
tolerance = 5e-06, maxiter = 100, show = FALSE, penalization = 0,
cte = TRUE, Algorithm = 1, OptimMethod = "CG")
Arguments
Y |
The response |
X |
The matrix of independent variables |
S |
The Dimension of the solution |
InitTransform |
Initial transform for the X matrix |
grouping |
Factor for grouping the observations |
tolerance |
Tolerance for convergence of the algorithm |
maxiter |
Maximum Number of iterations |
show |
Show the steps of the algorithm |
penalization |
Penalization for the Ridge Logistic Regression |
cte |
Should a constant be included in the model? |
Algorithm |
Algorithm used in the calculations |
OptimMethod |
Optimization methods from optim |
Details
The procedure uses the algorithm proposed by Bastien et al () to fit a Partial Lest Squares Regression when the response is Binary. The procedure will be later converted into a Biplot to visulize the results.
Value
Still to be finished
Author(s)
Jose Luis Vicente Villardon
Examples
# No examples yet
Partial Least Squares Regression with several Binary Responses
Description
Fits Partial Least Squares Regression with several Binary Responses
Usage
PLSRBin(Y, X, S = 2, InitTransform = 5, grouping = NULL,
tolerance = 5e-05, maxiter = 100, show = FALSE, penalization = 0.1,
cte = TRUE, OptimMethod = "CG", Multiple = FALSE)
Arguments
Y |
The response |
X |
The matrix of independent variables |
S |
The Dimension of the solution |
InitTransform |
Initial transform for the X matrix |
grouping |
Grouping variable when the inial transformation is standardization within groups. |
tolerance |
Tolerance for convergence of the algorithm |
maxiter |
Maximum Number of iterations |
show |
Show the steps of the algorithm |
penalization |
Penalization for the Ridge Logistic Regression |
cte |
Should a constant be included in the model? |
OptimMethod |
Optimization methods from optim |
Multiple |
The responses are the indicators of a multinomial variable? |
Details
The function fits the PLSR method for the case when there is a set binary dependent variables, using logistic rather than linear fits to take into account the nature of responses. We term the method PLS-BLR (Partial Least Squares Binary Logistic Regression). This can be considered as a generalization of the NIPALS algorithm when the responses are all binary.
Value
Method |
Description of 'comp1' |
X |
The predictors matrix |
Y |
The responses matrix |
Initial_Transformation |
Initial Transformation of the X matrix |
ScaledX |
The scaled X matrix |
tolerance |
Tolerance used in the algorithm |
maxiter |
Maximum number of iterations used |
penalization |
Ridge penalization |
IncludeConst |
Is the constant included in the model? |
XScores |
Scores of the X matrix, used later for the biplot |
XLoadings |
Loadings of the X matrix |
YScores |
Scores of the Y matrix |
YLoadings |
Loadings of the Y matrix |
Coefficients |
Regression coefficients |
XStructure |
Correlations among the X variables and the PLS scores |
Intercepts |
Intercepts for the Y loadings |
LinTerm |
Linear terms for each response |
Expected |
Expected probabilities for the responses |
Predictions |
Binary predictions of the responses |
PercentCorrect |
Global percent of correct predictions |
PercentCorrectCols |
Percent of correct predictions for each column |
Maxima |
Column with the maximum probability. Useful when the responses are the indicators of a multinomial variable |
Author(s)
José Luis Vicente Villardon
References
Ugarte Fajardo, J., Bayona Andrade, O., Criollo Bonilla, R., Cevallos‐Cevallos, J., Mariduena‐Zavala, M., Ochoa Donoso, D., & Vicente Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
Examples
X=as.matrix(wine[,4:21])
Y=cbind(Factor2Binary(wine[,1])[,1], Factor2Binary(wine[,2])[,1])
rownames(Y)=wine[,3]
colnames(Y)=c("Year", "Origin")
pls=PLSRBin(Y,X, penalization=0.1, show=TRUE, S=2)
PLS binary regression.
Description
Fits PLS binary regression.
Usage
PLSRBinFit(Y, X, S = 2, tolerance = 5e-06, maxiter = 100,
show = FALSE, penalization = 0.1, cte = TRUE, OptimMethod = "CG")
Arguments
Y |
The response |
X |
The matrix of independent variables |
S |
The Dimension of the solution |
tolerance |
Tolerance for convergence of the algorithm |
maxiter |
Maximum Number of iterations |
show |
Show the steps of the algorithm |
penalization |
Penalization for the Ridge Logistic Regression |
cte |
Should a constant be included in the model? |
OptimMethod |
Optimization methods from optim |
Details
Fits PLS binary regression. It is used for a higher level function.
Value
The PLS fit used by the PLSRBin function.
Author(s)
Jose Luis Vicente Villardon
References
Ugarte Fajardo, J., Bayona Andrade, O., Criollo Bonilla, R., Cevallos‐Cevallos, J., Mariduena‐Zavala, M., Ochoa Donoso, D., & Vicente Villardon, J. L. (2020). Early detection of black Sigatoka in banana leaves using hyperspectral images. Applications in plant sciences, 8(8), e11383.
Examples
## Not yet
Partial Least Squares Regression (PLSR)
Description
Fits a Partial Least Squares Regression (PLSR) to two continuous data matrices
Usage
PLSRfit(Y, X, S = 2, tolerance = 5e-06,
maxiter = 100, show = FALSE)
Arguments
Y |
The matrix of dependent variables |
X |
The Matrix of Independent Variables |
S |
Dimension of the solution. The default is 2 |
tolerance |
Tolerance for the algorithm. |
maxiter |
Maximum number of iterations for the algorithm. |
show |
Logical. Should the calculation process be shown on the screen |
Details
Fits a Partial Least Squares Regression (PLSR) to a set of two continuous data matrices
Value
An object of class "PLSR"
Method |
PLSR1 |
X |
Independent Variables |
Y |
Dependent Variables |
center |
Are data centered? |
scale |
Are data scaled? |
ScaledX |
Scaled Independent Variables |
ScaledY |
Scaled Dependent Variables |
XScores |
Scores for the Independent Variables |
XWeights |
Weights for the Independent Variables - coefficients of the linear combination |
XLoadings |
Factor loadings for the Independent Variables |
YScores |
Scores for the Dependent Variables |
YWeights |
Weights for the Dependent Variables - coefficients of the linear combination |
YLoadings |
Factor loadings for the Dependent Variables |
XStructure |
Structure Correlations for the Independent Variables |
YStructure |
Structure Correlations for the Dependent Variables |
YXStructure |
Structure Correlations two groups |
Author(s)
Jose Luis Vicente Villardon
References
Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: a basic tool of chemometrics. Chemometrics and intelligent laboratory systems, 58(2), 109-130.
Plot clusters on a biplot.
Description
Highlights several groups or clusters on a biplot representation.
Usage
PlotBiplotClusters(A, Groups = ones(c(nrow(A), 1)), TypeClus = "st",
ClusterColors = NULL, ClusterNames = NULL, centers =
TRUE, ClustConf = 1, Legend = FALSE, LegendPos =
"topright", CexClustCenters = 1, ...)
Arguments
A |
Coordinates of the points in the scattergram |
Groups |
Factor defining the groups to be highlited |
TypeClus |
Type of representation of the clusters. For the moment just a convex hull but in the future ellipses and stars will be added. |
ClusterColors |
A vector of colors with as many elements as clusters. If |
ClusterNames |
A vector of names with as many elements as clusters. |
centers |
Logical variable to control if centres of the clusters are plotted |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
Legend |
Should a legend be plotted |
LegendPos |
Position of the legend. |
CexClustCenters |
Size of the cluster centres. |
... |
Any other graphical parameters |
Details
The clusters to plot should be added to the biplot object using the function AddCluster2Biplot
.
Value
It takes effects on a plot
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
data(iris)
bip=PCA.Biplot(iris[,1:4])
bip=AddCluster2Biplot(bip, NGroups=3, ClusterType="us", Groups=iris[,5], Original=FALSE)
plot(bip, PlotClus = TRUE)
Plot the response functions along the directions of best fit.
Description
Plot the response functions along the directions of best fit for the selected dimensions
Usage
PlotOrdinalResponses(olb, A1 = 1, A2 = 2, inf = -12, sup = 12,
Legend = TRUE, WhatVars=NULL)
Arguments
olb |
An object of class "Ordinal.Logistic.Biplot" |
A1 |
First dimension of the plot. |
A2 |
Second dimension of the plot |
inf |
Lower limit of the representation |
sup |
Upper limit of the representation |
Legend |
Should a legend be plotted |
WhatVars |
A vector with the numbers of the variables to be plotted. If NULL all the variables are plotted. |
Details
Plot the response functions along the directions of best fit for the selected dimensions
Value
A plot describing the behaviour of the variable
Author(s)
Jose Luis Vicente Villardon
Examples
data(Doctors)
olb = OrdLogBipEM(Doctors,dim = 2, nnodes = 10, initial=4, tol = 0.001,
maxiter = 100, penalization = 0.1, show=TRUE)
PlotOrdinalResponses(olb, WhatVars=c(1,2,3,4))
Political Figures in the USA
Description
Does the American public actively differentiate political stimuli along ideological lines?. Dissimilarities among 13 political figurein the USA.
Usage
data("PoliticalFigures")
Format
A data frame with the dissimilarities among 13 political figures in the USA.
G._W._Bush
a numeric vector with the dissimilarities with the other figures
John_Kerry
a numeric vector with the dissimilarities with the other figures
Ralph_Nader
a numeric vector with the dissimilarities with the other figures
Dick_Cheney
a numeric vector with the dissimilarities with the other figures
John_Edwards
a numeric vector with the dissimilarities with the other figures
Laura_Bush
a numeric vector with the dissimilarities with the other figures
Hillary_Clinton
a numeric vector with the dissimilarities with the other figures
Bill_Clinton
a numeric vector with the dissimilarities with the other figures
Colin_Powell
a numeric vector with the dissimilarities with the other figures
John_Ashcroft
a numeric vector with the dissimilarities with the other figures
John_McCain
a numeric vector with the dissimilarities with the other figures
Democ._Party
a numeric vector with the dissimilarities with the other figures
Repub._Party
a numeric vector with the dissimilarities with the other figures
Details
We have taken information from the 2004 CPS American National Election Study. Specifically 711 NES respondents' feeling thermometer ratings of thirteen prominent political figures from the period of the 2004 election: George W. Bush; John Kerry; Ralph Nader; Richard Cheney; John Edwards; Laura Bush; Hillary Clinton; Bill Clinton; Colin Powell; John Ashcroft; John McCain; the Democratic party; and the Republican party. With the respondent scores, a dissimilarity among each pair of figures
Source
Jacoby, W. G., & Armstrong, D. A. (2014). Bootstrap Confidence Regions for Multidimensional Scaling Solutions. American Journal of Political Science, 58(1), 264-278.
References
Jacoby, W. G., & Armstrong, D. A. (2014). Bootstrap Confidence Regions for Multidimensional Scaling Solutions. American Journal of Political Science, 58(1), 264-278.
Examples
# Not yet
Factor Analysis Biplot based on polychoric correlations
Description
Calculates a biplot for ordinal data based on polychoric correlations
Usage
PolyOrdinalLogBiplot(X, dimension = 3, method = "principal",
rotate = "varimax", RescaleCoordinates = TRUE, ...)
Arguments
X |
A matrix of ordinal data |
dimension |
Number of dimensiona to retain |
method |
Principal components (principal) or factor analysis (fa) |
rotate |
Rotation for the analysis |
RescaleCoordinates |
Rescale coordinates as in a continuous data biplot |
... |
Any aditional arguments for the principal and fa functions |
Details
The procedure calculates
Value
A biplot (Continuous or ordinal)
Author(s)
Jose Luis Vicente Villardon
See Also
fa
, principal
Examples
## Not Yet
Calculates loose axis ticks and labels using nice numbers
Description
Calculates axis ticks and labels using nice numbers
Usage
PrettyTicks(min = -3, max = 3, ntick = 5)
Arguments
min |
Minimum value on the axis |
max |
maximum value on the axis. |
ntick |
Approximated number of desired ticks |
Details
Calculates axis ticks and labels using nice numbers. The resulting labels are known as loose labels.
Value
A list with the following fields
ticks |
Ticks for the axis |
labels |
The corresponding labels |
Author(s)
Jose Luis Vicente Villardon
References
Heckbert, P. S. (1990). Nice numbers for graph labels. In Graphics Gems (pp. 61-63). Academic Press Professional, Inc..
See Also
Examples
PrettyTicks(-4, 4, 5)
Principal Coordinates Analysis
Description
Principal coordinates Analysis for a matrix of proximities obtained from binary, categorical, continuous or mixed data
Usage
PrincipalCoordinates(Proximities, w = NULL, dimension = 2,
method = "eigen", tolerance = 1e-04, Bootstrap = FALSE,
BootstrapType = c("Distances", "Products"), nB = 200,
ProcrustesRot = TRUE, BootstrapMethod = c("Sampling", "Permutation"))
Arguments
Proximities |
An object of class |
w |
An set of weights. |
dimension |
Dimension of the solution |
method |
Method to calculate the eigenvalues and eigenvectors. The default is the usual eigen function although the Power Method to calculate only tre first eigenvectors can be used. |
tolerance |
Tolerance for the eigenvalues |
Bootstrap |
Should Bootstrap be calculated? |
BootstrapType |
Bootstrap on the residuals of the "distance" or "scalar products" matrix. |
nB |
Number of Bootstrap replications |
ProcrustesRot |
Should each replication be rotated to match the initial solution? |
BootstrapMethod |
The replications are obtained "Sampling" or "Permutating" the residuals. |
Details
Principal Coordinates Analysis for a proximity matrix previously calculated from a matrix of raw data or directly obsrved proximities.
Value
An object of class Principal.Coordinates
. The function adds the information of the Principal Coordinates to the object of class proximities
. Together with the information about the proximities the object has:
Analysis |
The type of analysis performed, "Principal Coordinates" in this case |
Eigenvalues |
The eigenvalues of the PCoA |
Inertia |
The Inertia of the PCoA |
RowCoordinates |
Coordinates for the objects in the PCoA |
RowQualities |
Qualities of representation for the objects in the PCoA |
RawStress |
Raw Stress values |
stress1 |
stress formula 1 |
stress2 |
stress formula 2 |
sstress1 |
sstress formula 1 |
sstress2 |
sstress formula 2 |
rsq |
Squared correlation between disparities and distances |
Spearman |
Spearman correlation between disparities and distances |
Kendall |
Kendall correlation between disparities and distances |
BootstrapInfo |
The result of the bootstrap calculations |
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J. C. (2006) Similarity dissimilarity and Distance, measures of. Encyclopedia of Statistical Sciences. 2nd. ed. Volume 12. Wiley
Gower, J.C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53: 325-338.
J.R. Demey, J.L. Vicente-Villardon, M.P. Galindo, A.Y. Zambrano, Identifying molecular markers associated with classifications of genotypes by external logistic biplot, Bioinformatics 24 (2008) 2832.
See Also
BinaryProximities
, BootstrapDistance
, BootstrapDistance
, BinaryProximities
Examples
data(spiders)
Dis=BinaryProximities(spiders)
pco=PrincipalCoordinates(Dis)
Dis=BinaryProximities(spiders)
pco=PrincipalCoordinates(Dis, Bootstrap=TRUE)
Protein consumption data.
Description
Protein consumption in twenty-five European countries for nine food groups.
Usage
data(Protein)
Format
A data frame with 25 observations on the following 11 variables.
Comunist
a factor with levels
No
Yes
Region
a factor with levels
North
Center
South
Red_Meat
a numeric vector
White_Meat
a numeric vector
Eggs
a numeric vector
Milk
a numeric vector
Fish
a numeric vector
Cereal
a numeric vector
Starch
a numeric vector
Nuts
a numeric vector
Fruits_Vegetables
a numeric vector
Details
These data measure protein consumption in twenty-five European countries for nine food groups. It is possible to use multivariate methods to determine whether there are groupings of countries and whether meat consumption is related to that of other foods.
Source
http://lib.stat.cmu.edu/DASL/Datafiles/Protein.html
References
Weber, A. (1973) Agrarpolitik im Spannungsfeld der internationalen Ernaehrungspolitik, Institut fuer Agrarpolitik und marktlehre, Kiel.
Gabriel, K.R. (1981) Biplot display of multivariate matrices for inspection of data and diagnosis. In Interpreting Multivariate Data (Ed. V. Barnett), New York: John Wiley & Sons, 147-173.
Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 297-298.
Examples
data(Protein)
## maybe str(Protein) ; plot(Protein) ...
Sugar Cane Data
Description
Molecular characteristics of 50 varieties of sugar cane.
Usage
data(RAPD)
Format
A data frame with 50 observations on 168 variables. 1-120: Random aplified polymorphic DNA and 121-168: Microsatellites
Details
Dta are codified as presence or absence of the dominant marker.
Examples
data(RAPD)
## maybe str(RAPD) ; plot(RAPD) ...
Remove rows that contains NaNs (missing data)
Description
Remove rows that contains NaNs to obtain a matrix wothout missind data
Usage
RemoveRowsWithNaNs(x, cols = NULL)
Arguments
x |
The matrix to be arranged |
cols |
A set of columns to check as a vector of integers |
Details
Remove rows that contains NaNs to obtain a matrix wothout missind data
Value
x |
Matrix without missing data |
Author(s)
Jose Luis Vicente-Villardon
Ridge Binary Logistic Regression for Binary data
Description
This function performs a logistic regression between a dependent binary variable y
and some independent
variables x
, solving the separation problem in this type of regression using ridge
penalization.
Usage
RidgeBinaryLogistic(y, X = NULL, data = NULL, freq = NULL,
tolerance = 1e-05, maxiter = 100, penalization = 0.2,
cte = FALSE, ref = "first", bootstrap = FALSE, nmB = 100,
RidgePlot = FALSE, MinLambda = 0, MaxLambda = 2, StepLambda = 0.1)
Arguments
y |
A binary dependent variable or a formula |
X |
A set of independent variables when y is not a formula. |
data |
data frame for the formula |
freq |
frequencies for each observation (usually 1) |
tolerance |
Tolerance for convergence |
maxiter |
Maximum number of iterations |
penalization |
Ridige penalization: a non negative constant. Penalization used in the diagonal matrix to avoid singularities. |
cte |
Should the model have a constant? |
ref |
Category of reference |
bootstrap |
Should bootstrap confidence intervals be calculated? |
nmB |
Number of bootstrap samples. |
RidgePlot |
Should the ridge plot be plotted? |
MinLambda |
Minimum value of lambda for the rigge plot |
MaxLambda |
Maximum value of lambda for the rigge plot |
StepLambda |
Step for increasing the values of lambda |
Details
Logistic Regression is a widely used technique in applied work when a binary, nominal or ordinal response variable is available, due to the fact that classical regression methods are not applicable to this kind of variables. The method is available in most of the statistical packages, commercial or free. Maximum Likelihood together with a numerical method as Newton-Raphson, is used to estimate the parameters of the model. In logistic regression, when in the space generated by the independent variables there are hyperplanes that separate among the individuals belonging to the different groups defined by the response, maximum likelihood does not converge and the estimations tend to the infinity. That is known in the literature as the separation problem in logistic regression. Even when the separation is not complete, the numerical solution of the maximum likelihood has stability problems. From a practical point of view, that means the estimated model is not accurate precisely when there should be a perfect, or almost perfect, fit to the data.
The problem of the existence of the estimators in logistic regression can be seen in Albert (1984), a solution for the binary case, based on the Firth method, Firth (1993) is proposed by Heinze(2002). The extension to nominal logistic model was made by Bull (2002). All the procedures were initially developed to remove the bias but work well to avoid the problem of separation. Here we have chosen a simpler solution based on ridge estimators for logistic regression Cessie(1992).
Rather than maximizing {L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})
we maximize
{{L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})} - \lambda \left( {\left\| {{{\bf{b}}_{j0}}} \right\| + \left\| {{{\bf{B}}_j}} \right\|} \right)
Changing the values of \lambda
we obtain slightly different solutions not affected by the separation problem.
Value
An object of class RidgeBinaryLogistic
with the following components
beta |
Estimates of the coefficients |
fitted |
Fitted probabilities |
residuals |
Residuals of the model |
Prediction |
Predictions of presences and absences |
Covariances |
Covariances among the estimates |
Deviance |
Deviance of the current model |
NullDeviance |
Deviance of the null model |
Dif |
Difference between the deviances of the cirrent and null models |
df |
Degrees of freedom of the difference |
p |
p-value |
CoxSnell |
Cox-Snell pseudo R-squared |
Nagelkerke |
Nagelkerke pseudo R-squared |
MacFaden |
MacFaden pseudo R-squared |
R2 |
Pseudo R-squared using the residuals |
Classification |
Classification table |
PercentCorrect |
Percentage of correct classification |
Author(s)
Jose Luis Vicente Villardon
References
Agresti, A. (1990) An Introduction to Categorical Data Analysis. John Wiley and Sons, Inc.
Albert, A. and Anderson, J. A. (1984) On the existence of maximum likelihood estimates in logistic regression models. Biometrika, 71(1): 1-10.
Anderson, J. A. (1972), Separate sample logistic discrimination. Biometrika, 59(1): 19-35.
Anderson, J. A. & Philips P. R. (1981) Regression, discrimination and measurement models for ordered categorical variables. Appl. Statist, 30: 22-31.
Bull, S. B., Mk, C. & Greenwood, C. M. (2002) A modified score function for multinomial logistic regression. Computational Statistics and data Analysis, 39: 57-74.
Cortinhas Abrantes, J. & Aerts, M. (2012) A solution to separation for clustered binary data. Statistical Modelling, 12 (1): 3-27.
Cox, D. R. (1970), Analysis of Binary Data. Methuen. London.
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
Firth D, (1993) Bias Reduction of Maximum Likelihood Estimates, Biometrika, Vol, 80, No, 1, (Mar,, 1993), pp, 27-38.
Fox, J. (1984) Linear Statistical Models and Related Methods. Wiley. New York.
Harrell, F. E. (2012). rms: Regression Modeling Strategies. R package version 3.5-0. http://CRAN.R-project.org/package=rms
Harrell, F. E. (2001). Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis (Springer Series in Statistics). Springer. New York.
Heinze G, and Schemper M, (2002) A solution to the problem of separation in logistic regresion. Statist. Med., 21:2409-2419
Heinze G. and Ploner M. (2004) Fixing the nonconvergence bug in logistic regression with SPLUS and SAS. Computer Methods and Programs in Biomedicine 71 p, 181-187
Heinze, G. (2006) A comparative investigation of methods for logistic regression with separated or nearly separated data. Statist. Med., 25:4216-4226.
Heinze, G. and Puhr, R. (2010) Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets. Statist. Med. 29: 770-777.
Hoerl, A. E. and Kennard, R.W. (1971) Rige Regression: biased estimators for nonorthogonal problems. Technometrics, 21: 55 67.
Sun, H. and Wang S. Penalized logistic regression for high-dimensional DNA methylation data with case-control studies. Bioinformatics. 28 (10): 1368-1375.
Hosmer, D. and Lemeshow, L. (1989) Applied Logistic Regression. John Wiley and Sons. Inc.
Le Cessie, S. and Van Houwelingen, J.C. (1992) Ridge Estimators in Logistic Regression. Appl. Statist. 41 (1): 191-201.
Malo, N., Libiger, O. and Schork, N. J. (2008) Accommodating Linkage Disequilibrium in Genetic-Association Analyses via Ridge Regression. Am J Hum Genet. 82(2): 375-385.
Silvapulle, M. J. (1981) On the existence of maximum likelihood estimates for the binomial response models. J. R. Statist. Soc. B 43: 310-3.
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Análisis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
Walter, S. and Duncan, D. (1967) Estimation of the probability of an event as a function of several variables. Biometrika. 54:167-79.
Wedderburn, R. W. M. (1976) On the existence and uniqueness of the maximum likelihood estimates for certain generalized linear models. Biometrika 63, 27-32.
Zhu, J. and Hastie, T. (2004) Classification of gene microarrays by penalized logistic regression. Biostatistics. 5(3):427-43.
Examples
# not yet
Fits a binary logistic regression with ridge penalization
Description
This function fits a logistic regression between a dependent variable y and some independent variables x, and solves the separation problem in this type of regression using ridge regression and penalization.
Usage
RidgeBinaryLogisticFit(y, xd, freq, tolerance = 1e-05, maxiter = 100, penalization = 0.2)
Arguments
y |
A vector with the values of the dependent variable |
xd |
A matrix with the independent variables |
freq |
Frequencies of each pattern |
tolerance |
Tolerance for the iterations. |
maxiter |
Maximum number of iterations for convergenc~ |
penalization |
Penalization used in the diagonal matrix to avoid singularities. |
Details
Fits a binary logistic regression with ridge penalization
Value
The parameters of the fit
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
##---- Should be DIRECTLY executable !! ----
Multinomial logistic regression with ridge penalization
Description
This function does a logistic regression between a dependent variable y and some independent variables x, and solves the separation problem in this type of regression using ridge regression and penalization.
Usage
RidgeMultinomialLogisticFit(y, x, penalization = 0.2,
tol = 1e-04, maxiter = 200, show = FALSE)
Arguments
y |
Dependent variable. |
x |
A matrix with the independent variables. |
penalization |
Penalization used in the diagonal matrix to avoid singularities. |
tol |
Tolerance for the iterations. |
maxiter |
Maximum number of iterations. |
show |
Should the iteration history be printed?. |
Details
The problem of the existence of the estimators in logistic regression can be seen in Albert (1984), a solution for the binary case, based on the Firth's method, Firth (1993) is proposed by Heinze(2002). The extension to nominal logistic model was made by Bull (2002). All the procedures were initially developed to remove the bias but work well to avoid the problem of separation. Here we have chosen a simpler solution based on ridge estimators for logistic regression Cessie(1992).
Rather than maximizing {L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})
we maximize
{{L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})} - \lambda \left( {\left\| {{{\bf{b}}_{j0}}} \right\| + \left\| {{{\bf{B}}_j}} \right\|} \right)
Changing the values of \lambda
we obtain slightly different solutions not affected by the separation problem.
Value
An object of class "rmlr"
with components
fitted |
Matrix with the fitted probabilities |
cov |
Covariance matrix among the estimates |
Y |
Indicator matrix for the dependent variable |
beta |
Estimated coefficients for the multinomial logistic regression |
stderr |
Standard error of the estimates |
logLik |
Logarithm of the likelihood |
Deviance |
Deviance of the model |
AIC |
Akaike information criterion indicator |
BIC |
Bayesian information criterion indicator |
Author(s)
Jose Luis Vicente-Villardon
References
Albert,A. & Anderson,J.A. (1984),On the existence of maximum likelihood estimates in logistic regression models, Biometrika 71(1), 1–10.
Bull, S.B., Mak, C. & Greenwood, C.M. (2002), A modified score function for multinomial logistic regression, Computational Statistics and dada Analysis 39, 57–74.
Firth, D.(1993), Bias reduction of maximum likelihood estimates, Biometrika 80(1), 27–38
Heinze, G. & Schemper, M. (2002), A solution to the problem of separation in logistic regression, Statistics in Medicine 21, 2109–2419
Le Cessie, S. & Van Houwelingen, J. (1992), Ridge estimators in logistic regression, Applied Statistics 41(1), 191–201.
Examples
# No examples yet
Ridge Multinomial Logistic Regression
Description
Function that calculates an object with the fitted multinomial logistic regression for a nominal variable. It compares with the null model, so that we will be able to compare which model fits better the variable.
Usage
RidgeMultinomialLogisticRegression(formula, data, penalization = 0.2,
cte = TRUE, tol = 1e-04, maxiter = 200, showIter = FALSE)
Arguments
formula |
The usual formula notation (or the dependent variable) |
data |
The dataframe used by the formula. (or a matrix with the independent variables). |
penalization |
Penalization used in the diagonal matrix to avoid singularities. |
cte |
Should the model have a constant? |
tol |
Value to stop the process of iterations. |
maxiter |
Maximum number of iterations. |
showIter |
Should the iteration history be printed?. |
Value
An object that has the following components:
fitted |
Matrix with the fitted probabilities |
cov |
Covariance matrix among the estimates |
Y |
Indicator matrix for the dependent variable |
beta |
Estimated coefficients for the multinomial logistic regression |
stderr |
Standard error of the estimates |
logLik |
Logarithm of the likelihood |
Deviance |
Deviance of the model |
AIC |
Akaike information criterion indicator |
BIC |
Bayesian information criterion indicator |
NullDeviance |
Deviance of the null model |
Difference |
Difference between the two deviance values |
df |
Degrees of freedom |
p |
p-value asociated to the chi-squared estimate |
CoxSnell |
Cox and Snell pseudo R squared |
Nagelkerke |
Nagelkerke pseudo R squared |
MacFaden |
MacFaden pseudo R squared |
Table |
Cross classification of observed and predicted responses |
PercentCorrect |
Percentage of correct classifications |
Author(s)
Jose Luis Vicente-Villardon
References
Albert,A. & Anderson,J.A. (1984),On the existence of maximum likelihood estimates in logistic regression models, Biometrika 71(1), 1–10.
Bull, S.B., Mak, C. & Greenwood, C.M. (2002), A modified score function for multinomial logistic regression, Computational Statistics and dada Analysis 39, 57–74.
Firth, D.(1993), Bias reduction of maximum likelihood estimates, Biometrika 80(1), 27–38
Heinze, G. & Schemper, M. (2002), A solution to the problem of separation in logistic regression, Statistics in Medicine 21, 2109–2419
Le Cessie, S. & Van Houwelingen, J. (1992), Ridge estimators in logistic regression, Applied Statistics 41(1), 191–201.
See Also
Examples
data(Protein)
y=Protein[[2]]
X=Protein[,c(3,11)]
rmlr = RidgeMultinomialLogisticRegression(y,X,penalization=0.0)
summary(rmlr)
Ordinal logistic regression with ridge penalization
Description
This function performs a logistic regression between a dependent ordinal variable y and some independent variables x, and solves the separation problem using ridge penalization.
Usage
RidgeOrdinalLogistic(y, x, penalization = 0.1, tol = 1e-04, maxiter = 200, show = FALSE)
Arguments
y |
Dependent variable. |
x |
A matrix with the independent variables. |
penalization |
Penalization used to avoid singularities. |
tol |
Tolerance for the iterations. |
maxiter |
Maximum number of iterations. |
show |
Should the iteration history be printed?. |
Details
The problem of the existence of the estimators in logistic regression can be seen in Albert (1984); a solution for the binary case, based on the Firth's method, Firth (1993) is proposed by Heinze(2002). All the procedures were initially developed to remove the bias but work well to avoid the problem of separation. Here we have chosen a simpler solution based on ridge estimators for logistic regression Cessie(1992).
Rather than maximizing {L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})
we maximize
{{L_j}(\left. {\bf{G}} \right|{{\bf{b}}_{j0}},{{\bf{B}}_j})} - \lambda \left( {\left\| {{{\bf{b}}_{j0}}} \right\| + \left\| {{{\bf{B}}_j}} \right\|} \right)
Changing the values of \lambda
we obtain slightly different solutions not affected by the separation problem.
Value
An object of class "pordlogist"
. This has components:
nobs |
Number of observations |
J |
Maximum value of the dependent variable |
nvar |
Number of independent variables |
fitted.values |
Matrix with the fitted probabilities |
pred |
Predicted values for each item |
Covariances |
Covariances matrix |
clasif |
Matrix of classification of the items |
PercentClasif |
Percent of good classifications |
coefficients |
Estimated coefficients for the ordinal logistic regression |
thresholds |
Thresholds of the estimated model |
logLik |
Logarithm of the likelihood |
penalization |
Penalization used to avoid singularities |
Deviance |
Deviance of the model |
DevianceNull |
Deviance of the null model |
Dif |
Diference between the two deviances values calculated |
df |
Degrees of freedom |
pval |
p-value of the contrast |
CoxSnell |
Cox-Snell pseudo R squared |
Nagelkerke |
Nagelkerke pseudo R squared |
MacFaden |
Nagelkerke pseudo R squared |
iter |
Number of iterations made |
Author(s)
Jose Luis Vicente-Villardon
References
Albert,A. & Anderson,J.A. (1984),On the existence of maximum likelihood estimates in logistic regression models, Biometrika 71(1), 1–10.
Bull, S.B., Mak, C. & Greenwood, C.M. (2002), A modified score function for multinomial logistic regression, Computational Statistics and dada Analysis 39, 57–74.
Firth, D.(1993), Bias reduction of maximum likelihood estimates, Biometrika 80(1), 27–38
Heinze, G. & Schemper, M. (2002), A solution to the problem of separation in logistic regression, Statistics in Medicine 21, 2109–2419
Le Cessie, S. & Van Houwelingen, J. (1992), Ridge estimators in logistic regression, Applied Statistics 41(1), 191–201.
Examples
data(Doctors)
olb = OrdLogBipEM(Doctors,dim = 2, nnodos = 10,
tol = 0.001, maxiter = 100, penalization = 0.2)
model = RidgeOrdinalLogistic(Doctors[, 1], olb$RowCoordinates, tol = 0.001,
maxiter = 100, penalization = 0.2)
model
SMACOF
Description
SMACOF algorithm for symmetric proximity matrices
Usage
SMACOF(P, X = NULL, W = NULL,
Model = c("Identity", "Ratio", "Interval", "Ordinal"),
dimsol = 2, maxiter = 100, maxerror = 1e-06,
StandardizeDisparities = TRUE, ShowIter = FALSE)
Arguments
P |
A matrix of proximities |
X |
Inial configuration |
W |
A matrix of weights~ |
Model |
MDS model. |
dimsol |
Dimension of the solution |
maxiter |
Maximum number of iterations of the algorithm |
maxerror |
Tolerance for convergence of the algorithm |
StandardizeDisparities |
Should the disparities be standardized |
ShowIter |
Show the iteration proccess |
Details
SMACOF performs multidimensional scaling of proximity data to find a least- squares representation of the objects in a low-dimensional space. A majorization algorithm guarantees monotone convergence for optionally transformed, metric and nonmetric data under a variety of models.
Value
An object of class Principal.Coordinates
and MDS
. The function adds the information of the MDS to the object of class proximities
. Together with the information about the proximities the object has:
Analysis |
The type of analysis performed, "MDS" in this case |
X |
Coordinates for the objects |
D |
Distances |
Dh |
Disparities |
stress |
Raw Stress |
stress1 |
stress formula 1 |
stress2 |
stress formula 2 |
sstress1 |
sstress formula 1 |
sstress2 |
sstress formula 2 |
rsq |
Squared correlation between disparities and distances |
rho |
Spearman correlation between disparities and distances |
tau |
Kendall correlation between disparities and distances |
Author(s)
Jose Luis Vicente-Villardon
References
Commandeur, J. J. F. and Heiser, W. J. (1993). Mathematical derivations in the proximity scaling (PROXSCAL) of symmetric data matrices (Tech. Rep. No. RR- 93-03). Leiden, The Netherlands: Department of Data Theory, Leiden University.
Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29, 28-42.
De Leeuw, J. & Mair, P. (2009). Multidimensional scaling using majorization: The R package smacof. Journal of Statistical Software, 31(3), 1-30, http://www.jstatsoft.org/v31/i03/
Borg, I., & Groenen, P. J. F. (2005). Modern Multidimensional Scaling (2nd ed.). Springer.
Borg, I., Groenen, P. J. F., & Mair, P. (2013). Applied Multidimensional Scaling. Springer.
Groenen, P. J. F., Heiser, W. J. and Meulman, J. J. (1999). Global optimization in least squares multidimensional scaling by distance smoothing. Journal of Classification, 16, 225-254.
Groenen, P. J. F., van Os, B. and Meulman, J. J. (2000). Optimal scaling by alternating length-constained nonnegative least squares, with application to distance-based analysis. Psychometrika, 65, 511-524.
See Also
Examples
data(spiders)
Dis=BinaryProximities(spiders)
MDSSol=SMACOF(Dis$Proximities)
Sustainability Society Index
Description
Sustainability Society Index
Usage
data("SSI")
Format
A data frame with 924 observations on the following 23 variables.
Year
a factor with levels
a2006
a2008
a2010
a2012
a2014
a2016
Country
a factor with levels
Albania
Algeria
Angola
Argentina
Armenia
Australia
Austria
Azerbaijan
Bangladesh
Belarus
Belgium
Benin
Bhutan
Bolivia
Bosnia-Herzegovina
Botswana
Brazil
Bulgaria
Burkina_Faso
Burundi
Cambodia
Cameroon
Canada
Central_African_Republic
Chad
Chile
China
Colombia
Congo
Congo_Democratic_Rep.
Costa_Rica
Cote_dIvoire
Croatia
Cuba
Cyprus
Czech_Republic
Denmark
Dominican_Republic
Ecuador
Egypt
El_Salvador
Estonia
Ethiopia
Finland
France
Gabon
Gambia
Georgia
Germany
Ghana
Greece
Guatemala
Guinea
Guinea-Bissau
Guyana
Haiti
Honduras
Hungary
Iceland
India
Indonesia
Iran
Iraq
Ireland
Israel
Italy
Jamaica
Japan
Jordan
Kazakhstan
Kenya
Korea._North
Korea._South
Kuwait
Kyrgyz_Republic
Laos
Latvia
Lebanon
Lesotho
Liberia
Libya
Lithuania
Luxembourg
Macedonia
Madagascar
Malawi
Malaysia
Mali
Malta
Mauritania
Mauritius
Mexico
Moldova
Mongolia
Montenegro
Morocco
Mozambique
Myanmar
Namibia
Nepal
Netherlands
New_Zealand
Nicaragua
Niger
Nigeria
Norway
Oman
Pakistan
Panama
Papua_New_Guinea
Paraguay
Peru
Philippines
Poland
Portugal
Qatar
Romania
Russia
Rwanda
Saudi_Arabia
Senegal
Serbia
Sierra_Leone
Singapore
Slovak_Republic
Slovenia
South_Africa
Spain
Sri_Lanka
Sudan
Sweden
Switzerland
Syria
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Trinidad_and_Tobago
Tunisia
Turkey
Turkmenistan
Uganda
Ukraine
United_Arab_Emirates
United_Kingdom
United_States
Uruguay
Uzbekistan
Venezuela
Vietnam
Yemen
Zambia
Zimbabwe
Sufficient_Food
a numeric vector
Sufficient_to_Drink
a numeric vector
Safe_Sanitation
a numeric vector
Education_
a numeric vector
Healthy_Life
a numeric vector
Gender_Equality
a numeric vector
Income_Distribution
a numeric vector
Population_Growth
a numeric vector
Good_Governance
a numeric vector
Biodiversity_
a numeric vector
Renewable_Water_Resources
a numeric vector
Consumption
a numeric vector
Energy_Use
a numeric vector
Energy_Savings
a numeric vector
Greenhouse_Gases
a numeric vector
Renewable_Energy
a numeric vector
Organic_Farming
a numeric vector
Genuine_Savings
a numeric vector
GDP
a numeric vector
Employment
a numeric vector
Public_Debt
a numeric vector
Details
Sustainability Society Index
Source
https://ssi.wi.th-koeln.de
References
Gallego-Alvarez, I., Galindo-Villardon, M. P., & Rodriguez-Rosa, M. (2015). Analysis of the Sustainable Society Index Worldwide: A Study from the Biplot Perspective. Social Indicators Research, 120(1), 29-65. https://doi.org/10.1007/s11205-014-0579-9
Examples
data(SSI)
## maybe str(SSI) ; plot(SSI) ...
Sustainability Society Index (3w)
Description
Sustainability Society Index, Three way table
Usage
data("SSI3w")
Format
The format is: List of 6 $ a2006: num [1:154, 1:21] 10 9.3 6.6 10 8.9 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:21] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2008: num [1:154, 1:21] 10 9.4 7.1 10 9.3 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:21] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2010: num [1:154, 1:21] 10 9.4 7.7 10 9.4 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:21] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2012: num [1:154, 1:21] 10 10 8.1 10 9.3 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:21] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2014: num [1:154, 1:21] 10 10 8.4 10 9.3 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:21] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2016: num [1:154, 1:21] 10 10 8.6 10 9.4 10 10 10 8.4 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:21] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ...
Details
Sustainability Society Index
Source
https://ssi.wi.th-koeln.de
References
Gallego-Alvarez, I., Galindo-Villardon, M. P., & Rodriguez-Rosa, M. (2015). Analysis of the Sustainable Society Index Worldwide: A Study from the Biplot Perspective. Social Indicators Research, 120(1), 29-65. https://doi.org/10.1007/s11205-014-0579-9
Examples
data(SSI3w)
## maybe str(SSI3w) ; plot(SSI3w) ...
Sustainability Society Index
Description
Sustainability Society Index
Usage
data("SSIEcon3w")
Format
The format is: List of 6 $ a2006: num [1:154, 1:5] 1.2 1 1 4.6 1 5.4 9.9 1.9 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:5] "Organic_Farming" "Genuine_Savings" "GDP" "Employment" ... $ a2008: num [1:154, 1:5] 1 1 1 4.2 1 5.6 9.9 1.9 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:5] "Organic_Farming" "Genuine_Savings" "GDP" "Employment" ... $ a2010: num [1:154, 1:5] 1.1 1 1 5.8 1.1 5.6 9.9 2 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:5] "Organic_Farming" "Genuine_Savings" "GDP" "Employment" ... $ a2012: num [1:154, 1:5] 1.1 1 1 5.7 1.1 5.7 9.9 2 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:5] "Organic_Farming" "Genuine_Savings" "GDP" "Employment" ... $ a2014: num [1:154, 1:5] 1.1 1 1 5.3 1.1 5.7 9.9 2.1 1.2 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:5] "Organic_Farming" "Genuine_Savings" "GDP" "Employment" ... $ a2016: num [1:154, 1:5] 1.1 1 1 4.8 1.1 6.8 9.9 2 1.2 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:5] "Organic_Farming" "Genuine_Savings" "GDP" "Employment" ...
Details
Sustainability Society Index
Source
https://ssi.wi.th-koeln.de
References
Gallego-Alvarez, I., Galindo-Villardon, M. P., & Rodriguez-Rosa, M. (2015). Analysis of the Sustainable Society Index Worldwide: A Study from the Biplot Perspective. Social Indicators Research, 120(1), 29-65. https://doi.org/10.1007/s11205-014-0579-9
Examples
data(SSIEcon3w)
## maybe str(SSIEcon3w) ; plot(SSIEcon3w) ...
Sustainability Society Index
Description
Sustainability Society Index
Usage
data("SSIEnvir3w")
Format
The format is: List of 6 $ a2006: num [1:154, 1:7] 4.2 6.5 4 4.9 7.7 5.7 8.1 4.9 2.8 6.3 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:7] "Biodiversity_" "Renewable_Water_Resources" "Consumption" "Energy_Use" ... $ a2008: num [1:154, 1:7] 4.8 6.5 4 5.1 7.7 5.7 8 5.7 2.8 6 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:7] "Biodiversity_" "Renewable_Water_Resources" "Consumption" "Energy_Use" ... $ a2010: num [1:154, 1:7] 5.4 6.6 4 5.2 7.7 5.7 8 6.4 2.8 5.8 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:7] "Biodiversity_" "Renewable_Water_Resources" "Consumption" "Energy_Use" ... $ a2012: num [1:154, 1:7] 5.3 6.6 4 5.3 7.7 6.1 8 6.8 2.8 5.8 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:7] "Biodiversity_" "Renewable_Water_Resources" "Consumption" "Energy_Use" ... $ a2014: num [1:154, 1:7] 5.6 6.6 4 5.3 7.7 7 7.9 7.3 2.8 6 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:7] "Biodiversity_" "Renewable_Water_Resources" "Consumption" "Energy_Use" ... $ a2016: num [1:154, 1:7] 5.5 6.6 4.1 5.4 7.8 7.3 7.9 7.3 2.9 5.9 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:7] "Biodiversity_" "Renewable_Water_Resources" "Consumption" "Energy_Use" ...
Details
Sustainability Society Index
Source
https://ssi.wi.th-koeln.de
References
Gallego-Alvarez, I., Galindo-Villardon, M. P., & Rodriguez-Rosa, M. (2015). Analysis of the Sustainable Society Index Worldwide: A Study from the Biplot Perspective. Social Indicators Research, 120(1), 29-65. https://doi.org/10.1007/s11205-014-0579-9
Examples
data(SSIEnvir3w)
## maybe str(SSIEnvir3w) ; plot(SSIEnvir3w) ...
Sustainability Society Index
Description
Sustainability Society Index
Usage
data("SSIHuman3w")
Format
The format is: List of 6 $ a2006: num [1:154, 1:9] 10 9.3 6.6 10 8.9 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:9] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2008: num [1:154, 1:9] 10 9.4 7.1 10 9.3 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:9] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2010: num [1:154, 1:9] 10 9.4 7.7 10 9.4 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:9] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2012: num [1:154, 1:9] 10 10 8.1 10 9.3 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:9] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2014: num [1:154, 1:9] 10 10 8.4 10 9.3 10 10 10 8.3 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:9] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ... $ a2016: num [1:154, 1:9] 10 10 8.6 10 9.4 10 10 10 8.4 10 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:154] "Albania" "Algeria" "Angola" "Argentina" ... .. ..$ : chr [1:9] "Sufficient_Food" "Sufficient_to_Drink" "Safe_Sanitation" "Education_" ...
Details
Sustainability Society Index
Source
https://ssi.wi.th-koeln.de
References
Gallego-Alvarez, I., Galindo-Villardon, M. P., & Rodriguez-Rosa, M. (2015). Analysis of the Sustainable Society Index Worldwide: A Study from the Biplot Perspective. Social Indicators Research, 120(1), 29-65. https://doi.org/10.1007/s11205-014-0579-9
Examples
data(SSIHuman3w)
## maybe str(SSIHuman3w) ; plot(SSIHuman3w) ...
Separation of different types of variables into a list
Description
The procedure creates a list in which each field contains the variables of the same type.
Usage
SeparateVarTypes(X, TypeVar = NULL, TypeFit = NULL)
Arguments
X |
A data frame |
TypeVar |
A vector of characters defining the type of each variable. If not provided the procedure tries to gess the type of each variable. See details for types |
TypeFit |
A vector of characters defining the type of fit for each variable. If not provided the procedure tries to gess the type of fit for each variable. See details for types |
Details
The procedure creates a list in which each field contains the variables of the same type. The type of Variable can be specified in a vector TypeVar and the type of fit in a vector TypeFit. The TypeVar is a vector of characters with as many components as variables with types coded as:
"c" - Continuous (1)
"b" - Binary (2)
"n" - Nominal (3)
"o" - Ordinal (4)
"f" - Frequency (5)
"a" - Abundance (5)
Numbers rhather than characters can also be used. Unless specified in TypeVar, numerical variables are "Continuous", factors are "Nominal", ordered factors are "Ordinal". Factors with just two values are considered as "Binary". "Frequencies" and "abundances" should be specified by the user. If Typevar has length 1, all the variables are supposed to have the same type.
The typeFit is a vector of characters containing the type of fit used for each variable, coded as:
"a" - Average (1)
"wa" - Weighted Average (2)
"r" - Regression (Linear or logistic depending on the type of variable) (3)
"g" - Gaussian (Equal tolerances) (4)
"g1" - Gaussian (Different tolerances) (5)
Numbers rhather than characters can also be used. Unless specified numerical variables are fitted with linear regression, factors with logistic biplots, frequencies with weighted averages and abundances with gaussian regression.
Value
A list containing the following fields
Continuous |
A list containing a data frame with the numeric variables and a character vector with the type of fit for each variable |
Binary |
A list containing a data frame with the binary variables and a character vector with the type of fit for each variable |
Nominal |
A list containing a data frame with the nominal variables and a character vector with the type of fit for each variable |
Ordinal |
A list containing a data frame with the ordinal variables and a character vector with the type of fit for each variable |
Frequency |
A list containing a data frame with the frequency variables and a character vector with the type of fit for each variable |
Abundance |
A list containing a data frame with the abundance variables and a character vector with the type of fit for each variable |
Author(s)
Jose Luis Vicente Villardon
Examples
# Not yet
Simple Procrustes Analysis
Description
Simple Procrustes Analysis for two matrices
Usage
SimpleProcrustes(X, Y, centre = FALSE)
Arguments
X |
Matrix of the first configuration. |
Y |
Matrix of the second configuration. |
centre |
Should the matrices be centred before the calculations? |
Details
Orthogonal Procrustes Analysis for two configurations X and Y. The first configuration X is used as a reference and the second, Y, is transformed to match the reference as much as possible. X = s Y T + 1t +E = Z + E
Value
An object of class Procrustes
.This has components:
X |
First Configuration |
Y |
Second Configuration |
Yrot |
Second Configuration after the transformation |
T |
Rotation Matrix |
t |
Translation Vector |
s |
Scale Factor |
rsss |
Residual Sum of Squares |
fit |
Goodness of fit as percent of expained variance |
correlations |
Correlations among the columns of X and Z |
Author(s)
Jose Luis Vicente-Villardon
References
Ingwer Borg, I. & Groenen, P. J.F. (2005). Modern Multidimensional Scaling. Theory and Applications. Second Edition. Springer
See Also
Examples
data(spiders)
Sparse version of the NIPALS algorithm for PCA.
Description
Sparse version of the NIPALS algorithm for PCA.
Usage
Sparse.NIPALSPCA(X, dimens = 2, tol = 1e-06, maxiter = 1000, lambda = 0.02)
Arguments
X |
The data matrix. |
dimens |
The dimension of the solution |
tol |
Tolerance of the algorithm. |
maxiter |
Maximum number of iteratios. |
lambda |
Value used for sparsity |
Details
Sparse version of the NIPALS algorithm for the singular value decomposition that allows for the construction of PCA and Biplot.
Value
The singular value decomposition
u |
The coordinates of the rows (standardized) |
d |
The singuklar values |
v |
The coordinates of the columns (standardized) |
Author(s)
Jose Luis Vicente Villardon
References
Have to be written
Examples
# Not yet
Hunting spiders environmental data.
Description
Hunting spiders environmental data.
Usage
data("SpidersEnv")
Format
A data frame with 28 observations on the following 6 variables.
Watcont
Water content
Barsand
Bare sand
Covmoss
Cover moss
Ligrefl
Light reflection
Falltwi
Fallen Twings
Coverher
Cover Herbs
Details
Hunting spiders environmental data.
Source
van der Aart, P. J. M., and Smeenk-Enserink, N. (1975) Correlations between distributions of hunting spiders (Lycos- idae, Ctenidae) and environmental characteristics in a dune area. Netherlands Journal of Zoology 25, 1-45.
References
Ter Braak, C. J. (1986). Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology, 67(5), 1167-1179.
Examples
data(SpidersEnv)
## maybe str(SpidersEnv) ; plot(SpidersEnv) ...
Hunting Spiders Data
Description
Hunting spiders abundances data.
Usage
data("SpidersSp")
Format
A data frame with 28 observations of abundance of 12 hunting spider species
- Alopacce
Abundance of the species Alopecosa accentuata
- Alopcune
Abundance of the species Alopecosa cuneata
- Alopfabr
Abundance of the species Alopecosa fabrilis
- Arctlute
Abundance of the species Arctosa lutetiana
- Arctperi
Abundance of the species Arctosa perita
- Auloalbi
Abundance of the species Aulonia albimana
- Pardlugu
Abundance of the species Pardosa lugubris
- Pardmont
Abundance of the species Pardosa monticola
- Pardnigr
Abundance of the species Pardosa nigriceps
- Pardpull
Abundance of the species Pardosa pullata
- Trocterr
Abundance of the species Trochosa terricola
- Zoraspin
Abundance of the species Zora spinimana
Source
van der Aart, P. J. M., and Smeenk-Enserink, N. (1975) Correlations between distributions of hunting spiders (Lycos- idae, Ctenidae) and environmental characteristics in a dune area. Netherlands Journal of Zoology 25, 1-45.
References
Ter Braak, C. J. (1986). Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology, 67(5), 1167-1179.
Examples
data(SpidersSp)
## maybe str(SpidersSp) ; plot(SpidersSp) ...
STATIS-ACT for multiple tables with common rows and its associated Biplot
Description
The procedure performs STATIS-ACT methodology for multiple tables with common rows and its associated biplot
Usage
StatisBiplot(X, InitTransform = "Standardize columns", dimens = 2,
SameVar = FALSE)
Arguments
X |
A list containing multiple tables with common rows. |
InitTransform |
Initial transformation of the data matrices |
dimens |
Dimension of the final solution |
SameVar |
Are the variables the same for all occasions? If so, Biplot trajectories for each variable will be calculated. |
Details
The procedure performs STATIS-ACT methodology for multiple tables with common rows and its associated biplot. When the variables are the same for all occasions trajectories for the variables can also be plotted. Basic plotting includes the consensus individuals and all the variables. Traditional trajectories for individuals and biplot trajectories for variables (when adequate) are optional. The original matrix will be provided as a list each cell of the list is the data matrix for one ocassion the number of rows for each occasion must be the same
Value
An object of class StatisBiplot
Author(s)
Jose Luis Vicente Villardon
References
Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012). STATIS and DISTATIS: optimum multitable principal component analysis and three way metric multidimensional scaling. WIREs Comput Stat, 4, 124-167.
Efron, B.,Tibshirani, RJ. (1993). An introduction to the bootstrap. New York: Chapman and Hall. 436p.
Escoufier, Y. (1976). Operateur associe a un tableau de donnees. Annales de laInsee, 22-23, 165-178.
Escoufier, Y. (1987). The duality diagram: a means for better practical applications. En P. Legendre & L. Legendre (Eds.), Developments in Numerical Ecology, pp. 139-156, NATO Advanced Institute, Serie G. Berlin: Springer.
L'Hermier des Plantes, H. (1976). Structuration des Tableaux a Trois Indices de la Statistique. [These de Troisieme Cycle]. University of Montpellier, France.
Ringrose, T.J. (1992). Bootstrapping and Correspondence Analysis in Archaeology. Journal of Archaeological. Science.19:615-629.
Examples
data(Chemical)
# Extract continous data from the original data frame.
x= Chemical[,5:16]
# Obtaining the three way table as a list
X=Convert2ThreeWay(x,Chemical$WEEKS, columns=FALSE)
# Calculating the Biplot associated to STATIS-ACT
stbip=StatisBiplot(X, SameVar=TRUE)
# Basic plot of the results
plot(stbip)
# Colors By Table
plot(stbip, VarColorType="ByTable")
# Colors By Variable
plot(stbip, VarColorType="ByVar", mode="s", MinQualityVars = 0.5)
plot(stbip, PlotRowTraj = TRUE, PlotVars=FALSE, RowColors=1:36)
Dual STATIS-ACT for binary data based on Tetrachoric Correlations
Description
Dual STATIS-ACT for binary data based on Tetrachoric Correlations
Usage
TetraDualStatis(X, dimens = 2, SameInd = FALSE, RotVarimax = FALSE,
OptimMethod = "L-BFGS-B", penalization = 0.01)
Arguments
X |
A three way binary data matrix |
dimens |
Dimension of the solution |
SameInd |
Are the individuals the same in all occassions? |
RotVarimax |
Should the solution be rotated? |
OptimMethod |
Optimization method for the gradients |
penalization |
Penalization for the ridge solution |
Details
The general aim of STATIS-ACT methods is to extract information common to a set of datasets with the same individuals. They will also be represented as a Euclidean configuration or map of points (or vectors), in the same way as in Principal Component Analysis (PCA) or Principal Coordinate Analysis (PCoA). If the object is to analyze the variables and the correlation structures between them we will use a Factor Analysis (FA). When we have tables in which we measure a set of common variables and we want to obtain a consensus structure of all of them, we will use the named STATIS-Dual.
The method was initially designed to work with individuals common to all the tables, but in this work, we will focus on the dual version, which works with variables common to all of them.
When we have several tables of binary dataset, the classical methods for continuous data are not suitable. If the individuals are the same in all tables, we can use a STATIS based on distances, also known as DISTATIS. El procedimiento consiste en calcular una matriz de distancias a partir de para un coeficiente de similaridad para datos binarios. Las distancias se convierten en productos escalares, como en ACoP, y se trabaja a partir de ellos como en el STATIS tradicional.
When we have common variables, and we are interested in the association between them, we could use a coefficient that, instead of similarity, shows the association between the variables. In this work we propose the use of the tetrachoric correlation matrix for each table and develop the necessary adaptations to the method.
Value
An object with the results
Author(s)
Laura Vicente-Gonzalez, José Luis Vicente-Villardon
Examples
# Not yet
Converts a multitable list to a two way matrix
Description
Takes a multitable list of matrices X and converts it to a two way matrix with the structure required by the Statis programs using a _ to separate variable and occassion or study.
Usage
Three2TwoWay(X, whatlines = 2)
Arguments
X |
The multitable list. |
whatlines |
Concatenate the rows (1) or the columns (2) |
Details
Takes a multitable list of matrices X and converts it to a two way matrix with the structure required by the Statis programs using a _ to separate variable and occassion or study. When whatlines is 1 the final matrix adds the rows of the three dimensional array, then the columns must be the same for all studies. When whatlines is 2 the columns are concatenated and then the number of rows must be the same for all studies.
Value
A two way matrix
x |
A two way matrix |
Author(s)
Jose Luis Vicente Villardon
Examples
# No examples yet
Three to two way data
Description
Three to two way data.
Usage
ThreeWay2FrontalSlices(X, Slice = 1)
Arguments
X |
A three way array. |
Slice |
The mode for the rows |
Details
Three to two way data. The provided mode is placen on the rows. The columns are the result of intercatively coding the other two modes.
Value
A two way matrix.
Author(s)
José Luis Vicente- Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Initial transformation of a data matrix
Description
Initial transformation of data before the construction of a biplot. (or any other technique)
Usage
TransformIni(X, InitTransform = "None", transform = "Standardize columns")
Arguments
X |
Original Raw Data Matrix |
InitTransform |
Initial transform of the data (usually logarithm) |
transform |
Transformation to use. See details. |
Details
Possible Transformations are:
1.- "Raw Data": When no transformation is required.
2.- "Substract the global mean": Eliminate an eefect common to all the observations
3.- "Double centering" : Interaction residuals. When all the elements of the table are comparable. Useful for AMMI models.
4.- "Column centering": Remove the column means.
5.- "Standardize columns": Remove the column means and divide by its standard deviation.
6.- "Row centering": Remove the row means.
7.- "Standardize rows": Divide each row by its standard deviation.
8.- "Divide by the column means and center": The resulting dispersion is the coefficient of variation.
9.- "Normalized residuals from independence" for a contingency table.
The transformation can be provided to the function by using the string beetwen the quotes or just the associated number.
The supplementary rows and columns are not used to calculate the parameters (means, standard deviations, etc). Some of the transformations are not compatible with supplementary data.
Value
X |
Transformed data matrix |
Author(s)
Jose Luis Vicente Villardon
References
M. J. Baxter (1995) Standardization and Transformation in Principal Component Analysis, with Applications to Archaeometry. Journal of the Royal Statistical Society. Series C (Applied Statistics). Vol. 44, No. 4 (1995) , pp. 513-527
Kroonenberg, P. M. (1983). Three-mode principal component analysis: Theory and applications (Vol. 2). DSWO press. (Chapter 6)
Examples
data(iris)
x=as.matrix(iris[,1:4])
x=TransformIni(x, transform=4)
x
Truncated version of the NIPALS algorithm for PCA.
Description
Truncated version of the NIPALS algorithm for PCA.
Usage
Truncated.NIPALSPCA(X, dimens = 2, tol = 1e-06, maxiter = 1000, lambda = 0.02)
Arguments
X |
The data matrix. |
dimens |
The dimension of the solution |
tol |
Tolerance of the algorithm. |
maxiter |
Maximum number of iteratios. |
lambda |
Value used for truncation |
Details
Classical NIPALS algorithm for the singular value decomposition that allows for the construction of PCA and Biplot.
Value
The singular value decomposition
u |
The coordinates of the rows (standardized) |
d |
The singuklar values |
v |
The coordinates of the columns (standardized) |
Author(s)
Jose Luis Vicente Villardon
References
Have to be written
See Also
Examples
# Not yet
Multidimensional Unfolding
Description
Multidimensional Unfolding with some adaptations for vegetation analysis
Usage
Unfolding(A, ENV = NULL, TransAbund = "Gaussian Columns", offset = 0.5,
weight = "All_1", Constrained = FALSE,
TransEnv = "Standardize columns",
InitConfig = "SVD", model = "Ratio",
condition = "Columns", Algorithm = "SMACOF",
OptimMethod = "CG", r = 2, maxiter = 100,
tolerance = 1e-05, lambda = 1, omega = 0, plot = FALSE)
Arguments
A |
The original proximities matrix |
ENV |
The matrix of environmental variables |
TransAbund |
Initial transformation of the abundances : "None", "Gaussian", "Column Percent", "Gaussian Columns", "Inverse Square Root", "Divide by Column Maximum") |
offset |
offset is the quantity added to the zeros of the table |
weight |
A matrix of weights for each cell of the table |
Constrained |
Should fit a constrained analysis |
TransEnv |
Transformation of the environmental variables |
InitConfig |
Init configuration for the algorithm |
model |
Type of model to be fitted: "Identity", "Ratio", "Interval" or "Ordinal". |
condition |
"Matrix", "Columns" to condition to the whole matrix or to each column |
Algorithm |
Algorithm to fit the model: "SMACOF", "GD", "Genefold" |
OptimMethod |
Optimization method for gradient descent |
r |
Dimension of the solution |
maxiter |
Maximum number of iterations in the algorithm |
tolerance |
Tolerace for the algorithm |
lambda |
First penalization parameter |
omega |
Second penalization parameter |
plot |
Should the results be plotted? |
Details
ological data
Value
An object of class "Unfolding"
Author(s)
Jose Luis Vicente Villardon
References
Ver Articulos
Examples
unf=Unfolding(SpidersSp, ENV=SpidersEnv, model="Ratio", Constrained = FALSE, condition="Matrix")
plot(unf, PlotTol=TRUE, PlotEnv = FALSE)
plot(unf, PlotTol=TRUE, PlotEnv = TRUE)
cbind(unf$QualityVars, unf$Var_Fit)
unf2=Unfolding(SpidersSp, ENV=SpidersEnv, model="Ratio", Constrained = TRUE, condition="Matrix")
plot(unf2, PlotTol=FALSE, PlotEnv = TRUE, mode="s")
cbind(unf2$QualityVars, unf2$Var_Fit)
Draws a variable on a biplot
Description
Draws a continuous variable on a biplot
Usage
VarBiplot(bi1, bi2, b0 = 0, xmin = -3, xmax = 3, ymin = -3, ymax
= 3, label = "Point", mode = "a", CexPoint = 0.8,
PchPoint = 1, Color = "blue", ticks = c(-3, -2.5, -2,
-1.5, -1, -0.5, 0.5, 1, 1.5, 2, 2.5, 3), ticklabels =
round(ticks, digits = 2), tl = 0.04, ts = "Complete",
Position = "Angle", AddArrow=FALSE, CexScale=0.8, ...)
Arguments
bi1 |
First component of the direction vector |
bi2 |
Second component of the direction vector |
b0 |
Constant for the regression adjusted biplots |
xmin |
Minimum value of the x axis |
xmax |
Maximum value of the x axis |
ymin |
Minimum value of the y axis |
ymax |
Maximum value of the y axis |
label |
Label of the variable |
mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
CexPoint |
Size for the symbols and labels of the variables |
PchPoint |
Symbols for the variable (when represented as a point) |
Color |
Color for the variable |
ticks |
Ticks when the variable is represented as a graded scale |
ticklabels |
Labels for the ticks when the variable is represented as a graded scale |
tl |
Thick length |
ts |
Size of the mark in the gradedv scale |
Position |
If the Position is "Angle" the label of the variable is placed using the angle of the vector |
AddArrow |
Add an arrow to the representation of other modes of the biplot. |
CexScale |
Sizes of the scales |
... |
Any other graphical parameters |
Details
See plot.PCA.Biplot
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
plot(bip)
Weighted Principal Coordinates Analysis
Description
Weighted Principal Coordinates Analysis
Usage
WeightedPCoA(Proximities,
weigths = matrix(1,dim(Proximities$Proximities)[1],1),
dimension = 2, tolerance=0.0001)
Arguments
Proximities |
A matrix containing the proximities among a set of objetcs |
weigths |
Weigths |
dimension |
Dimension of the solution |
tolerance |
Tolerance for the eigenvalues |
Details
Weighted Principal Coordinates Analysis
Value
data(spiders)
dist=BinaryProximities(spiders)
pco=WeightedPCoA(dist)
An object of class Principal.Coordinates
Author(s)
Jose Luis Vicente-Villardon
References
Gower, J. C. (2006) Similarity dissimilarity and Distance, measures of. Encyclopedia of Statistical Sciences. 2nd. ed. Volume 12. Wiley
Gower, J.C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53: 325-338.
J.R. Demey, J.L. Vicente-Villardon, M.P. Galindo, A.Y. Zambrano, Identifying molecular markers associated with classifications of genotypes by external logistic biplot, Bioinformatics 24 (2008) 2832.
Cuadras, C. M., Fortiana, J. Metric scaling graphical representation of Categorical Data. Proceedings of Statistics Day, The Center for Multivariate Analysis, Pennsylvania State University, Part 2, pp.1-27, 1995.
See Also
Examples
data(spiders)
dist=BinaryProximities(spiders)
pco=WeightedPCoA(dist)
Compares two binary logistic models
Description
Anova for comparing two binary logistic models
Usage
## S3 method for class 'RidgeBinaryLogistic'
anova(object, object2, ...)
Arguments
object |
The first model |
object2 |
The second model |
... |
Any additional arguments |
Details
Anova for comparing two binary logistic models
Value
The comparison of the two models.
Author(s)
Jose Luis Vicente Villardon
Examples
# Not yet
Diagonal matrix from a vector
Description
Creates a diagonal matrix from a vector
Usage
diagonal(d)
Arguments
d |
A numerical vector |
Value
A diagonal matrix wirh the values of vector in the diagonal a zeros elsewhere
Author(s)
Jose Luis Vicente Villardon
Examples
diag(c(1, 2, 3, 4, 5))
Connects two sets of points by lines
Description
Connects two sets of points by lines in a rowwise manner. Adapted from Graffelman(2013)
Usage
dlines(SetA, SetB, lin = "dotted", color = "black", ...)
Arguments
SetA |
First set of points |
SetB |
Second set of points |
lin |
Line style. |
color |
Line color |
... |
Any other graphical parameters |
Details
Connects two sets of points by lines
Value
NULL
Author(s)
Based on Graffelman (2013)
References
Jan Graffelman (2013). calibrate: Calibration of Scatterplot and Biplot Axes. R package version 1.7.2. http://CRAN.R-project.org/package=calibrate
Examples
## No examples
G inverse
Description
Calculates the g-inverse of a squared matrix using the eigen decomposition and removing the eigenvalues smaller than a tolerance.
Usage
ginv(X, tol = sqrt(.Machine$double.eps))
Arguments
X |
Matrix to calculate the g-inverse |
tol |
Tolerance. |
Details
The function is useful to avoid singularities.
Value
Returns the g-inverse
Author(s)
Jose Luis Vicente Villardon
Examples
data(iris)
x=as.matrix(iris[,1:4])
S= t(x)
ginv(S)
Logit function
Description
Calculates the logit of a probability
Usage
logit(p)
Arguments
p |
A probability |
Details
Calculates the logit of a probability
Value
The lo git of the provided probebility
Author(s)
Jose Luis Vicente Villardón
Matrix squared root
Description
Matrix square root of a matrix using the eigendecomposition.
Usage
matrixsqrt(S, tol = sqrt(.Machine$double.eps))
Arguments
S |
A squered matrix |
tol |
Tolerance for the igenvalues |
Details
Matrix square root of a matrix using the eigendecomposition and removing the eigenvalues smaller than a tolerance
Value
The matrix square root of the argument
Author(s)
Jose Luis Vicente Villardon
Examples
data(iris)
x=as.matrix(iris[,1:4])
S= t(x)
matrixsqrt(S)
Inverse of the Matrix squared root
Description
Inverse of the Matrix square root of a matrix using the eigendecomposition.
Usage
matrixsqrtinv(S, tol = sqrt(.Machine$double.eps))
Arguments
S |
A squered matrix |
tol |
Tolerance for the igenvalues |
Details
Inverse of the Matrix square root of a matrix using the eigendecomposition and removing the eigenvalues smaller than a tolerance
Value
The inverse matrix square root of the argument
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
data(iris)
x=as.matrix(iris[,1:4])
S= t(x)
matrixsqrtinv(S)
Moth data
Description
Moth data
Usage
data("moth")
Format
A data frame with 12 observations on the following 14 variables.
s1
a numeric vector
s2
a numeric vector
s3
a numeric vector
s4
a numeric vector
s5
a numeric vector
s6
a numeric vector
s7
a numeric vector
s8
a numeric vector
s9
a numeric vector
s10
a numeric vector
s11
a numeric vector
s12
a numeric vector
s13
a numeric vector
s14
a numeric vector
Details
Moth data
Source
Withaker
References
Application of the Parametric Bootstrap to Models that Incorporate a Singular Value Decomposition Luis Milan; Joe Whittaker Applied Statistics, Vol. 44, No. 1. (1995), pp. 31-49.
Examples
data(moth)
## maybe str(moth) ; plot(moth) ...
Matrix of ones
Description
Square matrix of ones
Usage
ones(n)
Arguments
n |
Order of the matrix |
Details
Square matrix of ones
Value
A matrix of ones of order n.
Author(s)
Jose Luis Vicente Villardon
Examples
ones(6)
Plots the results of a Binary Logistic Biplot
Description
Plots the results of a Binary Logistic Biplot
Usage
## S3 method for class 'Binary.Logistic.Biplot'
plot(x, F1 = 1, F2 = 2, ShowAxis = FALSE, margin = 0,
PlotVars = TRUE, PlotInd = TRUE, WhatRows = NULL, WhatCols = NULL,
LabelRows = TRUE, LabelCols = TRUE, ShowBox = FALSE, RowLabels = NULL,
ColLabels = NULL, RowColors = NULL, ColColors = NULL, Mode = "s",
TickLength = 0.01, RowCex = 0.8, ColCex = 0.8, SmartLabels = FALSE,
MinQualityRows = 0, MinQualityCols = 0, dp = 0, PredPoints = 0,
SizeQualRows = FALSE, SizeQualCols = FALSE, ColorQualRows = FALSE,
ColorQualCols = FALSE, PchRows = NULL, PchCols = NULL, PlotClus = FALSE,
TypeClus = "ch", ClustConf = 1, Significant = TRUE, alpha = 0.05,
Bonferroni = TRUE, PlotSupVars = TRUE, AbbreviateLabels = FALSE, MainTitle = TRUE, Title =
NULL, RemoveXYlabs = FALSE, CenterCex = 1.5, ...)
Arguments
x |
An object of class Binary.Logistic.Biplot |
F1 |
Dimension for the first axis of the representation. Default = 1 |
F2 |
Dimension for the second axis of the representation. Default = 2 |
ShowAxis |
Should the axis of the representation be shown? |
margin |
Margin of the plot as a percentage. It gets some space for the labels. |
PlotVars |
Should the variables be plotted? |
PlotInd |
Should the individuals be plotted? |
WhatRows |
What Rows should be plotted. A binary vector containing which rows (individuals) should be plotted (1) and which should not (0). |
WhatCols |
What Columns should be plotted. A binary vector containing which columns (variables) should be plotted (1) and which should not (0). |
LabelRows |
Should the individuals be labeled? |
LabelCols |
Should the individuals be labeled? |
ShowBox |
Should a box around the points be plotted? |
RowLabels |
A vector of row labels. If NULL the labels contained in the object will be used. |
ColLabels |
A vector of column labels. If NULL the labels contained in the object will be used. |
RowColors |
A vector of alternative row colors. |
ColColors |
A vector of alternative column colors. |
Mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
TickLength |
Length of the scale ticks for the biplot variables. |
RowCex |
Cex (Size) of the rows (marks and labels). Can be a single common size for all the points or a vector with individual sizes. |
ColCex |
Cex (Size) of the columns (marks and labels). Can be a single common size for all the points or a vector with individual sizes. |
SmartLabels |
Should the labels be placed in a smart way? |
MinQualityRows |
Minimum quality of the rows to be plotted. (Between 0 and 1) |
MinQualityCols |
Minimum quality of the columns to be plotted. (Between 0 and 1) |
dp |
A vector of variable indices to project all the individuals onto each variable of the vector. |
PredPoints |
A vector of row indices to project onto each variable. |
SizeQualRows |
Should the size of the Row points be related to its quality? |
SizeQualCols |
Should the size of the Column points be related to its quality? |
ColorQualRows |
Should the color of the Row points be related to its quality? |
ColorQualCols |
Should the color of the Column points be related to its quality? |
PchRows |
Marks for the rows (numbers). Can be a single common mark for all the points or a vector with individual marks. |
PchCols |
Marks for the columns (numbers). Can be a single common mark for all the points or a vector with individual marks. |
PlotClus |
Should the clusters be plotted? |
TypeClus |
Type of plot for the clusters. ("ch"- Convex Hull, "el"- Ellipse or "st"- Star) |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
Significant |
Should only the significant variables be plotted? |
alpha |
Signification level. |
Bonferroni |
Should the Bonferroni correction be used? |
PlotSupVars |
Should the Supplementary variables be plotted? |
AbbreviateLabels |
Should labels be abbreviated? |
MainTitle |
Should the mail Title be displayed? |
Title |
Title to display. |
RemoveXYlabs |
Should the axis labs be removed? |
CenterCex |
Size of the point for 0.5 probability. |
... |
Any other graphical parameter. |
Details
Plots a biplot for binary data. The Biplot for binary data is taken as the basis of the plot. If there are a mixture of different types of variables (binary, nominal, abundance, ...) are added to the biplot as supplementary parts.
There are several modes for plotting the biplot. "p".- Points (Rows and Columns are represented by points)
"a" .- Arrows (The traditional representation with points for rows and arrows for columns)
"b" .- The arrows for the columns are extended to both extremes of the plot and labeled outside the plot area.
"h" .- The arrows for the columns are extended to the positive extreme of the plot and labeled outside the plot area.
"ah" .- Same as arrows but labeled outside the plot area.
"s" .- The directions (or biplot axes) have a graded scale for prediction of the original values.
Value
The plot of the biplot.
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Análisis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
Examples
data(spiders)
X=Dataframe2BinaryMatrix(spiders)
logbip=BinaryLogBiplotGD(X,penalization=0.1)
plot(logbip, Mode="a")
summary(logbip)
Plot the solution of a Coorespondence Analysis
Description
Plots the solution of a Correspondence Analysis
Usage
## S3 method for class 'CA.sol'
plot(x, ...)
Arguments
x |
A CA.sol object |
... |
Any other biplot and graphical parameters |
Details
Plots the solution of a Correspondence Analysis
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
Add some references here
See Also
Examples
data(riano)
Sp=riano[,3:15]
cabip=CA(Sp)
plot(cabip)
Plots the solution of a Canonical Correspondence Analysisis
Description
Plots the solution of a Canonical Correspondence Analysisis using similar parameters to the continuous biplot
Usage
## S3 method for class 'CCA.sol'
plot(x, A1 = 1, A2 = 2, ShowAxis = FALSE, margin = 0,
PlotSites = TRUE, PlotSpecies = TRUE, PlotEnv = TRUE,
LabelSites = TRUE, LabelSpecies = TRUE, LabelEnv =
TRUE, TypeSites = "wa", SpeciesQuality = FALSE,
MinQualityVars = 0.3, dp = 0, pr = 0, PlotAxis =
FALSE, TypeScale = "Complete", ValuesScale =
"Original", mode = "a", CexSites = NULL, CexSpecies =
NULL, CexVar = NULL, ColorSites = NULL, ColorSpecies =
NULL, ColorVar = NULL, PchSites = NULL, PchSpecies =
NULL, PchVar = NULL, SizeQualSites = FALSE,
SizeQualSpecies = FALSE, SizeQualVars = FALSE,
ColorQualSites = FALSE, ColorQualSpecies = FALSE,
ColorQualVars = FALSE, SmartLabels = FALSE, ...)
Arguments
x |
The results of a CCA model |
A1 |
Dimension for the first axis |
A2 |
Dimension for the second axis |
ShowAxis |
Logical variable to control if the coordinate axes should appear in the plot. The default value is FALSE because for most of the biplots its presence is irrelevant. |
margin |
Margin for the labels in some of the biplot modes (percentage of the plot width). Default is 0. Increase the value if the labels are not completely plotted. |
PlotSites |
Should the sites be plotted? |
PlotSpecies |
Should the species be plotted? |
PlotEnv |
Should the environmental variables be plotted? |
LabelSites |
Labels for the sites |
LabelSpecies |
Labels for the species |
LabelEnv |
Labels for the environmental variables. |
TypeSites |
Type for the sites plot |
SpeciesQuality |
Quality for the species |
MinQualityVars |
Minimum quality to plot a variable |
dp |
A set of indices with the variables that will show the projections of the individuals. |
pr |
A set of indices with the individuals to show the projections on the variables. |
PlotAxis |
Should the axis be plotted? |
TypeScale |
Type of scale to use : "Complete", "StdDev" or "BoxPlot" |
ValuesScale |
Values to show on the scale: "Original" or "Transformed" |
mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
CexSites |
Size for the symbols and labels of the sites. Can be a single common size for all the points or a vector with individual sizes. |
CexSpecies |
Size for the symbols and labels of the species. Can be a single common size for all the points or a vector with individual sizes. |
CexVar |
Size for the symbols and labels of the variables. Can be a single common size for all the points or a vector with individual sizes. |
ColorSites |
Color for the symbols and labels of the sites. Can be a single common color for all the points or a vector with individual colors. |
ColorSpecies |
Color for the symbols and labels of the species. Can be a single common color for all the points or a vector with individual colors. |
ColorVar |
Color for the symbols and labels of the variables. Can be a single common color for all the points or a vector with individual colors. |
PchSites |
Symbol for the sites points. See |
PchSpecies |
Symbol for the species points. See |
PchVar |
Symbol for the variables points. See |
SizeQualSites |
Should the size of the site points be related to their qualities of representation (predictiveness)? |
SizeQualSpecies |
Should the size of the species points be related to their qualities of representation (predictiveness)? |
SizeQualVars |
Should the size of the variables points be related to their qualities of representation (predictiveness)? |
ColorQualSites |
Should the color of the sites points be related to their qualities of representation (predictiveness)? |
ColorQualSpecies |
Should the color of the species points be related to their qualities of representation (predictiveness)? |
ColorQualVars |
Should the color of the variables points be related to their qualities of representation (predictiveness)? |
SmartLabels |
Plot the labels in a smart way |
... |
Aditional graphical parameters. |
Details
The plotting procedure is similar to the one used for continuous biplots including the calibration of the environmental variables.
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
CCA
See Also
Examples
##---- Should be DIRECTLY executable !! ----
Plot of a Canonical Variate Analysis
Description
Plot of a Canonical Variate Analysis
Usage
## S3 method for class 'CVA'
plot(x, A1 = 1, A2 = 2, ...)
Arguments
x |
Object of class CVA |
A1 |
Dimension for the first axis of the representation |
A2 |
Dimension for the second axis of the representation |
... |
Additional arguments |
Details
Plot of a Canonical Variate Analysis
Value
Te Vanonical variate plot
Author(s)
Jose Luis Vicente Villardon
Plots a Canonical Biplot
Description
Plots a Canonical Biplot
Usage
## S3 method for class 'Canonical.Biplot'
plot(x, A1 = 1, A2 = 2, ScaleGraph = TRUE, PlotGroups =
TRUE, PlotVars = TRUE, PlotInd = TRUE, WhatInds =
NULL, WhatVars = NULL, WhatGroups = NULL, IndLabels =
NULL, VarLabels = NULL, GroupLabels = NULL,
AbbreviateLabels = FALSE, LabelInd = TRUE, LabelVars =
TRUE, CexGroup = 1, PchGroup = 16, margin = 0.1,
AddLegend = FALSE, ShowAxes = FALSE, LabelAxes =
FALSE, LabelGroups = TRUE, PlotCircle = TRUE,
ConvexHulls = FALSE, TypeCircle = "M", ColorGroups =
NULL, ColorVars = NULL, LegendPos = "topright",
ColorInd = NULL, voronoi = TRUE, mode = "a", TypeScale
= "Complete", ValuesScale = "Original", MinQualityVars
= 0, dpg = 0, dpi = 0, dp = 0, PredPoints = 0,
PlotAxis = FALSE, CexInd = NULL, CexVar = NULL, PchInd
= NULL, PchVar = NULL, ColorVar = NULL, ShowAxis =
FALSE, VoronoiColor = "black", ShowBox = FALSE,
ShowTitle = TRUE, PlotClus = FALSE, TypeClus = "ch",
ClustConf = 1, ClustCenters = FALSE, UseClusterColors
= TRUE, CexClustCenters = 1, ...)
Arguments
x |
An object of class "Canonical.Biplot" |
A1 |
Dimension for the first axis. 1 is the default. |
A2 |
Dimension for the second axis. 2 is the default. |
ScaleGraph |
Reescale the coordinates to optimal matching. |
PlotGroups |
Shoud the group centers be plotted? |
PlotVars |
Should the variables be plotted? |
PlotInd |
Should the individuals be plotted? |
WhatInds |
Logical vector to control what individuals (Rows) are plotted. (Can be also a binary vector) |
WhatVars |
Logical vector to control what variables (Columns) are plotted. (Can be also a binary vector) |
WhatGroups |
Logical vector to control what groups are plotted. (Can be also a binary vector) |
IndLabels |
A set of labels for the individuals. If NULL the default object labels are used |
VarLabels |
A set of labels for the variables. If NULL the default object labels are used |
GroupLabels |
A set of labels for the groups. If NULL the default object labels are used |
AbbreviateLabels |
Should labels be abbreviated? |
LabelInd |
Should the individuals be labeled? |
LabelVars |
Should the variables be labeled? |
CexGroup |
Sizes of the points for the groups |
PchGroup |
Markers for the group |
margin |
margin for the graph |
AddLegend |
Should a legend with the groups be added? |
ShowAxes |
Should outside axes be shown? |
LabelAxes |
Should outside axes be labelled? |
LabelGroups |
Should the groups be labeled? |
PlotCircle |
Should the confidence regions for the groups be plotted? |
ConvexHulls |
Should the convex hulls containing the individuals for each group be plotted? |
TypeCircle |
Type of confidence region: Univariate (U), Bonferroni(B), Multivariate (M) or Classical (C) |
ColorGroups |
User colors for the groups. Default colors will be used if NULL. |
ColorVars |
User colors for the variables. Default colors will be used if NULL. |
LegendPos |
Position of the legend. |
ColorInd |
User colors for the individuals. Default colors will be used if NULL. |
voronoi |
Should the voronoi diagram with the prediction regións for each group be plotted? |
mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
TypeScale |
Type of scale to use : "Complete", "StdDev" or "BoxPlot" |
ValuesScale |
Values to show on the scale: "Original" or "Transformed" |
MinQualityVars |
Minimum quality of representation for a variable to be plotted |
dpg |
A set of indices with the variables that will show the projections of the gorups |
dpi |
A set of indices with the individuasl that will show the projections on the variables |
dp |
A set of indices with the variables that will show the projections of the individuals |
PredPoints |
A vector with integers. The group centers listed in the vector are projected onto all the variables. |
PlotAxis |
Not Used |
CexInd |
Size of the points for individuals. |
CexVar |
Size of the points for variables. |
PchInd |
Marhers of the points for individuals. |
PchVar |
Markers of the points for variables. |
ColorVar |
Colors of the points for variables. |
ShowAxis |
Should axis scales be shown? |
VoronoiColor |
Color for the Voronoi diagram |
ShowBox |
Should a box around the poitns be plotted? |
ShowTitle |
Should the title be shown? |
PlotClus |
Should the clusters be plotted? |
TypeClus |
Type of plot for the clusters. ("ch"- Convex Hull, "el"- Ellipse or "st"- Star) |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
ClustCenters |
Should the cluster centers be plotted? |
UseClusterColors |
Should the cluster colors be used in the plot |
CexClustCenters |
Size of the cluster centres |
... |
Any other graphical parameters |
Details
The function plots the results of a Canononical Biplot. The coordinates for Groups, Individuals and Variables can be shown or not on the plot, each of the three can also be labeled separately. The are parameters to control the way each different set of coordinates is plotted and labeled.
There are several modes for plotting the biplot.
"p".- Points (Rows and Columns are represented by points)
"a" .- Arrows (The traditional representation with points for rows and arrows for columns)
"b" .- The arrows for the columns are extended to both extremes of the plot and labeled outside the plot area.
"h" .- The arrows for the columns are extended to the positive extreme of the plot and labeled outside the plot area.
"ah" .- Same as arrows but labeled outside the plot area.
"s" .- The directions (or biplot axes) have a graded scale for prediction of the original values.
The TypeScale argument applies only to the "s" mode. There are three types:
"Complete" .- An equally spaced scale covering the whole range of the data is calculates.
"StdDev" .- Mean with one, two and three stadard deviations
"BoxPlot" .- Box-Plot like Scale (Median, 25 and 75 percentiles, maximum and minimum values.)
The ValuesScale argument applies only to the "s" mode and controls if the labels show the Original ot Transformed values.
Some of the initial transformations are not compatible with some of the types of biplots and scales. For example, It is not possible to recover by projection the original values when you double centre de data. In that case you have the residuals for interaction and only the transformed values make sense.
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
Amaro, I. R., Vicente-Villardon, J. L., & Galindo-Villardon, M. P. (2004). Manova Biplot para arreglos de tratamientos con dos factores basado en modelos lineales generales multivariantes. Interciencia, 29(1), 26-32.
Varas, M. J., Vicente-Tavera, S., Molina, E., & Vicente-Villardon, J. L. (2005). Role of canonical biplot method in the study of building stones: an example from Spanish monumental heritage. Environmetrics, 16(4), 405-419.
Santana, M. A., Romay, G., Matehus, J., Villardon, J. L., & Demey, J. R. (2009). simple and low-cost strategy for micropropagation of cassava (Manihot esculenta Crantz). African Journal of Biotechnology, 8(16).
Examples
data(wine)
X=wine[,4:21]
canbip=CanonicalBiplot(X, group=wine$Group)
plot(canbip, TypeCircle="U")
Plots a Canonical Distance Analysis
Description
Plots a Canonical Distance Analysis
Usage
## S3 method for class 'CanonicalDistanceAnalysis'
plot(x, A1 = 1, A2 = 2, ScaleGraph = TRUE,
ShowAxis = FALSE, ShowAxes = FALSE, LabelAxes = TRUE, margin = 0.1,
PlotAxis = FALSE, ShowBox = TRUE, PlotGroups = TRUE, LabelGroups = TRUE,
CexGroup = 1.5, PchGroup = 16, ColorGroup = NULL, voronoi = TRUE,
VoronoiColor = "black", PlotInd = TRUE, LabelInd = TRUE, CexInd = 0.8,
PchInd = 3, ColorInd = NULL, WhatInds = NULL, IndLabels = NULL,
PlotVars = TRUE, LabelVar = TRUE, CexVar = NULL, PchVar = NULL,
ColorVar = NULL, WhatVars = NULL, VarLabels = NULL, mode = "a",
TypeScale = "Complete", ValuesScale = "Original", SmartLabels = TRUE,
AddLegend = TRUE, LegendPos = "topright", PlotCircle = TRUE,
ConvexHulls = FALSE, TypeCircle = "M", MinQualityVars = 0, dpg = 0,
dpi = 0, PredPoints = 0, PlotClus = TRUE, TypeClus = "ch", ClustConf = 1,
CexClustCenters = 1, ClustCenters = FALSE, UseClusterColors = TRUE, ...)
Arguments
x |
An object of class "CanonicalDistanceAnalysis" |
A1 |
Dimension for the first axis. 1 is the default. |
A2 |
Dimension for the second axis. 2 is the default. |
ScaleGraph |
Reescale the coordinates to optimal matching. |
ShowAxis |
Should the axis be shown? |
ShowAxes |
Not used |
LabelAxes |
Shoud the axis be labelled? |
margin |
Margin of the plot |
PlotAxis |
Should the axis be plotted? |
ShowBox |
Show a box around the plot |
PlotGroups |
Should the groups be plotted? |
LabelGroups |
Should the groups be labelled? |
CexGroup |
Sizes for the groups |
PchGroup |
Marks for the groups |
ColorGroup |
Colors for the groups |
voronoi |
Should a voronoi diagram separating the groups be plotted? |
VoronoiColor |
Color for the voronoi diagram |
PlotInd |
Should the individuals be plotted? |
LabelInd |
Should the individuals be labelled? |
CexInd |
Sizes for the individuals |
PchInd |
Marks for the individuals |
ColorInd |
Colors for the individuals |
WhatInds |
What indivduals are plotted |
IndLabels |
Labels for the individuals |
PlotVars |
Should the variables be plotted? |
LabelVar |
Should the variables be labelled? |
CexVar |
Sizes for the variables |
PchVar |
Marks for the variables |
ColorVar |
User colors for the variables. Default colors will be used if NULL. |
WhatVars |
What Variables are plotted |
VarLabels |
User labels for the variables |
mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
TypeScale |
Type of scale to use : "Complete", "StdDev" or "BoxPlot" |
ValuesScale |
Values to show on the scale: "Original" or "Transformed" |
SmartLabels |
Plot the labels in a smart way |
AddLegend |
Should a legend be added? |
LegendPos |
Position of the legend |
PlotCircle |
Should the confidence regions for the groups be plotted? |
ConvexHulls |
Should the convex hulls containing the individuals for each group be plotted? |
TypeCircle |
Type of confidence region: Univariate (U), Bonferroni(B), Multivariate (M) or Classical (C) |
MinQualityVars |
Minimum quality of representation for a variable to be plotted |
dpg |
A set of indices with the variables that will show the projections of the gorups |
dpi |
A set of indices with the individuasl that will show the projections on the variables |
PredPoints |
A vector with integers. The group centers listed in the vector are projected onto all the variables. |
PlotClus |
Should the clusters be plotted? |
TypeClus |
Type of plot for the clusters. ("ch"- Convex Hull, "el"- Ellipse or "st"- Star) |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
CexClustCenters |
SIze of the cluster centers. |
ClustCenters |
Should the cluster centers be plotted? |
UseClusterColors |
Should the cluster colors be used in the plot |
... |
Any other graphical parameters |
Details
Plots a Canonical Distance Analysis
Value
The plot of a Canonical Distance Analysis
Author(s)
Jose Luis Vicente Villardon
References
Gower, J. C. and Krzanowski, W. J. (1999). Analysis of distance for structured multivariate data and extensions to multivariate analysis of variance. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(4):505-519.
See Also
Examples
# Not yet
Plots a biplot for continuous data.
Description
Plots a biplot for continuous data.
Usage
## S3 method for class 'ContinuousBiplot'
plot(x, A1 = 1, A2 = 2, ShowAxis = FALSE, margin = 0,
PlotVars = TRUE, PlotInd = TRUE, WhatInds = NULL,
WhatVars = NULL, LabelVars = TRUE, LabelInd = TRUE,
IndLabels = NULL, VarLabels = NULL, mode = "a", CexInd
= NULL, CexVar = NULL, ColorInd = NULL, ColorVar =
NULL, LabelPos = 1, SmartLabels = FALSE,
AbbreviateLabels = FALSE, MinQualityInds = 0,
MinQualityVars = 0, dp = 0, PredPoints = 0, PlotAxis =
FALSE, TypeScale = "Complete", ValuesScale =
"Original", SizeQualInd = FALSE, SizeQualVars = FALSE,
ColorQualInd = FALSE, ColorQualVars = FALSE, PchInd =
NULL, PchVar = NULL, PlotClus = FALSE, TypeClus =
"ch", ClustConf = 1, ClustLegend = FALSE,
ClustLegendPos = "topright", ClustCenters = FALSE,
UseClusterColors = TRUE, CexClustCenters = 1,
PlotSupVars = TRUE, SupMode = "a", ShowBox = FALSE,
nticks = 5, NonSelectedGray = FALSE, PlotUnitCircle =
TRUE, PlotContribFA = TRUE, AddArrow = FALSE,
ColorSupContVars = "red", ColorSupBinVars = "red",
ColorSupOrdVars = "red", ModeSupContVars="a",
ModeSupBinVars="a", ModeSupOrdVars="a",
WhatSupBinVars = NULL, Title = NULL, Xlab = NULL,
Ylab = NULL, add = FALSE, PlotTrajVars = FALSE,
PlotTrajInds = FALSE, LabelTraj = "end", Limits = NULL,
PlotSupInds = FALSE, WhatSupInds = NULL,
ColorSupInd = "black", CexSupInd = 0.8, PchSupInd =
16, LabelSupInd = TRUE, PredSupPoints = 0, CexScale =
0.5, ...)
Arguments
x |
An object of class "Biplot" |
A1 |
Dimension for the first axis. 1 is the default. |
A2 |
Dimension for the second axis. 2 is the default. |
ShowAxis |
Logical variable to control if the coordinate axes should appear in the plot. The default value is FALSE because for most of the biplots its presence is irrelevant. |
margin |
Margin for the labels in some of the biplot modes (percentage of the plot width). Default is 0. Increase the value if the labels are not completely plotted. |
PlotVars |
Logical to control if the Variables (Columns) are plotted. |
PlotInd |
Logical to control if the Individuals (Rows) are plotted. |
WhatInds |
Logical vector to control what individuals (Rows) are plotted. (Can be also a binary vector) |
WhatVars |
Logical vector to control what variables (Columns) are plotted. (Can be also a binary vector) |
LabelVars |
Logical to control if the labels for the Variables are shown |
LabelInd |
Logical to control if the labels for the individuals are shown |
IndLabels |
A set of labels for the individuals. If NULL the default object labels are used |
VarLabels |
A set of labels for the variables. If NULL the default object labels are used |
mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
CexInd |
Size for the symbols and labels of the individuals. Can be a single common size for all the points or a vector with individual sizes. |
CexVar |
Size for the symbols and labels of the variables. Can be a single common size for all the points or a vector with individual sizes. |
ColorInd |
Color for the symbols and labels of the individuals. Can be a single common color for all the points or a vector with individual colors. |
ColorVar |
Color for the symbols and labels of the variables. Can be a single common color for all the points or a vector with individual colors. |
LabelPos |
Position of the labels in relation to the point. (Se the graphical parameter |
SmartLabels |
Plot the labels in a smart way |
AbbreviateLabels |
Should labels be abbreviated? |
MinQualityInds |
Minimum quality of representation for an individual to be plotted. |
MinQualityVars |
Minimum quality of representation for a variable to be plotted. |
dp |
A set of indices with the variables that will show the projections of the individuals. |
PredPoints |
A vector with integers. The row points listed in the vector are projected onto all the variables. |
PlotAxis |
Not Used |
TypeScale |
Type of scale to use : "Complete", "StdDev" or "BoxPlot" |
ValuesScale |
Values to show on the scale: "Original" or "Transformed" |
SizeQualInd |
Should the size of the row points be related to their qualities of representation (predictiveness)? |
SizeQualVars |
Should the size of the column points be related to their qualities of representation (predictiveness)? |
ColorQualInd |
Should the color of the row points be related to their qualities of representation (predictiveness)? |
ColorQualVars |
Should the color of the column points be related to their qualities of representation (predictiveness)? |
PchInd |
Symbol for the row points. See |
PchVar |
Symbol for the column points. See |
PlotClus |
Should the clusters be plotted? |
TypeClus |
Type of plot for the clusters. ("ch"- Convex Hull, "el"- Ellipse or "st"- Star) |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
ClustLegend |
Should a legend for the clusters be plotted? Default FALSE |
ClustLegendPos |
Position of the legend for the clusters. Default "topright" |
ClustCenters |
Should the cluster centers be plotted |
UseClusterColors |
Should the cluster colors be used in the plot |
CexClustCenters |
Size of the cluster centres |
PlotSupVars |
Should the supplementary variables be plotted? |
SupMode |
Mode of the supplementary variables. |
ShowBox |
Should a box around the poitns be plotted? |
nticks |
Number of ticks for the representation of the variables |
NonSelectedGray |
The nonselected individuals and variables aplotted in light gray colors |
PlotUnitCircle |
Plot the unit circle in the biplot for a Factor Analysis in which the lenght of the column arrows is smaller than 1 and is the quality of representation. |
PlotContribFA |
Plot circles in the biplot for a Factor Analysis with different values of the quality of representation. |
AddArrow |
Add an arrow to the representation of other modes of the biplot. |
ColorSupContVars |
Colors for the continuous supplementary variables. |
ColorSupBinVars |
Colors for the binary supplementary variables. |
ColorSupOrdVars |
Colors for the ordinal supplementary variables. |
ModeSupContVars |
Mode for the continuous supplementary variables. |
ModeSupBinVars |
Mode for the binary supplementary variables. |
ModeSupOrdVars |
Mode for the ordinal supplementary variables. |
WhatSupBinVars |
What supplementary binary variables should be plotted? |
Title |
Title of the plot. |
Xlab |
Label for the X axis |
Ylab |
Label for the Y axis |
add |
Should the plot be added to an existing plot? |
PlotTrajVars |
Plot trajectories for the variables (when appropriate)? |
PlotTrajInds |
Plot trajectories for the individuals (when appropriate)? |
LabelTraj |
Label trajectories for the variables (when appropriate)? |
Limits |
Limits of the axis for the plot |
PlotSupInds |
Should the supplementary individuals be plotted? |
WhatSupInds |
What supplementary individuals are going to be plotted |
ColorSupInd |
Colors for the supplementary individuals |
CexSupInd |
Sizes for the supplementary individuals |
PchSupInd |
Symbols for the supplementary individuals |
LabelSupInd |
Labels for the supplementary individuals |
PredSupPoints |
Predictions for the supplementary individuals |
CexScale |
Sizes of the scales |
... |
Any other graphical parameters. |
Details
Plots a biplot for continuous data. The Biplot for continuous data is taken as the basis of the plot. If there are a mixture of different types of variables (binary, nominal, abundance, ...) are added to the biplot as supplementary parts.
There are several modes for plotting the biplot. "p".- Points (Rows and Columns are represented by points)
"a" .- Arrows (The traditional representation with points for rows and arrows for columns)
"b" .- The arrows for the columns are extended to both extremes of the plot and labeled outside the plot area.
"h" .- The arrows for the columns are extended to the positive extreme of the plot and labeled outside the plot area.
"ah" .- Same as arrows but labeled outside the plot area.
"s" .- The directions (or biplot axes) have a graded scale for prediction of the original values.
The TypeScale argument applies only to the "s" mode. There are three types:
"Complete" .- An equally spaced scale covering the whole range of the data is calculates.
"StdDev" .- Mean with one, two and three stadard deviations
"BoxPlot" .- Box-Plot like Scale (Median, 25 and 75 percentiles, maximum and minimum values.)
The ValuesScale argument applies only to the "s" mode and controls if the labels show the Original ot Transformed values.
Some of the initial transformations are not compatible with some of the types of biplots and scales. For example, It is not possible to recover by projection the original values when you double centre de data. In that case you have the residuals for interaction and only the transformed values make sense.
It is possible to associate the color and the size of the points with the quality of representation. Bigger points correspond to better representation quality.
Value
No value Returned
Author(s)
Jose Luis Vicente Villardon
References
Gabriel, K. R. (1971). The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58(3), 453-467.
Galindo Villardon, M. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio. 1986, vol. 10, num. 1.
Vicente-Villardon, J. L., Galindo Villardon, M. P., & Blazquez Zaballos, A. (2006). Logistic biplots. Multiple correspondence analysis and related methods. London: Chapman & Hall, 503-521.
Gower, J. C., & Hand, D. J. (1995). Biplots (Vol. 54). CRC Press.
Gower, J. C., Lubbe, S. G., & Le Roux, N. J. (2011). Understanding biplots. John Wiley & Sons.
Blasius, J., Eilers, P. H., & Gower, J. (2009). Better biplots. Computational Statistics & Data Analysis, 53(8), 3145-3158.
Examples
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
plot(bip, mode="s", margin=0.2, ShowAxis=FALSE)
Plots an External Logistic Biplot for binary data
Description
Plot of an External Binary Logistic Biplot with many arguments controling different aspects of the representation
Usage
## S3 method for class 'External.Binary.Logistic.Biplot'
plot(x, F1 = 1, F2 = 2,
ShowAxis = FALSE, margin = 0.1,
PlotVars = TRUE, PlotInd = TRUE, WhatRows = NULL,
WhatCols = NULL, LabelRows = TRUE, LabelCols = TRUE,
RowLabels = NULL, ColLabels = NULL, RowColors = NULL,
ColColors = NULL, Mode = "s", TickLength = 0.01,
RowCex = 0.8, ColCex = 0.8, SmartLabels = FALSE,
MinQualityRows = 0, MinQualityCols = 0, dp = 0,
PredPoints = 0, SizeQualRows = FALSE, ShowBox = FALSE,
SizeQualCols = FALSE, ColorQualRows = FALSE,
ColorQualCols = FALSE, PchRows = NULL, PchCols = NULL,
PlotClus = FALSE, TypeClus = "ch", ClustConf = 1,
Significant = FALSE, alpha = 0.05, Bonferroni = FALSE,
PlotSupVars = TRUE, ...)
Arguments
x |
An object of type |
F1 |
Latent factor to represent at the X axis |
F2 |
Latent factor to represent at the Y axis |
ShowAxis |
Should the axis be plotted? |
margin |
Margin for the labels in some of the biplot modes (percentage of the plot width). Default is 0. Increase the value if the labels are not completely plotted. |
PlotVars |
Should Variables be plotted |
PlotInd |
Should Individuals be plotted |
WhatRows |
A binary vector (0 and 1) that indicates if each individual row should be plotted or not |
WhatCols |
A binary vector (0 and 1) that indicates if each individual column should be plotted or not |
LabelRows |
Should Variables be labelled |
LabelCols |
Should Individuals be labelled |
RowLabels |
A vector of Labels for the rows if you do not want to use the data labels |
ColLabels |
A vector of Labels for the columns if you do not want to use the data labels |
RowColors |
A vector of colors for the rows |
ColColors |
A vector of colors for the rows |
Mode |
Mode of the biplot: "p", "a", "b", "ah" and "s". See details. |
TickLength |
Lenght of the tick marks. Depends on the scale of the graph. |
RowCex |
A scalar or a vector containing the sizes of the poitns ans labels for the rows. Default value is 0.8 if the sizes are not provided. |
ColCex |
A scalar or a vector containing the sizes of the poitns ans labels for the columns. Default value is 0.8 if the sizes are not provided. |
SmartLabels |
Plot the labels in a smart way |
MinQualityRows |
Minimum quality of representation for a row or individual to be plotted |
MinQualityCols |
Minimum quality of representation for a column or variable to be plotted |
dp |
"Drop Points" on the variables, a vector with integers. The row points are projected on the directions of the variables listed in the vector. |
PredPoints |
A vector with integers. The row points listed in the vector are projected onto all the variables. |
SizeQualRows |
Should the size of the row points be related to their qualities of representation (predictiveness)? |
ShowBox |
Should abox around the point be displayed? |
SizeQualCols |
Should the size of the column points be related to their qualities of representation (predictiveness)? |
ColorQualRows |
Should the color of the row points be related to their qualities of representation (predictiveness)? |
ColorQualCols |
Should the color of the column points be related to their qualities of representation (predictiveness)? |
PchRows |
Symbol for the row points. See |
PchCols |
Symbol for the column points. See |
PlotClus |
Should the clusters be plotted? |
TypeClus |
Type of plot for the clusters. ("ch"- Convex Hull, "el"- Ellipse or "st"- Star) |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
Significant |
If TRUE, only the significant variables are plotted |
alpha |
Significance Level |
Bonferroni |
Should the Bonferroni correction be used |
PlotSupVars |
Should supplementary variables be plotted |
... |
Any other graphical parameter you want to use |
Details
The logistic regression equation predicts the probability that a caracter will be present in an individual. Geometrically the y´s can be represented as point in the reduced dimension space and the b's are the vectors showing the directions that best predict the probability of presence of each allele . For a com-plete explanation of the geometrical properties of the ELB see Vicente-Villardón et al (2006). The prediction of the probabilities is made in the same way as in a linear Biplot, i. e., the projection of a genotype point on the direction of an variable vector predicts the probability of presence of that variable in the individual. To facilitate the interpretation of the graph, fixed prediction probabilities points are situated on each allele vector. To simplify the graph, in our ap-plication, a vector joining the points for 0.5 and 0.75 are placed; this shows the cut point for prediction of presence and the direction of increasing probabilities. The length of the vector can be interpreted as an inverse measure of the discriminatory power of the alleles or bands, in the sense that shorter vectors correspond to alleles that better differentiate individuals. Two alleles pointing in the same direction are highly correlated, two alleles pointing in opposite directions are negatively correlated, and two alleles forming an angle close to 90º are not correlated. A more complete scale with probabilities from 0.1 to 0.9 can also be plotted with this function. For each variable, the ordination diagram can be divided into two separate regions predicting presence or absence, the two regions are separated by the line that is perpendicular to the variable vector in the Biplot and cuts the vector in the point predicting 0.5. The variables associated to the configuration are those that predict the presences adequately. In a practical situation not all the variables are associated to the ordination. Due to the high number usually studied, it is convenient to situate on the graph only those that are related to the configuration, i. e., those that have an adequate goodness of fit after adjusting the logistic regression.
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
Demey, J., Vicente-Villardon, J. L., Galindo, M.P. AND Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24): 2832-2838.
Vicente-Villardon, J. L., Galindo, M. P. and Blazquez, A. (2006) Logistic Biplots. In Multiple Correspondence Analysis And Related Methods. Grenacre, M & Blasius, J, Eds, Chapman and Hall, Boca Raton.
See Also
Examples
data(spiders)
dist=BinaryProximities(spiders)
pco=PrincipalCoordinates(dist)
pcobip=ExternalBinaryLogisticBiplot(pco)
plot(pcobip, Mode="s")
pcobip=AddCluster2Biplot(pcobip, NGroups=3, ClusterType="hi")
op <- par(mfrow=c(1,2))
plot(pcobip, Mode="s", PlotClus = TRUE)
plot(pcobip$Dendrogram)
par(op)
Plot the results of Model-Based Gaussian Clustering algorithms
Description
PLots an object of type MGC (Model-based Gaussian Clustering)
Usage
## S3 method for class 'MGC'
plot(x, vars = NULL, groups = x$Classification, CexPoints = 0.2, Confidence = 0.95, ...)
Arguments
x |
An object of type MGC |
vars |
A subset of indices of the variables to be plotted |
groups |
A factor containing groups to represent. Usually the clusters obtained from the algorithm. |
CexPoints |
Size of the points. |
Confidence |
Confidence of the ellipses |
... |
Anay additional graphical parameters |
Details
PLots an object of type MGC (Model-based Gaussian Clustering) using a splom plot.
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
Examples
data(iris)
Plots an ordinal Logistic Biplot
Description
Plots an ordinal Logistic Biplot
Usage
## S3 method for class 'Ordinal.Logistic.Biplot'
plot(x, A1 = 1, A2 = 2,
ShowAxis = FALSE, margin = 0, PlotVars = TRUE, PlotInd = TRUE,
LabelVars = TRUE, LabelInd = TRUE, mode = "a", CexInd = NULL,
CexVar = NULL, ColorInd = NULL, ColorVar = NULL, SmartLabels = TRUE,
MinQualityVars = 0, dp = 0, PredPoints = 0, PlotAxis = FALSE,
TypeScale = "Complete", ValuesScale = "Original",
SizeQualInd = FALSE, SizeQualVars = FALSE, ColorQualInd = FALSE,
ColorQualVars = FALSE, PchInd = NULL, PchVar = NULL,
PlotClus = FALSE, TypeClus = "ch", ClustConf = 1,
ClustCenters = FALSE, UseClusterColors = TRUE, ClustLegend = TRUE,
ClustLegendPos = "topright", TextVarPos = 1, PlotSupVars = FALSE,...)
Arguments
x |
Plots and object of type "Ordinal.Logistic.Biplot" |
A1 |
First dimension to plot |
A2 |
Second dimension to plot |
ShowAxis |
Should the axis be shown |
margin |
Margin for the graph (in order to have space for the variable levels) |
PlotVars |
Should the variables be plotted? |
PlotInd |
Should the individuals be plotted? |
LabelVars |
Should the variables be labelled? |
LabelInd |
Should the variables be labelled? |
mode |
Mode of the biplot (see the classical biplot) |
CexInd |
Type of marker used for the individuals |
CexVar |
Type of marker used for the variables |
ColorInd |
Colors used for the individuals |
ColorVar |
Colors used for the cariables |
SmartLabels |
Should smart placement for the labels be used? |
MinQualityVars |
Minimum quality of representation for a variable to be displayed |
dp |
Set of variables in which the individuals are projected |
PredPoints |
Set of points thet will be projected on all the variables |
PlotAxis |
Should the axis be plotted? |
TypeScale |
See continuous biplots |
ValuesScale |
See continuous biplots |
SizeQualInd |
Should the size of the labels and points be related to the quality of representation for individuals? |
SizeQualVars |
Should the size of the labels and points be related to the quality of representation for variables? |
ColorQualInd |
Should the intensity of the color of the labels and points be related to the quality of representation for individuals? |
ColorQualVars |
Should the intensity of the color of the labels and points be related to the quality of representation for variables? |
PchInd |
Markers for the individuals |
PchVar |
Markers for the individuals |
PlotClus |
Should the added clusters for the individuals be plotted? |
TypeClus |
Type of plot for the clusters. The types are "ch", "el" and "st" for "Convex Hull", "Ellipse" and "Star" repectively. |
ClustConf |
Confidence level for the cluster |
ClustCenters |
Should the centers of the clsters be plotted |
UseClusterColors |
Should the colors of the clusters be used to plot the individuals. |
ClustLegend |
Should a legend for the clusters be added? |
ClustLegendPos |
Position of the legend |
TextVarPos |
Position of the labels for the variables |
PlotSupVars |
Should the supplementary variables be plotted |
... |
Any other aditional parameters |
Details
Plots an ordinal Logistic Biplot
Value
The plot ....
Author(s)
Jose Luis Vicente Villardon
References
Vicente-Villardón, J. L., & Sánchez, J. C. H. (2014). Logistic Biplots for Ordinal Data with an Application to Job Satisfaction of Doctorate Degree Holders in Spain. arXiv preprint arXiv:1405.0294.
See Also
Examples
data(Doctors)
olb = OrdLogBipEM(Doctors,dim = 2, nnodes = 10, initial=4, tol = 0.001,
maxiter = 100, penalization = 0.1, show=TRUE)
plot(olb, mode="s", ColorInd="gray", ColorVar=1:5)
Plots a Principal Component Analysis
Description
Plots the results of a Principal Component Analysis.
Usage
## S3 method for class 'PCA.Analysis'
plot(x, A1 = 1, A2 = 2, CorrelationCircle = FALSE, ...)
Arguments
x |
The object with the results of a PCA |
A1 |
Dimension for the first axis of the representation |
A2 |
Dimension for the second axis of the representation |
CorrelationCircle |
Should the correlation circle be plotted? If false the scores plot is done. |
... |
Any other arguments of the function plot.ContinuousBiplot |
Details
Plots theresults of a Principal Component Analysis. The plot can be the correlation circle containing the correlations of the variables with the components or a plot of the scores of the individuals.
Value
The PCA plot.
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
# Not yet
Plots the Bootstrap information for Principal Components Analysis (PCA)
Description
Plots an object of class "PCA.Bootstrap"
Usage
## S3 method for class 'PCA.Bootstrap'
plot(x, Eigenvalues = TRUE,
Inertia = FALSE, EigenVectors = TRUE, Structure = TRUE,
Squared = TRUE, Scores = TRUE, ColorInd = "black", TypeScores = "ch", ...)
Arguments
x |
An object of class "PCA.Bootstrap" |
Eigenvalues |
Should the information for the eigenvalues be plotted? |
Inertia |
Should the information for the inertia be plotted? |
EigenVectors |
Should the information for the eigenvectors be plotted? |
Structure |
Should the information for the correlations (variables-dimensions) be plotted? |
Squared |
Should the information for the correlations (variables-dimensions) be plotted? |
Scores |
Should the row (individual) scores be plotted? |
ColorInd |
Colors for the rows |
TypeScores |
Type of plot for the scores |
... |
Any other graphical argument |
Details
For each parameter, box-plots and confidence intervals are plotted. The initial estimator and the bootstrap mean are plotted.
For the eigenvectors, loadings and contributions, the graph is divided into as many rows as dimensions, each row contains a plot of the hole set of variables.
The scores are plotted on a two dimensional
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
Daudin, J. J., Duby, C., & Trecourt, P. (1988). Stability of principal component analysis studied by the bootstrap method. Statistics: A journal of theoretical and applied statistics, 19(2), 241-258.
Chateau, F., & Lebart, L. (1996). Assessing sample variability in the visualization techniques related to principal component analysis: bootstrap and alternative simulation methods. COMPSTAT, Physica-Verlag, 205-210.
Babamoradi, H., van den Berg, F., & Rinnan, Å. (2013). Bootstrap based confidence limits in principal component analysis: A case study. Chemometrics and Intelligent Laboratory Systems, 120, 97-105.
Fisher, A., Caffo, B., Schwartz, B., & Zipunnikov, V. (2016). Fast, exact bootstrap principal component analysis for p> 1 million. Journal of the American Statistical Association, 111(514), 846-860.
See Also
Examples
X=wine[,4:21]
grupo=wine$Group
rownames(X)=paste(1:45, grupo, sep="-")
pcaboot=PCA.Bootstrap(X, dimens=2, Scaling = "Standardize columns", B=1000)
plot(pcaboot, ColorInd=as.numeric(grupo))
summary(pcaboot)
Plots an object of class PCoABootstrap
Description
Plots an object of class PCoABootstrap
Usage
## S3 method for class 'PCoABootstrap'
plot(x, F1=1, F2=2, Move2Center=TRUE,
BootstrapPlot="Ellipse", confidence=0.95, Colors=NULL, ...)
Arguments
x |
An object of class "PCoABootstrap" |
F1 |
First dimension to plot |
F2 |
Second dimension to plot |
Move2Center |
Translate the ellipse center to the coordinates |
BootstrapPlot |
Type of Bootstrap plot to draw: "Ellipse", "ConvexHull", "Star" |
confidence |
Confidence level for the bootstrap plot |
Colors |
Colors of the objects |
... |
Additional parameters for graphical representations |
Details
Draws the bootstrap confidence regions for the coordinates of the points obtained from a Principal Coodinates Analysis
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
J.R. Demey, J.L. Vicente-Villardon, M.P. Galindo, A.Y. Zambrano, Identifying molecular markers associated with classifications of genotypes by external logistic biplot, Bioinformatics 24 (2008) 2832.
Examples
data(spiders)
Dis=BinaryProximities(spiders)
pco=PrincipalCoordinates(Dis, Bootstrap=TRUE, BootstrapType="Products")
plot(pco, Bootstrap=TRUE)
Plots an object of class Principal.Coordinates
Description
Plots an object of class Principal.Coordinates
Usage
## S3 method for class 'Principal.Coordinates'
plot(x, A1 = 1, A2 = 2, LabelRows = TRUE,
WhatRows = NULL, RowCex = 1, RowPch = 16, Title = "", RowLabels = NULL,
RowColors = NULL, ColColors = NULL, ColLabels = NULL, SizeQualInd = FALSE,
SmartLabels = TRUE, ColorQualInd = FALSE, ColorQual = "black", PlotSup = TRUE,
Bootstrap = FALSE, BootstrapPlot = c("Ellipse", "CovexHull", "Star"),
margin = 0, PlotClus = FALSE, TypeClus = "ch", ClustConf = 1,
CexClustCenters = 1, LegendClust = TRUE, ClustCenters = FALSE,
UseClusterColors = TRUE, ShowAxis = FALSE, PlotBinaryMeans = FALSE,
MinIncidence = 0, ShowBox = FALSE, ColorSupContVars = NULL,
ColorSupBinVars = NULL, ColorSupOrdVars = NULL, TypeScale = "Complete",
SupMode = "s", PlotSupVars = FALSE, ...)
Arguments
x |
Object of class "Principal.Coordinates" |
A1 |
First dimenssion of the plot |
A2 |
Second dimenssion of the plot |
LabelRows |
Controls if the points are labelled. Usually TRUE. |
WhatRows |
What Rows to plot. A vector of 0/1 elements. If NULL all rows are plotted |
RowCex |
Size of the points. Can be a single number or a vector. |
RowPch |
Symbols for the points. |
Title |
Title for the graph |
RowLabels |
Labels for the rows. If NULL row names of the data matrix are used. |
RowColors |
Colors for the rows. If NULL row deafault colors are assigned. Can be a single value or avector of colors. |
ColColors |
Colors for the columns (Variables) |
ColLabels |
Labels for the columns (Variables) |
SizeQualInd |
Controls if the size of points depends on the quality of representation. |
SmartLabels |
Controls the way labels are plotted on the graph. If TRUE labels for points with positive x values are placed to the right of the point and labels for points with negative values to the left |
ColorQualInd |
Controls if the color of the points depends on the quality of representation. |
ColorQual |
Darker color for the quality scale. |
PlotSup |
Controls if the supplementary points are plotted. |
Bootstrap |
Controls if the bootstrap points are plotted. |
BootstrapPlot |
Type of plot of the Bootstrap Information. The types are "Ellipse", "CovexHull" or "Star". |
margin |
Margin for the graph. |
PlotClus |
Should the clusters be plotted? |
TypeClus |
Type of plot for the clusters. ("ch"- Convex Hull, "el"- Ellipse or "st"- Star) |
ClustConf |
Percent of points included in the cluster. only the ClusConf percent of the points nearest to the center will be used to calculate the cluster |
CexClustCenters |
Size of the cluster centers |
LegendClust |
Legends for the clusters |
ClustCenters |
Should the cluster centers be plotted |
UseClusterColors |
Should the cluster colors be used in the plot |
ShowAxis |
Logical variable to control if the coordinate axes should appear in the plot. The default value is FALSE because for most of the biplots its presence is irrelevant. |
PlotBinaryMeans |
Plot the mean of the presence points for each variable |
MinIncidence |
Minimum incidence to keep a variable |
ShowBox |
Should a box around the poitns be plotted? |
ColorSupContVars |
Colors for the supplementary continuous variables |
ColorSupBinVars |
Colors for the supplementary binary variables |
ColorSupOrdVars |
Colors for the supplementary ordinal variables |
TypeScale |
Type of scales for the plot |
SupMode |
Mode of the supplementary variables |
PlotSupVars |
Should the supplementary variables be plotted |
... |
Additional parameters for graphical representations |
Details
Graphical representation of an Principal coordinates Analysis controlling visual aspects of the plot as colors, symbols or sizes of the points.
Value
No value is returned
Author(s)
Jose Luis Vicente-Villardon
References
J.R. Demey, J.L. Vicente-Villardon, M.P. Galindo, A.Y. Zambrano, Identifying molecular markers associated with classifications of genotypes by external logistic biplot, Bioinformatics 24 (2008) 2832.
See Also
Examples
data(spiders)
dist=BinaryProximities(spiders)
pco=PrincipalCoordinates(dist)
plot(pco)
Plots an object of class "Procrustes"
Description
Plots Simple Procrustes Analysis
Usage
## S3 method for class 'Procrustes'
plot(x, F1=1, F2=2, ...)
Arguments
x |
Object of class "Procrustes" |
F1 |
First dimenssion of the plot |
F2 |
Second dimenssion of the plot |
... |
Additional parameters for graphical representations |
Details
Graphical representation of an Orthogonal Procrustes Analysis.
Value
No value is returned
Author(s)
Jose Luis Vicente-Villardon
See Also
Examples
data(spiders)
dist=BinaryProximities(spiders)
pco=PrincipalCoordinates(dist)
plot(pco)
Plots a Statis Biplot Object
Description
Plots a Statis Biplot Object
Usage
## S3 method for class 'StatisBiplot'
plot(x, A1 = 1, A2 = 2, PlotType = "Biplot",
PlotRowTraj = FALSE, PlotVarTraj = FALSE, LabelTraj = "Begining",
VarColorType = "ByVar", VarColors = NULL, VarLabels = NULL,
RowColors = NULL, TableColors = NULL, RowRandomColors = FALSE,
TypeTraj = "line", ...)
Arguments
x |
A Statis object |
A1 |
First dimension of the plot |
A2 |
Second dimension of the plot |
PlotType |
Type of plot: Interstructure, Correlations, Contributions or Biplot |
PlotRowTraj |
Should the row trajectories be plotted? |
PlotVarTraj |
Should the variables trajectories be plotted? |
LabelTraj |
Where the trajecories should be labelled: Begining or End. |
VarColorType |
The colors for the variables should be set by table (ByTable) or by variable (ByVar) |
VarColors |
Colors for the variables. |
VarLabels |
Labels for the variables |
RowColors |
Colors for the rows |
TableColors |
Colors for each table |
RowRandomColors |
Use random colors for the variables. |
TypeTraj |
Type of trajectory to plot: Lines or stars |
... |
Aditional parameters |
Details
Plots a Statis Biplot Object. The arguments of the general biplot are as in a Continuous Biplot.
Value
A biplot
Author(s)
Jose Luis Vicente Villardon
References
Vallejo-Arboleda, A., Vicente-Villardon, J. L., & Galindo-Villardon, M. P. (2007). Canonical STATIS: Biplot analysis of multi-table group structured data based on STATIS-ACT methodology. Computational statistics & data analysis, 51(9), 4193-4205.
See Also
Examples
data(Chemical)
x= Chemical[,5:16]
X=Convert2ThreeWay(x,Chemical$WEEKS, columns=FALSE)
stbip=StatisBiplot(X)
Plots an object of class "tetraDualStatis".
Description
Plots an object the results of TetraDualStatis.
Usage
## S3 method for class 'TetraDualStatis'
plot(x, A1 = 1, A2 = 2, PlotType = "InterStructure",
PlotRowTraj = FALSE, PlotVarTraj = FALSE, LabelTraj = "Begining",
VarColorType = "Biplot", VarColors = NULL, VarLabels = NULL,
RowColors = NULL, TableColors = NULL, RowRandomColors = FALSE,
TypeTraj = "line", ...)
Arguments
x |
An object of class TetraDualStatis |
A1 |
Dimension for the X-axis |
A2 |
Dimension for the Y-axis |
PlotType |
Type of plot: "Biplot", "Compromise", "Correlations", "Contributions", "InterStructure". |
PlotRowTraj |
Should the row trajectories be plotted? |
PlotVarTraj |
Should the variables trajectories be plotted? |
LabelTraj |
Should the trajectories be labelled. |
VarColorType |
One of the following: "Biplot", "ByTable", "ByVar". |
VarColors |
User colors for the variables. |
VarLabels |
User labels for the variables. |
RowColors |
User colors for the rows. |
TableColors |
User colors for the different tables. |
RowRandomColors |
Should use random colors for the rows? |
TypeTraj |
Type of trajectory. Normally a line. |
... |
Additional graphical arguments. |
Details
Plots an object the results of TetraDualStatis.
Value
The plot of the results
Author(s)
Laura Vicente-Gonzalez, Jose Luis Vicente-Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Plots an Unfolding Representation
Description
Plots an Unfolding Representation
Usage
## S3 method for class 'Unfolding'
plot(x, A1 = 1, A2 = 2, ShowAxis = FALSE,
margin = 0.1, PlotSites = TRUE, PlotSpecies = TRUE, PlotEnv = TRUE,
LabelSites = TRUE, LabelSpecies = TRUE, LabelEnv = TRUE,
SpeciesQuality = FALSE, MinQualityVars = 0, dp = 0,
PlotAxis = FALSE, TypeScale = "Complete", ValuesScale = "Original",
mode = "h", CexSites = NULL, CexSpecies = NULL, CexVar = NULL,
ColorSites = NULL, ColorSpecies = NULL, ColorVar = NULL,
PchSites = NULL, PchSpecies = NULL, PchVar = NULL,
SizeQualSites = FALSE, SizeQualSpecies = FALSE,
SizeQualVars = FALSE, ColorQualSites = FALSE,
ColorQualSpecies = FALSE, ColorQualVars = FALSE, SmartLabels = FALSE,
PlotTol = FALSE, ...)
Arguments
x |
An object of class Unfolding |
A1 |
Axis 1 of the representation. |
A2 |
Axis 1 of the representation. |
ShowAxis |
Should the axis be shown? |
margin |
Margin for the plot (precentage) |
PlotSites |
Should the sites be plotted? |
PlotSpecies |
Should the species be plotted? |
PlotEnv |
Should the environmental variables be plotted? |
LabelSites |
Should the sites be labelled? |
LabelSpecies |
Should the species be labelled? |
LabelEnv |
Should the environmental variables be labelled? |
SpeciesQuality |
Min species quality to plot |
MinQualityVars |
Minimum quality of a var to be plotted. |
dp |
A set of indices with the variables that will show the projections of the individuals. |
PlotAxis |
Should the axis be plotted? |
TypeScale |
Type of scale to use : "Complete", "StdDev" or "BoxPlot" |
ValuesScale |
Values to show on the scale: "Original" or "Transformed" |
mode |
Mode of the biplot: "p", "a", "b", "h", "ah" and "s". |
CexSites |
Size for the symbols and labels of the sites. Can be a single common size for all the points or a vector with individual sizes. |
CexSpecies |
Size for the symbols and labels of the species. Can be a single common size for all the points or a vector with individual sizes. |
CexVar |
Size for the symbols and labels of the variables. Can be a single common size for all the points or a vector with individual sizes. |
ColorSites |
Color for the symbols and labels of the sites. Can be a single common color for all the points or a vector with individual colors. |
ColorSpecies |
Color for the symbols and labels of the species. Can be a single common color for all the points or a vector with individual colors. |
ColorVar |
Color for the symbols and labels of the variables. Can be a single common color for all the points or a vector with individual colors. |
PchSites |
Symbol for the sites points. See |
PchSpecies |
Symbol for the species points. See |
PchVar |
Symbol for the variables points. See |
SizeQualSites |
Should the size of the site points be related to their qualities of representation (predictiveness)? |
SizeQualSpecies |
Should the size of the species points be related to their qualities of representation (predictiveness)? |
SizeQualVars |
Should the size of the variables points be related to their qualities of representation (predictiveness)? |
ColorQualSites |
Should the color of the sites points be related to their qualities of representation (predictiveness)? |
ColorQualSpecies |
Should the color of the species points be related to their qualities of representation (predictiveness)? |
ColorQualVars |
Should the color of the variables points be related to their qualities of representation (predictiveness)? |
SmartLabels |
Plot the labels in a smart way |
PlotTol |
Should the tolerances be plotted |
... |
Aditional graphical parameters. |
Details
Plots an Unfolding Representation
Value
A plot of the unfolding representation.
Author(s)
Jose Luis Vicente-Villardon
References
de Leeuw, J. (2005). Multidimensional unfolding. Encyclopedia of statistics in behavioral science.
Examples
# Not yet
Plot a concentration ellipse.
Description
Plot a concentration ellipse obtained from ConcEllipse
.
Usage
## S3 method for class 'ellipse'
plot(x, add=TRUE, labeled= FALSE ,
center=FALSE, centerlabel="Center", initial=FALSE, ...)
Arguments
x |
An object with class |
add |
Should the ellipse be added to the current plot? |
labeled |
Should the ellipse be labelled with the confidence level? |
center |
Should the center be plotted? |
centerlabel |
Label for the center. |
initial |
Should the initial data be plotted? |
... |
Any other graphical parameter that can affects the plot (as color, etc ...) |
Details
Plots an ellipse containing a specified percentage of the data.
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
References
Meulman, J. J., & Heiser, W. J. (1983). The display of bootstrap solutions in multidimensional scaling. Murray Hill, NJ: Bell Laboratories.
Linting, M., Meulman, J. J., Groenen, P. J., & Van der Kooij, A. J. (2007). Stability of nonlinear principal components analysis: An empirical study using the balanced bootstrap. Psychological Methods, 12(3), 359.
See Also
ConcEllipse
, ~~~
Examples
data(iris)
dat=as.matrix(iris[1:50,1:2])
plot(iris[,1], iris[,2],col=iris[,5], asp=1)
E=ConcEllipse(dat, 0.95)
plot(E, labeled=TRUE, center=TRUE)
Plots a fraction of the data as a cluster
Description
Plots a convex hull or a star containing a specified percentage of the data. Used to plot clusters.
Usage
## S3 method for class 'fraction'
plot(x, add = TRUE, center = FALSE,
centerlabel = "Center", initial = FALSE, type = "ch", ...)
Arguments
x |
An object with class |
add |
Should the fraction be added to the current plot? |
center |
Should the center be plotted? |
centerlabel |
Label for the center. |
initial |
Should the initial data be plotted? |
type |
Type of plot. Can be: "ch"- Convex Hull or "st" - Star (Joining each point with the center) |
... |
Any other graphical parameter that can affects the plot (as color, etc ...) |
Details
Plots a convex hull or a star containing a specified percentage of the data.
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
a=matrix(runif(50), 25,2)
a2=Fraction(a, 0.7)
plot(a2, add=FALSE, type="ch", initial=TRUE, center=TRUE, col="blue")
plot(a2, add=TRUE, type="st", col="red")
Prints the results of Model-Based Gaussian Clustering algorithms
Description
Prints the results of Model-Based Gaussian Clustering algorithms
Usage
## S3 method for class 'MGC'
print(x, ...)
Arguments
x |
An object of class "MGC" |
... |
Any aditional parameters |
Details
Prints the results of Model-Based Gaussian Clustering algorithms
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
##-- ==> Define data, use random,
##-- or do help(data=index) for the standard data sets.
prints an object of class RidgeBinaryLogistic
Description
prints an object of class RidgeBinaryLogistic
Usage
## S3 method for class 'RidgeBinaryLogistic'
print(x, ...)
Arguments
x |
An object of class |
... |
Aditional Arguments |
Details
Prints an object of class RidgeBinaryLogistic
Value
The main resuls of a binary logistic regression
Author(s)
Jose Luis Vicente Villardon
Examples
# Not yet
Ecological data from Riano (Spain)
Description
Ecological data from Riano (Spain)
Usage
data("riano")
Format
A data frame with 70 observations on the following 25 variables.
Week
a factor with levels
A
B
C
D
E
F
G
H
I
J
Depth
a factor with levels
0
2
5
10
15
20
Bottom
Cianof
a numeric vector
Crisof
a numeric vector
Haptof
a numeric vector
Crasp
a numeric vector
Cripto
a numeric vector
Dinof
a numeric vector
Diatom
a numeric vector
Euglen
a numeric vector
Prasin
a numeric vector
Clorof
a numeric vector
Zigofi
a numeric vector
Xantof
a numeric vector
malgas
a numeric vector
Ta
a numeric vector
X02
a numeric vector
pH
a numeric vector
COND
a numeric vector
SiO2
a numeric vector
P.PO4
a numeric vector
Chla
a numeric vector
Chlb
a numeric vector
Chlc
a numeric vector
IM
a numeric vector
Details
Ecological data from Riano (Spain). Abundance of several algae taxonomic groups and several environmental variables
Source
Department of Ecology. University of Leon. Spain
Examples
data(riano)
## maybe str(riano) ; plot(riano) ...
Extract the scores of a CCA solution object
Description
Extract the scores of a CCA solution object
Usage
scores.CCA.sol(CCA.sol)
Arguments
CCA.sol |
The results of a CCA model |
Details
Extract the scores of a CCA solution object
Value
The species, sites and environmental variables scores of a CCA solution
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
##---- Should be DIRECTLY executable !! ----
Smoking habits
Description
Frequency table representing smoking habits of different employees in a company
Usage
data(smoking)
Format
A data frame with 5 observations on the following 4 variables.
None
a numeric vector
Light
a numeric vector
Medium
a numeric vector
Heavy
a numeric vector
Details
Frequency table representing smoking habits of different employees in a company
Source
http://orange.biolab.si/docs/latest/reference/rst/Orange.projection.correspondence/
References
Greenacre, Michael (1983). Theory and Applications of Correspondence Analysis. London: Academic Press.
Examples
data(smoking)
## maybe str(smoking) ; plot(smoking) ...
Hunting Spiders Data
Description
Hunting spiders data transformed into Presence/Abscense.
Usage
data(spiders)
Format
A data frame with 28 observations of presence/absence of 12 hunting spider species
- Alopacce
Presence/Absence of the species Alopecosa accentuata
- Alopcune
Presence/Absence of the species Alopecosa cuneata
- Alopfabr
Presence/Absence of the species Alopecosa fabrilis
- Arctlute
Presence/Absence of the species Arctosa lutetiana
- Arctperi
Presence/Absence of the species Arctosa perita
- Auloalbi
Presence/Absence of the species Aulonia albimana
- Pardlugu
Presence/Absence of the species Pardosa lugubris
- Pardmont
Presence/Absence of the species Pardosa monticola
- Pardnigr
Presence/Absence of the species Pardosa nigriceps
- Pardpull
Presence/Absence of the species Pardosa pullata
- Trocterr
Presence/Absence of the species Trochosa terricola
- Zoraspin
Presence/Absence of the species Zora spinimana
Source
van der Aart, P. J. M., and Smeenk-Enserink, N. (1975) Correlations between distributions of hunting spiders (Lycos- idae, Ctenidae) and environmental characteristics in a dune area. Netherlands Journal of Zoology 25, 1-45.
Examples
data(spiders)
Summary of the solution of a CCA
Description
Summary of the solution of a CCA
Usage
## S3 method for class 'CCA.sol'
summary(object, ...)
Arguments
object |
An object of class CCA.sol |
... |
Aditional arguments |
Details
Summary of the solution of a CCA
Value
The main results of a CCA
Author(s)
Jose Luis Vicente Villardon
See Also
Examples
##---- Should be DIRECTLY executable !! ----
Summary of a Canonical Variate Analysis
Description
Summary of a Canonical Variate Analysis
Usage
## S3 method for class 'CVA'
summary(object, ...)
Arguments
object |
An object of class CVA |
... |
Any aditional arguments |
Details
Summary of a Canonical Variate Analysis
Value
The summary
Author(s)
Jose Luis Vicente Villardon
Examples
# Not yet
Summary of the solution of a Canonical Biplot Analysis
Description
Summary of the solution of a Canonical Biplot Analysis
Usage
## S3 method for class 'Canonical.Biplot'
summary(object, ...)
Arguments
object |
The result of a Canonical Biplot |
... |
Aditional arguments |
Details
Summary of the results of a Canonical Biplot
Value
The summary
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Summary of the solution of a Biplot for Continuous Data
Description
Summary of the solution of a Biplot for Continuous Data
Usage
## S3 method for class 'ContinuousBiplot'
summary(object, latex = FALSE, ...)
Arguments
object |
An object of class "ContinuousBiplot" |
latex |
Should the results be in latex tables |
... |
Any aditional parameters |
Details
Summary of the solution of a Biplot for Continuous Data
Value
The summary
Author(s)
Jose Luis Vicente Villardon
Examples
## Simple Biplot with arrows
data(Protein)
bip=PCA.Biplot(Protein[,3:11])
summary(bip)
Summary of Model-Based Gaussian Clustering results
Description
Summarizes the results of Model-Based Gaussian Clustering algorithms
Usage
## S3 method for class 'MGC'
summary(object, Centers = TRUE, Covariances = TRUE, ...)
Arguments
object |
An object of class "MGC" |
Centers |
Should the Centers be shown |
Covariances |
Should the Covariances be shown |
... |
Any aditional Parameters |
Details
Summarizes the results of Model-Based Gaussian Clustering algorithms
Value
No value returned
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
##-- ==> Define data, use random,
##-- or do help(data=index) for the standard data sets.
## The function is currently defined as
Summary of the results of a PCA.
Description
Sumarizes the results of a PCA Analysis.
Usage
## S3 method for class 'PCA.Analysis'
summary(object, latex = FALSE, ...)
Arguments
object |
The object with the results of s PCA Analysis. |
latex |
Should return latex tables? |
... |
Aditional arguments. |
Details
Sumarizes the results of a PCA Analysis, including latex tables for presentation.
Value
A summary of the main results
Author(s)
Jose Luis Vicente Villardon
Examples
# Not yet
Summary of a PCA.Bootstrap object
Description
Summary of a PCA.Bootstrap object
Usage
## S3 method for class 'PCA.Bootstrap'
summary(object, ...)
Arguments
object |
An object of class PCA.Bootstrap |
... |
Additional arguments |
Details
Summary of a PCA.Bootstrap object
Value
The summary
Author(s)
Jose Luis Vicente Villardon
Summary of a PLSR object
Description
Summary of a PLSR object
Usage
## S3 method for class 'PLSR'
summary(object, ...)
Arguments
object |
An object of class PLSR |
... |
Additional arguments |
Details
Summary of a PLSR object
Value
The summary of the object
Author(s)
Jose Luis Vicente Villardon
Summary of PLSR with a Binary Response
Description
Summary of PLSR with a single binary Response
Usage
## S3 method for class 'PLSR1Bin'
summary(object, ...)
Arguments
object |
An object of class PLSR1Bin |
... |
Aditional arguments |
Details
Summary of PLSR with a single binary Response
Value
The summary
Author(s)
Jose Luis Viecente Villlardon
Examples
#Not yet
Summary of the results of a Principal Coordinates Analysis
Description
Summary of the results of a Principal Coordinates Analysis
Usage
## S3 method for class 'Principal.Coordinates'
summary(object, printdata=FALSE, printproximities=FALSE,
printcoordinates=FALSE, printqualities=FALSE,...)
Arguments
object |
An object of Type |
printdata |
Should original data be printed. Default is FALSE |
printproximities |
Should proximities be printed. Default is FALSE |
printcoordinates |
Should proximities be printed. Default is FALSE |
printqualities |
Should qualoties of representation be printed. Default is FALSE |
... |
Additional parameters to summary. |
Details
This function is a method for the generic function summary() for class "Principal.Coordinates". It can be invoked by calling summary(x) for an object x of the appropriate class.
Value
The summary
Author(s)
Jose Luis Vicente-Villardon
Examples
data(spiders)
dist=BinaryProximities(spiders)
pco=PrincipalCoordinates(dist)
summary(pco)
Summary of a Binary Logistic Regression with Ridge Penalization
Description
Summarizes the results of a Binary Logistic Regression with Ridge Penalization
Usage
## S3 method for class 'RidgeBinaryLogistic'
summary(object, ...)
Arguments
object |
The object with te results of the logistic regression. |
... |
Any other arguments |
Details
Summarizes the results of a Binary Logistic Regression with Ridge Penalization.
Value
The summary
Author(s)
Jose Luis Vicente Villardon
Examples
# Not Yet
Summary of the results of TetraDualStatis
Description
Summary of the results of TetraDualStatis
Usage
## S3 method for class 'TetraDualStatis'
summary(object, ...)
Arguments
object |
The result of a Tetra Dual Statis Analysis |
... |
aditional arguments |
Details
Summarizes the results of TetradUalStatis
Value
No value returned
Author(s)
Laura Vicente-Gonzalez, José Luis Vicente-Villardon
Examples
# No examples yet
Tucker 3 Principal Covariates Regression
Description
Tucker 3 Principal Covariates Regression
Usage
t3pcovr(X, Y, I, J, K, L, r1 = 2, r2 = 2, r3 = 2,
conv = 1e-06, OriginalAlfa = 0.5, AlternativeLossF = 1,
nRuns = 100, StartSeed = 0)
Arguments
X |
A two way data matrix with the predictors. |
Y |
A three way data matrix with the responses. |
I |
Number of elements of first mode of 3D/2D (the common mode: rows) |
J |
number of elements of second mode of 3D (columns 3D) |
K |
number of elements of third mode of 3D (slabs) |
L |
number of elements of second mode of 2D (columns 2D) |
r1 |
Number of extracted components for the A-mode |
r2 |
Number of extracted components for the B-mode |
r3 |
Number of extracted components for the C-mode |
conv |
value for convergence (tolerance value) |
OriginalAlfa |
(0-1): importance that degree reduction and prediction have in the analysis |
AlternativeLossF |
Using the alternative loss function? 0 = no (use original loss function: weighted SSQ; weighted met alfa) 1 = yes (use weighted loss function with scaled SSQ: scaled by the SSQ in X and y ) |
nRuns |
Number of runs |
StartSeed |
Seed for the analysis |
Details
In behavioral research it is very common to have to deal with several data sets which include information relative to the same set of individuals, in such a way that one data set tries to explain the others. The class of models known as PCovR focuses on the analysis of a three-way data array explained by a two-way data matrix. In this paper the Tucker3-PCovR model is proposed that is a particular case of PCovR class. Tucker3-PCovR model reduces the predictors to a few components and predict the criterion by using these components and, at the same time, the three way data is fitted through the Tucker3 model. Both, the reduction of the predictors and the prediction of the criterion are done simultaneously. An alternating least squares algorithm to estimate the Tucker3-PCovR model is proposed. A biplot representation to facilitate the interpretation of the results is presented. A couple of applications are made to coupled empirical data sets related to the field of psychology.
Value
A |
Component matrix for the A-mode) |
B1 |
Component matrix for the B-mode |
C |
Component matrix for the C-mode |
H |
Matrized core array (frontal slices) |
B2 |
Loading matrix of components (components x predictors) |
... |
Further arguments |
Author(s)
Elisa Frutos Bernal (efb@usal.es)
References
De Jong, S., & Kiers, H. A. (1992). Principal covariates regression: Part I. Theory. Chemometrics and Intelligent Laboratory Systems , 155-164.
Marlies Vervloet, Henk A. Kiers, Wim Van den Noortgate, Eva Ceulemans (2015). PCovR: An R Package for Principal Covariates Regression. Journal of Statistical Software, 65(8), 1-14. URL http://www.jstatsoft.org/v65/i08/.
Smilde, A. K., Bro, R., & Geladi, P. (2004). Multi-way analysis with applications in the chemical sciences. Chichester, UK: Wiley.
Examples
#Not yet
Labels of a Scatter
Description
Plots labels of points in a scattergram. labels for points with positive x
are placed on the right of
the points, and labels for points with negative values on the left.
Usage
textsmart(A, Labels, CexPoints, ColorPoints, ...)
Arguments
A |
Coordinates of the points for the scaterrgram |
Labels |
Labels for the points |
CexPoints |
Size of the labels |
ColorPoints |
Colors of the labels |
... |
Aditional graphical arguments |
Details
The function is used to improve the readability of the labels in a scatergram.
Value
No value returned
Author(s)
Jose Luis Vicente-Villardon
See Also
Examples
data(spiders)
dist=BinaryProximities(spiders)
pco=PrincipalCoordinates(dist)
plot(pco, SmartLabels =TRUE)
Extracts the weighted averages of a CCA solution
Description
Extracts the weighted averages of a CCA solution
Usage
wa(CCA.sol, transformed = FALSE)
Arguments
CCA.sol |
The solution of a CCA |
transformed |
Average of the transformed or the original data? |
Details
Extracts the weighted averages of a CCA solution
Value
A matrix with the averages
Author(s)
icente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Weighted correlations
Description
Weighted correlations
Usage
wcor(d1, d2, w = rep(1, nrow(d1))/nrow(d1))
Arguments
d1 |
First Vector |
d2 |
Second vector to correlate |
w |
weights for ecah element of the vectors |
Details
Weighted correlations
Value
Weighted correlation
Author(s)
Jose Luis Vicente Villardon
Weighted quantiles
Description
Weighted quantiles
Usage
weighted.quantile(x, w, q = 0.5)
Arguments
x |
The numerical variable. |
w |
Weights |
q |
Quantile |
Value
The quantile
Author(s)
Jose Luis Vicente Villardon
Examples
##---- Should be DIRECTLY executable !! ----
Wine data
Description
Comparison of young wines of Ribera de Duero and Toro
Usage
data("wine")
Format
A data frame with 45 observations on the following 21 variables.
Year
A factor with levels
1986
1987
Origin
A factor with levels
Ribera
Toro
Group
A factor with levels
R86
R87
T86
T87
A
Alcoholic content (percentage)
VA
volatil acidity - g acetic acid/l
TA
Total tritable acidity - g tartaric acid/l
FA
Fixed acidity - g tartaric acid/l
pH
ph
TPR
Total phenolics - g gallic acid /l - Folin
TPS
Total phenolics - Somers
V
Substances reactive to vanilin - mg catechin/l
PC
Procyanidins - mg cyanidin/l
ACR
Total Anthocyanins - mg/l - method 1
ACS
Total Anthocyanins - mg/l - methods 2
ACC
Malvidin - malvidin-3-glucoside mg/l
CI
Color density -
CI2
Color density 2
H
Wine Hue Color
I
Degree of Ionization - Percent
CA
Chemical Age
VPC
ratio V/PC
Details
Comparison of young wines of Ribera de Duero and Toro
Source
Rivas-Gonzalo, J. C., Gutierrez, Y., Polanco, A. M., Hebrero, E., Vicente-Villardon, J. L., Galindo, P., & Santos-Buelga, C. (1993). Biplot analysis applied to enological parameters in the geographical classification of young red wines. American journal of enology and viticulture, 44(3), 302-308.
References
Rivas-Gonzalo, J. C., Gutierrez, Y., Polanco, A. M., Hebrero, E., Vicente-Villardon, J. L., Galindo, P., & Santos-Buelga, C. (1993). Biplot analysis applied to enological parameters in the geographical classification of young red wines. American journal of enology and viticulture, 44(3), 302-308.
Examples
data(wine)
## maybe str(wine) ; plot(wine) ...
Matrix of zeros as in Matlab
Description
Matrix of zeros
Usage
zeros(n)
Arguments
n |
Dimension of the matrix |
Value
A matrix of zeros
Author(s)
Jose Luis Vicente Villardon
Examples
zeros(6)