Title: | Various Andrews Curves |
---|---|
Description: | Visualisation of multidimensional data through different Andrews curves: Andrews, D. F. (1972) Plots of High-Dimensional Data. Biometrics, 28(1), 125-136. <doi:10.2307/2528964>. |
Authors: | Jaroslav Myslivec [aut], Sigbert Klinke [cre, ctb] |
Maintainer: | Sigbert Klinke <[email protected]> |
License: | GPL-3 |
Version: | 1.1.2 |
Built: | 2025-02-19 04:44:22 UTC |
Source: | https://github.com/sigbertklinke/andrews |
Andrews curves for visualization of multidimensional data.
For colouring the curves see the details.
For differences between andrews
and andrews0
see the vignette("andrews")
.
With the same parameters called both functions should create the same plot.
type==5
is a modification of type==3
and type==6
is a modification of type==4
.
andrews( df, type = 1, clr = NULL, step = 100, ymax = 10, alpha = NULL, palcol = NULL, lwd = 1, lty = "solid", ... )
andrews( df, type = 1, clr = NULL, step = 100, ymax = 10, alpha = NULL, palcol = NULL, lwd = 1, lty = "solid", ... )
df |
data frame or an R object that can be converted into a data frame with |
type |
type of curve
|
clr |
number/name of column in the data frame for color of curves |
step |
smoothness of curves |
ymax |
maximum of |
alpha |
semi-transparent color ( |
palcol |
a function which generates a set of colors, see details |
lwd |
line width, a positive number, defaulting to 1. |
lty |
line type, can either be specified as an integer (0=blank, 1=solid (default), 2=dashed, 3=dotted, 4=dotdash, 5=longdash, 6=twodash) or as one of the character strings "blank", "solid", "dashed", "dotted", "dotdash", "longdash", or "twodash", where "blank" uses ‘invisible lines’ (i.e., does not draw them). |
... |
further named parameters given to |
If clr
has length one then it is used as column number or column name
for coloring the curves:
If df[,clr]
is numeric then palcol
must be function which returns
colors for values in the range \[0, 1\]
using normalized variable.
The default is function function(v) { hsv(0,1,v) }
.
Otherwise df[,clr]
is converted to a factor and palcol
must be a function
which returns for each level a color. The parameter for palcol
is the numbe of
levels and the default is grDevices::rainbow()
.
If the length of clr
is the number of rows of df
then clr
is interpreted as
colors.
Andrews curves transform multidimensional data into curves. This package presents four types of curves.
nothing
Sigbert Klinke [email protected], Jaroslav Myslivec [email protected]
Andrews, D. F. (1972) Plots of High-Dimensional Data. Biometrics, vol. 28, no. 1, pp. 125-136.
Khattree, R., Naik, D. N. (2002) Andrews Plots for Multivariate Data: Some New Suggestions and Applications. Journal of Statistical Planning and Inference, vol. 100, no. 2, pp. 411-425.
data(iris) op <- par(mfrow=c(1,2)) andrews0(iris,clr=5,ymax=3) andrews(iris,clr=5,ymax=3) par(op) andrews(iris,type=4,clr=5,ymax=NA)
data(iris) op <- par(mfrow=c(1,2)) andrews0(iris,clr=5,ymax=3) andrews(iris,clr=5,ymax=3) par(op) andrews(iris,type=4,clr=5,ymax=NA)
Andrews curves for visualization of multidimensional data.
For differences between andrews
and andrews2
see the 'vignette("andrews").
For colouring the curves see the details.
andrews0( df, type = 1, clr = NULL, step = 100, ymax = 10, main = NULL, sub = NULL )
andrews0( df, type = 1, clr = NULL, step = 100, ymax = 10, main = NULL, sub = NULL )
df |
data frame |
type |
type of curve
|
clr |
number/name of column in the date frame for color of curves |
step |
smoothness of curves |
ymax |
maximum of |
main |
main title for the plot |
sub |
sub title for the plot |
Andrews curves transform multidimensional data into curves. This package presents four types of curves
If df[,clr]
is numeric then hsv(1,1,v)
with the normalized values (on \[0, 1\]
) of df[,clr]
is used.
Otherwise the number of unique values in nuv <- unique(df[,clr])
is used in connection with rainbow(nuv)
.
nothing
Jaroslav Myslivec [email protected]
Andrews, D. F. (1972) Plots of High-Dimensional Data. Biometrics, vol. 28, no. 1, pp. 125-136.
Khattree, R., Naik, D. N. (2002) Andrews Plots for Multivariate Data: Some New Suggestions and Applications. Journal of Statistical Planning and Inference, vol. 100, no. 2, pp. 411-425.
data(iris) andrews0(iris,clr=5,ymax=3) andrews0(iris,type=4,clr=5,ymax=2)
data(iris) andrews0(iris,clr=5,ymax=3) andrews0(iris,type=4,clr=5,ymax=2)
The data set contains six measurements made on 100 genuine and 100 counterfeit old-Swiss 1000-franc bank notes. The data frame and the documentation is a copy of mclust::banknote.
banknote
banknote
A data frame with 200 rows and 7 columns:
the status of the banknote: genuine
or counterfeit
Length of bill (mm)
Width of left edge (mm)
Width of right edge (mm)
Bottom margin width (mm)
Top margin width (mm)
Length of diagonal (mm)
Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A practical approach. London: Chapman & Hall, Tables 1.1 and 1.2, pp. 5-8.
Defines a function which can be used as basis for Andrews curves .
deftype(index = NULL, FUN = NULL, xlim = c(-pi, pi))
deftype(index = NULL, FUN = NULL, xlim = c(-pi, pi))
index |
index/name of the function |
FUN |
function of the form |
xlim |
default range for displaying curves (default: |
either a list of all functions or a single function
# define a new andrews curve, just with sine curves deftype("sine", function(n, t) { n <- as.integer(if (n<1) 1 else n) m <- matrix(NA, nrow=length(t), ncol=n) for (i in 1:n) m[,i] <- sin(i*t) m }) andrews(iris, "sine") # query deftype() deftype("sine")
# define a new andrews curve, just with sine curves deftype("sine", function(n, t) { n <- as.integer(if (n<1) 1 else n) m <- matrix(NA, nrow=length(t), ncol=n) for (i in 1:n) m[,i] <- sin(i*t) m }) andrews(iris, "sine") # query deftype() deftype("sine")
Generates a vector of the first n
primes using gmp::nextprime()
.
generate_n_primes(n, one = FALSE)
generate_n_primes(n, one = FALSE)
n |
the number of primes to generate. |
one |
should |
an integer vector of prime numbers
generate_n_primes(5) generate_n_primes(5, TRUE)
generate_n_primes(5) generate_n_primes(5, TRUE)
Normalization of a variable:
type==1
: ar
normalized into ,
type==2
: ar
is standardized,
otherwise no normalization is done.
normalize(ar, type = 1)
normalize(ar, type = 1)
ar |
numeric variable. |
type |
integer: type of normalization (default: |
Normalization of variable: ar<-(ar-min(ar))/(max(ar)-min(ar))
Returns normalized variable.
Jaroslav Myslivec [email protected], Sigbert Klinke [email protected]
normalize(iris[,1])
normalize(iris[,1])
Extracts numeric array from data frame.
numarray(df)
numarray(df)
df |
data frame. |
Extracts numeric array from data frame.
Returns numeric array.
Jaroslav Myslivec [email protected], Sigbert Klinke [email protected]
numarray(iris)
numarray(iris)
Computes the Stahel-Donoho outlyingness. If type
is any of the available types by andrews()
then
the projection vectors are generated along the andrews curves. Otherwise step
random directions
will be used. Note that the projection vectors are always normalized to length one.
outlyingness(x, type = 1, step = 100, xlim = NULL, normalize = 1)
outlyingness(x, type = 1, step = 100, xlim = NULL, normalize = 1)
x |
data frame |
type |
type of curve, see |
step |
step smoothness of curves |
xlim |
the x limits (x1, x2) |
normalize |
type of normalization, see |
the Stahel-Donoho outlyingness
Stahel, W. (1981), Robuste Schätzungen: infinitesimale Optimalität und Schätzungen von Kovarianzmatrizen, PhD thesis, ETH Z¨urich.
Donoho, D. (1982), Breakdown properties of multivariate location estimators, Ph.D. Qualifying paper, Dept. Statistics, Harvard University, Boston.
# use projection vectors from the Andrews curve sdo <- outlyingness(iris) col <- gray(1-sdo/max(sdo)) andrews(iris, clr=col, ymax=NA) # use 1000 random projection vectors sdo <- outlyingness(iris, type=0, step=1000) col <- gray(1-sdo/max(sdo)) andrews(iris, clr=col, ymax=NA) # use 1000 random projection vectors with adjusted outlyingness library("robustbase") x <- numarray(iris) x <- scale(x, center=apply(x, 2, min), scale=apply(x, 2, max)-apply(x, 2, min)) sdo <- adjOutlyingness(x, ndir=1000, only.outlyingness=TRUE) col <- gray(1-sdo/max(sdo)) andrews(as.data.frame(x), clr=col, ymax=NA)
# use projection vectors from the Andrews curve sdo <- outlyingness(iris) col <- gray(1-sdo/max(sdo)) andrews(iris, clr=col, ymax=NA) # use 1000 random projection vectors sdo <- outlyingness(iris, type=0, step=1000) col <- gray(1-sdo/max(sdo)) andrews(iris, clr=col, ymax=NA) # use 1000 random projection vectors with adjusted outlyingness library("robustbase") x <- numarray(iris) x <- scale(x, center=apply(x, 2, min), scale=apply(x, 2, max)-apply(x, 2, min)) sdo <- adjOutlyingness(x, ndir=1000, only.outlyingness=TRUE) col <- gray(1-sdo/max(sdo)) andrews(as.data.frame(x), clr=col, ymax=NA)
Selecting object utility in Andrews curves
selectand(df, type = 1, step = 100, ncol = 0, from = 0, to = 1, col = 2)
selectand(df, type = 1, step = 100, ncol = 0, from = 0, to = 1, col = 2)
df |
data frame. |
type |
type of curve. |
step |
smoothness of curves. |
ncol |
number of column in data frame for selection. |
from |
from value. |
to |
to value. |
col |
color of selected objects. |
Define which objects will be selected (colored) in Andrews curves.
Nothing
Jaroslav Myslivec [email protected]
data(iris) andrews(iris,clr=5,ymax=3) selectand(iris,ncol=1,from=5,to=5.5,col=1)
data(iris) andrews(iris,clr=5,ymax=3) selectand(iris,ncol=1,from=5,to=5.5,col=1)
Creates and displays a temporary PDF file with different diagrams comparing andrews
and andrews0
plots.
zzz()
zzz()
nothing
if (interactive()) zzz()
if (interactive()) zzz()