Title: | Access to Teaching Materials from a ZIP File or GitHub |
---|---|
Description: | Provides access to teaching materials for various statistics courses, including R and Python programs, Shiny apps, data, and PDF/HTML documents. These materials are stored on the Internet as a ZIP file (e.g., in a GitHub repository) and can be downloaded and displayed or run locally. The content of the ZIP file is temporarily or permanently stored. By default, the package uses the GitHub repository 'sigbertklinke/mmstat4.data.' Additionally, the package includes 'association_measures.R' from the archived package 'ryouready' by Mark Heckman and some auxiliary functions. |
Authors: | Sigbert Klinke [aut, cre] , Jekaterina Zukovska [ctb] |
Maintainer: | Sigbert Klinke <[email protected]> |
License: | GPL-3 |
Version: | 0.2.3 |
Built: | 2024-11-06 05:28:25 UTC |
Source: | https://github.com/sigbertklinke/mmstat4 |
askUser
provides a way to ask the user a yes/no/cancel question (default). A *
after a number indicates the default option.
askUser( msg, choices = c("Yes", "No", "Cancel"), default = 1, col = crayon::black )
askUser( msg, choices = c("Yes", "No", "Cancel"), default = 1, col = crayon::black )
msg |
character: the prompt message for the user |
choices |
character: vector of choices (default: |
default |
character/integer: default option if only |
col |
function: a color function (default: |
the integer number choosen by the user
if (interactive()) askUser("Do you want to use askUser?")
if (interactive()) askUser("Do you want to use askUser?")
Various association coefficients for nominal and ordinal data; the input formats follows stats::chisq.test()
.
concordant
concordant pairs
discordant
discordant pairs
ties.row
pairs tied on rows
ties.col
pairs tied on columns
nom.phi
Phi Coefficient
nom.cc
Contingency Coefficient (Pearson's C) and Sakoda' s Adjusted Pearson's C
nom.TT
Tshuprow's T (not meaningful for non-square tables)
nom.CV
Cramer's V (for 2 x 2 tables V = Phi)
nom.lambda
Goodman and Kruskal's Lambda with
lambda.cr
The row variable is used as independent, the column variable as dependent variable.
lambda.rc
The column variable is used as independent, the row variable as dependent variable.
lambda.symmetric
Symmetric Lambda (the mean of both above).
nom.uncertainty
Uncertainty Coefficient (Theil's U) with
ucc.cr
The row variable is used as independent, the column variable as dependent variable.
uc.rc
The column variable is used as independent, the row variable as dependent variable.
uc.symmetric
Symmetric uncertainty coefficient.
ord.gamma
Gamma coefficient
ord.tau
a vector with Kendall-Stuart Tau's
tau.a
Tau-a (for quadratic tables only)
tau.b
Tau-b
tau.c
Tau-c
ord.somers.d
Somers' d
eta
Eta coefficient for nominal/interval data
concordant(x, y = NULL) discordant(x, y = NULL) ties.row(x, y = NULL) ties.col(x, y = NULL) nom.phi(x, y = NULL) nom.cc(x, y = NULL) nom.TT(x, y = NULL) nom.CV(x, y = NULL) nom.lambda(x, y = NULL) nom.uncertainty(x, y = NULL) ord.gamma(x, y = NULL) ord.tau(x, y = NULL) ord.somers.d(x, y = NULL) eta(x, y, breaks = NULL)
concordant(x, y = NULL) discordant(x, y = NULL) ties.row(x, y = NULL) ties.col(x, y = NULL) nom.phi(x, y = NULL) nom.cc(x, y = NULL) nom.TT(x, y = NULL) nom.CV(x, y = NULL) nom.lambda(x, y = NULL) nom.uncertainty(x, y = NULL) ord.gamma(x, y = NULL) ord.tau(x, y = NULL) ord.somers.d(x, y = NULL) eta(x, y, breaks = NULL)
x |
a numeric vector, table or matrix. |
y |
a numeric vector; ignored if |
breaks |
either a numeric vector of two or more unique cut points or a single number (greater than or equal to 2)
giving the number of intervals into which |
the association coefficient(s)
From the archived ryouready
package by Mark Heckmann.
The code for the calculation of nom.lambda
, nom.uncertainty
, ord.gamma
, ord.tau
, ord.somers.d
was supplied by Marc Schwartz (under GPL 2) and checked against SPSS results.
## Nominal data # remove gender from the table hec <- apply(HairEyeColor, 1:2, sum) nom.phi(hec) nom.cc(hec) nom.TT(hec) nom.CV(hec) nom.lambda(hec) nom.uncertainty(hec) ## Ordinal data # create a fake data set ordx <- sample(5, size=100, replace=TRUE) ordy <- sample(5, size=100, replace=TRUE) concordant(ordx, ordy) discordant(ordx, ordy) ties.row(ordx, ordy) ties.col(ordx, ordy) ord.gamma(ordx, ordy) ord.tau(ordx, ordy) ord.somers.d(ordx, ordy) ## Interval/nominal data eta(iris$Species, iris$Sepal.Length)
## Nominal data # remove gender from the table hec <- apply(HairEyeColor, 1:2, sum) nom.phi(hec) nom.cc(hec) nom.TT(hec) nom.CV(hec) nom.lambda(hec) nom.uncertainty(hec) ## Ordinal data # create a fake data set ordx <- sample(5, size=100, replace=TRUE) ordy <- sample(5, size=100, replace=TRUE) concordant(ordx, ordy) discordant(ordx, ordy) ties.row(ordx, ordy) ties.col(ordx, ordy) ord.gamma(ordx, ordy) ord.tau(ordx, ordy) ord.somers.d(ordx, ordy) ## Interval/nominal data eta(iris$Species, iris$Sepal.Length)
Generates and plots a cumulative distribution function.
cdf(x, ...) ## Default S3 method: cdf(x, y, discrete = TRUE, ...) ## S3 method for class 'cdf' plot(x, y, ..., col.01line = "black", pch = 19)
cdf(x, ...) ## Default S3 method: cdf(x, y, discrete = TRUE, ...) ## S3 method for class 'cdf' plot(x, y, ..., col.01line = "black", pch = 19)
x |
numeric: x-values |
... |
further parameters given to |
y |
numeric: y-values |
discrete |
logical: if distribution is discrete |
col.01line |
color: color of horizontal lines at 0 and 1 (default: |
pch |
point type: See |
returns a cdf
object
# Binomial distribution x <- cdf(0:10, pbinom(0:10, 10, 0.5)) plot(x) # Exponential distribution x <- seq(0, 5, by=0.01) x <- cdf(x, pexp(x), discrete=FALSE) plot(x)
# Binomial distribution x <- cdf(0:10, pbinom(0:10, 10, 0.5)) plot(x) # Exponential distribution x <- seq(0, 5, by=0.01) x <- cdf(x, pexp(x), discrete=FALSE) plot(x)
checkFiles
verifies whether all specified files are valid source files
that can be executed independently of each other. If an error occurs, the following actions are taken:
If open
is either a function name or a function with a file
parameter, then checkFiles
will attempt to open the faulty source file; otherwise, it will not.
The execution of checkFiles
is stopped.
If you do not want the faulty source file to be opened immediately, use open=0
.
Three modes are available for checking a file
:
exist
: Does the source file exist?
parse
: (default) Is parse(file)
(in R) or python -m py_compile "file"
(in Python) successful?
run
: Is Rscript "file"
(in R) or reticulate::py_run_file(file)
(in Python) successful?
If source files have side effects, e.g., generating an image or producing other outputs,
and mode == "parse"
, these side effects will occur during the check.
To prevent a script from being executed during the check, add a ## Not check:
comment at the top of the script.
checkFiles( files, index = seq_along(files), path = NULL, open = openFile, mode = c("parse", "run", "exist"), ... ) Rsolo( files, index = seq_along(files), path = NULL, open = openFile, mode = c("parse", "run", "exist"), ... )
checkFiles( files, index = seq_along(files), path = NULL, open = openFile, mode = c("parse", "run", "exist"), ... ) Rsolo( files, index = seq_along(files), path = NULL, open = openFile, mode = c("parse", "run", "exist"), ... )
files |
character: file name(s) |
index |
integer(s): if |
path |
character: path to start from (default: |
open |
function: function or function name to call after an error occurs (default: |
mode |
character which check to do |
... |
further parameters given to the function in |
nothing
if (interactive()) { files <- list.files(pattern="*.(R|py)$", full.names=TRUE, recursive=TRUE) checkFiles(files) }
if (interactive()) { files <- list.files(pattern="*.(R|py)$", full.names=TRUE, recursive=TRUE) checkFiles(files) }
Tries to open the given file
with the default application of the operating system using base::system2()
.
Only Windows (windows
), macOS (darwin
), Linux (linux
) and FreeBSD (freebsd
) is supported.
defaultApp(file, wait = FALSE, ...)
defaultApp(file, wait = FALSE, ...)
file |
character: file name |
wait |
logical: indicates whether the R interpreter should wait for the command to finish, or run it asynchronously (default: |
... |
further arguments passed to |
Result of try(system2, ...)
, invisibly
if (interactive()) { ghget() defaultApp(ghlist("dataanalysis.pdf", full.names = TRUE)) }
if (interactive()) { ghget() defaultApp(ghlist("dataanalysis.pdf", full.names = TRUE)) }
dupFiles
computes checksums to find duplicate files.
dupFiles(files, ...) Rdups(files, ...)
dupFiles(files, ...) Rdups(files, ...)
files |
character: file name(s) |
... |
further parameters given to |
a list of file names with the same checksum or NULL
if (interactive()) { files <- list.files(pattern="*.R$", full.names=TRUE, recursive=TRUE) dupFiles(files) }
if (interactive()) { files <- list.files(pattern="*.R$", full.names=TRUE, recursive=TRUE) dupFiles(files) }
Creates a list with element names replaced by link{getText}
.
getList(...)
getList(...)
... |
named elements of a list |
renamed list
getList(BOSTON=1, MTCARS=2)
getList(BOSTON=1, MTCARS=2)
Allows to access the package internal mmstat
environment.
getMMstat(...)
getMMstat(...)
... |
elements |
the choosen element
getMMstat('version')
getMMstat('version')
Translates a given message into another language.
getText(msg)
getText(msg)
msg |
character vector |
vector of translated messages
getText('Test')
getText('Test')
The function gh
carries out the following operation on a file named x
.
It searches for a match for x
within the active repository, utilizing fuzzy string
matching. If no unique match is identified, an error is thrown along with suggestions for
potential "best" matches.
Otherwise, the following operation are performed:
gh(x, 'open')
or ghopen(x)
: Opens a file in the local browser if the file extension is html
or pdf
, otherwise in the RStudio editor.
gh(x, 'load')
or ghload(x)
: Loads the contents of a file with import
and trust=TRUE
.
gh(x, 'source')
or ghsource(x)
: Executes the contents of a file with source
.
gh(x, 'app')
or ghapp(x)
: Tries to open the file with the default application of the OS, see defaultApp()
.
ghdata(x, pkg)
: Helper function to load data sets from R packages into Python, simulates pkg::x
.
gh(x, what = c("open", "load", "source", "app"), ..., .call = NULL) ghopen(x, ...) ghload(x, ...) ghsource(x, ...) ghapp(x, ...)
gh(x, what = c("open", "load", "source", "app"), ..., .call = NULL) ghopen(x, ...) ghload(x, ...) ghsource(x, ...) ghapp(x, ...)
x |
character(1): name of the file, app or data set |
what |
character or function: a name of a predefined function or another function. The function must have a formal parameter |
... |
further parameters used in |
.call |
the original function call (default: |
invisibly the result of utils::browseURL, openFile()
, rio::import()
, or base::source()
.
if (interactive()) { x <- ghopen("bank2.SAV") x <- ghload("bank2.SAV") str(x) x <- ghsource("univariate/example_ecdf.R") }
if (interactive()) { x <- ghopen("bank2.SAV") x <- ghload("bank2.SAV") str(x) x <- ghsource("univariate/example_ecdf.R") }
Runs a Shiny app from the downloaded zip file.
ghappAddin()
ghappAddin()
nothing
if (interactive()) ghappAddin()
if (interactive()) ghappAddin()
ghdecompose
pbjectghc
creates from a list of file names using ghdecompose()
and deletes mssing files.
ghc(...)
ghc(...)
... |
list(s) of filenmaes |
a ghdecompose
pbject
ghc(list.files(system.file(package="mmstat4"), recursive=TRUE))
ghc(list.files(system.file(package="mmstat4"), recursive=TRUE))
Decomposes a path of a set of files (or dirs) in several parts:
ghdecompose(files, dirs = FALSE)
ghdecompose(files, dirs = FALSE)
files |
character vector: path of files |
dirs |
logical: directory or files names (default: |
outpath
the path part which is common to all files (basically the place where the ZIP file was extracted)
inpath
the path part which is not necessary for a unique address in teh ZIP file
minpath
the minimal path part such that all files addressable in unique manner,
filename
the basename of the file, and
source
the input to shortpath
.
a data frame with five variables
ghget("local") pdf <- ghdecompose(ghlist(full.names=TRUE)) pdf
ghget("local") pdf <- ghdecompose(ghlist(full.names=TRUE)) pdf
Finds either a unique match in the list of files or throws an error with possible candidate files.
ghfile(x, n = 6, silent = FALSE, msg = "%s")
ghfile(x, n = 6, silent = FALSE, msg = "%s")
x |
character: a single file name |
n |
logical: if |
silent |
logical: if no (unique) match is found, then |
msg |
character: error message how to put the file name(s (default: |
the full matching file
ghfile("data/BANK2.sav") if (interactive()) ghfile("data/BANK2.SAV") # throws an error
ghfile("data/BANK2.sav") if (interactive()) ghfile("data/BANK2.SAV") # throws an error
Makes a repository the active repository and downloads it if necessary.
The parameter .tempdir
is TRUE
(default) then the repository is stored in the
in the temporary directory tempdir()
else in the application directory
rappdirs::user_data_dir()
for mmstat4
.
The parameter .tempdir
is not logical
then the value will be used as installation path.
ghget(..., .force = FALSE, .tempdir = TRUE, .quiet = !interactive())
ghget(..., .force = FALSE, .tempdir = TRUE, .quiet = !interactive())
... |
parameters to set and activate a repository |
.force |
logical: download and unzip in any case? (default: |
.tempdir |
logical or character: store download temporary or permanently (default: |
.quiet |
logical: show repository read attempts (default:
|
Note, the list of repository names, directories and urls is stored in the installation directory, too.
the name of the current key or nothing if unsuccessful
if (interactive()) { # get one of the default ZIP file from internet ghget("hu.data") # get a locally stored zip file ghget(dummy2=system.file("zip", "mmstat4.dummy.zip", package="mmstat4")) # get from an URL ghget(dummy.url="https://github.com/sigbertklinke/mmstat4.dummy/archive/refs/heads/main.zip") }
if (interactive()) { # get one of the default ZIP file from internet ghget("hu.data") # get a locally stored zip file ghget(dummy2=system.file("zip", "mmstat4.dummy.zip", package="mmstat4")) # get from an URL ghget(dummy.url="https://github.com/sigbertklinke/mmstat4.dummy/archive/refs/heads/main.zip") }
If the user agrees, it installs additional software necessary for running a script.
Currently, only type=="py"
for Python scripts and type=="R"`` for R scripts are supported. When a repository is downloaded,
ghinstallis called once. If the user calls
ghinstallfor an update, the parameter
force=TRUE' must be set.
ghinstall(type = c("py", "R"), force = FALSE)
ghinstall(type = c("py", "R"), force = FALSE)
type |
character: programm type (default: |
force |
logical: should the installation really done (default: 'NA) |
R
mmstat4
init_R.R
is opened if present in the active repository.
py
mmstat4
internally utilizes a virtual environment named mmstat4.xxxx
,
where xxxx, varies depending on the repository. When
ghinstallis invoked, it verifies the existence of the virtual environment
mmstat4.xxxx. If it does not exist, the environment is created, and
init_py.R'
is opened if present in the active repository.
NULL
if type
is not found, otherwise type
# to delete the virtual environment use # reticulate::virtualenv_remove('mmstat4') if (interactive()) ghinstall()
# to delete the virtual environment use # reticulate::virtualenv_remove('mmstat4') if (interactive()) ghinstall()
Both functions return unique (short) names for accessing each file in the repository according to a regular expression. For details about regular expressions, see base::regex.
ghlist( pattern = ".", ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE, full.names = FALSE ) ghgrep( pattern = ".", ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE, full.names = FALSE )
ghlist( pattern = ".", ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE, full.names = FALSE ) ghgrep( pattern = ".", ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE, full.names = FALSE )
pattern |
character string containing a regular expression
(or character string for |
ignore.case |
if |
perl |
logical. Should Perl-compatible regexps be used? |
fixed |
logical. If |
useBytes |
logical. If |
full.names |
logical: should full names returned instead of short names (default: |
character vector of short names
if (interactive()) ghgrep()
if (interactive()) ghgrep()
A RStudio addin to open a file from the downloaded zip file.
ghopenAddin()
ghopenAddin()
nothing
if (interactive()) ghopenAddin()
if (interactive()) ghopenAddin()
Returns a path for files based on ghdecompose
.
ghpath(df, from = c("outpath", "inpath", "minpath", "filename"))
ghpath(df, from = c("outpath", "inpath", "minpath", "filename"))
df |
data frame: returned from |
from |
character: either |
a character vector with file pathes
ghget("dummy") pdf <- ghdecompose(ghlist(full.names=TRUE)) ghpath(pdf) ghpath(pdf, 'o') # equals the input to ghdecompose ghpath(pdf, 'i') ghpath(pdf, 'm') ghpath(pdf, 'f')
ghget("dummy") pdf <- ghdecompose(ghlist(full.names=TRUE)) ghpath(pdf) ghpath(pdf, 'o') # equals the input to ghdecompose ghpath(pdf, 'i') ghpath(pdf, 'm') ghpath(pdf, 'f')
Queries the unique (short) names for each file in the repository. Several query methods are available, see Details.
ghquery( query, n = 6, full.names = FALSE, method = c("fpdist", "overlap", "tfidf"), costs = NULL, counts = FALSE, useBytes = FALSE )
ghquery( query, n = 6, full.names = FALSE, method = c("fpdist", "overlap", "tfidf"), costs = NULL, counts = FALSE, useBytes = FALSE )
query |
character: query string |
n |
integer: maximal number of matches to return |
full.names |
logical: should full names used instead of short names (default: |
method |
character: method to be used (default: |
costs |
a numeric vector or list with names partially matching
‘insertions’, ‘deletions’ and ‘substitutions’ giving
the respective costs for computing the Levenshtein distance, or
|
counts |
a logical indicating whether to optionally return the
transformation counts (numbers of insertions, deletions and
substitutions) as the |
useBytes |
a logical. If |
The following query methods are available:
fpdist
uses a partial backward matching distance based on utils::adist()
overlap
uses the overlap distance for query
and file names
character vector of short names fitting best to the query
if (interactive()) ghquery("bank")
if (interactive()) ghquery("bank")
If key
is NULL
, then it returns the known repositories and where they are stored.
If key
is not NULL
, then possible addresses for a repository are returned .
ghrepos(key = NULL)
ghrepos(key = NULL)
key |
character: "name" of the repository to find (default: |
a data frame with the data about the repositories
ghrepos()
ghrepos()
ghzip
creates a ZIP file (if dest
has an extension zip
) or copies to the destination directory.
If dest
is NULL
then a temporary directory will be used.
Please note that neither the ZIP file is deleted nor the target directory is cleaned beforehand
if it already exists.
ghzip(files, dest = NULL)
ghzip(files, dest = NULL)
files |
|
dest |
character: ZIP file name of destination directory (default: |
the name of the destination directory or the ZIP file
if (interactive()) { zipfile <- tempfile(fileext='.zip') files <- list.files(system.file(package="mmstat4"), recursive=TRUE) ghzip(files, zipfile) }
if (interactive()) { zipfile <- tempfile(fileext='.zip') files <- list.files(system.file(package="mmstat4"), recursive=TRUE) ghzip(files, zipfile) }
Checks if a Shiny app runs locally or on a server
isLocal()
isLocal()
logical
isLocal()
isLocal()
Returns a list with normalized pathes.
normpathes(x)
normpathes(x)
x |
file pathes |
A list of the same length as x
, the i-th element of which contains the vector of
splits of x[i]
.
normpathes("CRAN/../mmstat4/python/./ghdist.R")
normpathes("CRAN/../mmstat4/python/./ghdist.R")
note
internally stores a colored message, while display
utilizes base::cat()
to present them
and reset the internal message stack.
note(msg, col = crayon::green) display()
note(msg, col = crayon::green) display()
msg |
character: message |
col |
function: a color function (default: |
note
returns invisibly the number of notes
notetest <- function(msg) { on.exit({ display() }) note(msg) # do some complex computation x <- 1+1 } notetest("Hello world!")
notetest <- function(msg) { on.exit({ display() }) note(msg) # do some complex computation x <- 1+1 } notetest("Hello world!")
The function attempts to open a file either in RStudio or in a text editor, depending on the environment.
If the session is interactive, it tries to open the file in RStudio using rstudioapi::navigateToFile()
.
If RStudio is not available or the attempt fails, it opens the file in a text editor using utils::edit()
.
If the session is not interactive, it simply returns the contents of the file.
openFile(file, ...)
openFile(file, ...)
file |
character: name of the file |
... |
further parameters give to |
invisibly the result from try(rstudioapi::navigateToFile(file))
or try(utils::edit(file))
.
openFile(system.file("rstudio", "addins.dcf", package = "mmstat4"))
openFile(system.file("rstudio", "addins.dcf", package = "mmstat4"))
library
and require
calls in R and import
calls from Pythonpkglist
counts the number of library
/require
/import
calls for
R and Python commands within the files.
It checks the availability of a package/module via utils::available.packages()
(for R)
and via PyPI
(for Python).
If code=TRUE
is set, it returns R/Python code for installing packages/modules.
Otherwise, a table with the number of library
or import
calls is returned.
pkglist(files, code = TRUE, repos = getOption("repos")) Rlibs(files, code = TRUE, repos = getOption("repos")) modlist(files, code = TRUE, repos = getOption("repos"))
pkglist(files, code = TRUE, repos = getOption("repos")) Rlibs(files, code = TRUE, repos = getOption("repos")) modlist(files, code = TRUE, repos = getOption("repos"))
files |
character: file name(s) |
code |
logical: should names given back or code for init scrips? (default: |
repos |
character: the base URL(s) of the repositories to use (default: |
a table how frequently the packages are called or R Code to install them
if (interactive()) { files <- list.files(pattern="*.(R|py)$", full.names=TRUE, recursive=TRUE) pkglist(files) }
if (interactive()) { files <- list.files(pattern="*.(R|py)$", full.names=TRUE, recursive=TRUE) pkglist(files) }
Checks if a package
is available.
pkgMissing(package)
pkgMissing(package)
package |
character: string naming the package/name space to load. |
a logical value
pkgMissing("tools") pkgMissing("A3")
pkgMissing("tools") pkgMissing("A3")
Name of the currently used virtual emvironment.
py_env()
py_env()
the name of the virtual Python environment currently used by mmstat4
py_env()
py_env()
Converts x
to an integer.
If the conversion fails or the integer is outside min
and max
then NA_integer_
is returned
toInt(x, min = -Inf, max = +Inf)
toInt(x, min = -Inf, max = +Inf)
x |
input object |
min |
numeric: minimal value |
max |
numeric: maximal value |
a single integer value
toInt(3.0) toInt("3.0") toInt("test")
toInt(3.0) toInt("3.0") toInt("test")
Converts x
to a numeric.
If the conversion fails or the value is outside min
and max
then NA
is returned
toNum(x, min = -Inf, max = +Inf)
toNum(x, min = -Inf, max = +Inf)
x |
input object |
min |
numeric: minimal value |
max |
numeric: maximal value |
a single integer value
toNum(3.0) toNum("3.0") toNum("test")
toNum(3.0) toNum("3.0") toNum("test")
Verifies whether a provided url
is downloadable, without detecting redirections in the URL.
urlExists(url)
urlExists(url)
url |
a vector of text URLs |
TRUE
if URL exists otherwise FALSE
if (interactive()) { urlExists("https://hu-berlin.de/sk") urlExists("https://huglawurza.de") }
if (interactive()) { urlExists("https://hu-berlin.de/sk") urlExists("https://huglawurza.de") }