You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
? or help gives the documentation of a specific function.
?? or help.search searches for provided key word (or regex pattern) in the help system.
Logic
xor indicates elementwise exclusive OR.
Files management
setwd & getwd: Setting working directory
list.files & list.dirs: List all files or directories in the given path.
file.exists, file.copy, file.rename, file.remove: System level of file manipulation.
download.file: Download files from the Internet in an R session.
Data cleaning and manipulation
anyNA, complete.cases, is.na, and na.omit are useful when finding or excluding NAs.
order can order the data frame with data in its column(s). For example, airquality[order(airquality$Month),] and airquality[order(airquality$Day),] order that data frame by Month and Day respectively. Multiple argumets in order are allowed.
Heat maps with ggplot2
A tutorial for creating heat maps in R, including with base and ggplot2 system.
R programming
All arguments after an ellipsis must have default values.
The arguments can be passed by order or by specified names. When specifying names, they can be either names themselves or characters. For instances, mean(x = 1:3) is equivalent to mean("x" = 1:3).
return returns the result of an expression and ignores all the following lines in that function.
Generating messages for function users:
message is used for generating a diagnostic message
warning and stop are for generating warnings and fetal errors respectively.
stopifnot, is "If any of the expressions in ... are not all TRUE, stop is called, producing an error message indicating the first of the elements of ... which were not true."
missing can be used to test whether a value was specified as an argument to a function. For instance, test <- function(y = 1) {if (missing(y)) {print(y)}}.
on.exit records the expression given as its argument as needing to be executed when the current function exits (either naturally or as the result of an error).
exist can test whether the named object exist in the specified environment.
readline reads a line from the terminal (in interactive use).
:: to use functions (once) without loading the package For example, calling reshape2::melt is equivalent to library(reshape2) or require(reshape2) before melt.
grep, grepl, regexpr, gregexpr and regexec search for matches to argument pattern within each element of a character vector: they differ in the format of and amount of detail in the results.
sub and gsub perform replacement of the first and all matches respectively.
sprintf returns a character vector containing a formatted combination of text and variable values.
substr extracts or replaces substrings in a character vector.
strsplit splits the elements of a character vector x into substrings according to the matches to substring split within them.
tolower and toupper convert upper-case characters in a character vector to lower-case, or vice versa. Non-alphabetic characters are left unchanged.
nchar takes a character vector as an argument and returns a vector whose elements contain the sizes of the corresponding elements of x.
Functions do loops or parallel operations
split divides the data in the vector x into the groups defined by f.
apply, sapply, lapply, tapply, and mapply ("apply" family). See an example of mapply since it's more complicated.
by is an object-oriented wrapper for tapply applied to data frames.
Reduce uses a binary function to successively combine the elements of a given vector and a possibly given initial value.
do.call constructs and executes a function call from a name or a function and a list of arguments to be passed to it, while call only constructs the function call.
car
Short for "Companion to Applied Regression". Two of the useful functions are Anova and Manova, which can calculate type-II or type-III ANOVA and MANOVA respectively.
caret
Short for "Classification And REgression Training". A package integrate multiple machine learning algorithm packages. In addition, it helps data preprocessing and cross-validation with confusionMatrix.
cowplot
Merging multiple ggplots and labeling them respectively in one graph.
dendextend
Extended functions for built-in dendrograms in R.
dplyr
Some other ways to manipulate or cleanse data.
MCMCglmm
A package for fitting Bayesian mixed models in R. More introduction and tutorial here.
plotly
A powerful package to build interactive plots. Its plot_ly function creates various types of plots, and ggplotly turns most of ggplot2 objects interactive.
rattle
Wonderful GUI for machine learning analyses. The author emphasizes its capability of creating logs when users click the GUI, and exporting them as a shortcut for further argument tuning. Programming is still encouraged.
reshape2 melt the data into a long-format or cast it into a wide-format. An example is provided here.
shiny
Building interactive interface and present data to others even they don't know R. Its tutorial is very worth reading.