
- R - Home
- R - Overview
- R - Environment Setup
- R - Basic Syntax
- R - Data Types
- R - Variables
- R - Operators
- R - Decision Making
- R - Loops
- R - Functions
- R - Strings
- R - Vectors
- R - Lists
- R - Matrices
- R - Arrays
- R - Factors
- R - Data Frames
- R - Packages
- R - Data Reshaping
- R - CSV Files
- R - Excel Files
- R - Binary Files
- R - XML Files
- R - JSON Files
- R - Web Data
- R - Database
- R Charts & Graphs
- R - Pie Charts
- R - Bar Charts
- R - Boxplots
- R - Histograms
- R - Line Graphs
- R - Scatterplots
- R Statistics Examples
- R - Mean, Median & Mode
- R - Linear Regression
- R - Multiple Regression
- R - Logistic Regression
- R - Normal Distribution
- R - Binomial Distribution
- R - Poisson Regression
- R - Analysis of Covariance
- R - Time Series Analysis
- R - Nonlinear Least Square
- R - Decision Tree
- R - Random Forest
- R - Survival Analysis
- R - Chi Square Tests
- R Useful Resources
- R - Interview Questions
- R - Quick Guide
- R - Cheatsheet
- R - Useful Resources
- R - Discussion
R Cheatsheet
This is a cheat sheet for the R programming language that provides a basic overview, key concepts, and commonly useful features to make your data analysis and programming tasks easier.
Table of Content
- Overview of Basic Structure
- Comments
- Data Types
- Variables
- Arithmetic and Logical Operators
- Special Values
- Type Conversion in R
- Getting Help with R
- Vectors
- Lists
- R Matrices
- Arrays
- Factors
- Data Frames
- Tibbles
- Dates and Times
- Subsetting Data
- Sorting and Ordering
- Data Transformation
- Data Summarisation
- Data Reshaping
- Joins
- Base R Graphics
- ggplot2 Basics
- Interactive Plots
- Descriptive Statistics
- Probability Distributions
- Hypothesis Testing
- Linear Regression
- ANOVA Test in R Programming
- Correlation and Covariance
- Time Series Analysis
- Control Structures
- Functions
- Apply Family
- Error Handling
- Reading Data
- Writing Data
1. Get Started with R
This is a basic R program that illustrates how to display the output.
"Tutorialspoint!" print("Hello World") 5 10 + 10
2. Comments
Comments are used to inform the lines of code. It is denoted by #.
# This is a comment
Note: For a multi-line comment, use # line by line for twice or more.
3. Data Types
The data types define the storage of different values by their type. To check the type of data values, use the class() function. Below is the list of representations in the code block −
# numeric data type num <- 76.5 class(num) # integer data type num <- 5000L # L used for integer class(num) # complex data type num <- 56i + 32 # i used for complex class(num) # character/string data type num <- "Welcome to Tutorialspoint" class(num) # logical/boolean data type num <- TRUE class(num)
4. Variables
Variables are used to store the data values. To assign the value to the variable, use the -> symbol.
emp <- "Soham" id <- 41625 print(emp) print(id) # For multiple variables for the same value var1 <- var2 <- var3 <- 99.5 # Display the values var1 var2 var3
5. Arithmetic and Logical Operators
In R, these operators play an important role in data analysis, statistics modelling, and decision-making.
Arithmetic Operators
Below is the list of arithmetic operators that used to perform the mathematical operation.
Operators | Description | Examples |
---|---|---|
+ | Addition | 5 + 3 = 8 |
- | Subtraction | 7 - 3 = 4 |
* | Multiplication | 5 * 5 = 25 |
/ | Division | 8 / 2 = 4 |
^ | Exponentiation (power) | 2 ^ 4 = 16 |
%% | Modulus (remainder) | 10 %% 3 = 1 |
%/% | Integer division (quotient) | 10 %/% 3 = 3 |
Below is the illustration of arithmetic operator types.
x <- 10 y <- 3 # Addition sum <- x + y print(sum) # 13 # Subtraction difference <- x - y print(difference) # 7 # Multiplication product <- x * y print(product) # 30 # Division quotient <- x / y print(quotient) # 3.333333 (approx) # Modulus modulus <- x %% y print(modulus) # 1 # Exponential power <- x^2 print(power) # 100 # Integer Division qi <- x %% y print(qi) # 3
Logical Operators
The operators that are used to return either TRUE or FALSE as a result are known as logical operators.
Operators | Description | Examples |
---|---|---|
& | Logical AND (element-wise) | TRUE & FALSE = FALSE |
&& | Logical AND (only for the first element) | TRUE && FALSE = FALSE |
Logical NOT | Multiplication | 5 * 5 = 25 |
== | Equal to | 8 == 8 = TRUE |
!= | Not Equal to | 5 != 3 = TRUE |
a <- TRUE b <- FALSE c <- 5 d <- 5 # Logical AND result_and <- a & b print(result_and) # Logical OR result_or <- a | b print(result_or) # Logical NOT result_not <- !a print(result_not) # Equality result_eq <- c == d print(result_eq) # Inequality result_neq <- c != d print(result_neq)
6. Special Values
The special values are used to define the specific types of data. These are not the data types or characters. Below the table, help you to learn the representation of special values.
Special Values | Description | Examples |
---|---|---|
NA | Missing Value | NA |
NAN | Not a number | 0/0 |
Inf or -Inf | Postive or Negative Infinity | 1/0 , -1/0 |
NULL | Absence of value | NULL |
TRUE or False | Logical true or false | TRUE or False |
7. Type Conversion in R
Type conversion is the process of converting one type of data into another. For example − integer
x <- 123 # integer to string x_str <- as.character(x) print(x_str)
Note: In R, use the type conversion functions such as as.numeric(), as.integer(), as.character(), as.logical(), as.factor(), as.Date(), as.matrix(), and as.data.frame().
8. Getting Help with R
Here, we use the help() function or the ? operator that can be used when a user wants to know the name of the function that they want to use.
# to get help on the mean() function help(mean) ?mean
9. Vectors
Vectors of R define lists of similar elements of the same type. To combine a list of elements, use the c() function and separate the elements by commas.
# vector of strings emp <- c("Raj", "Suresh", "Bhaskar") print(emp) # vector of integers num <- c(11, 34, 23, 34) print(num)
10. Lists
In R, the list contains different types of data values within a single parenthesis. It is denoted by a list().
# List of characters/strings list_data <- list("book", 15, 45.6, 56i+32) # Print the list print(list_data)
11. R Matrices
Matrices define the two-dimensional data set with columns and rows. To identify the numbers of rows and columns in the matrices, use the parameters nrow and ncol in the matrix(). Below is the representation −
data_matrix <- matrix(c(11, 12, 13, 14, 15, 16), nrow = 2, ncol = 2) print(data_matrix)
12. Arrays
An array is a multi-dimensional data structure that stores the elements of the same data type. This is denoted by an array() function.
# create a 2d array arr_two_d <- array(1:6, dim = c(2, 3)) # 2 rows, 3 columns print(arr_two_d) # create a 3d array # Create a 2*3*2 array (2 rows, 3 columns, 2 layers) arr_three_d <- array(1:12, dim = c(2, 3, 2)) print(arr_three_d)
Note: The operation of an array can be calculated using various functions such as dim(), length(), array(), apply(), and sum().
13. Factors
In R, a factor is a special data type that is used to categorize data. For example, if we have a list of colors like "Red," "Green," and "Yellow" we can convert them into a factor to manage these categories for the data analysis.
# Create a vector gen <-c("female", "male", "male", "female") print(gen) # Convert the vector gen into a factor gender <-factor(gen) print(gender) # Levels: female male
14. Data Frames
Data Frames are used to display data in the table format.
Data_Frame <- data.frame ( Emp_name = c("Tapas", "Aman", "Ravi"), Id = c(1421, 1546, 1127), State = c("Jharkhand", "UP", "Bihar") ) # Display the data frame Data_Frame
15. Tibbles
Tibble is a popular dataframe that is similar to the Data Frame. This enhances the security when working with data.
tibble(col1 = val1, col2 = val2, ...)
16. Dates and Times
To perform date and time in R, use the specific functions such as Sys.time(), strftime(), and POSIXlt(). Below is the description −
- Sys.time(): This return the current date and time.
- strftime(): This format the date and time object.
- POSIXlt(): This convert the string into date-time object.
# Current time current_time <- Sys.time() print(current_time)
17. Subsetting Data
R has strong indexing features that allow users to easily access and work with elements in objects like vectors, matrices, and data frames. Thus, these are the feature that represents subsetting data.
# Create vector x <- 1:15 # Print vector cat("Original vector: ", x, "\n") # Subsetting vector cat("First 6 values of vector: ", x[1:6], "\n")
18. Sorting and Ordering
R provides two ways to sort the data, either in ascending or descending order. To sort the data, use sort() or order().
order(x, decreasing = TRUE or FALSE, na.last = TRUE or FLASE, method = c("auto", "shell", "quick", "radix")) Or, sort(x, decreasing = FALSE, na.last = TRUE)
19. Data Transformation
The data transformation in R changes the structure or data values to make it better for analysis.
# Given data m <- "123.45" n <- "TRUE" # Convert to numeric and logical m_numeric <- as.numeric(m) n_logical <- as.logical(n) print(m_numeric) # 123.45 print(n_logical) # TRUE
20. Data Summarisation
Data summarisation describes the summary of a dataframe. This shows the statistics like mean, median, and more.
summary(object)
21. Data Reshaping
The reshape() function in R can be used to reshape data between long and wide formats. The data reshaping can be performed using various function such as rbind(), cbind(), Melt(), Dcast(), and finally().
22. Joins
In R, we can perform joins to combine two datasets based upon a common key or condition. Users can also use SQL joins, and the most popular packages are dplyr and base R.
# syntax using dplyr library(dplyr) result <- join_function(x = left_data, y = right_data, by = "key_column")
23. Base R Graphics
The base R graphics is a set of functions that creates interactive graphs. Some of the common plotting functions are plot(), hist(), boxplot(), pie(), etc. Below is the table format that shows the list of plot types with their syntax and examples.
Plot Type | Function | Syntax | Examples |
---|---|---|---|
Basic Plot | plot() | plot(x, y, type = "p", main = "Title", xlab = "X-axis", ylab = "Y-axis") | plot(x, y, type = "p", main = "Scatter Plot") |
Histogram | hist() | hist(x, main = "Title", xlab = "X-axis", col = "blue", border = "black") | hist(x, main = "Histogram", col = "skyblue") |
Boxplot | boxplot() | boxplot(x, main = "Title", ylab = "Y-axis", col = "lightgreen") | boxplot(x, main = "Boxplot", col = "lightgreen") |
Line Plot | plot() with type = "l" | plot(x, y, type = "l", main = "Title", xlab = "X-axis", ylab = "Y-axis") | plot(x, y, type = "l", main = "Sine Wave") |
Bar Plot | barplot() | barplot(height, main = "Title", xlab = "X-axis", ylab = "Y-axis", col = "blue") | barplot(counts, main = "Barplot", col = "lightblue") |
Pie Chart | pie() | pie(x, main = "Title", col = rainbow(length(x))) | pie(sizes, labels = labels, main = "Pie Chart") |
24. ggplot2 Basics
ggplot2 Basics is an initial package of R programming for producing data visualizations.
ggplot(data, aes(x, y)) + geom_point()
25. Interactive Plots
The interactive plots of R allow users to illustrate the tasks like zooming, panning, and interaction with the element for the visualization. Next, install the libraries based on your choices − plotly, shiny, ggirph, highcharter, and leafnode.
# plotly library(eg.) library(plotly) # Generate random data points for x and y axes x <- rnorm(n) y <- rnorm(n) # Create the scatter plot plot <- plot_ly( x = ~x, y = ~y, type = 'scatter', mode = 'markers' ) # Display the plot plot
26. Descriptive Statistics
Descriptive statistics is a branch of statistics that summarizes and describes the main features of a dataset.
27. Probability Distributions
The probability distribution shows the distribution of random variable values.
28. Hypothesis Testing
This type of testing in R designed by the researchers or to validate the hypothesis.
# Given data data <- c(23, 25, 28, 22, 30, 29, 32, 28) # one-sample t-test res <- t.test(data, mu = 25, alternative = "two.sided") print(res)
29. Linear Regression
In R, linear regression is a statistical model that uses a straight line to represent the relationship between dependent and independent variables. The independent variable may be more than one.
data(dataset_file) # Fit a linear regression model to predict mpg based on weight model <- lm(mpg ~ wt, data = dataset_file) # Display the model summary summary(model)
30. ANOVA Test in R programming
ANOVA stands for analysis of variance, which is used for the relationship between categorical and continuous variables. The user can perform an ANOVA test using the aov() function.
aov_output <- aov(dependent_variable ~ independent_variable, data = dataset) summary(aov_output)
31. Correlation and Covariance in R
These are two statistical measures that can be used to describe the relationship between two variables. Covariance measures the directional relationship between two variables (whether they increase or decrease together), but its magnitude depends on the scale of the variables. Correlation, on the other hand, standardizes this relationship, providing a value between -1 and 1 that indicates both the strength and direction of the linear relationship.
Correlation
x <- c(1, 2, 3, 4, 5) y <- c(5, 4, 3, 2, 1) # Calculate the correlation cor_result <- cor(x, y) print(cor_result)
Covariance
x_ax <- c(11, 2, 31, 4, 51) y_ax <- c(5, 41, 3, 21, 1) # Calculate the covariance cov_result <- cov(x_ax, y_ax) print(cov_result)
32. Time Series Analysis in R
The time series analysis in R defines the collection of data points that were recorded at successive time intervals.
objectName <- ts(data, start, end, frequency)
33. Control Structures
The following is the list of points that shows control structure in R −
- if and else: This test a condition and act on it.
- for: This fixed the number of time when execute a loop.
- while: If the condition is true then loop will run.
- repeat: This execute the infinite loop.
- break: This break the execution of a loop.
- next: This skip an interation of a loop.
34. Functions
While creating a function, use the keyword function() −
my_function <- function() { print("Welcome to Tutorialspoint!") }
35. Apply Family
The R apply family is a set of functions that allow users to handle rows, columns, or subsets of data. It also applies a function to elements of a vector, list, or matrix.
mat <- matrix(1:16, nrow = 4) apply(mat, 1, sum) # Sum of each row apply(mat, 2, mean) # Mean of each column
36. Error Handling
The user can use trycatch() to control error handling in the R program. This method specifies the custom behaviour of error, warning, and message.
tryCatch( expr = { # Code evaluation }, error = function(e) { # Handle the error }, warning = function(w) { # Handle the warning }, finally = { # always execute the code } )
37. Reading Data from files
Here, we are providing two methods by which one can read data from the text file.
read.delim(file, header = TRUE, sep = \t, dec = ., ) Or, library(readr) data <- read_csv("file_path.csv")
38. Writing Data from files
In R, writing data in files means saving the data objects to external files such as CSV, Excel, JSON, and text.
# csv write.csv(data, "file_path.csv", row.names = FALSE) # excel library(writexl) write_xlsx(data, "file_path.xlsx") # json library(jsonlite) write(toJSON(data, pretty = TRUE), "file_path.json") # text write.table(data, "file_path.txt", sep = "\t", row.names = FALSE)