R Cheatsheet



This is a cheat sheet for the R programming language that provides a basic overview, key concepts, and commonly useful features to make your data analysis and programming tasks easier.

Table of Content

  1. Overview of Basic Structure
  2. Comments
  3. Data Types
  4. Variables
  5. Arithmetic and Logical Operators
  6. Special Values
  7. Type Conversion in R
  8. Getting Help with R
  9. Vectors
  10. Lists
  11. R Matrices
  12. Arrays
  13. Factors
  14. Data Frames
  15. Tibbles
  16. Dates and Times
  17. Subsetting Data
  18. Sorting and Ordering
  19. Data Transformation
  20. Data Summarisation
  21. Data Reshaping
  22. Joins
  23. Base R Graphics
  24. ggplot2 Basics
  25. Interactive Plots
  26. Descriptive Statistics
  27. Probability Distributions
  28. Hypothesis Testing
  29. Linear Regression
  30. ANOVA Test in R Programming
  31. Correlation and Covariance
  32. Time Series Analysis
  33. Control Structures
  34. Functions
  35. Apply Family
  36. Error Handling
  37. Reading Data
  38. Writing Data

1. Get Started with R

This is a basic R program that illustrates how to display the output.

"Tutorialspoint!" 
print("Hello World") 
5
10 + 10 

2. Comments

Comments are used to inform the lines of code. It is denoted by #.

# This is a comment

Note: For a multi-line comment, use # line by line for twice or more.

3. Data Types

The data types define the storage of different values by their type. To check the type of data values, use the class() function. Below is the list of representations in the code block −

# numeric data type
num <- 76.5
class(num) 

# integer data type
num <- 5000L # L used for integer
class(num) 

# complex data type
num <- 56i + 32 # i used for complex
class(num) 

# character/string data type
num <- "Welcome to Tutorialspoint"
class(num) 

# logical/boolean data type
num <- TRUE
class(num) 

4. Variables

Variables are used to store the data values. To assign the value to the variable, use the -> symbol.

emp <- "Soham"
id <- 41625
print(emp)
print(id)

# For multiple variables for the same value
var1 <- var2 <- var3 <- 99.5

# Display the values
var1
var2
var3

5. Arithmetic and Logical Operators

In R, these operators play an important role in data analysis, statistics modelling, and decision-making.

Arithmetic Operators

Below is the list of arithmetic operators that used to perform the mathematical operation.

Operators Description Examples
+ Addition 5 + 3 = 8
- Subtraction 7 - 3 = 4
* Multiplication 5 * 5 = 25
/ Division 8 / 2 = 4
^ Exponentiation (power) 2 ^ 4 = 16
%% Modulus (remainder) 10 %% 3 = 1
%/% Integer division (quotient) 10 %/% 3 = 3

Below is the illustration of arithmetic operator types.

x <- 10
y <- 3

# Addition
sum <- x + y      
print(sum)         # 13

# Subtraction
difference <- x - y  
print(difference)  # 7

# Multiplication
product <- x * y   
print(product)     # 30

# Division
quotient <- x / y  
print(quotient)    # 3.333333 (approx)

# Modulus
modulus <- x %% y  
print(modulus)     # 1

# Exponential
power <- x^2      
print(power)       # 100

# Integer Division
qi <- x %% y
print(qi)          # 3

Logical Operators

The operators that are used to return either TRUE or FALSE as a result are known as logical operators.

Operators Description Examples
& Logical AND (element-wise) TRUE & FALSE = FALSE
&& Logical AND (only for the first element) TRUE && FALSE = FALSE
Logical NOT Multiplication 5 * 5 = 25
== Equal to 8 == 8 = TRUE
!= Not Equal to 5 != 3 = TRUE
a <- TRUE
b <- FALSE
c <- 5
d <- 5

# Logical AND
result_and <- a & b   
print(result_and)

# Logical OR
result_or <- a | b    
print(result_or)

# Logical NOT
result_not <- !a      
print(result_not)

# Equality
result_eq <- c == d  
print(result_eq)

# Inequality
result_neq <- c != d  
print(result_neq)

6. Special Values

The special values are used to define the specific types of data. These are not the data types or characters. Below the table, help you to learn the representation of special values.

Special Values Description Examples
NA Missing Value NA
NAN Not a number 0/0
Inf or -Inf Postive or Negative Infinity 1/0 , -1/0
NULL Absence of value NULL
TRUE or False Logical true or false TRUE or False

7. Type Conversion in R

Type conversion is the process of converting one type of data into another. For example − integer

x <- 123  
# integer to string
x_str <- as.character(x) 
print(x_str)

Note: In R, use the type conversion functions such as as.numeric(), as.integer(), as.character(), as.logical(), as.factor(), as.Date(), as.matrix(), and as.data.frame().

8. Getting Help with R

Here, we use the help() function or the ? operator that can be used when a user wants to know the name of the function that they want to use.

# to get help on the mean() function
help(mean)   
?mean        

9. Vectors

Vectors of R define lists of similar elements of the same type. To combine a list of elements, use the c() function and separate the elements by commas.

# vector of strings
emp <- c("Raj", "Suresh", "Bhaskar")
print(emp)

# vector of integers
num <- c(11, 34, 23, 34)
print(num)

10. Lists

In R, the list contains different types of data values within a single parenthesis. It is denoted by a list().

# List of characters/strings
list_data <- list("book", 15, 45.6, 56i+32)

# Print the list
print(list_data)

11. R Matrices

Matrices define the two-dimensional data set with columns and rows. To identify the numbers of rows and columns in the matrices, use the parameters nrow and ncol in the matrix(). Below is the representation −

data_matrix <- matrix(c(11, 12, 13, 14, 15, 16), nrow = 2, ncol = 2)
print(data_matrix)

12. Arrays

An array is a multi-dimensional data structure that stores the elements of the same data type. This is denoted by an array() function.

# create a 2d array
arr_two_d <- array(1:6, dim = c(2, 3))  # 2 rows, 3 columns
print(arr_two_d)

# create a 3d array
# Create a 2*3*2 array (2 rows, 3 columns, 2 layers)
arr_three_d <- array(1:12, dim = c(2, 3, 2))
print(arr_three_d)

Note: The operation of an array can be calculated using various functions such as dim(), length(), array(), apply(), and sum().

13. Factors

In R, a factor is a special data type that is used to categorize data. For example, if we have a list of colors like "Red," "Green," and "Yellow" we can convert them into a factor to manage these categories for the data analysis.

# Create a vector
gen <-c("female", "male", "male", "female")
print(gen)

# Convert the vector gen into a factor 
gender <-factor(gen)
print(gender)  # Levels: female male

14. Data Frames

Data Frames are used to display data in the table format.

Data_Frame <- data.frame (
  Emp_name = c("Tapas", "Aman", "Ravi"),
  Id = c(1421, 1546, 1127),
  State = c("Jharkhand", "UP", "Bihar")
)

# Display the data frame
Data_Frame

15. Tibbles

Tibble is a popular dataframe that is similar to the Data Frame. This enhances the security when working with data.

tibble(col1 = val1, col2 = val2, ...)

16. Dates and Times

To perform date and time in R, use the specific functions such as Sys.time(), strftime(), and POSIXlt(). Below is the description −

  • Sys.time(): This return the current date and time.
  • strftime(): This format the date and time object.
  • POSIXlt(): This convert the string into date-time object.
# Current time
current_time <- Sys.time()
print(current_time)

17. Subsetting Data

R has strong indexing features that allow users to easily access and work with elements in objects like vectors, matrices, and data frames. Thus, these are the feature that represents subsetting data.

# Create vector
x <- 1:15
# Print vector
cat("Original vector: ", x, "\n")
# Subsetting vector
cat("First 6 values of vector: ", x[1:6], "\n")

18. Sorting and Ordering

R provides two ways to sort the data, either in ascending or descending order. To sort the data, use sort() or order().

order(x, decreasing = TRUE or FALSE, na.last = TRUE or FLASE, method = c("auto", "shell", "quick", "radix"))
Or,
sort(x, decreasing = FALSE, na.last = TRUE)

19. Data Transformation

The data transformation in R changes the structure or data values to make it better for analysis.

# Given data
m <- "123.45"
n <- "TRUE"

# Convert to numeric and logical
m_numeric <- as.numeric(m)
n_logical <- as.logical(n)

print(m_numeric)  # 123.45
print(n_logical)  # TRUE

20. Data Summarisation

Data summarisation describes the summary of a dataframe. This shows the statistics like mean, median, and more.

summary(object)

21. Data Reshaping

The reshape() function in R can be used to reshape data between long and wide formats. The data reshaping can be performed using various function such as rbind(), cbind(), Melt(), Dcast(), and finally().

22. Joins

In R, we can perform joins to combine two datasets based upon a common key or condition. Users can also use SQL joins, and the most popular packages are dplyr and base R.

# syntax using dplyr 
library(dplyr)
result <- join_function(x = left_data, y = right_data, by = "key_column")

23. Base R Graphics

The base R graphics is a set of functions that creates interactive graphs. Some of the common plotting functions are plot(), hist(), boxplot(), pie(), etc. Below is the table format that shows the list of plot types with their syntax and examples.

Plot Type Function Syntax Examples
Basic Plot plot() plot(x, y, type = "p", main = "Title", xlab = "X-axis", ylab = "Y-axis") plot(x, y, type = "p", main = "Scatter Plot")
Histogram hist() hist(x, main = "Title", xlab = "X-axis", col = "blue", border = "black") hist(x, main = "Histogram", col = "skyblue")
Boxplot boxplot() boxplot(x, main = "Title", ylab = "Y-axis", col = "lightgreen") boxplot(x, main = "Boxplot", col = "lightgreen")
Line Plot plot() with type = "l" plot(x, y, type = "l", main = "Title", xlab = "X-axis", ylab = "Y-axis") plot(x, y, type = "l", main = "Sine Wave")
Bar Plot barplot() barplot(height, main = "Title", xlab = "X-axis", ylab = "Y-axis", col = "blue") barplot(counts, main = "Barplot", col = "lightblue")
Pie Chart pie() pie(x, main = "Title", col = rainbow(length(x))) pie(sizes, labels = labels, main = "Pie Chart")

24. ggplot2 Basics

ggplot2 Basics is an initial package of R programming for producing data visualizations.

ggplot(data, aes(x, y)) + geom_point()

25. Interactive Plots

The interactive plots of R allow users to illustrate the tasks like zooming, panning, and interaction with the element for the visualization. Next, install the libraries based on your choices − plotly, shiny, ggirph, highcharter, and leafnode.

# plotly library(eg.)
library(plotly)

# Generate random data points for x and y axes
x <- rnorm(n)            
y <- rnorm(n)           

# Create the scatter plot
plot <- plot_ly(
  x = ~x,                
  y = ~y,                
  type = 'scatter',      
  mode = 'markers'       
)

# Display the plot
plot

26. Descriptive Statistics

Descriptive statistics is a branch of statistics that summarizes and describes the main features of a dataset.

27. Probability Distributions

The probability distribution shows the distribution of random variable values.

28. Hypothesis Testing

This type of testing in R designed by the researchers or to validate the hypothesis.

# Given data
data <- c(23, 25, 28, 22, 30, 29, 32, 28)

# one-sample t-test
res <- t.test(data, mu = 25, alternative = "two.sided")
print(res)

29. Linear Regression

In R, linear regression is a statistical model that uses a straight line to represent the relationship between dependent and independent variables. The independent variable may be more than one.

data(dataset_file)

# Fit a linear regression model to predict mpg based on weight
model <- lm(mpg ~ wt, data = dataset_file)

# Display the model summary
summary(model)

30. ANOVA Test in R programming

ANOVA stands for analysis of variance, which is used for the relationship between categorical and continuous variables. The user can perform an ANOVA test using the aov() function.

aov_output <- aov(dependent_variable ~ independent_variable, data = dataset)
summary(aov_output)

31. Correlation and Covariance in R

These are two statistical measures that can be used to describe the relationship between two variables. Covariance measures the directional relationship between two variables (whether they increase or decrease together), but its magnitude depends on the scale of the variables. Correlation, on the other hand, standardizes this relationship, providing a value between -1 and 1 that indicates both the strength and direction of the linear relationship.

Correlation

x <- c(1, 2, 3, 4, 5)
y <- c(5, 4, 3, 2, 1)

# Calculate the correlation
cor_result <- cor(x, y)
print(cor_result)

Covariance

x_ax <- c(11, 2, 31, 4, 51)
y_ax <- c(5, 41, 3, 21, 1)

# Calculate the covariance
cov_result <- cov(x_ax, y_ax)
print(cov_result)

32. Time Series Analysis in R

The time series analysis in R defines the collection of data points that were recorded at successive time intervals.

objectName <- ts(data, start, end, frequency)

33. Control Structures

The following is the list of points that shows control structure in R −

  • if and else: This test a condition and act on it.
  • for: This fixed the number of time when execute a loop.
  • while: If the condition is true then loop will run.
  • repeat: This execute the infinite loop.
  • break: This break the execution of a loop.
  • next: This skip an interation of a loop.

34. Functions

While creating a function, use the keyword function() −

my_function <- function() { 
  print("Welcome to Tutorialspoint!")
}

35. Apply Family

The R apply family is a set of functions that allow users to handle rows, columns, or subsets of data. It also applies a function to elements of a vector, list, or matrix.

mat <- matrix(1:16, nrow = 4)
apply(mat, 1, sum)  # Sum of each row
apply(mat, 2, mean) # Mean of each column

36. Error Handling

The user can use trycatch() to control error handling in the R program. This method specifies the custom behaviour of error, warning, and message.

tryCatch(
   expr = {
     # Code evaluation
   },
   error = function(e) {
     # Handle the error
   },
   warning = function(w) {
     # Handle the warning
   },
   finally = {
     # always execute the code
  }
)

37. Reading Data from files

Here, we are providing two methods by which one can read data from the text file.

read.delim(file, header = TRUE, sep = \t, dec = ., )
Or,
library(readr)
data <- read_csv("file_path.csv")

38. Writing Data from files

In R, writing data in files means saving the data objects to external files such as CSV, Excel, JSON, and text.

# csv
write.csv(data, "file_path.csv", row.names = FALSE)
# excel
library(writexl)
write_xlsx(data, "file_path.xlsx")
# json
library(jsonlite)
write(toJSON(data, pretty = TRUE), "file_path.json")
# text
write.table(data, "file_path.txt", sep = "\t", row.names = FALSE)
Advertisements