Intro to R, Using R and RStudio

Field Ecology

Connor Brown

Oct 6, 2025

What is R?

  • R is an open-source software environment for statistical computing and graphics, popular among ecologist and data scientist.

  • There’s almost nothing you can’t do with R and RStudio!1

  • Since R does not have a graphical user interface (GUI) for point-and-click interactions like Excel or SPSS, most people use an Integrated Development Environment (IDE) called RStudio.

    • RStudio provides a consistent way to organize files, access information quickly, and display data.

R in your Browser - Posit Cloud

https://posit.cloud/

Create A New Project

Name Your Project

What Are We Looking At?

Console panel
  • The Console is where the output and conflicts and errors within your code are listed. You can also run individual lines of code in the Console if you want to test code variations or options before incorporating into your script and running the entire chunk.
Environment panel
  • The Environment displays your current R objects, variable values, custom functions, and previously executed commands in the “History” tab.
Files, Plots, Help
  • The Files panel displays files, plots, the library of R packages that are installed, and R Help resources.

Open a File For Code

Before you start writing code, you need to open an R Script file, this is where you’ll be able to save code for later.

New File

Script panel
  • This is where you input and run your code. Your graphics display below the chunk of code inside the panel.
Console panel
  • The Console is where the output and conflicts and errors within your code are listed. You can also run individual lines of code in the Console if you want to test code variations or options before incorporating into your script and running the entire chunk.
Environment panel
  • The Environment displays your current R objects, variable values, custom functions, and previously executed commands in the “History” tab.
Files, Plots, Help
  • The Files panel displays files, plots, the library of R packages that are installed, and R Help resources.

Functions and Packages

  • A function is a reusable piece of code that performs a task.

  • Functions often take inputs (called arguments or parameters) and return an output.

    • For example:

      sqrt(x = 16)
      [1] 4
    • Here, sqrt() is the function, 16 is the argument, and the output is 4.

  • You can think of a function like a math function:

    • In math: \(y = f(x)\)

    • In R: f(x)

      log(x = 100)
      [1] 4.60517
    • The function log() takes 100 as input and returns its logarithm.

Functions and Packages

  • Packages are collections of functions, data, and documentation bundled together

  • They extend R so you can do tasks like data wrangling, visualization, or machine learning.

  • R comes with a set of base packages, but thousands of additional packages are available through repositories like CRAN and GitHub.

  • Examples of popular packages:

Install Packages

To install a package, type the following commands in the Console at the bottom of RStudio and press Enter after each one:

Or navigate to Tools > Install Packages … and type the packages you want to install

Install Packages

To install a package, type the following commands in the Console at the bottom of RStudio and press Enter after each one:

Or navigate to Tools > Install Packages … and type the packages you want to install

This will download and install:

Shortcut

You can also install multiple packages at once using the function c():

install.packages(c("readr", "dplyr", "ggplot2"))

c() stands for combine or concatenate.

Load Packages

After installation, load the packages into your R session by typing these lines of code into your R Script:

You can comment out anything you don’t want to run with #, this is useful for annotating code or leaving notes to yourself.

# Load Libraries
library(readr) # for reading files
library(dplyr) # for data manipulation
library(ggplot2) # for plotting

You cannot combine or concatenate libraries when loading

library(c(readr, dplyr, ggplot2))
Error in library(c(readr, dplyr, ggplot2)): 'package' must be of length 1

Running your Script

  • You will write your code in the script but it won’t run automatically

    • You can position your cursor on any line of code in your script and click the Run button and your code will be sent to the console and run.

    • If you want to run multiple lines, you can highlight a section of code, click Run, and the entire highlighted section will be sent to the console.

    • You can type code directly into the console, but the code and output is not saved.

Loading Data

# Load data file into a new data frame
konzaData <- read_csv("https://connorb.github.io/data/konza_fqi.csv")

What is a Data Frame?

A data frame is one of the most common ways to store data in R. Think of it like a spreadsheet or a table:

  • Each row is an observation (like one student, one site, or one day).

  • Each column is a variable (like name, age, or temperature).

Almost every dataset you import into R (CSV, Excel, database) becomes a data frame.

Functions in R (and packages like dplyr and ggplot2) are designed to work naturally with data frames

Check Out the Data

Check Out the Data

Or in the console:

str(konzaData)
spc_tbl_ [80 × 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ Watershed: chr [1:80] "004b" "004b" "004b" "004b" ...
 $ Plot     : chr [1:80] "a1" "a2" "a3" "a4" ...
 $ N        : num [1:80] 27 24 23 24 32 26 27 33 26 28 ...
 $ FQI      : num [1:80] 22.9 21.3 20.7 21.9 23.5 ...
 $ D        : num [1:80] 7.98 6.82 5.58 7 8.6 ...
 $ Bison    : chr [1:80] "No Graze" "No Graze" "No Graze" "No Graze" ...
 $ BurnCycle: num [1:80] 4 4 4 4 4 4 4 4 4 4 ...
 - attr(*, "spec")=
  .. cols(
  ..   Watershed = col_character(),
  ..   Plot = col_character(),
  ..   N = col_double(),
  ..   FQI = col_double(),
  ..   D = col_double(),
  ..   Bison = col_character(),
  ..   BurnCycle = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Manipulating Data

Let’s calculate the mean and standard deviation of FQI in each bison group

# Calculate the mean and standard deviation
konzaSummary <- konzaData %>% 
  # we want to calculate mean and standard deviation between each group of Bison
  group_by(Bison) %>% 
  summarise(FQI_Avg = mean(FQI),
            FQI_sd = sd(FQI),
            count = n())

The tidyverse pipe (%>%) is a shortcut that makes R code easier to read and write when you want to do multiple steps in a row.

  • %>% takes the thing on the left-hand side and passes it as the first argument to the function on the right-hand side

  • In RStudio, the keyboard shortcut for inserting the tidyverse pipe (%>%) is:

    • Windows/Linux: Ctrl + Shift + M

    • Mac: Cmd + Shift + M

Plotting Data

konzaData %>%
  # Set your x and y axis
  ggplot(aes(x = Bison, y = FQI)) +
  # What kind of plot do you want to make?
  geom_boxplot()

In ggplot2, you build a plot in layers.

  • The + is how you add a new layer or element to your plot.

What makes a “good” plot?

Plotting Data

konzaData %>%
  # Set your x and y axis
  ggplot(aes(x = Bison, y = FQI)) +
  # What kind of plot do you want to make?
  geom_boxplot() +
  # How do we want to label our plot
  labs(title = "Effect of Bison Grazing on FQI",
       x = "Grazing Treatment",
       y = "Floristic Quality Index (FQI)")

What if we want to add all the observations as points to the boxplot?

Plotting Data

konzaData %>%
  # Set your x and y axis
  ggplot(aes(x = Bison, y = FQI)) +
  # What kind of plot do you want to make?
  geom_boxplot() +
  # How do we want to label our plot
  labs(title = "Effect of Bison Grazing on FQI",
       x = "Grazing Treatment",
       y = "Floristic Quality Index (FQI)") +
  # Let's make the points brown
  geom_jitter(color = "brown")

That works, but some of the observations are duplicated. Hint: Look at the outliers

Plotting Data

konzaData %>%
  # Set your x and y axis
  ggplot(aes(x = Bison, y = FQI)) +
  # What kind of plot do you want to make?
  geom_boxplot(outliers = FALSE) +
  # How do we want to label our plot
  labs(title = "Effect of Bison Grazing on FQI",
       x = "Grazing Treatment",
       y = "Floristic Quality Index (FQI)") +
  # Let's make the points brown
  geom_jitter(color = "brown")

Saving Your Plot

konzaData %>%
  # Set your x and y axis
  ggplot(aes(x = Bison, y = FQI)) +
  # What kind of plot do you want to make?
  geom_boxplot(outliers = FALSE) +
  # How do we want to label our plot
  labs(title = "Effect of Bison Grazing on FQI",
       x = "Grazing Treatment",
       y = "Floristic Quality Index (FQI)") +
  # Let's make the points brown
  geom_jitter(color = "brown")
# Save your plot as a PNG file
ggsave("Konza_FQI_Boxplot.png")

Saving Your Plot

Saving Your R Script

Learning More

  • The Carpentries

    • Teaches foundational coding and data science skills to researchers worldwide, all lessons are published online.

    • The Carpentries Workshops @ KU:

      • Software Carpentry

        • Automate tasks using the Unix Shell, track and share your work using version control, and write software in Python or R that is readable, reusable, and reliable.
      • Data Carpentry

        • Organize, clean, and query your data using open-source tools. Reproducibly analyze and visualize your data using Python or R.
  • Spring 2026 - EVRN 420: Environmental Data Science W/ Dr. Zimmerman.