Style Guide

Some things about writing code are necessary for the code to run. Examples:

Other things about writing code are best-practice for readability, or are particular to the coder. Examples:

But the most important part about your code is getting it to run!

Comments

Comment your code. Just do it. All the cool kids do it.

Anything after a # is a comment. It won’t do anything to your code, but it will allow you to remember what you were doing when you go back to it later. Trust me, you will forget what you were doing unless you put comments in it. Also, when you try to show it to someone else, it will not make any sense without comments.

“But I’m the only one who reads my code! Do I still need comments?”

Yes, yes you do. Also, I highly recommend showing other people your code. They probably have the answers to the problem you spend three hours trying to solve yesterday.

Useful tip - if you want to save a bit of code for later, you can “comment out” a whole section using ctrl-shft-c

Even Gandalf needs to comment his code.

Projects and Working directories

Hopefully you are already working in projects. If not, start one now! They will make working directories and file handling much, much easier.

getwd()

This prints the absolute path for our working directory. From now on we can use relative paths to refer to any file within this working directory. Anything we export will automatically be saved here.

Using a project in RStudio will set your working directory so you don’t have to do it at the top of each script.

Functions

R uses functions for most of its data analysis and manipulation methods. Becoming comfortable with the syntax of these functions is super important. RStudio makes this easier!

Each function requires certain arguments be passed to the function, and will give you certain values.

You can access the help menu for a function using ? or the help window in RStudio

?mean

Notice that when you start typing, RStudio has a drop down that suggests what you might be looking for.

The function ‘mean’ only requires one argument. ‘x’ which will usually be a vector of numbers.

The help function also lists two other arguments - trim and na.rm, however they have defaults included. If we want to accept the defaults for those arguments (which we usually do), we just need x.

mean(c(1,2,3,4,5))

you can also code this as

#

x = c(1,2,3,4,5)

mean(x)

it doesn’t have to be called “x” either

foo = c(1,2,3,4,5)

mean(x = foo)

Because mean only has one argument, we don’t have to assign it to x. R will know what you are talking about. If there is more than one argument, unassigned arguments must be listed in order or R will get confused.

mean(foo)

You can also create your own functions. When you write your own function you assign the name of the function to “function(arguments) {code}”

#we can quickly make some data
Sturgeon <- data.frame(species = c("white", "white", "white", "green", "green"), 
                       forkLength = c(200, 34, 58, 22, 46))

# Create a function. Be sure to use comments to say exactly what your
#arguments are
LengthFrequency <- function(data, # the data I want to analyze, as a data frame
                            speciesName) { #the species we want to analyze, as a character
  
  #first I"ll subset my data to select the species I want
  df = data[which(data$species == speciesName),]
  
  #now I'll calculate mean length
  length = mean(df$forkLength)
  
  #print a histogram of all the fork lengths
  hist(df$forkLength)
  
  #return the mean length
  return(length) 
}
# end LengthFrequency

#now we can use our function over and over!
white = LengthFrequency(Sturgeon, "white")
white

green = LengthFrequency(Sturgeon, "green")
green

Error codes

You will get errors. Don’t be afraid of them. For beginners, error codes are often inpenetrable, but if you learn to read them they can be very helpful! You can also always type them into google.

You will also get warnings. Warnings are also in red like errors, but they may or may not be bad. Always read and understand them, but sometimes they are nothing to worry about.

#see if you can solve these (10mins)

#create a variable 'x' and assign the letter 'y' to it
x <- y

#select the forth colum from the sturgeon data set
Sturgeon[,4]

#select the first row of the sturgeon data set
sturgeon[1,]

#calculate the mean of all teh sturgeon
mean(Sturgeon$species)

Packages

By now, you’ve probably already been using packages. Packages are one of the great wonders of R. Get used to them! However, there are lots of them and many of them do the same things. Everyone has favorites, so use whichever you are most comfortable with. Unfortunately, there are so many of them that they often have functions that have the same names, which can lead to conflicts. Fortunately, when you load them you should get a warning. You can use :: to specify which package a function comes from if there are conflicts.

library(tidyverse)

#these are two I mix up all the time
?stats::filter
?dplyr::filter

So that’s an introduction. Now let’s dive into it!